**The Revolutionary Confucius Code Agent: A Game-Changer in AI Software Engineering**
Imagine an AI that can tackle massive software repositories and long-running projects with ease, producing reproducible results on complex benchmarks like SWE Bench Professional and SWE Bench Verified. Welcome to the Confucius Code Agent (CCA), a groundbreaking open-sourced AI software engineer developed by researchers from Meta and Harvard. This revolutionary system is set to revolutionize the world of software engineering, and we’re excited to dive into its features and capabilities.
**Scaffolding, Redefined**
The Confucius SDK, the core of the CCA, is a masterclass in design. Gone are the days of simple wrapping around a language model; this platform redefines scaffolding as a critical design problem. The SDK is organized around three axes: **Agent Expertise**, **Consumer Expertise**, and **Developer Expertise**. This structure allows the agent to reason over dozens of files and multiple interaction steps, keeping prompts within the model’s context limits while preserving essential artifacts like patches, error logs, and design decisions.
**Persistent Notes: The Secret to Efficient Cross-Session Memory**
One of the most innovative features of the Confucius SDK is its persistent note-taking system. A dedicated agent writes structured Markdown notes from execution lines, capturing job-specific methods, repository conventions, and common failure modes. These notes are saved as long-term memory that can be reused across sessions, reducing the number of turns from 64 to 61, token usage from 104k to 93k, and increasing Resolve@1 from 53.0 to 54.4 on 151 SWE Bench Professional tasks with Claude 4.5 Sonnet.
**Modular Extensions and Tool Sophistication**
The Confucius SDK exposes tools as extensions, allowing file modifying, command execution, test runners, and code search. Each extension can preserve its own state and immediate wiring, making it easier to customize the agent to suit specific needs. We also explored the impact of tool use sophistication using an ablation on a 100-job subset of SWE Bench Professional, finding that how the agent chooses and sequences tools matters almost as much as the spine model choice.
**The Meta Agent: Automating Agent Design**
Taking it to the next level, the Confucius SDK features a meta agent that takes a natural language specification of an agent and iteratively proposes configurations, prompts, and extension units. It then runs the candidate agent on tasks, inspects lines, and metrics, and edits the configuration in a build, test, enhance loop. This process turns the agent engineering process itself into an LLM-guided optimization problem, generating the production Confucius Code Agent rather than relying on manual tuning.
**Results on SWE Bench Professional and SWE Bench Verified**
The results are impressive: the Confucius Code Agent outperforms other AI software engineers on benchmarks like SWE Bench Professional and SWE Bench Verified. With Claude 4.5 Sonnet, the Confucius Code Agent reaches Resolve@1 52.7 on SWE Bench Professional, surpassing Claude 4.5 Opus with a weaker scaffold at 52.0. On SWE Bench Verified, the Confucius Code Agent with Claude 4 Sonnet reaches Resolve@1 74.6, outperforming SWE Agent and OpenHands.
**Key Takeaways**
* Scaffolding can outweigh model size: the Confucius Code Agent demonstrates that strong scaffolding can lead to better results even with a smaller model.
* Hierarchical working memory is essential for long-horizon coding: the Confucius SDK’s hierarchical working memory and context compression enable it to handle large repositories and long trajectories.
* Persistent notes act as efficient cross-session memory: reusing structured notes can improve performance and reduce the number of turns, token usage, and increase Resolve@1.
* Tool configuration is crucial: the agent’s tool routing and restoration methods are a significant performance lever, not just an implementation detail.
* The meta agent automates agent design and tuning: this process can generate production-ready agents without manual tuning.
The Confucius Code Agent is a testament to the power of innovative design and the potential of AI in software engineering. With its modular extensions, persistent notes, and meta agent, this system is poised to revolutionize the way we approach software development.
You can find the full paper [here](link to paper).
