How mini‑SWE‑agent Solves 65% of SWE‑bench Bugs with Only 100 Lines of Code
The mini‑SWE‑agent, a lightweight open‑source software‑engineering AI built by the original SWE‑bench team, achieves about 65% bug‑fix success on the SWE‑bench benchmark using roughly 100 lines of Python, thanks to its minimal dependencies, shell‑based execution, linear history, and support for various container environments, offering a fast, extensible alternative to the full‑featured SWE‑agent.
Overview
mini‑SWE‑agent is an open‑source software‑engineering agent that implements the same bug‑fix task as SWE‑agent but with a minimal code base (~100 lines of core Python, ~200 lines including setup). It executes shell commands directly without a tool‑call interface.
Performance
On the SWE‑bench validation set the agent solves approximately 65 % of the problems , comparable to the original SWE‑agent while being far lighter.
Key design features
Minimal code and dependencies : ~100 lines of Python, no heavy third‑party libraries.
Direct shell execution : each model output is a complete command executed by Python; no separate tool‑call protocol.
Linear history : steps are appended to the message stream, avoiding complex state management.
Independent step execution : commands run in isolated subprocesses, simplifying sandboxing and extension.
Simplified configuration : built‑in code templates replace YAML configuration; CLI commands mini (run) and mini‑v (visual UI) start the agent.
Broad environment support : works in local shells and inside Docker, Podman, Singularity, Apptainer, etc., without code changes.
Retained tooling : batch inference, trajectory browsing, and a visual UI are still provided for large‑scale evaluation.
Installation and usage
Clone the repository and install the minimal requirements:
git clone https://github.com/SWE-agent/mini-swe-agent.git
cd mini-swe-agent
pip install -r requirements.txt # typically only standard librariesRun the agent from the command line:
mini # start the agent in terminal mode
mini-v # launch the optional visual interfaceThe agent reads a problem description (e.g., a GitHub issue) from stdin or a file, generates a shell command, executes it, and appends the result to the conversation.
Recommended scenarios
Rapid local experimentation, fine‑tuning (FT) or reinforcement‑learning (RL) loops where a lightweight control flow is desired.
Environments where installing large frameworks is impractical.
Evaluations that require a stable, reproducible sandbox.
For use cases that need extensive toolchains, configurable YAML pipelines, or complex multi‑tool state, the full‑featured SWE‑agent is more appropriate.
Background
SWE‑bench is a benchmark built from real GitHub issues and pull requests to assess whether large language models can understand bug reports and automatically fix code. SWE‑agent was released in 2024 by researchers from Princeton University and OpenAI to achieve high bug‑fix rates on this benchmark. mini‑SWE‑agent was created to provide a ~100‑fold reduction in code size while preserving performance.
References
Project repository: https://github.com/SWE-agent/mini-swe-agent
Readme and additional documentation: https://github.com/SWE-agent/mini-swe-agent?tab=readme-ov-file
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Data Party THU
Official platform of Tsinghua Big Data Research Center, sharing the team's latest research, teaching updates, and big data news.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
