Unlock Multi‑Model AI Decision Power with LLM Council – A Hands‑On Guide
LLM Council, an open‑source platform created by former OpenAI researcher Andrej Karpathy, lets users simultaneously query top LLMs such as GPT‑5.1, Gemini 3 Pro, Claude Sonnet 4.5 and Grok 4, anonymously peer‑review their answers, and synthesize a final report, dramatically improving accuracy for research, tech selection and learning while remaining easy to install and run locally.
Why LLM Council is called an "AI decision神器" – solves four core pains
Multi‑model simultaneous answering : One query can invoke several top LLMs (default GPT‑5.1, Gemini 3 Pro, Claude Sonnet 4.5, Grok 4). Each model returns its own answer, providing diverse perspectives in a single view.
Anonymous peer‑review scoring : Models evaluate each other's answers on accuracy and insight, producing a ranking that makes quality instantly visible.
Chairman model aggregation : A designated "chairman" model (default Gemini 3 Pro) merges all answers and review scores, removes redundancy, fills gaps, and generates a comprehensive final report.
Local deployment & full open‑source : All data stays in local JSON files, API calls are routed through OpenRouter, and the code can be customized to add new models, change review criteria, or adjust the chairman model.
Zero‑code operation : The web UI mimics ChatGPT; users type a question and the system automatically runs answering → reviewing → aggregation, displaying results in separate tabs.
Authored by Karpathy : Former OpenAI scientist Andrej Karpathy ensures clean, efficient code and a trustworthy design.
Three practical scenarios where LLM Council shines
1. Academic research – multi‑model validation of paper ideas
A researcher asks whether a "dynamic sparse attention" improvement is feasible. Four models respond with complementary feedback, the peer‑review stage highlights the most critical issue, and the chairman model synthesizes a concise recommendation, saving weeks of trial‑and‑error.
2. Technical selection – data‑driven LLM deployment decisions
When choosing between LangServe and vLLM for a 7B model deployment, the four models compare performance, usability, and cost. Peer reviews favor vLLM, and the chairman model produces a final recommendation that leads to a 2× throughput increase and 30% cost reduction.
3. Learning boost – diverse explanations of concepts
For the ε‑greedy strategy in Q‑Learning, GPT, Claude, and Gemini each give a different real‑world analogy. Peer reviews rank Claude’s example as the clearest, allowing the learner to grasp the concept ten times faster than a single‑model explanation.
Quick‑start guide for newcomers (three steps)
Step 1: Prepare environment and install dependencies
Ensure Python 3.10+, Node.js 18+, and uv are installed.
Clone the repository and install backend and frontend dependencies:
# Clone the project
git clone https://github.com/karpathy/llm-council.git
cd llm-council
# Install backend dependencies
uv sync
# Install frontend dependencies
cd frontend
npm install
cd ..Step 2: Configure API key (critical)
Register at openrouter.ai and obtain an API key (a small credit balance is required).
Create a .env file in the project root containing:
OPENROUTER_API_KEY=sk-or-v1-YOUR_KEYStep 3: Launch the application and start asking
Run the one‑click start script: ./start.sh Or start manually:
Backend: uv run python -m backend.main Frontend: cd frontend && npm run dev Open http://localhost:5173 in a browser, type a question (e.g., "Explain Transformer self‑attention"), and watch the answer, peer reviews, and final report appear.
Advanced: customizing the model list
Edit backend/config.py to add or replace models. Example snippet:
COUNCIL_MODELS = [
"openai/gpt-5.1",
"google/gemini-3-pro-preview",
"anthropic/claude-sonnet-4.5",
"x-ai/grok-4",
"qwen/qwen-72b-chat" # added Chinese model
]
CHAIRMAN_MODEL = "google/gemini-3-pro-preview" # can be changed to any trusted modelFinal thoughts
LLM Council is not meant to replace a single AI model but to enable smarter, more reliable AI usage. By orchestrating multiple models, applying anonymous peer review, and aggregating results, it reduces bias, fills knowledge gaps, and delivers trustworthy answers for research, technical decisions, and learning.
Project repository: https://github.com/karpathy/llm-council
Old Meng AI Explorer
Tracking global AI developments 24/7, focusing on large model iterations, commercial applications, and tech ethics. We break down hardcore technology into plain language, providing fresh news, in-depth analysis, and practical insights for professionals and enthusiasts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
