Artificial Intelligence 11 min read

Unlock Multi‑Model AI Decision Power with LLM Council – A Hands‑On Guide

LLM Council, an open‑source platform created by former OpenAI researcher Andrej Karpathy, lets users simultaneously query top LLMs such as GPT‑5.1, Gemini 3 Pro, Claude Sonnet 4.5 and Grok 4, anonymously peer‑review their answers, and synthesize a final report, dramatically improving accuracy for research, tech selection and learning while remaining easy to install and run locally.

Old Meng AI Explorer

Dec 15, 2025

Unlock Multi‑Model AI Decision Power with LLM Council – A Hands‑On Guide

Why LLM Council is called an "AI decision神器" – solves four core pains

Multi‑model simultaneous answering : One query can invoke several top LLMs (default GPT‑5.1, Gemini 3 Pro, Claude Sonnet 4.5, Grok 4). Each model returns its own answer, providing diverse perspectives in a single view.

Anonymous peer‑review scoring : Models evaluate each other's answers on accuracy and insight, producing a ranking that makes quality instantly visible.

Chairman model aggregation : A designated "chairman" model (default Gemini 3 Pro) merges all answers and review scores, removes redundancy, fills gaps, and generates a comprehensive final report.

Local deployment & full open‑source : All data stays in local JSON files, API calls are routed through OpenRouter, and the code can be customized to add new models, change review criteria, or adjust the chairman model.

Zero‑code operation : The web UI mimics ChatGPT; users type a question and the system automatically runs answering → reviewing → aggregation, displaying results in separate tabs.

Authored by Karpathy : Former OpenAI scientist Andrej Karpathy ensures clean, efficient code and a trustworthy design.

Three practical scenarios where LLM Council shines

1. Academic research – multi‑model validation of paper ideas

A researcher asks whether a "dynamic sparse attention" improvement is feasible. Four models respond with complementary feedback, the peer‑review stage highlights the most critical issue, and the chairman model synthesizes a concise recommendation, saving weeks of trial‑and‑error.

2. Technical selection – data‑driven LLM deployment decisions

When choosing between LangServe and vLLM for a 7B model deployment, the four models compare performance, usability, and cost. Peer reviews favor vLLM, and the chairman model produces a final recommendation that leads to a 2× throughput increase and 30% cost reduction.

3. Learning boost – diverse explanations of concepts

For the ε‑greedy strategy in Q‑Learning, GPT, Claude, and Gemini each give a different real‑world analogy. Peer reviews rank Claude’s example as the clearest, allowing the learner to grasp the concept ten times faster than a single‑model explanation.

Quick‑start guide for newcomers (three steps)

Step 1: Prepare environment and install dependencies

Ensure Python 3.10+, Node.js 18+, and uv are installed.

Clone the repository and install backend and frontend dependencies:

# Clone the project
git clone https://github.com/karpathy/llm-council.git
cd llm-council

# Install backend dependencies
uv sync

# Install frontend dependencies
cd frontend
npm install
cd ..

Step 2: Configure API key (critical)

Create a .env file in the project root containing:

OPENROUTER_API_KEY=sk-or-v1-YOUR_KEY

Step 3: Launch the application and start asking

Run the one‑click start script: ./start.sh Or start manually:

Backend: uv run python -m backend.main Frontend: cd frontend && npm run dev Open http://localhost:5173 in a browser, type a question (e.g., "Explain Transformer self‑attention"), and watch the answer, peer reviews, and final report appear.

Advanced: customizing the model list

Edit backend/config.py to add or replace models. Example snippet:

COUNCIL_MODELS = [
    "openai/gpt-5.1",
    "google/gemini-3-pro-preview",
    "anthropic/claude-sonnet-4.5",
    "x-ai/grok-4",
    "qwen/qwen-72b-chat"  # added Chinese model
]
CHAIRMAN_MODEL = "google/gemini-3-pro-preview"  # can be changed to any trusted model

Final thoughts

LLM Council is not meant to replace a single AI model but to enable smarter, more reliable AI usage. By orchestrating multiple models, applying anonymous peer review, and aggregating results, it reduces bias, fills knowledge gaps, and delivers trustworthy answers for research, technical decisions, and learning.

Project repository: https://github.com/karpathy/llm-council

LLM Technical Guide AI tool Open-source multi-model AI

Written by

Old Meng AI Explorer

Tracking global AI developments 24/7, focusing on large model iterations, commercial applications, and tech ethics. We break down hardcore technology into plain language, providing fresh news, in-depth analysis, and practical insights for professionals and enthusiasts.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.