How a 9B‑parameter Qwen3.5 model achieves full‑auto data analysis on a consumer GPU
The open‑source CoPaw‑Flash‑9B‑DataAnalyst‑LoRA model, fine‑tuned via LoRA, can autonomously load, explore, statistically analyze, visualize, and generate structured reports for CSV/Excel/JSON datasets, achieving a 90% success rate with an average of 26 iteration rounds, and it runs on a single consumer‑grade GPU using vLLM and the Data Analyst framework.
Model Overview
CoPaw‑Flash‑9B‑DataAnalyst‑LoRA is a LoRA‑fine‑tuned version of Alibaba’s open‑source CoPaw‑Flash‑9B (Qwen3.5‑9B architecture). The LoRA adapter is hosted at huggingface.co/jason1966/CoPaw-Flash-9B-DataAnalyst-LoRA . After fine‑tuning the model can autonomously load CSV/Excel/JSON datasets, perform statistical analysis, generate visualizations, write and execute Python scripts, and produce structured analysis reports without human clicks.
Performance Evaluation
Evaluation used 29 real Kaggle datasets with the Data Analyst framework (max 50 rounds, 128 K context). Results:
Average iteration rounds: 1.2 → 26.0 (≈21.7× increase)
Generated Python files: 0 → >100
Generated charts: 0 → >290
Total token consumption: ~5 K → 18.5 M (≈3700×)
Natural completion rate: 0 % → 89.7 %
Usable outputs: 0/29 (0 %) → 26/29 (90 %)
Human intervention: required at every step → fully autonomous
Demonstration
The agent autonomously analyzes a CSV, writes Python code, executes it, and produces box plots, scatter plots, bar charts, heatmaps, and a final report covering data overview, key findings, dimensional analysis, and conclusions. Sample visualizations (e.g., Toyota used‑car dataset) are shown in the original article.
Deployment Guide
Step 1 – Serve the model with vLLM
export HF_TOKEN=your_huggingface_token
CUDA_VISIBLE_DEVICES=0,1 vllm serve agentscope-ai/CoPaw-Flash-9B \
--enable-lora \
--lora-modules agent-lora=jason1966/CoPaw-Flash-9B-DataAnalyst-LoRA \
--max-lora-rank 64 \
--tensor-parallel-size 2 \
--gpu-memory-utilization 0.85 \
--max-model-len 131072 \
--gdn-prefill-backend triton \
--trust-remote-code \
--reasoning-parser qwen3 \
--enable-auto-tool-choice \
--tool-call-parser qwen3_xml \
--port 8000Key flags: --enable-lora + --lora-modules: load the LoRA adapter (core) --max-lora-rank 64: must match the adapter --reasoning-parser qwen3: expose the model’s reasoning process --enable-auto-tool-choice: automatic tool selection for agent scenarios
Hardware Requirements
Dual‑GPU (bf16, TP=2): ≈11 GB per GPU
Single‑GPU (bf16): ≈22 GB
8‑bit quantization: ≈12 GB
4‑bit quantization: ≈6 GB (consumer‑grade GPU sufficient)
Official test environment: 2 × NVIDIA H200 GPUs with vLLM 0.19.1.
Step 2 – Install the Data Analyst framework
git clone https://github.com/IIIIQIIII/data-analyst.git
cd data-analyst
bun installConfigure .env:
CLAUDE_CODE_USE_OPENAI=1
OPENAI_BASE_URL=http://localhost:8000/v1
OPENAI_API_KEY=unused
OPENAI_MODEL=agent-loraStep 3 – Run analysis
bun run startIssue a natural‑language request, e.g.:
Analyze the CSV file in the current directory and find sales trendsThe agent loads the data, writes and runs Python code, creates visualizations, and generates a full report automatically.
Model–Framework Relationship
The model acts as the “brain”; the Data Analyst framework provides six tools that translate model intentions into file I/O and code execution. Without the framework the model’s analysis ability cannot be exercised; without LoRA fine‑tuning the original Qwen3.5‑9B model stalls after each tool call, producing no useful output.
Key Takeaways
True autonomy: the agent runs fully automatically, not a step‑by‑step “press‑continue” pseudo‑agent.
9 B parameters are sufficient: consumer‑grade hardware can handle the workload.
All components (model, framework, evaluation data) are released under Apache 2.0.
Empirical results: 90 % success on 29 real datasets demonstrate practical viability.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Old Zhang's AI Learning
AI practitioner specializing in large-model evaluation and on-premise deployment, agents, AI programming, Vibe Coding, general AI, and broader tech trends, with daily original technical articles.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
