How an AI-Powered Question Recording System Supercharges Efficiency for Middle School Teachers
This article details the design and implementation of a locally deployed AI system that automatically extracts, structures, and manages exam questions from scanned papers, supporting multiple subjects, reducing manual effort, and enabling flexible test generation for teachers.
Construction Goals
Automatically convert paper, scanned, image, and Word exams into editable items to increase entry efficiency.
Eliminate repetitive manual entry, copying, formatting, and classification to reduce labor costs.
Support multi‑subject recognition for Chinese, Math, and English.
Extract structured fields such as stem, options, answer, analysis, type, difficulty, and knowledge points to build a structured question bank.
Enable flexible test assembly by subject, grade, type, knowledge point, and difficulty, with both automatic and manual modes.
Deploy locally using DeepSeek and Ollama so that exam and question data never leave the internal network.
User Roles
System Administrator : Manage users, roles, models, OCR services, and system parameters.
Curriculum Supervisor : Review question quality, maintain standards, and define test‑assembly rules.
Teacher / Entry Clerk : Upload exams, verify recognition results, maintain questions, and manually assemble tests.
General Viewer : View the question bank, download exams, and see assembly results.
Core Functional Requirements
1. Exam Upload & File Management
Supports PDF, JPG, PNG, and Word formats; batch upload; preview conversion.
Classify files by subject, grade, version, exam type, and uploader.
Record metadata (name, size, time, status) and allow retry on OCR/AI failures.
2. OCR Automatic Recognition
Image OCR for JPG/PNG; PDF OCR for both text‑based and scanned PDFs.
Word parsing extracts text, tables, and images.
Layout analysis detects question number, stem, options, answer area, and analysis area.
Mathematical formulas are preserved as LaTeX or image references; original images are retained for picture‑based questions.
3. AI Question Structuring
OCR output is fed to a local DeepSeek model to split into standard fields: subject, grade, type (single‑choice, multiple‑choice, fill‑in‑blank, true/false, short answer, essay, reading comprehension, calculation), stem, options, answer, analysis, knowledge points, difficulty, score, and source reference.
4. Review & Audit
Side‑by‑side comparison of original scan and structured result.
Editable fields for stem, options, answer, analysis, knowledge points, and difficulty.
Formula correction (text edit or image upload) and image cropping/re‑association.
Batch confirmation for multiple correct recognitions.
Workflow states: pending, approved, rejected, stored.
5. Question‑Bank Management
Unified repository with filtering by subject, grade, type, knowledge point, and difficulty.
Advanced search on stem, answer, source, and uploader.
Edit and batch modify attributes; de‑duplicate based on similarity; version history for traceability.
6. Automatic Test Assembly
Configure rules (subject, grade, knowledge points, types, difficulty, quantity, total score).
Intelligent selection respecting difficulty and type ratios; avoid duplicate questions.
Auto‑generate answer sheets with analysis.
7. Manual Test Assembly
Search and drag‑drop questions; adjust order; set per‑question scores.
Define section titles and groupings; real‑time preview; export to Word or PDF.
8. Statistics & Quality Analysis
Track total question count, OCR/AI success rates, audit pass rates, type distribution, difficulty distribution, auto/manual assembly usage, and high‑frequency knowledge points.
AI Recognition Capability Design
Chinese
Recognizes pinyin, characters, poetry dictation, reading comprehension, classical Chinese, essay prompts.
Splits reading material, questions, and answers; extracts essay requirements, word count, scoring hints; preserves line breaks and punctuation for poetry; extracts key points for subjective answers.
Math
Handles multiple‑choice, fill‑in‑blank, calculation, application, geometry, and function problems.
Converts formulas to LaTeX or retains images; links geometry diagrams; extracts solution steps; auto‑tags knowledge points such as functions, equations, probability.
English
Supports single‑choice, cloze, reading comprehension, short‑answer, translation, essay.
Distinguishes article, question, options; identifies cloze blanks and candidates; reserves audio fields for future listening tasks; auto‑tags grammar points, tenses, clauses.
System Technical Architecture
Overall Architecture
Frontend Stack
Vue 3 – Build management UI
Vite – Frontend build tool
TypeScript – Improve code maintainability
Element Plus – Component library for admin UI
Pinia – State management
Vue Router – Page routing
Axios – Call backend APIs
ECharts – Data visualization
Backend Stack
Spring Boot – Core business services
Spring Security / Sa‑Token – Authentication and authorization
MyBatis‑Plus / JPA – Database access
MySQL / PostgreSQL – Persist question bank, exams, users, tasks
Redis – Cache, task status, rate limiting
MinIO / Local Disk – Store PDFs, images, Word files, question images
EasyExcel / Apache POI – Process Word/Excel documents
OpenAPI/Swagger – API documentation
AI Service Stack
Python – Main language for AI/OCR processing
FastAPI – Expose OCR and AI parsing endpoints
Uvicorn – Run FastAPI service
PaddleOCR – Chinese/English/digit OCR
pdfplumber / PyMuPDF – Extract text and render PDF pages
python‑docx – Parse Word documents
OpenCV – Image preprocessing, cropping, denoising
Ollama – Local large‑model serving
DeepSeek – Question structuring, knowledge‑point tagging, difficulty judgment
Core Business Process
System Modules
Login & Permission : User login, role management, menu permissions, operation logs.
File Upload : Exam upload, preview, classification, upload records.
OCR Recognition : Image, PDF, Word parsing, task management.
AI Parsing : Question splitting, field extraction, knowledge‑point tagging, difficulty assessment.
Question Review : Result comparison, manual correction, audit and storage.
Question Bank : List, filter, search, edit, batch operations, de‑duplication, version history.
Automatic Assembly : Rule config, smart selection, paper generation, answer sheet creation.
Manual Assembly : Manual selection, drag‑drop ordering, preview, score setting.
Statistics & Analysis : Question, OCR, assembly statistics, quality analysis.
System Configuration : OCR, model, prompt, file storage settings.
Core Data Model Recommendations
sys_user : User information.
sys_role : Role information.
paper_file : Original uploaded exam files.
ocr_task : OCR task records.
question : Main question table.
question_option : Options for choice questions.
question_image : Question‑related images or formula graphics.
question_tag : Tags and knowledge points.
question_audit : Audit records.
paper : Generated test papers.
paper_question : Association between papers and questions.
model_config : Ollama/DeepSeek model settings.
prompt_template : AI parsing prompt templates.
operation_log : System operation logs.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
SpringMeng
Focused on software development, sharing source code and tutorials for various systems.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
