Educational Large Language Model Research and Product Applications for Youth Programming
The presentation outlines the challenges of sparse data and delayed learning effects in youth programming education, introduces three technical breakthroughs—dual‑data model training, hierarchical knowledge‑graph prompting, and reinforcement‑based cognitive recommendation—and showcases product implementations such as the Frog Programming Platform, AI learning machine, and digital‑human recorded courses.
The talk, led by Dr. Su Yu, senior engineer and deputy researcher at the Hefei Institute of Artificial Intelligence, focuses on the development of large language models (LLMs) for intelligent education, especially personalized programming for young learners.
Background and challenges : Traditional classroom teaching cannot meet individual learning needs; data sparsity (limited student interaction records) and learning latency (delayed effect of recommendations) hinder effective personalization.
Technical highlights :
Dual‑data model training and historical experience injection to build a domain‑specific programming LLM.
Hierarchical knowledge‑graph construction with prompt generation for small‑knowledge learning.
Reinforcement‑based cognitive recommendation that simulates learning paths using a model‑driven environment.
Product cases :
Frog Programming Platform – an AI‑driven coding practice environment with knowledge‑graph visualisation.
AI Programming Learning Machine – a hardware device offering code repair, hint generation, and guided learning.
Digital‑human AI recorded‑course platform – provides personalized video lessons with interactive virtual teachers.
Implementation details :
Data acquisition via LLM‑generated error code and adversarial networks to create realistic training samples.
Fine‑tuning on open‑source LLaMA using LoRA, with dual loss functions ensuring code similarity to reference answers and passing test cases.
Knowledge injection through a high‑quality error‑code repository, embedding vectors, and similarity‑based prompt retrieval, improving code‑fix accuracy by ~20% over GPT‑3.5.
Small‑knowledge learning leverages multi‑level knowledge graphs and reasoning prompts to answer queries such as binary‑search implementation, enhancing model performance on limited user data.
Reinforcement cognitive recommendation models the recommendation process as an agent interacting with a simulated environment, evaluating state changes and immediate rewards, reducing the average number of steps needed to master a concept by about 30%.
Future outlook discusses the trade‑offs between large and small models, the importance of model compression, domain‑specific knowledge integration, and the role of fine‑grained data in achieving “artificial intelligence” for education.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.