Advances in Educational Large Language Models for Youth Programming and Personalized Learning
The presentation by Dr. Su Yu outlines challenges in youth programming education, introduces three technical breakthroughs for educational large language models—dual-data training, knowledge‑graph‑driven learning, and reinforcement‑based recommendation—and showcases product implementations such as the Frog programming platform, AI learning machine, and digital AI recorded‑course system.
Dr. Su Yu, a senior engineer and deputy researcher at the Hefei Institute of Artificial Intelligence, presented a comprehensive overview of the current state and challenges of intelligent education, focusing on youth programming. He highlighted issues of data sparsity and delayed learning effects in personalized education.
The talk detailed three technical highlights of educational large language model development: (1) training a domain‑specific large language model for youth programming using dual data and historical experience injection; (2) enabling small‑knowledge learning through hierarchical knowledge graphs and prompt generation; and (3) reinforcing cognitive recommendation by simulating learning environments with large models.
Product applications were demonstrated, including the Frog programming platform, an AI‑powered programming learning device, and a digital‑human AI recorded‑course platform. These solutions integrate multimodal test‑question representation, personalized error‑correction, code hints, and knowledge‑graph visualizations to provide tailored learning paths.
Technical methods such as adversarial data generation, fine‑tuning with LoRA on LLaMA, knowledge injection via embedding vectors, and reinforcement learning for recommendation were explained. Comparative evaluations showed significant improvements over baseline models like GPT‑3.5, especially in code repair accuracy.
The presentation concluded with reflections on the trade‑offs between large and small models, the importance of domain‑specific knowledge integration, and the role of high‑quality, user‑generated data in advancing educational AI systems.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.