Tagged articles
10 articles
Page 1 of 1
Data Party THU
Data Party THU
Oct 10, 2025 · Artificial Intelligence

Can Language Models Self‑Train Without Data? Inside the Language Self‑Play Framework

This article examines the Language Self‑Play (LSP) approach for data‑free training of large language models, detailing its challenger‑solver game formulation, advantage calculations, loss functions, self‑reward extension, experimental setup on AlpacaEval, and results that show LSP can match or surpass data‑driven baselines.

LLMdata-free traininglarge language models
0 likes · 14 min read
Can Language Models Self‑Train Without Data? Inside the Language Self‑Play Framework
Fighter's World
Fighter's World
Sep 30, 2024 · Artificial Intelligence

Exploring Google NotebookLM: Use Cases, Interaction Experience, and Key Insights

The author reviews Google NotebookLM, describing how it aids deep paper reading, boosts chat willingness with guided prompts, maintains conversation coherence through self‑play insights, highlights the audio‑overview feature, and reflects on AI concepts such as the "bitter lesson" and the limits of self‑play in open scenarios.

AI researchGoogleLLM
0 likes · 22 min read
Exploring Google NotebookLM: Use Cases, Interaction Experience, and Key Insights
Architect
Architect
Sep 28, 2024 · Artificial Intelligence

How Does OpenAI’s o1 Model Leverage Self‑Play RL and New Scaling Laws?

The article provides an in‑depth technical analysis of OpenAI’s multimodal o1 model, explaining its self‑play reinforcement‑learning pipeline, the novel train‑time and test‑time compute scaling laws, its long‑think reasoning abilities demonstrated through a cipher example, and speculative architectures for generator‑verifier systems.

InferenceOpenAIRL scaling
0 likes · 35 min read
How Does OpenAI’s o1 Model Leverage Self‑Play RL and New Scaling Laws?
DataFunSummit
DataFunSummit
Sep 22, 2023 · Artificial Intelligence

Exploring Game AI Agents: Review, LLM‑Driven Exploration, and Future Directions

This article reviews the evolution of game AI agents, examines how large language models (LLMs) can drive new AI behaviors in games, and discusses practical case studies across genres such as Werewolf‑style, war‑SLG, and MOBA games, concluding with challenges and future research directions.

AI agentsGame DevelopmentLLM
0 likes · 31 min read
Exploring Game AI Agents: Review, LLM‑Driven Exploration, and Future Directions
Code DAO
Code DAO
Dec 14, 2021 · Artificial Intelligence

Building a Chess AI from Scratch: Combining AlphaZero and Transformers (Part 2)

This article walks through constructing a learnable chess AI by integrating AlphaZero‑style Monte Carlo Tree Search with a decoder‑only Transformer, detailing the game tree logic, model architecture, input and output encodings, self‑play training loop, and code implementation in PyTorch.

AlphaZeroMonteCarloTreeSearchPyTorch
0 likes · 23 min read
Building a Chess AI from Scratch: Combining AlphaZero and Transformers (Part 2)
Programmer DD
Programmer DD
Jan 3, 2021 · Artificial Intelligence

How Self‑Play and GAIL Powered the WeKick AI to Win the First Google Football Kaggle Championship

After a nostalgic gaming session, the author recounts how Tencent’s upgraded AI, WeKick, leveraged self‑play reinforcement learning, GAIL‑based adversarial simulation, and a multi‑style League framework to dominate the inaugural Google Football Kaggle competition, illustrating the escalating complexity of multi‑agent AI in real‑time strategy games.

GAILKaggle competitionTencent
0 likes · 8 min read
How Self‑Play and GAIL Powered the WeKick AI to Win the First Google Football Kaggle Championship
DataFunTalk
DataFunTalk
Mar 20, 2019 · Artificial Intelligence

Addressing Sparse Reward Problems in Model-Free Reinforcement Learning

This article reviews the challenges of model‑free reinforcement learning, especially sparse reward issues exemplified by Montezuma’s Revenge, and surveys recent approaches such as expert demonstrations, curriculum learning, self‑play, hierarchical reinforcement learning, and count‑based exploration to mitigate these problems.

Model-freecurriculum learningexploration
0 likes · 12 min read
Addressing Sparse Reward Problems in Model-Free Reinforcement Learning