What’s Driving the AI Boom? New Models, Data Limits, and the Rise of Forgetting
This issue reviews the latest AI breakthroughs—including OpenAI’s O3 and o1 models, pricing cuts, new features in ChatGPT, product launches like Pika 2.0 and Gemini 2.0, a heated debate on pre‑training data bottlenecks sparked by Ilya Sutskever, a novel black‑box forgetting method, and DeepMind’s Genie 2 3D world generator—highlighting how industry dynamics and research directions are reshaping the field.
Market Updates and New Features
OpenAI’s AI Christmas season introduced several updates: the O3 reasoning model, the full‑strength o1 model API, and a real‑time API with significantly lower token prices (e.g., gpt-4o-realtime-preview-2024-12-17 audio token cost reduced by 60%). New capabilities include Projects in ChatGPT for personal knowledge bases, universal Search with voice input, phone/WhatsApp conversation support, and a Mac version that can read code from IDEs and integrate with notes apps.
Other notable product releases this week were Pika 2.0 (customizable scene generation for text‑to‑image), GitHub Copilot becoming free up to 2,000 calls per month, and Gemini 2.0 Flash Thinking, which offers step‑by‑step reasoning visualisation and free access via Google AI Studio.
Industry Debate on Pre‑training Limits
At NeurIPS 2024, former OpenAI chief scientist Ilya Sutskever argued that current pre‑training methods are reaching a data ceiling and will soon end, prompting a lively discussion. Google AI Studio’s product lead countered that the perceived limit is a matter of imagination, while Yann LeCun emphasized the untapped potential of video data compared to text.
The community’s key questions focus on whether data scarcity will truly bottleneck AI progress, if models must move beyond text‑driven training, and how multimodal models might break existing constraints.
Research Spotlight: Black‑Box Forgetting
Researchers from Tokyo University of Science and NEC propose a black‑box forgetting technique that selectively erases knowledge of specific classes without accessing model internals. The method optimises textual prompts (no‑gradient optimisation) and introduces Latent Context Sharing to reduce high‑dimensional optimisation complexity, achieving strong selective forgetting performance.
Potential applications include specialised models for efficiency, risk‑controlled image generation, and privacy protection by preventing unwanted content generation.
Genie 2: Single‑Image 3D World Generation
DeepMind’s Genie 2, a foundation world model, can create unlimited, controllable 3D environments from a single image. Trained on massive video datasets, it supports object interaction, realistic character animation, physics simulation, and multi‑view rendering.
Key use cases are AI agent training, immersive interactive design, and AI‑driven game development, with interest from industry leaders such as Elon Musk.
Both the forgetting research and Genie 2 illustrate a shift toward specialised, efficient, and multimodal AI systems that address practical deployment challenges while pushing the boundaries of model capabilities.
ZhongAn Tech Team
China's first online insurer. Through tech innovation we make insurance simpler, warmer, and more valuable. Powered by technology, we support 50 billion RMB of policies and serve 600 million users with smart, personalized solutions. ZhongAn's hardcore tech and article shares are here.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
