NewBeeNLP
Author

NewBeeNLP

Always insightful, always fun

119
Articles
0
Likes
1
Views
0
Comments
Recent Articles

Latest from NewBeeNLP

100 recent articles max
NewBeeNLP
NewBeeNLP
Mar 18, 2025 · Interview Experience

How to Ace Multimodal Model Interviews at Taobao's Search AI Division

This article recounts a three‑stage interview for a multimodal large‑model position at Taobao's Search AI division, detailing typical questions on CLIP, LoRA, BLIP, Qwen‑VL, Transformer fundamentals, RLHF, and coding challenges, and offers insights on what interviewers focus on.

AICLIPLoRA
0 likes · 5 min read
How to Ace Multimodal Model Interviews at Taobao's Search AI Division
NewBeeNLP
NewBeeNLP
Mar 14, 2025 · Artificial Intelligence

How Open‑Sora 2.0 Achieves SOTA Video Generation with Only $200K Training Cost

Open‑Sora 2.0 is an open‑source 11B‑parameter video generation model that matches commercial SOTA performance while being trained on 224 GPUs for just $200,000, thanks to a 3D auto‑encoder, MMDiT architecture, aggressive data filtering, low‑resolution pre‑training, and highly optimized parallel training techniques.

AI modelMMDiTOpen-Sora
0 likes · 9 min read
How Open‑Sora 2.0 Achieves SOTA Video Generation with Only $200K Training Cost
NewBeeNLP
NewBeeNLP
Mar 11, 2025 · Artificial Intelligence

How DeepSeek’s New Architecture Redefines LLM Efficiency and Performance

This article analyzes DeepSeek’s recent breakthroughs—including the Multi‑Head Latent Attention (MLA), Group Relative Policy Optimization (GRPO), and a refined Mixture‑of‑Experts design—along with its three‑stage training pipeline, RL‑only R1‑Zero variant, and benchmark comparisons against GPT‑4o‑Mini and Llama 3.1, highlighting both gains and remaining challenges.

DeepSeekLLMMixture of Experts
0 likes · 18 min read
How DeepSeek’s New Architecture Redefines LLM Efficiency and Performance
NewBeeNLP
NewBeeNLP
Feb 27, 2025 · Industry Insights

How DeepSeek’s Open‑Source Tools Exploit China‑Specific H800 GPUs to Boost AI Performance

The article analyzes DeepSeek’s three open‑source projects—FlashMLA, DeepEP, and DeepGEMM—showing how they optimize for the China‑only NVIDIA H800 GPU, contrast this with the abundant hardware resources of Western AI firms, and highlight the growing demand for talent that masters both AI models and GPU hardware.

AI hardwareDeepEPDeepGEMM
0 likes · 7 min read
How DeepSeek’s Open‑Source Tools Exploit China‑Specific H800 GPUs to Boost AI Performance
NewBeeNLP
NewBeeNLP
Feb 23, 2025 · Industry Insights

What I Learned After a Year Building Large Language Models: Wins, Losses, and Future Trends

After a year of cutting salary to join a startup focused on large‑model research, I reflect on the early uncertainty of exponential growth, the challenges of competing with AI giants, evolving career paths, emerging industry trends, and how balancing work with family shaped my perspective on long‑term success.

AI industryAI trendsCareer Reflection
0 likes · 11 min read
What I Learned After a Year Building Large Language Models: Wins, Losses, and Future Trends
NewBeeNLP
NewBeeNLP
Feb 21, 2025 · Artificial Intelligence

Do Scaling Laws Still Hold? Analyzing Grok‑3, Deepseek and LLM Training Trends

The article examines whether pre‑training scaling laws remain valid, compares Grok‑3’s architecture and training strategy with Deepseek models, and explores how different scaling approaches—pre‑training, RL‑based, and test‑time—affect the cost‑effectiveness and intelligence of large language models.

AI researchGrok-3scaling laws
0 likes · 11 min read
Do Scaling Laws Still Hold? Analyzing Grok‑3, Deepseek and LLM Training Trends
NewBeeNLP
NewBeeNLP
Jan 17, 2025 · Artificial Intelligence

Unlocking Multimodal Intelligence: A Deep Dive into Next Token Prediction

This comprehensive survey examines the foundations, tokenization techniques, model architectures, training paradigms, evaluation benchmarks, and open challenges of multimodal next‑token prediction (MMNTP), offering researchers a clear roadmap for future advances in multimodal AI.

Next Token PredictionTraining Paradigmsevaluation
0 likes · 9 min read
Unlocking Multimodal Intelligence: A Deep Dive into Next Token Prediction
NewBeeNLP
NewBeeNLP
Jan 14, 2025 · R&D Management

How to Kickstart Your CS Research Journey and Find LLM Serving Ideas

The author shares a candid half‑year reflection on entering computer‑science research, outlining practical steps for discovering research ideas, navigating papers, focusing on LLM serving systems, and emphasizing collaboration to help newcomers succeed in academia.

LLM servingacademic journeyresearch methodology
0 likes · 9 min read
How to Kickstart Your CS Research Journey and Find LLM Serving Ideas
NewBeeNLP
NewBeeNLP
Jan 2, 2025 · Artificial Intelligence

Unlocking Multimodal RAG: From Semantic Extraction to Scalable VLM Solutions

This article examines the implementation paths and future prospects of multimodal Retrieval‑Augmented Generation, covering semantic extraction, transformer‑based OCR, visual language models, scaling challenges, tensor indexing, and practical evaluations with tools like Infinity and ColPali.

AI retrievalDocument UnderstandingInfinity Database
0 likes · 12 min read
Unlocking Multimodal RAG: From Semantic Extraction to Scalable VLM Solutions