Author

NewBeeNLP

Always insightful, always fun

119

Articles

Likes

Views

Comments

Latest from NewBeeNLP

100 recent articles max

NewBeeNLP

Mar 18, 2025 · Interview Experience

How to Ace Multimodal Model Interviews at Taobao's Search AI Division

This article recounts a three‑stage interview for a multimodal large‑model position at Taobao's Search AI division, detailing typical questions on CLIP, LoRA, BLIP, Qwen‑VL, Transformer fundamentals, RLHF, and coding challenges, and offers insights on what interviewers focus on.

AICLIPLoRA

0 likes · 5 min read

How to Ace Multimodal Model Interviews at Taobao's Search AI Division

NewBeeNLP

Mar 14, 2025 · Artificial Intelligence

How Open‑Sora 2.0 Achieves SOTA Video Generation with Only $200K Training Cost

Open‑Sora 2.0 is an open‑source 11B‑parameter video generation model that matches commercial SOTA performance while being trained on 224 GPUs for just $200,000, thanks to a 3D auto‑encoder, MMDiT architecture, aggressive data filtering, low‑resolution pre‑training, and highly optimized parallel training techniques.

AI modelMMDiTOpen-Sora

0 likes · 9 min read

How Open‑Sora 2.0 Achieves SOTA Video Generation with Only $200K Training Cost

NewBeeNLP

Mar 11, 2025 · Artificial Intelligence

How DeepSeek’s New Architecture Redefines LLM Efficiency and Performance

This article analyzes DeepSeek’s recent breakthroughs—including the Multi‑Head Latent Attention (MLA), Group Relative Policy Optimization (GRPO), and a refined Mixture‑of‑Experts design—along with its three‑stage training pipeline, RL‑only R1‑Zero variant, and benchmark comparisons against GPT‑4o‑Mini and Llama 3.1, highlighting both gains and remaining challenges.

DeepSeekLLMMixture of Experts

0 likes · 18 min read

How DeepSeek’s New Architecture Redefines LLM Efficiency and Performance

NewBeeNLP

Feb 27, 2025 · Industry Insights

How DeepSeek’s Open‑Source Tools Exploit China‑Specific H800 GPUs to Boost AI Performance

The article analyzes DeepSeek’s three open‑source projects—FlashMLA, DeepEP, and DeepGEMM—showing how they optimize for the China‑only NVIDIA H800 GPU, contrast this with the abundant hardware resources of Western AI firms, and highlight the growing demand for talent that masters both AI models and GPU hardware.

AI hardwareDeepEPDeepGEMM

0 likes · 7 min read

How DeepSeek’s Open‑Source Tools Exploit China‑Specific H800 GPUs to Boost AI Performance

NewBeeNLP

Feb 23, 2025 · Industry Insights

What I Learned After a Year Building Large Language Models: Wins, Losses, and Future Trends

After a year of cutting salary to join a startup focused on large‑model research, I reflect on the early uncertainty of exponential growth, the challenges of competing with AI giants, evolving career paths, emerging industry trends, and how balancing work with family shaped my perspective on long‑term success.

AI industryAI trendsCareer Reflection

0 likes · 11 min read

What I Learned After a Year Building Large Language Models: Wins, Losses, and Future Trends

NewBeeNLP

Feb 21, 2025 · Artificial Intelligence

Do Scaling Laws Still Hold? Analyzing Grok‑3, Deepseek and LLM Training Trends

The article examines whether pre‑training scaling laws remain valid, compares Grok‑3’s architecture and training strategy with Deepseek models, and explores how different scaling approaches—pre‑training, RL‑based, and test‑time—affect the cost‑effectiveness and intelligence of large language models.

AI researchGrok-3scaling laws

0 likes · 11 min read

Do Scaling Laws Still Hold? Analyzing Grok‑3, Deepseek and LLM Training Trends

NewBeeNLP

Feb 16, 2025 · Industry Insights

Why Baidu Is Open‑Sourcing Its Next‑Gen Ernie Model and What It Means for the AI Landscape

Baidu announced that its upcoming Ernie 4.5 series will be open‑source from June 30, marking a shift from paid, closed‑source models to free, community‑driven AI and highlighting technical innovations like knowledge‑enhanced learning, continual learning, and multimodal capabilities.

AI industryBaiduErnie

0 likes · 5 min read

Why Baidu Is Open‑Sourcing Its Next‑Gen Ernie Model and What It Means for the AI Landscape

NewBeeNLP

Jan 17, 2025 · Artificial Intelligence

Unlocking Multimodal Intelligence: A Deep Dive into Next Token Prediction

This comprehensive survey examines the foundations, tokenization techniques, model architectures, training paradigms, evaluation benchmarks, and open challenges of multimodal next‑token prediction (MMNTP), offering researchers a clear roadmap for future advances in multimodal AI.

Next Token PredictionTraining Paradigmsevaluation

0 likes · 9 min read

Unlocking Multimodal Intelligence: A Deep Dive into Next Token Prediction

NewBeeNLP

Jan 14, 2025 · R&D Management

How to Kickstart Your CS Research Journey and Find LLM Serving Ideas

The author shares a candid half‑year reflection on entering computer‑science research, outlining practical steps for discovering research ideas, navigating papers, focusing on LLM serving systems, and emphasizing collaboration to help newcomers succeed in academia.

LLM servingacademic journeyresearch methodology

0 likes · 9 min read

How to Kickstart Your CS Research Journey and Find LLM Serving Ideas

NewBeeNLP

Jan 2, 2025 · Artificial Intelligence

Unlocking Multimodal RAG: From Semantic Extraction to Scalable VLM Solutions

This article examines the implementation paths and future prospects of multimodal Retrieval‑Augmented Generation, covering semantic extraction, transformer‑based OCR, visual language models, scaling challenges, tensor indexing, and practical evaluations with tools like Infinity and ColPali.

AI retrievalDocument UnderstandingInfinity Database

0 likes · 12 min read

Unlocking Multimodal RAG: From Semantic Extraction to Scalable VLM Solutions