NewBeeNLP
Author

NewBeeNLP

Always insightful, always fun

119
Articles
0
Likes
1
Views
0
Comments
Recent Articles

Latest from NewBeeNLP

100 recent articles max
NewBeeNLP
NewBeeNLP
Sep 9, 2024 · Artificial Intelligence

Can Real‑Time Learning at Serving Time Transform Recommendation Re‑ranking?

This article introduces LAST, a novel online learning approach that updates recommendation models instantly at serving time, addressing real‑time learning challenges, re‑ranking complexities, and demonstrating superior offline and online performance in industrial e‑commerce scenarios.

AILASTOnline Learning
0 likes · 12 min read
Can Real‑Time Learning at Serving Time Transform Recommendation Re‑ranking?
NewBeeNLP
NewBeeNLP
Sep 5, 2024 · Artificial Intelligence

Why RLHF Is Irreplaceable: Uncovering the Limits of SFT

The article analyzes why supervised fine‑tuning (SFT) cannot replace reinforcement learning from human feedback (RLHF), highlighting SFT's lack of negative feedback and backward‑looking capability, and explains how RLHF’s reward model addresses these fundamental shortcomings.

RLHFSFTTraining Methods
0 likes · 7 min read
Why RLHF Is Irreplaceable: Uncovering the Limits of SFT
NewBeeNLP
NewBeeNLP
Sep 3, 2024 · Industry Insights

Why Pre‑training Teams Boost New Engineers’ Skills Faster Than SFT Teams

The answer explains that joining a pre‑training team accelerates a newcomer’s engineering abilities through hands‑on work with large‑scale data pipelines, distributed training code, and debugging, while SFT teams focus mainly on data labeling, making pre‑training the more effective path for rapid skill growth.

AIEngineering SkillsSFT
0 likes · 6 min read
Why Pre‑training Teams Boost New Engineers’ Skills Faster Than SFT Teams
NewBeeNLP
NewBeeNLP
Sep 2, 2024 · Artificial Intelligence

Boosting Large Language Model Math Reasoning: Mixed Instructions, Synthetic Data, and Training Optimizations

This article presents a comprehensive technical walkthrough on enhancing large language model mathematical reasoning by reviewing model architectures, introducing mixed CoT‑PoT instructions, generating and filtering synthetic data, and applying multi‑stage training optimizations such as RFT, PPO, and DPO, with detailed experimental results and Q&A insights.

AIReward Modellarge language models
0 likes · 17 min read
Boosting Large Language Model Math Reasoning: Mixed Instructions, Synthetic Data, and Training Optimizations
NewBeeNLP
NewBeeNLP
Aug 22, 2024 · Artificial Intelligence

How to Fine‑Tune GPT‑4o for Free: Costs, Steps, and Real‑World Benchmarks

OpenAI has launched low‑cost fine‑tuning for GPT‑4o, offering free daily training tokens, a simple dashboard workflow, and early benchmark results that show significant performance gains, while the community debates the merits of fine‑tuning versus prompt‑caching for efficient AI applications.

AI benchmarksFine-tuningGPT-4o
0 likes · 6 min read
How to Fine‑Tune GPT‑4o for Free: Costs, Steps, and Real‑World Benchmarks
NewBeeNLP
NewBeeNLP
Aug 15, 2024 · Industry Insights

Decoding Xiaohongshu’s Decentralized Recommendation: Sideinfo and Multimodal Fusion

This article analyzes how Xiaohongshu addresses the decentralization challenge in its recommendation system by strengthening side‑information usage, integrating multimodal signals across the full pipeline, and implementing interest exploration and protection mechanisms, while also outlining future research directions such as generative recommendation and large‑model‑driven user profiling.

decentralized-distributiongraphinterest-exploration
0 likes · 25 min read
Decoding Xiaohongshu’s Decentralized Recommendation: Sideinfo and Multimodal Fusion
NewBeeNLP
NewBeeNLP
Aug 7, 2024 · Artificial Intelligence

Can Intuitive Fine‑Tuning Replace Expensive RLHF and DPO for LLM Alignment?

This article analyses the shortcomings of current large language model training methods such as SFT, RLHF and DPO, explains why they incur high data and compute costs, and introduces Intuitive Fine‑Tuning (IFT) with temporal residual connections as a cheaper yet effective alternative that better aligns training objectives with real generation tasks.

DPOIntuitive Fine-TuningLLM
0 likes · 15 min read
Can Intuitive Fine‑Tuning Replace Expensive RLHF and DPO for LLM Alignment?
NewBeeNLP
NewBeeNLP
Aug 5, 2024 · Industry Insights

How Alibaba Cloud Scales Search Recommendations with Big Data, AI, and LLMs

This article details Alibaba Cloud's end‑to‑end architecture for search and advertising recommendation, covering the data platform, AI services, feature‑store design, training and inference optimizations, and the integration of large language models for new recommendation scenarios.

AI PlatformAlibaba CloudRAG
0 likes · 17 min read
How Alibaba Cloud Scales Search Recommendations with Big Data, AI, and LLMs
NewBeeNLP
NewBeeNLP
Aug 3, 2024 · Artificial Intelligence

Extending LLM Context to 1M Tokens: SAMBA, CoPE, RoPE, Retrieval Heads & Infini‑Attention

This article reviews recent research on extending large language model context windows to millions of tokens, covering SAMBA's hybrid architecture, Contextual Position Encoding (CoPE), RoPE base length theory, Retrieval Head analysis, and the memory‑efficient Infini‑Attention mechanism.

Efficient AttentionLLM researchlarge language models
0 likes · 10 min read
Extending LLM Context to 1M Tokens: SAMBA, CoPE, RoPE, Retrieval Heads & Infini‑Attention