Tagged articles
1025 articles
Page 5 of 11
DataFunTalk
DataFunTalk
Sep 15, 2025 · Artificial Intelligence

How AI+Data Agents Are Transforming the Automotive Industry’s Digital Leap

In an interview, Di Xingxing of Autohome details their AI+Data framework—unified lake‑warehouse, intelligent engine, and agent services—that breaks data silos, blends traditional models with LLMs, leverages causal inference and RAG knowledge bases, and uses continuous feedback to build explainable, evolving data agents for accurate sales forecasting, competitive analysis, and end‑to‑end business automation in the automotive industry.

Large Language ModelsRAGai
0 likes · 10 min read
How AI+Data Agents Are Transforming the Automotive Industry’s Digital Leap
DataFunSummit
DataFunSummit
Sep 14, 2025 · Artificial Intelligence

How AI is Revolutionizing Chemistry and Drug Discovery: From Data to Breakthroughs

This article explores how AI-driven models and data pipelines are transforming the chemistry and pharmaceutical sectors by accelerating drug design, improving protein‑antibody predictions, automating patent data extraction, and outlining future goals for end‑to‑end AI‑enabled scientific discovery.

AI for ScienceChemistry AILarge Language Models
0 likes · 13 min read
How AI is Revolutionizing Chemistry and Drug Discovery: From Data to Breakthroughs
Alibaba Cloud Developer
Alibaba Cloud Developer
Sep 12, 2025 · Operations

How to Build End‑to‑End Observability for Large‑Model Applications on Alibaba Cloud

This guide explains how to design and implement a complete observability solution for large‑model AI services on Alibaba Cloud, covering architecture, core metrics, logging standards, demo code, log collection, dashboard design, alerting, monitoring tools, troubleshooting SOPs, and recovery procedures.

AI OperationsAlibaba CloudLarge Language Models
0 likes · 21 min read
How to Build End‑to‑End Observability for Large‑Model Applications on Alibaba Cloud
Fun with Large Models
Fun with Large Models
Sep 12, 2025 · Artificial Intelligence

When to Choose Model Fine‑Tuning vs RAG for Large‑Model Engineering Interviews

The article explains the technical background and suitable scenarios for Retrieval‑Augmented Generation (RAG) and model fine‑tuning, compares their strengths, discusses how they can be combined, and provides interview‑style Q&A on their capabilities, risks, and differences from model distillation.

AI InterviewFine‑TuningLarge Language Models
0 likes · 7 min read
When to Choose Model Fine‑Tuning vs RAG for Large‑Model Engineering Interviews
AI2ML AI to Machine Learning
AI2ML AI to Machine Learning
Sep 11, 2025 · Industry Insights

Key Takeaways from Asset Management Leaders on Large‑Model AI at the Bund Conference

The article compiles senior asset‑management executives' perspectives on applying large‑model AI—covering vertical versus generic models, integration strategies, talent and cost considerations, innovative C2C development, AI‑native platforms, and the practical challenges of using LLMs in investment research.

AI applicationsC2C developmentLarge Language Models
0 likes · 5 min read
Key Takeaways from Asset Management Leaders on Large‑Model AI at the Bund Conference
Baidu Geek Talk
Baidu Geek Talk
Sep 10, 2025 · Artificial Intelligence

How to Cut Through the LLM SOTA Hype: Practical Evaluation Strategies for 2025

Amid the 2025 surge of large language models, this article demystifies misleading SOTA claims, critiques benchmark reliability, and presents a comprehensive, business‑focused evaluation framework—including dataset construction, metric selection, automated scoring, and practical guidelines—to help developers and product teams choose the right model for real‑world applications.

AI benchmarkingLLM-as-judgeLarge Language Models
0 likes · 18 min read
How to Cut Through the LLM SOTA Hype: Practical Evaluation Strategies for 2025
Baobao Algorithm Notes
Baobao Algorithm Notes
Sep 10, 2025 · Artificial Intelligence

Qwen3-Next Unveiled: Sparse MoE, Hybrid Attention & Multi‑Token Prediction

A recent Hugging Face pull request reveals Alibaba’s upcoming Qwen3‑Next series, highlighting its extreme‑context, parameter‑efficient design that combines a 1:50 high‑sparsity MoE, a hybrid attention architecture mixing gated attention with Gated DeltaNet, and a Multi‑Token Prediction technique, promising ten‑fold throughput gains for 32K‑plus token contexts.

AI ArchitectureLarge Language ModelsMulti-token Prediction
0 likes · 8 min read
Qwen3-Next Unveiled: Sparse MoE, Hybrid Attention & Multi‑Token Prediction
DataFunSummit
DataFunSummit
Sep 9, 2025 · Artificial Intelligence

How Baidu’s GRAB Model Uses Scaling Laws to Transform Ad Ranking

This article explains Baidu's generative ranking model GRAB, detailing how scaling laws from large language models inspire a new recommendation paradigm, the model's architecture, custom attention mechanisms, training strategies, deployment optimizations, and the resulting business gains in CTR and revenue.

BaiduCTR predictionLarge Language Models
0 likes · 22 min read
How Baidu’s GRAB Model Uses Scaling Laws to Transform Ad Ranking
JD Cloud Developers
JD Cloud Developers
Sep 9, 2025 · Artificial Intelligence

How JD’s PODM‑MI Framework Revolutionized E‑commerce Search Ranking

This article recounts a JD engineer’s journey from theory to practice, detailing the development of the PODM‑MI re‑ranking framework, its three‑layer distribution‑based design, the discovery of a novel SID bottleneck, and the resulting multi‑million‑order impact validated at SIGIR 2024.

E-commerce AILarge Language ModelsSIGIR
0 likes · 8 min read
How JD’s PODM‑MI Framework Revolutionized E‑commerce Search Ranking
DataFunSummit
DataFunSummit
Sep 8, 2025 · Artificial Intelligence

How High‑Quality Inference Data Is Powering the Next AI Revolution

This article explores how high‑quality inference data has become a new paradigm driving AI breakthroughs, detailing Ant Group's research on inference data paradigms, financial‑sector applications, intelligent labeling and quality inspection, and the AIGD AI data synthesis platform, followed by a technical Q&A.

AI dataAIGDFinancial AI
0 likes · 11 min read
How High‑Quality Inference Data Is Powering the Next AI Revolution
DaTaobao Tech
DaTaobao Tech
Sep 8, 2025 · Artificial Intelligence

How to Make Large Language Models Understand Third‑Party Java Packages: From Failure to Success

This article explains why AI coding assistants like Cursor and Claude fail to read external Java libraries, explores naive "feed‑the‑code" tricks, evaluates built‑in IDE tools, and ultimately presents a robust solution using a local decompilation pipeline (MCP) that lets LLMs query class definitions and generate correct backend code.

AI code generationJava decompilationLarge Language Models
0 likes · 19 min read
How to Make Large Language Models Understand Third‑Party Java Packages: From Failure to Success
DataFunTalk
DataFunTalk
Sep 8, 2025 · Artificial Intelligence

When Claude Leaves China: How Domestic AI Models Are Rising to Fill the Gap

Anthropic's new ban on Claude for Chinese‑controlled firms forces developers to seek home‑grown alternatives, prompting a deep dive into Claude's strengths, the rapid rise of Chinese large‑language models, and the gaps that still separate them from the world‑leading offering.

AI SafetyAI modelsChinese AI
0 likes · 11 min read
When Claude Leaves China: How Domestic AI Models Are Rising to Fill the Gap
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Sep 5, 2025 · Artificial Intelligence

Weekly Quantitative Finance Paper Digest (Aug 30 – Sep 5, 2025)

This digest reviews four recent AI‑driven finance papers: a robust MCVaR portfolio optimizer with ellipsoidal support and RKHS uncertainty, a PPO‑based adaptive weighting system for LLM‑generated alphas, an empirical comparison of price‑based, GICS‑based, and LLM‑embedding stock clustering, and a diffusion‑model approach that generates future financial chart images from current charts and text prompts.

Diffusion ModelsLarge Language ModelsQuantitative Finance
0 likes · 9 min read
Weekly Quantitative Finance Paper Digest (Aug 30 – Sep 5, 2025)
ShiZhen AI
ShiZhen AI
Sep 5, 2025 · Artificial Intelligence

Andrew Ng Highlights Core AI Engineer Skills Amidst Major AI Industry Updates

The article reports that ChatGPT now supports branch conversations, Anthropic restricts service use in certain regions, Andrew Ng outlines essential AI engineer capabilities such as AI‑assisted software building, prompting and agentic workflows, and highlights the market demand, while also covering the Kimi K2 model upgrade, Hugging Face’s FineVision dataset release, and Google’s AI‑driven Deep Loop Shaping method published in *Science*.

AI EngineeringAI SafetyAI for astronomy
0 likes · 8 min read
Andrew Ng Highlights Core AI Engineer Skills Amidst Major AI Industry Updates
Instant Consumer Technology Team
Instant Consumer Technology Team
Sep 5, 2025 · Artificial Intelligence

Why Context Engineering Is the Next Frontier for Large Language Models

This article surveys over 1,400 papers to define context engineering as a systematic discipline that structures retrieval, memory, tools, and multi‑agent coordination for LLMs, highlighting the critical asymmetry between understanding long contexts and generating equally complex outputs.

Context EngineeringLLM evaluationLarge Language Models
0 likes · 8 min read
Why Context Engineering Is the Next Frontier for Large Language Models
DataFunSummit
DataFunSummit
Sep 4, 2025 · Artificial Intelligence

Unlocking Multi‑Agent AI: How Ant Group’s agentUniverse Transforms Financial Services

The article explores Ant Group’s agentUniverse team’s experience applying multi‑agent technology in finance, covering background on large language models, the agentUniverse framework, real‑world implementations, and the advantages of coordinated multi‑agent collaboration for complex analytical and decision‑making tasks.

AI CollaborationFinancial AILarge Language Models
0 likes · 4 min read
Unlocking Multi‑Agent AI: How Ant Group’s agentUniverse Transforms Financial Services
Amap Tech
Amap Tech
Sep 4, 2025 · Artificial Intelligence

How Hierarchical Sampling Boosts Self‑Taught Reasoning in LLMs

HS‑STAR introduces a three‑stage hierarchical sampling framework that identifies high‑utility boundary problems, reallocates computation budget to them, and fine‑tunes large language models, achieving significant accuracy gains on math reasoning benchmarks without extra sampling cost.

HS-STARHierarchical SamplingLarge Language Models
0 likes · 10 min read
How Hierarchical Sampling Boosts Self‑Taught Reasoning in LLMs
Data Party THU
Data Party THU
Sep 3, 2025 · Artificial Intelligence

Exploring Multimodal Generative AI: A Tsinghua Tutorial at IJCAI 2025

This article introduces a 1.5‑hour tutorial presented by Tsinghua researchers at IJCAI 2025, covering the latest advances in multimodal generative AI, including multimodal large language models, diffusion models, post‑training generalization techniques, and unified understanding‑generation frameworks.

Diffusion ModelsGenerative ModelsIJCAI 2025
0 likes · 5 min read
Exploring Multimodal Generative AI: A Tsinghua Tutorial at IJCAI 2025
AI2ML AI to Machine Learning
AI2ML AI to Machine Learning
Sep 2, 2025 · Artificial Intelligence

Why Enterprise Large‑Model Digitalization Is So Hard: Key Challenges and Capabilities

The article analyzes why enterprise‑wide large‑model AI projects face steep hurdles, outlining required human capabilities, historical labor shifts, current hot technologies such as RAG, Agent, CoT and multimodal, their limits, a three‑stage implementation roadmap, typical case pitfalls, and the key success factors for sustainable digital transformation.

CoTDigital TransformationEnterprise AI
0 likes · 15 min read
Why Enterprise Large‑Model Digitalization Is So Hard: Key Challenges and Capabilities
Amap Tech
Amap Tech
Sep 2, 2025 · Artificial Intelligence

How Pos2Distill Eliminates Positional Bias in Large Language Models

This article introduces Pos2Distill, a novel knowledge‑distillation framework that transfers capabilities from advantageous to disadvantaged positions in large language models, effectively mitigating positional bias and improving performance on long‑text retrieval and in‑context reasoning tasks.

Large Language Modelsin-context reasoningknowledge distillation
0 likes · 10 min read
How Pos2Distill Eliminates Positional Bias in Large Language Models
Alibaba Cloud Developer
Alibaba Cloud Developer
Sep 2, 2025 · Artificial Intelligence

Turning Large Language Models into Business Results: Alibaba Cloud’s Playbook

In this talk, Alibaba Cloud CIO Jiang Linquan shares how his team systematically tackled organizational, technical, and operational challenges to deploy large‑language‑model applications across dozens of enterprise scenarios, presenting real‑world case studies, a RIDE methodology, and practical metrics for success.

Case StudiesDigital TransformationEnterprise AI
0 likes · 36 min read
Turning Large Language Models into Business Results: Alibaba Cloud’s Playbook
DataFunSummit
DataFunSummit
Aug 28, 2025 · Artificial Intelligence

Why Finance Needs Its Own Large Language Model: Insights from Du Xiaoman

This article explains how the unique data‑driven, knowledge‑intensive, and complex nature of the financial industry makes large language models especially valuable, outlines the limitations of generic models, and shows how domain‑specific, cost‑effective models can deliver superior performance for finance.

Large Language ModelsModel Trainingai
0 likes · 5 min read
Why Finance Needs Its Own Large Language Model: Insights from Du Xiaoman
Architects' Tech Alliance
Architects' Tech Alliance
Aug 26, 2025 · Artificial Intelligence

How DeepSeek‑V3.1’s New FP8 Precision Supercharges Domestic Chip Performance

DeepSeek‑V3.1 introduces the UE8M0 FP8 Scale precision, cutting memory usage by up to 75% and enabling next‑generation Chinese chips such as Ascend 910B to run 128K context models efficiently, while the ecosystem rapidly adopts FP8, yet challenges in IP autonomy and software maturity remain before global competitiveness is achieved.

AI hardwareDeepSeekDomestic Chips
0 likes · 10 min read
How DeepSeek‑V3.1’s New FP8 Precision Supercharges Domestic Chip Performance
JD Tech
JD Tech
Aug 25, 2025 · Artificial Intelligence

How JD’s Large‑Model Tools are Shaping AI in Enterprise: Insights & Roadmap

JD’s recent technical salon reveals the rapid evolution of large‑model tools, detailing industry trends, JD’s JoyAI ecosystem—including JoyAgent, OxyGent and JoyCode—real‑world applications across office, code review, logistics and local services, and future policy and multi‑agent visions.

AI applicationsAI toolsEnterprise AI
0 likes · 13 min read
How JD’s Large‑Model Tools are Shaping AI in Enterprise: Insights & Roadmap
Architecture and Beyond
Architecture and Beyond
Aug 24, 2025 · Artificial Intelligence

Why Master‑Slave Architecture Powers Modern Multi‑Agent AI Systems

The article explains how the master‑slave (or manager‑worker) architecture, inspired by both software micro‑services and biological systems, solves context fragmentation and coordination challenges in large‑model multi‑agent applications, detailing design principles, technical implementations, advantages, limitations, and suitable use cases.

AI coordinationLarge Language ModelsMulti-Agent
0 likes · 15 min read
Why Master‑Slave Architecture Powers Modern Multi‑Agent AI Systems
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Aug 23, 2025 · Artificial Intelligence

Why LoRA, QLoRA, Prompt & Prefix Tuning Are Changing Large‑Model Fine‑Tuning

This article explains the mathematical basis of LoRA, compares it with QLoRA, Prompt Tuning, Prefix Tuning and P‑tuning, shows practical PyTorch implementations, and provides mixed‑precision training tips so readers can choose the most memory‑efficient fine‑tuning method for their large language models.

Large Language ModelsLoRAPrompt Tuning
0 likes · 17 min read
Why LoRA, QLoRA, Prompt & Prefix Tuning Are Changing Large‑Model Fine‑Tuning
DataFunSummit
DataFunSummit
Aug 23, 2025 · Artificial Intelligence

Mastering Role‑Playing AI Agents: Challenges, Techniques, and Future Directions

This article surveys the latest research on role‑playing AI agents, covering their definition, core components, application scenarios, three main challenges—role fidelity, long‑term memory, and evaluation—and presents four technical approaches for each challenge along with future research directions and references.

AI agentsLarge Language ModelsMemory
0 likes · 22 min read
Mastering Role‑Playing AI Agents: Challenges, Techniques, and Future Directions
JD Retail Technology
JD Retail Technology
Aug 22, 2025 · Artificial Intelligence

How JD’s Open‑Source Large‑Model Tools Are Shaping the Future of Enterprise AI

This article explores the rapid evolution of large‑model AI tools, outlines JD’s open‑source solutions such as JoyAI, JoyAgent, OxyGent and JoyCode, and examines real‑world applications, design principles, policy considerations, and future directions for AI agents and embodied intelligence.

AI applicationsAI policyEnterprise AI
0 likes · 12 min read
How JD’s Open‑Source Large‑Model Tools Are Shaping the Future of Enterprise AI
JD Tech Talk
JD Tech Talk
Aug 20, 2025 · Artificial Intelligence

How Large AI Models Are Transforming Software Testing

This article explains what large AI models are, how they enhance capabilities across domains, and details their practical use in software testing—covering code review, automated test case generation, security and performance checks—while envisioning future impacts on manual testing efficiency.

AI in QALarge Language ModelsSoftware Testing
0 likes · 4 min read
How Large AI Models Are Transforming Software Testing
Data Party THU
Data Party THU
Aug 20, 2025 · Artificial Intelligence

How Dual‑Granularity Prompting Boosts Graph‑Enhanced LLMs for Fraud Detection

The article analyzes the Dual Granularity Prompting (DGP) framework, which mitigates information overload in graph‑enhanced large language models for fraud detection by applying fine‑grained processing to target nodes and coarse‑grained summarization to neighbors, achieving superior accuracy and token efficiency across multiple public and industrial datasets.

Large Language Modelsdual granularity promptingfraud detection
0 likes · 6 min read
How Dual‑Granularity Prompting Boosts Graph‑Enhanced LLMs for Fraud Detection
Kuaishou Large Model
Kuaishou Large Model
Aug 19, 2025 · Artificial Intelligence

How Klear-Reasoner Achieves SOTA Math & Code Reasoning with GPPO

Klear-Reasoner, built on Qwen3‑8B‑Base, introduces the Gradient‑Preserving Clipping Policy Optimization (GPPO) algorithm to overcome traditional clip limitations, achieving state‑of‑the‑art performance on AIME2024/2025 and LiveCodeBench while providing detailed experimental analysis and data‑quality insights.

GPPOLarge Language ModelsReinforcement Learning
0 likes · 11 min read
How Klear-Reasoner Achieves SOTA Math & Code Reasoning with GPPO
Alibaba Cloud Developer
Alibaba Cloud Developer
Aug 18, 2025 · Artificial Intelligence

Mastering Claude Prompt Engineering: 9 Proven Strategies to Boost LLM Performance

This guide systematically breaks down Anthropic's official prompt‑engineering recommendations—clear instructions, multishot examples, chain‑of‑thought prompting, XML structuring, response pre‑filling, prompt chaining, long‑context handling, extended thinking, and practical code snippets—showing how to unlock Claude's full potential across complex tasks.

Chain-of-ThoughtClaudeLarge Language Models
0 likes · 15 min read
Mastering Claude Prompt Engineering: 9 Proven Strategies to Boost LLM Performance
Fighter's World
Fighter's World
Aug 15, 2025 · Artificial Intelligence

Why GPT‑5 Is Still Far From AGI Yet Near Scalable Profitability

The article analyzes GPT‑5’s release, its unified multi‑model architecture with a real‑time router, improved reasoning, coding and tool‑use capabilities, reduced hallucinations, and how these technical shifts reshape AI commercialization, investment logic, competition and enterprise adoption.

AI commercializationAgentic AIGPT-5
0 likes · 20 min read
Why GPT‑5 Is Still Far From AGI Yet Near Scalable Profitability
Data Party THU
Data Party THU
Aug 14, 2025 · Artificial Intelligence

How FilterLLM Turns One LLM Pass into Billion‑User Cold‑Start Recommendations

The article analyzes the FilterLLM approach, which augments a frozen LLM with billions of learnable user tokens to predict a full‑user interaction probability distribution in a single forward pass, dramatically speeding up cold‑start recommendation while preserving recommendation quality across multiple benchmarks.

FilterLLMLLMLarge Language Models
0 likes · 8 min read
How FilterLLM Turns One LLM Pass into Billion‑User Cold‑Start Recommendations
Data Party THU
Data Party THU
Aug 13, 2025 · Artificial Intelligence

How Large Language Models Are Revolutionizing Automated Scholarly Paper Review

This survey examines the rapid rise of large language models in automated scholarly paper review (ASPR), analyzing model types, technical breakthroughs such as long‑text, multimodal, and multi‑turn capabilities, new generation methods, datasets, open‑source tools, current challenges, publisher policies, and future research directions.

ASPRLarge Language ModelsMultimodal AI
0 likes · 19 min read
How Large Language Models Are Revolutionizing Automated Scholarly Paper Review
AI Info Trend
AI Info Trend
Aug 13, 2025 · Industry Insights

How China’s AI Labs Are Closing the Gap with the US in Q2 2025

The Q2 2025 State of AI report analyzes Chinese AI labs’ rapid progress across language models, open‑source weights, and multimodal generation, showing a shrinking performance gap with US leaders, detailed benchmark scores, ecosystem classifications, and emerging competitive dynamics.

ChinaIndustry AnalysisLarge Language Models
0 likes · 10 min read
How China’s AI Labs Are Closing the Gap with the US in Q2 2025
AI Info Trend
AI Info Trend
Aug 11, 2025 · Industry Insights

What Q2 2025 Reveals About the AI Landscape: Key Trends and Model Rankings

The Q2 2025 State of AI Highlights Report analyzes benchmark data, model performance, and market dynamics, revealing five major industry trends, the rise of AI agents, rapid advances in language, vision, and speech models, and shifting hardware acceleration strategies that shape the future of artificial intelligence.

AI agentsLarge Language Modelsai
0 likes · 11 min read
What Q2 2025 Reveals About the AI Landscape: Key Trends and Model Rankings
Data Party THU
Data Party THU
Aug 11, 2025 · Artificial Intelligence

Can Hidden Signals Reveal Multimodal Model Jailbreaks? Introducing HiddenDetect

This article presents HiddenDetect, a training‑free method that leverages refusal‑semantic vectors and layer‑wise activation analysis to detect jailbreak attempts in multimodal large language models, revealing distinct safety signals across text and image modalities and demonstrating strong performance on several LVLM benchmarks.

LVLMLarge Language ModelsMultimodal
0 likes · 7 min read
Can Hidden Signals Reveal Multimodal Model Jailbreaks? Introducing HiddenDetect
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Aug 8, 2025 · Artificial Intelligence

Unlocking Big Data Ops with Large Models: Opportunities, Challenges, Design

This article summarizes a Cloud Summit talk where Alibaba Cloud’s AI expert Zhang Yingying explains how large language models can enhance big‑data intelligent operations, covering opportunities, challenges, RAG‑based Q&A, multi‑agent diagnostics, and the engineering architecture needed for reliable, scalable deployment.

AI EngineeringBig Data OperationsLarge Language Models
0 likes · 20 min read
Unlocking Big Data Ops with Large Models: Opportunities, Challenges, Design
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Aug 8, 2025 · Artificial Intelligence

What Von Neumann’s Brain Theory Reveals About Prompt Engineering for LLMs

The article explores how Von Neumann’s insights on the brain‑computer analogy illuminate modern large‑language‑model prompt engineering, comparing logical reasoning chains, memory mechanisms, and DSL‑driven computation to improve accuracy, reduce hallucinations, and balance reasoning depth with precise calculation.

DSLLarge Language ModelsPrompt engineering
0 likes · 14 min read
What Von Neumann’s Brain Theory Reveals About Prompt Engineering for LLMs
AI Frontier Lectures
AI Frontier Lectures
Jul 31, 2025 · Artificial Intelligence

What’s Driving the Latest LLM Architecture Trends? DeepSeek, OLMo, Gemma, and More Explained

This article examines the evolution of large language model architectures over the past seven years, comparing key design choices such as Multi‑Head Latent Attention, Grouped‑Query Attention, Mixture‑of‑Experts, sliding‑window attention, normalization placement, and optimizer variants across models like DeepSeek V3, OLMo 2, Gemma 3, Llama 4, Qwen 3, SmolLM 3, and Kimi 2.

AI researchLLM comparisonLarge Language Models
0 likes · 30 min read
What’s Driving the Latest LLM Architecture Trends? DeepSeek, OLMo, Gemma, and More Explained
Data Thinking Notes
Data Thinking Notes
Jul 30, 2025 · Artificial Intelligence

Tracing the Evolution of Large Language Models: Key Papers and Breakthroughs

This article reviews the most influential papers in large language model research since 2017, covering foundational works such as the Transformer, GPT‑3, BERT, scaling laws, and recent innovations like FlashAttention, Mamba, and QLoRA, highlighting their core contributions and impact on AI development.

AI researchLarge Language ModelsModel Optimization
0 likes · 28 min read
Tracing the Evolution of Large Language Models: Key Papers and Breakthroughs
JD Tech
JD Tech
Jul 29, 2025 · Artificial Intelligence

How Causal Inference Meets Large Language Models to Revolutionize E‑commerce Pricing

This article describes a QCon talk that combines causal inference with large language models to build a retrieval‑augmented generation pricing system for e‑commerce, detailing the three‑step algorithm, LLM‑driven modeling challenges, process‑reward tree search, reinforcement‑learning fine‑tuning, and experimental gains in accuracy and speed.

Large Language ModelsReinforcement LearningRetrieval Augmented Generation
0 likes · 17 min read
How Causal Inference Meets Large Language Models to Revolutionize E‑commerce Pricing
FunTester
FunTester
Jul 29, 2025 · Artificial Intelligence

Why AI Hallucinations Happen and How Test Engineers Can Reset Conversations

AI-generated content can produce hallucinations—misleading or illogical answers—especially during lengthy testing dialogues, caused by context overload, limited training data, ambiguous prompts, and the model’s creative tendencies; resetting the conversation with a new session and proper handoff can dramatically improve accuracy and efficiency for software test engineers.

AI hallucinationLarge Language ModelsPrompt engineering
0 likes · 10 min read
Why AI Hallucinations Happen and How Test Engineers Can Reset Conversations
AI Algorithm Path
AI Algorithm Path
Jul 27, 2025 · Artificial Intelligence

Understanding RLHF: How Human Feedback Trains Modern LLMs

This article explains the RLHF (Reinforcement Learning from Human Feedback) pipeline that powers ChatGPT and other large language models, covering the limitations of traditional fine‑tuning, the creation of human‑feedback datasets, reward‑model training, loss design, and the final PPO‑based fine‑tuning step.

ChatGPTHuman FeedbackLarge Language Models
0 likes · 8 min read
Understanding RLHF: How Human Feedback Trains Modern LLMs
AI Info Trend
AI Info Trend
Jul 24, 2025 · Industry Insights

What’s Driving AI Adoption in 2025? Six Key Trends Uncovered

The AI Adoption Survey H1 2025 reveals that nearly half of organizations have deployed AI in production, engineering and R&D lead usage, Chinese LLMs gain overseas interest, and cost, reliability and intelligence remain the top challenges, while tool preferences and multimodal trends reshape the market.

AI InfrastructureAI adoptionAI trends
0 likes · 7 min read
What’s Driving AI Adoption in 2025? Six Key Trends Uncovered
AI2ML AI to Machine Learning
AI2ML AI to Machine Learning
Jul 24, 2025 · Artificial Intelligence

Exploring Recent Large‑Model Agent Papers: Insights and Analyses

This article reviews a series of recent research papers on large‑model agents, covering topics such as reinforcement‑learning‑driven ML agents, premise‑critique ability of LLMs, long‑term tool‑augmented LLM evaluation, agentic RAG, set‑based retrieval for multi‑hop QA, mobile VLM agents, and broader surveys of LLM applications, summarizing each work’s problem statement, prior approaches, novel contributions, experimental results, limitations, and future directions.

Agentic AILLM evaluationLarge Language Models
0 likes · 46 min read
Exploring Recent Large‑Model Agent Papers: Insights and Analyses
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Jul 23, 2025 · Artificial Intelligence

How to Distill Large Language Models for Efficient Text Generation with EasyDistill

This guide explains how to use the EasyDistill framework and Alibaba Cloud PAI to distill large language models for high‑quality text generation, covering model deployment, SFT and DPO training data construction, code examples, configuration files, and best practices for achieving resource‑efficient, high‑performance student models.

DPOEasyDistillLarge Language Models
0 likes · 14 min read
How to Distill Large Language Models for Efficient Text Generation with EasyDistill
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Jul 23, 2025 · Artificial Intelligence

Unlock Efficient LLMs: How Alibaba’s PAI EasyDistill Powers Model Post‑Training

This article explains how Alibaba Cloud's AI platform PAI leverages the EasyDistill framework for post‑training model optimization, covering knowledge distillation concepts, data synthesis techniques, basic and advanced distillation training, the DistilQwen model family, real‑world customer cases, and step‑by‑step practical demos.

AI PlatformEasyDistillLLM optimization
0 likes · 12 min read
Unlock Efficient LLMs: How Alibaba’s PAI EasyDistill Powers Model Post‑Training
Tencent Cloud Developer
Tencent Cloud Developer
Jul 23, 2025 · Artificial Intelligence

Why Retrieval‑Augmented Generation Is Evolving Into Agentic AI Search

This article explains how the inherent knowledge limits of large language models drive the rise of Retrieval‑Augmented Generation (RAG), outlines its three evolutionary stages, introduces Agentic RAG and DeepSearch, and discusses the knowledge and ability boundaries that shape future AI search systems.

AI searchAgentic AIDeepSearch
0 likes · 19 min read
Why Retrieval‑Augmented Generation Is Evolving Into Agentic AI Search
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 21, 2025 · Artificial Intelligence

Unlocking LLM Power: How Context Engineering Transforms AI Assistants

Context engineering, the emerging discipline of structuring and managing input information for large language models, goes beyond simple prompt design by addressing issues such as context poisoning, overload, and conflict, offering strategies like intelligent retrieval, isolation, pruning, and compression to build reliable, high‑performing AI agents.

AI productivityAgent DesignContext Engineering
0 likes · 19 min read
Unlocking LLM Power: How Context Engineering Transforms AI Assistants
DataFunTalk
DataFunTalk
Jul 21, 2025 · Artificial Intelligence

From Prompt Engineering to Context Engineering: Transforming LLM Interactions

This article traces the evolution from prompt engineering to context engineering, detailing technical milestones, core concepts, practical strategies, and future trends that together reshape large language model applications and enable sophisticated AI agents across diverse domains.

Large Language ModelsMemory ManagementPrompt engineering
0 likes · 35 min read
From Prompt Engineering to Context Engineering: Transforming LLM Interactions
Data Thinking Notes
Data Thinking Notes
Jul 20, 2025 · Artificial Intelligence

Mastering Context Engineering: Boost LLM Performance with Advanced Techniques

Context Engineering, a new discipline for optimizing large language model inputs, expands context windows, compares with prompt engineering, outlines core techniques like information organization, dynamic management, semantic retrieval, and offers practical applications and recommendations to enhance AI performance across domains.

AI OptimizationLarge Language ModelsPrompt engineering
0 likes · 11 min read
Mastering Context Engineering: Boost LLM Performance with Advanced Techniques
Fun with Large Models
Fun with Large Models
Jul 17, 2025 · Artificial Intelligence

How to Integrate Large Models with LangChain: A Step‑by‑Step Tutorial

This tutorial explains LangChain's core modules and three‑layer architecture, shows how to set up a Python environment, and provides concrete code examples for connecting SiliconFlow Qwen3‑8B and DeepSeek models via the init_chat_model API, including result inspection and references to official documentation.

DeepSeekLangChainLarge Language Models
0 likes · 9 min read
How to Integrate Large Models with LangChain: A Step‑by‑Step Tutorial
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Jul 16, 2025 · Artificial Intelligence

ChunkFlow: Accelerating Long‑Context Model Fine‑Tuning Up to 4.5× Faster

The paper introduces ChunkFlow, an efficient training framework for variable‑length and ultra‑long sequence datasets that powers Qwen models, achieving up to 4.53× speedup over Megatron‑LM and more than 2× overall performance gains by reorganizing data into fixed‑size chunks and employing a state‑aware scheduler.

AI PerformanceChunkFlowDistributed Training
0 likes · 7 min read
ChunkFlow: Accelerating Long‑Context Model Fine‑Tuning Up to 4.5× Faster
DataFunTalk
DataFunTalk
Jul 16, 2025 · Artificial Intelligence

How Jason Wei’s Breakthroughs Are Shaping the Future of Large Language Models

Jason Wei, a former Google Brain and OpenAI researcher now at Meta, has driven key advances in large language models—including chain‑of‑thought prompting, instruction tuning, emergent abilities, zero‑shot learning, and data augmentation—shaping both AI research paradigms and real‑world applications.

Chain-of-ThoughtInstruction TuningLarge Language Models
0 likes · 7 min read
How Jason Wei’s Breakthroughs Are Shaping the Future of Large Language Models
DataFunTalk
DataFunTalk
Jul 16, 2025 · Artificial Intelligence

MiniMax-M1 Revealed: Hybrid Attention, RL Training, and 1M Token Context

MiniMax’s latest M1 model, unveiled after a $300 million funding round, showcases a 4.56‑trillion‑parameter hybrid‑expert architecture with lightning attention, supporting up to one million tokens, and leverages reinforcement‑learning techniques to enhance long‑context handling, inference efficiency, and system‑2 reasoning capabilities.

AI scalingLarge Language ModelsModel architecture
0 likes · 16 min read
MiniMax-M1 Revealed: Hybrid Attention, RL Training, and 1M Token Context
DataFunSummit
DataFunSummit
Jul 15, 2025 · Artificial Intelligence

Unlocking Semantic Search: Elasticsearch Vector Search & RAG Applications

This article explains why traditional keyword search falls short, introduces Elasticsearch's vector search and hybrid retrieval capabilities, and shows how combining it with large language models enables Retrieval‑Augmented Generation (RAG) for more accurate, context‑aware AI-driven search across text and multimedia data.

ElasticsearchLarge Language ModelsRAG
0 likes · 5 min read
Unlocking Semantic Search: Elasticsearch Vector Search & RAG Applications
DataFunTalk
DataFunTalk
Jul 13, 2025 · Artificial Intelligence

What 2025’s AI API Market Data Reveals About the Future of Large Models

An in‑depth analysis of 2025 H1 OpenRouter token usage shows explosive growth in Q1, highlights Google Gemini’s market dominance, reveals diverse long‑tail demand across domains, and examines shifting API preferences, offering key insights into the evolving landscape of large‑model services.

AI market analysisAPI trendsLarge Language Models
0 likes · 10 min read
What 2025’s AI API Market Data Reveals About the Future of Large Models
DataFunSummit
DataFunSummit
Jul 13, 2025 · Artificial Intelligence

How Alibaba Tackles Low-Resource Language Data for Multilingual LLMs

In this interview, Alibaba International’s senior data‑science expert Li Haijun explains the challenges of low‑resource languages for multilingual large models and details a five‑step data‑collection, augmentation, quality‑optimization, engineering, and evaluation framework that powers their cross‑border e‑commerce AI applications.

Large Language Modelsailow-resource languages
0 likes · 12 min read
How Alibaba Tackles Low-Resource Language Data for Multilingual LLMs
AI Frontier Lectures
AI Frontier Lectures
Jul 11, 2025 · Artificial Intelligence

How Llama Evolved: From Llama‑1 to Llama‑3 – Architecture, Data, and Performance Insights

This article provides a comprehensive technical analysis of Meta's Llama series, tracing the evolution from Llama‑1 through Llama‑2 to Llama‑3, detailing model architectures, training data pipelines, optimization methods, benchmark results, and the broader impact on the open‑source AI community.

AI researchLLaMALarge Language Models
0 likes · 25 min read
How Llama Evolved: From Llama‑1 to Llama‑3 – Architecture, Data, and Performance Insights
Kuaishou Tech
Kuaishou Tech
Jul 10, 2025 · Artificial Intelligence

How MODA’s Modular Duplex Attention Solves Multimodal Attention Imbalance and Boosts Emotion Understanding

The paper introduces MODA, a modular duplex attention multimodal model that addresses severe cross‑modal attention imbalance in existing large multimodal models, proposes a novel attention paradigm and masking scheme, and demonstrates significant performance gains across 21 benchmarks in perception, cognition, and emotion tasks, earning a Spotlight paper at ICML 2025.

Emotion RecognitionLarge Language ModelsMoDA
0 likes · 13 min read
How MODA’s Modular Duplex Attention Solves Multimodal Attention Imbalance and Boosts Emotion Understanding
Nightwalker Tech
Nightwalker Tech
Jul 10, 2025 · Artificial Intelligence

Master Prompt Engineering: From Basics to Advanced AI Prompt Techniques

This comprehensive guide introduces Prompt Engineering, explaining its core concepts, why clear prompts matter, and how to craft effective instructions using roles, tasks, requirements, and examples, while covering beginner to advanced techniques such as chain‑of‑thought, self‑correction, and building reusable prompt workflows for AI models.

ChatGPTLarge Language ModelsPrompt engineering
0 likes · 29 min read
Master Prompt Engineering: From Basics to Advanced AI Prompt Techniques
DataFunSummit
DataFunSummit
Jul 8, 2025 · Artificial Intelligence

Explore Cutting-Edge AI Knowledge Graphs: From Multimodal GraphRAG to Industry Applications

This article presents a curated catalog of cutting‑edge AI resources, covering multimodal GraphRAG, knowledge‑graph and large‑model integration, financial industry AI products, Chinese‑medicine decision support, AI‑driven knowledge‑graph evolution, private‑domain Q&A pipelines, and emerging trends and standards, with a QR code for the full ebook.

Artificial IntelligenceDocument IntelligenceLarge Language Models
0 likes · 2 min read
Explore Cutting-Edge AI Knowledge Graphs: From Multimodal GraphRAG to Industry Applications
Data Thinking Notes
Data Thinking Notes
Jul 6, 2025 · Artificial Intelligence

How Quantization Shrinks Giant AI Models for Edge Devices

This article explains why quantizing massive AI models is essential for deploying them on resource‑constrained devices, outlines core quantization concepts, techniques, and methods, compares their pros and cons, and presents practical application scenarios such as smartphones, autonomous driving, IoT, and edge computing.

AI deploymentLarge Language ModelsModel Quantization
0 likes · 9 min read
How Quantization Shrinks Giant AI Models for Edge Devices
dbaplus Community
dbaplus Community
Jul 6, 2025 · Artificial Intelligence

Why Build AI Agents? Benefits, Challenges, and Real-World Examples

This article explores the definition of AI agents, examines why they are essential despite challenges like latency and hallucinations, highlights their advantages such as lowered development barriers and workflow simplification, and presents real-world cases and future multi‑agent prospects.

AI agentsLarge Language ModelsPrompt engineering
0 likes · 25 min read
Why Build AI Agents? Benefits, Challenges, and Real-World Examples
DataFunTalk
DataFunTalk
Jul 5, 2025 · Artificial Intelligence

Is AI Turning Human Thought into a Uniform, Safe Echo Chamber?

Recent studies from MIT, Cornell and Santa Clara reveal that reliance on AI tools like ChatGPT reduces brain activity, narrows creative thinking, and drives cultural homogenization, prompting urgent reflection on the trade‑off between efficiency and originality in human expression.

Artificial IntelligenceCultural HomogenizationLarge Language Models
0 likes · 12 min read
Is AI Turning Human Thought into a Uniform, Safe Echo Chamber?
Nightwalker Tech
Nightwalker Tech
Jul 4, 2025 · Artificial Intelligence

Bypass Membership Limits: Access Overseas LLMs Easily with Chatbox

This guide explains how to overcome domestic membership restrictions and quickly connect to overseas large language models such as ChatGPT, Gemini, Claude, and Grok using the open‑source Chatbox client, covering download, configuration, model selection, and various interaction modes with step‑by‑step screenshots.

AI modelsChatboxLarge Language Models
0 likes · 8 min read
Bypass Membership Limits: Access Overseas LLMs Easily with Chatbox
Instant Consumer Technology Team
Instant Consumer Technology Team
Jul 3, 2025 · Artificial Intelligence

Why Buying an AI Appliance Is a Strategic Pitfall for Enterprises

Enterprises rushing to purchase DeepSeek AI appliances and smart‑agent platforms often face hidden technical, data, and organizational challenges that turn promised "plug‑and‑play" solutions into costly missteps, highlighting the need for realistic strategy, robust data governance, and continuous capability building.

AI capability buildingAI deploymentData Governance
0 likes · 28 min read
Why Buying an AI Appliance Is a Strategic Pitfall for Enterprises
iQIYI Technical Product Team
iQIYI Technical Product Team
Jul 3, 2025 · Artificial Intelligence

Three iQIYI AI Papers Break New Ground at ACL 2025 & INTERSPEECH 2025

iQIYI’s AI research team secured three paper acceptances—two at ACL 2025 (including a main conference and a Findings paper) and one at INTERSPEECH 2025—covering long‑context large language model evaluation, Chinese novel summarization, and efficient Thai speech recognition, with links to each work.

ACL 2025AI researchINTERSPEECH 2025
0 likes · 7 min read
Three iQIYI AI Papers Break New Ground at ACL 2025 & INTERSPEECH 2025
AI Frontier Lectures
AI Frontier Lectures
Jul 2, 2025 · Artificial Intelligence

Can Language Models Self‑Edit? Inside the SEAL Framework for Self‑Adapting LLMs

This article reviews recent AI self‑evolution research and provides an in‑depth analysis of the SEAL (Self‑Adapting Language) framework, which enables large language models to generate and learn from their own synthetic data through a nested reinforcement‑learning and fine‑tuning loop, with experimental results on few‑shot and knowledge‑integration tasks.

Few‑Shot LearningLarge Language ModelsMeta Learning
0 likes · 11 min read
Can Language Models Self‑Edit? Inside the SEAL Framework for Self‑Adapting LLMs
DataFunTalk
DataFunTalk
Jul 2, 2025 · Artificial Intelligence

How Multimodal Large Models Are Revolutionizing Complex Document OCR

In a detailed interview, Zhao Chenyang explains how multimodal large models (VLM) overcome the limitations of traditional OCR in mixed layouts, table reconstruction, and handwritten text by leveraging self‑supervised pre‑training, lightweight fine‑tuning, and hybrid pipelines that dramatically cut annotation costs and improve recall rates.

AI deploymentLarge Language ModelsMultimodal AI
0 likes · 13 min read
How Multimodal Large Models Are Revolutionizing Complex Document OCR
Tencent Cloud Developer
Tencent Cloud Developer
Jul 2, 2025 · Artificial Intelligence

Big Model Evolution: From Transformers to Enterprise Deployment

This article surveys the rapid evolution of large language models from the Transformer breakthrough to trillion‑parameter capabilities, explains key techniques such as self‑attention, MoE and KV‑Cache, explores practical aspects like temperature tuning, sales AI applications, and compares private versus cloud deployment strategies for enterprises.

Enterprise DeploymentKV cacheLarge Language Models
0 likes · 6 min read
Big Model Evolution: From Transformers to Enterprise Deployment
DataFunTalk
DataFunTalk
Jul 1, 2025 · Artificial Intelligence

Will OpenAI Reach ASI First? Dylan Patel’s Bold Prediction

In a candid hour‑long interview, SemiAnalysis founder Dylan Patel predicts OpenAI will be the first to achieve artificial superintelligence (ASI), while dissecting GPT‑4.5’s failure, Meta’s costly AI missteps, Apple’s strategic lag, and the shifting partnership between OpenAI and Microsoft.

AI competitionASIApple
0 likes · 11 min read
Will OpenAI Reach ASI First? Dylan Patel’s Bold Prediction
Ops Development Stories
Ops Development Stories
Jul 1, 2025 · Artificial Intelligence

From Lean to AIOps: How AI is Transforming Modern Operations

This comprehensive guide walks through the evolution from Lean and Agile practices to DevOps and finally AIOps, explaining core concepts, key algorithms, the role of large language models, RAG‑based root‑cause analysis, and practical implementation steps for intelligent operations.

Large Language ModelsLeanRAG
0 likes · 19 min read
From Lean to AIOps: How AI is Transforming Modern Operations
DataFunSummit
DataFunSummit
Jun 30, 2025 · Artificial Intelligence

How Large Language Models Are Evolving Toward Autonomous Meta‑Learning Agents

This talk reviews the rapid evolution of generative large‑model AI from rule‑based systems to massive pre‑training, examines the current bottlenecks in continual learning and knowledge discovery, and proposes large‑scale meta‑learning—especially context‑based reinforcement learning (ICRL)—as a path toward truly autonomous, self‑learning agents.

AI researchAutonomous AgentsLarge Language Models
0 likes · 24 min read
How Large Language Models Are Evolving Toward Autonomous Meta‑Learning Agents
DataFunTalk
DataFunTalk
Jun 30, 2025 · Artificial Intelligence

Wenxin 4.5 Series: Open‑Source Multimodal MoE Models and FastDeploy Guide

The Wenxin 4.5 series introduces ten open‑source models—including large‑scale MoE and dense variants—featuring a novel multimodal heterogeneous architecture, high training efficiency, SOTA benchmark performance, and comprehensive toolkits (ERNIEKit, FastDeploy) for fine‑tuning and multi‑hardware deployment.

ERNIEKitFastDeployLarge Language Models
0 likes · 8 min read
Wenxin 4.5 Series: Open‑Source Multimodal MoE Models and FastDeploy Guide
DataFunTalk
DataFunTalk
Jun 29, 2025 · Artificial Intelligence

Large Models Boost Douyin User Experience: Expert Insights

In an interview at the DA Digital Intelligence Conference, ByteDance AI specialist Cai Conghuai explains how large language models, combined with techniques like SFT, DPO, and RAG, are reshaping Douyin's user‑experience signal detection, root‑cause analysis, and evaluation, while outlining future AI‑agent breakthroughs.

DPOLarge Language ModelsMultimodal
0 likes · 12 min read
Large Models Boost Douyin User Experience: Expert Insights
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Jun 26, 2025 · Artificial Intelligence

Master Cloud AI Inference: Load‑Testing Strategies with Alibaba PAI‑EAS

This article explains how Alibaba Cloud’s PAI‑EAS platform enables efficient, scalable AI inference by detailing distributed architecture, serverless resource scheduling, comprehensive load‑testing modes, key performance metrics, and step‑by‑step usage instructions, helping developers optimize latency, throughput, and cost for large language models.

AI inferenceAlibaba PAILarge Language Models
0 likes · 7 min read
Master Cloud AI Inference: Load‑Testing Strategies with Alibaba PAI‑EAS
Alimama Tech
Alimama Tech
Jun 25, 2025 · Artificial Intelligence

Introducing ROLL: A Scalable, User‑Friendly RL Framework for Large‑Scale LLM Training

ROLL is an open‑source reinforcement‑learning framework designed for large language model post‑training that combines multi‑task RL, agentic support, flexible algorithm configuration, elastic resource scheduling, and rich observability, delivering significant accuracy gains across benchmarks while remaining easy to use for researchers, product developers, and infrastructure engineers.

AI FrameworkLarge Language ModelsRLHF
0 likes · 11 min read
Introducing ROLL: A Scalable, User‑Friendly RL Framework for Large‑Scale LLM Training
DeWu Technology
DeWu Technology
Jun 25, 2025 · Artificial Intelligence

Engineering Large Language Models with Spring AI: From Basics to RAG and Function Calls

This article walks through the fundamentals of large language models, their stateless and structured-output nature, explains how Spring‑AI provides a Java‑friendly API for model integration, covers RAG architecture, the MCP protocol, and demonstrates end‑to‑end code examples for building intelligent agents.

AI integrationFunction CallingLarge Language Models
0 likes · 15 min read
Engineering Large Language Models with Spring AI: From Basics to RAG and Function Calls
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Jun 24, 2025 · Artificial Intelligence

How Transformers and Mixture-of-Experts Power Large Language Models

This article explores the role of Transformers and Mixture‑of‑Experts in large models, outlines five fine‑tuning methods, compares traditional and agentic RAG, presents classic agent design patterns, text‑chunking strategies, levels of intelligent agent systems, and explains KV‑caching techniques.

Fine-tuningLarge Language ModelsMixture of Experts
0 likes · 2 min read
How Transformers and Mixture-of-Experts Power Large Language Models
AsiaInfo Technology: New Tech Exploration
AsiaInfo Technology: New Tech Exploration
Jun 23, 2025 · Artificial Intelligence

How Generative Data‑Driven Model Distillation Boosts Large‑Model Performance and Cuts Compute

This article examines generative data‑driven model distillation as a technique that not only compresses large language models but also improves their accuracy, addresses data‑privacy constraints, and reduces computational costs, offering a practical roadmap and real‑world results from a corporate AI platform.

AI OptimizationKnowledge TransferLarge Language Models
0 likes · 22 min read
How Generative Data‑Driven Model Distillation Boosts Large‑Model Performance and Cuts Compute
Programmer Xu Shu
Programmer Xu Shu
Jun 23, 2025 · Artificial Intelligence

From Bag‑of‑Words to ChatGPT: How Large Language Models Evolved

Tracing the evolution of large language models—from early bag‑of‑words techniques, through word embeddings, RNNs, attention mechanisms, Transformers, BERT, and GPT—this article explains each breakthrough, its limitations, and how they culminated in ChatGPT’s conversational AI.

AI evolutionChatGPTLarge Language Models
0 likes · 12 min read
From Bag‑of‑Words to ChatGPT: How Large Language Models Evolved
Data Thinking Notes
Data Thinking Notes
Jun 22, 2025 · Artificial Intelligence

What Powers the Rise of AI Agents? Inside the Tech Behind Agentic AI

This report explores the fundamentals, core technologies, leading platforms, current state, and future outlook of AI Agents and Agentic AI, detailing how large language models and mature infrastructure enable autonomous, reactive, proactive, and adaptive agents, and examines prominent projects such as Manus, Genspark, and Lovart.

AI agentsAgentic AILarge Language Models
0 likes · 5 min read
What Powers the Rise of AI Agents? Inside the Tech Behind Agentic AI
DataFunTalk
DataFunTalk
Jun 22, 2025 · Artificial Intelligence

How Cursor’s CEO Envisions the Future of AI‑Powered Programming

In this interview, Cursor CEO Michael Truell explains the company’s mission to revolutionize coding with AI, discusses the evolution of AI‑assisted development, shares insights on product strategy, scaling challenges, and the broader impact of intent‑driven programming on software engineering.

AI programmingCursorLarge Language Models
0 likes · 37 min read
How Cursor’s CEO Envisions the Future of AI‑Powered Programming
AI Algorithm Path
AI Algorithm Path
Jun 20, 2025 · Artificial Intelligence

Beginner’s Guide to Visual Language Models – Day 1: What They Are and Why They Matter

This article introduces visual‑language models (VLMs), explaining how they combine large language models with visual encoders, why they overcome the rigidity of traditional computer‑vision systems, their key advantages, modular architecture, training methods, and practical applications such as image captioning and visual question answering.

AI applicationsComputer VisionLarge Language Models
0 likes · 8 min read
Beginner’s Guide to Visual Language Models – Day 1: What They Are and Why They Matter
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Jun 19, 2025 · Artificial Intelligence

Can Adaptive Chain‑of‑Thought Learning Halve LLM Thinking Time?

The article introduces the Think When You Need (TWYN) method, a reinforcement‑learning approach that dynamically adapts chain‑of‑thought length, dramatically cuts redundant token generation in large language models, and maintains or improves accuracy across diverse reasoning benchmarks.

Chain-of-ThoughtLarge Language ModelsReinforcement Learning
0 likes · 9 min read
Can Adaptive Chain‑of‑Thought Learning Halve LLM Thinking Time?
Fun with Large Models
Fun with Large Models
Jun 19, 2025 · Artificial Intelligence

How GraphRAG Boosts Answer Accuracy with Knowledge Graphs (Part 1)

This article explains GraphRAG’s architecture, compares it with traditional RAG, and presents experimental results showing that GraphRAG’s knowledge‑graph‑driven retrieval markedly improves answer accuracy, especially on low‑match, multi‑paragraph queries.

GraphRAGLarge Language ModelsPerformance Evaluation
0 likes · 11 min read
How GraphRAG Boosts Answer Accuracy with Knowledge Graphs (Part 1)
AntTech
AntTech
Jun 18, 2025 · Artificial Intelligence

How Ant Group’s Baoling Models Push Toward AGI with MoE and Multimodal Innovations

In a detailed AICon talk, Ant Group’s Baoling team leader Zhou Jun outlines their latest large‑model training techniques, MoE architecture optimizations, multimodal breakthroughs, open‑source releases, and the strategic roadmap needed to turn AI into a ubiquitous, “scan‑code‑level” everyday assistant.

AI InfrastructureLarge Language ModelsMixture of Experts
0 likes · 25 min read
How Ant Group’s Baoling Models Push Toward AGI with MoE and Multimodal Innovations
Instant Consumer Technology Team
Instant Consumer Technology Team
Jun 17, 2025 · Artificial Intelligence

Mastering Fine‑Tuning Datasets: From Basics to Advanced LLM Techniques

This comprehensive guide explains the importance of fine‑tuning datasets for large language models, covering task classification, dataset formats, supervised and instruction tuning, domain adaptation, multimodal data, and practical code examples to help practitioners build effective training, validation, and test sets.

Fine-tuningInstruction TuningLarge Language Models
0 likes · 33 min read
Mastering Fine‑Tuning Datasets: From Basics to Advanced LLM Techniques
Data Thinking Notes
Data Thinking Notes
Jun 15, 2025 · Artificial Intelligence

Mastering Fine-Tuning: From Basics to Advanced Techniques for Large Language Models

Fine‑tuning transforms a general‑purpose large language model into a domain‑specific expert by training on a small, labeled dataset, and this guide explains its background, core concepts, technical mechanisms, various methods—including full‑parameter, LoRA, adapters, and prompt tuning—plus practical use cases, advantages, challenges, and best‑practice recommendations.

AdapterLarge Language ModelsLoRA
0 likes · 13 min read
Mastering Fine-Tuning: From Basics to Advanced Techniques for Large Language Models
ByteFE
ByteFE
Jun 13, 2025 · Artificial Intelligence

How AI Coding Powered a 3‑Day English Learning App: Insights from ByteDance’s TRAE

In a three‑day sprint, ByteDance’s VP Hong Dingkun built an English‑learning app using the AI‑coding platform TRAE, illustrating how large‑model‑driven code completion, natural‑language programming, and AI‑enhanced development can dramatically boost productivity, democratize coding, and push the limits of software intelligence.

AI CodingByteDanceLarge Language Models
0 likes · 14 min read
How AI Coding Powered a 3‑Day English Learning App: Insights from ByteDance’s TRAE
Zuoyebang Tech Team
Zuoyebang Tech Team
Jun 12, 2025 · Information Security

How AI‑Powered RAG and Agents Are Revolutionizing Enterprise Security Operations

This article explains how the rise of AI large‑model technology and Retrieval‑Augmented Generation (RAG) combined with autonomous AI agents enable a three‑layer network‑boundary defense, address deep operational challenges such as alert overload and response latency, and dramatically improve incident‑response efficiency in large‑scale enterprises.

AI agentsAI securityLarge Language Models
0 likes · 16 min read
How AI‑Powered RAG and Agents Are Revolutionizing Enterprise Security Operations
Open Source Linux
Open Source Linux
Jun 12, 2025 · Artificial Intelligence

From Transformers to DeepSeek‑R1: The Evolution of Large Language Models (2017‑2025)

This article chronicles the rapid development of large language models from the 2017 Transformer breakthrough through the rise of BERT, GPT‑3, multimodal models, alignment techniques like RLHF, and finally the cost‑efficient DeepSeek‑R1 in 2025, highlighting key innovations, scaling trends, and real‑world impacts.

AI AlignmentDeep LearningLarge Language Models
0 likes · 26 min read
From Transformers to DeepSeek‑R1: The Evolution of Large Language Models (2017‑2025)
Architects' Tech Alliance
Architects' Tech Alliance
Jun 11, 2025 · Artificial Intelligence

From Transformers to DeepSeek‑R1: The 2017‑2025 Evolution of Large Language Models

This article chronicles the rapid development of large language models from the 2017 Transformer breakthrough through the rise of BERT, GPT‑3, ChatGPT, multimodal systems like GPT‑4V/o, and the recent cost‑efficient DeepSeek‑R1, highlighting key architectural innovations, scaling trends, alignment techniques, and their transformative impact on AI research and industry.

AI AlignmentBERTCost‑Efficient Inference
0 likes · 26 min read
From Transformers to DeepSeek‑R1: The 2017‑2025 Evolution of Large Language Models
DataFunTalk
DataFunTalk
Jun 9, 2025 · Artificial Intelligence

Can AI Models Pass the Chinese Math Gaokao? A Fair, Objective Test

The author conducts a transparent, objective assessment of several large language models on the 2025 Chinese national math exam, converting all questions to LaTeX, applying strict Gaokao scoring rules, and revealing each model's strengths and weaknesses across single‑choice, multiple‑choice, and fill‑in‑the‑blank items.

AI benchmarkingGaokaoLarge Language Models
0 likes · 7 min read
Can AI Models Pass the Chinese Math Gaokao? A Fair, Objective Test
DataFunSummit
DataFunSummit
Jun 6, 2025 · Artificial Intelligence

Automating High‑Quality NL2SQL Data Synthesis with Intermediate Representations

This work tackles the difficulty of incorporating extensive domain knowledge into in‑domain NL2SQL tasks by proposing an intermediate‑representation‑based data synthesis method that decouples knowledge compliance from SQL generation, enabling automated creation of high‑quality training data with 60× human efficiency and over 97% accuracy.

Large Language ModelsNL2SQLSQL generation
0 likes · 2 min read
Automating High‑Quality NL2SQL Data Synthesis with Intermediate Representations