Tagged articles
1026 articles
Page 7 of 11
Efficient Ops
Efficient Ops
Mar 9, 2025 · Artificial Intelligence

Essential LLMOps Tools: Build, Deploy, Monitor, and Manage Large Language Models

LLMOps, the end-to-end methodology for managing large language models, encompasses a curated set of development, deployment, monitoring, and local management tools—such as LangChain, vLLM, LangSmith, and Ollama—enabling practitioners to efficiently build, scale, and maintain AI applications.

AI DevelopmentLLMOpsModel Deployment
0 likes · 6 min read
Essential LLMOps Tools: Build, Deploy, Monitor, and Manage Large Language Models
Architects' Tech Alliance
Architects' Tech Alliance
Mar 9, 2025 · Industry Insights

DeepSeek’s AI Ecosystem: From Core Tech to Market Impact

This article provides a comprehensive analysis of DeepSeek, covering its foundational AI research, technology stack, product offerings, and the broader upstream, midstream, and downstream AI industry landscape, including hardware, server, cloud, and market trends.

AI InfrastructureArtificial IntelligenceDeepSeek
0 likes · 13 min read
DeepSeek’s AI Ecosystem: From Core Tech to Market Impact
Fun with Large Models
Fun with Large Models
Mar 8, 2025 · Artificial Intelligence

Make AI Obey: A Detailed Prompt Engineering Guide to Boost Large‑Model Logic

This tutorial explains how to enhance large language models' logical reasoning by using DeepSeek‑R1's deep‑thinking mode, few‑shot prompting, chain‑of‑thought, and zero‑shot chain‑of‑thought techniques, providing concrete examples, comparisons, and a step‑by‑step template for effective prompt design.

AI reasoningChain-of-ThoughtDeepSeek
0 likes · 10 min read
Make AI Obey: A Detailed Prompt Engineering Guide to Boost Large‑Model Logic
Code Mala Tang
Code Mala Tang
Mar 8, 2025 · Artificial Intelligence

14 Powerful Prompt Engineering Techniques to Unlock AI’s Full Potential

This article introduces the fundamentals of prompt engineering and presents fourteen practical techniques—ranging from role‑playing and step‑by‑step reasoning to chain‑of‑thought and ReAct—that help users craft precise, high‑quality prompts for any large language model, dramatically improving AI output.

AIAI productivityLLM techniques
0 likes · 16 min read
14 Powerful Prompt Engineering Techniques to Unlock AI’s Full Potential
Cognitive Technology Team
Cognitive Technology Team
Mar 7, 2025 · Artificial Intelligence

From Word Embeddings to Large Language Models: A Comprehensive Overview of AI Model Evolution

This article traces the development of AI models—from early word embeddings like Word2Vec and ELMo, through transformer‑based encoders such as BERT and decoder‑only models like GPT‑1/2/3, to recent multimodal systems and scaling laws—explaining their architectures, training methods, and impact on modern AI applications.

AIEmbeddingMultimodal
0 likes · 22 min read
From Word Embeddings to Large Language Models: A Comprehensive Overview of AI Model Evolution
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Mar 7, 2025 · Artificial Intelligence

How Pai‑Megatron‑Patch Boosts Qwen2‑VL Multimodal Training Efficiency

This article explains how the Pai‑Megatron‑Patch toolkit enhances the usability and training performance of the Qwen2‑VL multimodal large model by introducing model‑parallel weight conversion, user‑friendly data loading, visual feature processing optimizations, optimizer offloading, and pipeline parallelism techniques, supported by extensive experimental analysis.

MegatronPipeline ParallelismQwen2-VL
0 likes · 25 min read
How Pai‑Megatron‑Patch Boosts Qwen2‑VL Multimodal Training Efficiency
dbaplus Community
dbaplus Community
Mar 7, 2025 · Artificial Intelligence

Master Prompt Engineering: Frameworks, Strategies, and Real‑World Examples for Large Language Models

This comprehensive guide explains what prompts are, outlines essential prompt components and multiple engineering frameworks, presents practical strategies for crafting clear and structured prompts, addresses model limitations such as hallucinations, and showcases a wide range of advanced prompting techniques with code examples.

AILLMPrompt engineering
0 likes · 29 min read
Master Prompt Engineering: Frameworks, Strategies, and Real‑World Examples for Large Language Models
Data Thinking Notes
Data Thinking Notes
Mar 6, 2025 · Artificial Intelligence

How China’s State‑Owned Giants Are Accelerating AI with DeepSeek

Amid a global digital surge, 45% of China’s central state‑owned enterprises have deployed the DeepSeek large‑model platform, rapidly integrating AI across energy, power, telecom, construction and other sectors to boost intelligent transformation and operational efficiency.

AI adoptionChinaDeepSeek
0 likes · 7 min read
How China’s State‑Owned Giants Are Accelerating AI with DeepSeek
JD Retail Technology
JD Retail Technology
Mar 6, 2025 · Artificial Intelligence

Dynamic Margin Selection for Efficient Deep Learning and Low-Resource Large Model Training

Jia Xing’s research introduces Dynamic Margin Selection, a technique that repeatedly refreshes a core set of boundary‑close samples to train large language models efficiently on limited resources, achieving comparable loss to full‑data training, enabling six‑fold model compression, faster inference, and a proposed exponential scaling law for data‑efficient AI.

ICLRLow‑Resource Trainingdynamic data selection
0 likes · 10 min read
Dynamic Margin Selection for Efficient Deep Learning and Low-Resource Large Model Training
Tencent Technical Engineering
Tencent Technical Engineering
Mar 5, 2025 · Information Security

Detecting Critical AI Infrastructure Vulnerabilities with AI-Infra-Guard

As open‑source large language model tools like Ollama, OpenWebUI and ComfyUI gain popularity, numerous security flaws such as unauthenticated APIs, CVE‑exploits, model theft and remote code execution emerge, prompting the development of AI‑Infra‑Guard—a lightweight, cross‑platform scanner that identifies over 30 component vulnerabilities and offers both web UI and CLI modes for rapid risk assessment.

AI securityAI-Infra-GuardCVE
0 likes · 13 min read
Detecting Critical AI Infrastructure Vulnerabilities with AI-Infra-Guard
Architects' Tech Alliance
Architects' Tech Alliance
Mar 5, 2025 · Industry Insights

DeepSeek R1 & Kimi 1.5: Inside the Development of Near‑Strong Reasoning Models

The article analyzes DeepSeek's recent releases—V3 dialogue model and R1 inference model—detailing their launch dates, rapid popularity surge, R1's reinforcement‑learning‑based design for code and math tasks, and provides links to related Beijing University technical reports while stripping promotional sales content.

AIDeepSeekIndustry Analysis
0 likes · 3 min read
DeepSeek R1 & Kimi 1.5: Inside the Development of Near‑Strong Reasoning Models
AntTech
AntTech
Mar 4, 2025 · Artificial Intelligence

GraphCLIP and 2D‑TPE: Enhancing Transferability of Graph Models and Table Understanding for Large Language Models

This article introduces GraphCLIP, a self‑supervised graph‑summary pre‑training framework that boosts zero‑ and few‑shot transferability of graph foundation models for text‑attributed graphs, and 2D‑TPE, a two‑dimensional positional encoding method that preserves table structure to markedly improve large language model performance on table‑understanding tasks, while also announcing a live paper session at WWW 2025 featuring the authors.

Positional EncodingSelf‑Supervised LearningTable Understanding
0 likes · 6 min read
GraphCLIP and 2D‑TPE: Enhancing Transferability of Graph Models and Table Understanding for Large Language Models
JD Retail Technology
JD Retail Technology
Feb 28, 2025 · Artificial Intelligence

Generative Recommendation with DPO Alignment for JD Alliance Advertising: Multi‑Objective Optimization and Online Results

The paper presents a generative recommendation framework for JD Alliance advertising that combines semantic‑ID modeling, large‑model pre‑training and fine‑tuning, and Direct Preference Optimization (including Softmax‑DPO and β‑DPO) to jointly boost click‑through and conversion rates, achieving +0.6% UCTR and +8% UCVR in online tests while outlining future multi‑objective extensions.

AdvertisingDPOGenerative Recommendation
0 likes · 12 min read
Generative Recommendation with DPO Alignment for JD Alliance Advertising: Multi‑Objective Optimization and Online Results
Architect
Architect
Feb 27, 2025 · Artificial Intelligence

Understanding Inference Large Language Models: DeepSeek‑R1 and the Rise of Test‑Time Computation

This article explains how inference‑oriented large language models such as DeepSeek‑R1 and OpenAI o1‑mini shift AI research from training‑time scaling to test‑time computation, detailing the underlying principles, new scaling laws, verification techniques, reinforcement‑learning pipelines, and practical methods for distilling reasoning capabilities into smaller models.

DeepSeek-R1InferenceReinforcement Learning
0 likes · 18 min read
Understanding Inference Large Language Models: DeepSeek‑R1 and the Rise of Test‑Time Computation
Code Mala Tang
Code Mala Tang
Feb 27, 2025 · Artificial Intelligence

Do New AI Reasoning Models Really Think? Unpacking the Debate

The article examines whether the latest AI models that claim to perform true reasoning—by breaking problems into steps and using chain‑of‑thought—actually reason like humans, presenting skeptical and supportive expert viewpoints, and offering practical guidance on how to use such models responsibly.

AI SafetyAI reasoningChain-of-Thought
0 likes · 14 min read
Do New AI Reasoning Models Really Think? Unpacking the Debate
DataFunSummit
DataFunSummit
Feb 26, 2025 · Artificial Intelligence

Applying Multimodal Large Models to Music Recommendation at NetEase Cloud Music

This article details how NetEase Cloud Music leverages multimodal large language models to improve music recommendation across daily, personalized, and playlist scenarios by extracting rich audio, text, and visual features, addressing data skew, cold‑start challenges, and achieving measurable gains in user engagement and distribution efficiency.

Multimodal AINetEase Cloud Musicfeature extraction
0 likes · 12 min read
Applying Multimodal Large Models to Music Recommendation at NetEase Cloud Music
AntTech
AntTech
Feb 26, 2025 · Artificial Intelligence

Ant Group’s 18 Accepted Papers at AAAI 2025: Summaries and Highlights

This article presents concise English summaries of the 18 Ant Group papers accepted at AAAI 2025, covering topics such as privacy‑preserving large‑model tuning, knowledge‑graph integration, AI‑generated image detection, multi‑task learning, generative retrieval, role‑playing evaluation, and video hallucination mitigation.

AAAI 2025AI EvaluationGenerative Retrieval
0 likes · 29 min read
Ant Group’s 18 Accepted Papers at AAAI 2025: Summaries and Highlights
Ops Development & AI Practice
Ops Development & AI Practice
Feb 25, 2025 · Artificial Intelligence

What Is Hybrid Reasoning in Claude 3.7 Sonnet and Why It Matters

Hybrid reasoning lets Claude 3.7 Sonnet dynamically switch between fast, intuition‑like answers and step‑by‑step, deep analysis, improving both speed and accuracy for tasks ranging from simple code snippets to complex algorithm design, and signals a broader shift in large language model capabilities.

AI reasoningClaude 3.7Hybrid Reasoning
0 likes · 9 min read
What Is Hybrid Reasoning in Claude 3.7 Sonnet and Why It Matters
21CTO
21CTO
Feb 25, 2025 · Artificial Intelligence

How Alibaba’s Qwen 2.5‑Max Challenges GPT‑4o and Redefines China’s AI Race

Chinese tech giants Huawei and Alibaba respond to President Xi’s call for stronger innovation, with Huawei showcasing its HarmonyOS and server‑grade Arm processor while Alibaba unveils the Qwen 2.5‑Max large language model that outperforms leading Western AI systems on multiple benchmarks, highlighting China’s accelerating AI ambitions.

AIAlibabaChina
0 likes · 5 min read
How Alibaba’s Qwen 2.5‑Max Challenges GPT‑4o and Redefines China’s AI Race
Architecture Digest
Architecture Digest
Feb 25, 2025 · Artificial Intelligence

DeepSeek Distillation Technology: Overview, Innovations, Architecture, Training, Performance, and Challenges

DeepSeek’s distillation technology combines data and model distillation to transfer knowledge from large teacher models to compact student models, detailing its definitions, principles, key innovations, architecture, training methods, performance gains, and challenges, especially in multimodal contexts.

AI researchDeepSeekknowledge distillation
0 likes · 16 min read
DeepSeek Distillation Technology: Overview, Innovations, Architecture, Training, Performance, and Challenges
21CTO
21CTO
Feb 24, 2025 · Artificial Intelligence

From Transformers to DeepSeek-R1: Evolution of Large Language Models

Since the 2017 introduction of the Transformer architecture, this article chronicles the rapid development of large language models—including BERT, GPT series, multimodal systems, and the cost‑effective DeepSeek‑R1—highlighting key innovations, scaling trends, alignment techniques, and their transformative impact across AI research and industry.

AI evolutionDeepSeekLLM History
0 likes · 23 min read
From Transformers to DeepSeek-R1: Evolution of Large Language Models
Architects' Tech Alliance
Architects' Tech Alliance
Feb 24, 2025 · Artificial Intelligence

NSA: Hardware‑Optimized Sparse Attention Mechanism from DeepSeek, Peking University and University of Washington

The NSA mechanism introduces a three‑branch hardware‑optimized sparse attention architecture—token compression, token selection, and sliding window—combined with learnable gating to balance global and local context, dramatically improving inference speed and efficiency for long‑context large language models.

AI ArchitectureDeepSeekHardware acceleration
0 likes · 5 min read
NSA: Hardware‑Optimized Sparse Attention Mechanism from DeepSeek, Peking University and University of Washington
Su San Talks Tech
Su San Talks Tech
Feb 23, 2025 · Artificial Intelligence

How DeepSeek’s Distillation Breaks AI Model Limits: Core Principles & Performance

This article explores DeepSeek’s cutting‑edge distillation technology, detailing its definition, underlying principles, innovative data‑model fusion, architecture choices, training strategies, performance gains over large language models, and the remaining challenges in knowledge transfer and multimodal data processing.

AI OptimizationDeepSeekMultimodal Learning
0 likes · 16 min read
How DeepSeek’s Distillation Breaks AI Model Limits: Core Principles & Performance
Architect
Architect
Feb 19, 2025 · Artificial Intelligence

Does Scaling Law Still Hold for Grok 3? A Deep Dive into LLM Training Economics

The article critically examines whether the pre‑training Scaling Law still applies to Grok 3, compares its compute usage and model size with DeepSeek and OpenAI models, evaluates the cost‑effectiveness of pre‑training, RL and test‑time scaling, and explores how these insights shape future large‑language‑model development strategies.

Grok-3Pre‑trainingRL scaling
0 likes · 11 min read
Does Scaling Law Still Hold for Grok 3? A Deep Dive into LLM Training Economics
AI Algorithm Path
AI Algorithm Path
Feb 19, 2025 · Artificial Intelligence

How Temperature Shapes Output in Large Language Models

The article explains the Temperature hyper‑parameter in large language models, shows how it modifies the softmax distribution, provides a Python visualisation script, and demonstrates through experiments that higher values increase creativity while lower values make outputs more deterministic.

PythonSamplingSoftmax
0 likes · 5 min read
How Temperature Shapes Output in Large Language Models
DataFunTalk
DataFunTalk
Feb 19, 2025 · Artificial Intelligence

Large Models: Concepts, Principles, Classifications and Applications

This report provides a comprehensive overview of large-scale AI models, explaining their definition, massive parameter and data requirements, underlying transformer architecture, classification into language, vision and multimodal models, notable examples such as DeepSeek, and a survey of popular AIGC tools and practical use cases.

AIGC toolsDeep LearningMultimodal AI
0 likes · 9 min read
Large Models: Concepts, Principles, Classifications and Applications
Architects' Tech Alliance
Architects' Tech Alliance
Feb 19, 2025 · Industry Insights

Why DeepSeek One‑Stop AI Machines Are Redefining Private Model Deployment

The surge in demand for private AI deployment has prompted multiple vendors to launch DeepSeek one‑stop machines—integrated hardware solutions that support the full DeepSeek model family, offering higher stability, easier setup, customization, cost savings, and data security across diverse industry scenarios.

AI InfrastructureAI hardwareDeepSeek
0 likes · 7 min read
Why DeepSeek One‑Stop AI Machines Are Redefining Private Model Deployment
Tencent Cloud Developer
Tencent Cloud Developer
Feb 19, 2025 · Industry Insights

Why Every Enterprise Needs a Knowledge‑Management System in the LLM Era

The article analyzes how the shift from data‑driven to knowledge‑driven operations, powered by large language models like DeepSeek, forces companies to build dynamic knowledge‑management platforms that integrate personal and corporate knowledge, improve efficiency, and create sustainable competitive advantage.

DeepSeekDigital TransformationEnterprise AI
0 likes · 14 min read
Why Every Enterprise Needs a Knowledge‑Management System in the LLM Era
Architects' Tech Alliance
Architects' Tech Alliance
Feb 18, 2025 · Artificial Intelligence

How DeepSeek’s Latest Models Redefine AI Performance and Industry Adoption

The DeepSeek report details rapid model releases from 2024 onward, highlighting innovations such as model distillation, a 671 B MoE architecture, FP8 mixed‑precision, and the Janus‑Pro multimodal framework, while also documenting major cloud and chip providers' integration of these models into their services.

AI industry adoptionDeepSeekMoE architecture
0 likes · 10 min read
How DeepSeek’s Latest Models Redefine AI Performance and Industry Adoption
DataFunTalk
DataFunTalk
Feb 18, 2025 · Artificial Intelligence

CODEI/O: Leveraging Code to Train Large Language Models for Enhanced Reasoning

The DeepSeek team introduced CODEI/O, a massive dataset that converts code into natural‑language reasoning chains, and demonstrated that training large language models on this data markedly improves their performance on diverse inference tasks, including non‑code domains, through a two‑stage training strategy.

CODEI/ODatasetcode reasoning
0 likes · 8 min read
CODEI/O: Leveraging Code to Train Large Language Models for Enhanced Reasoning
Cognitive Technology Team
Cognitive Technology Team
Feb 18, 2025 · Artificial Intelligence

Two Major Bottlenecks in Deploying Large Language Models: Machine Deception and Hallucination

Deploying large language models faces two critical challenges—machine deception, where AI generates plausible yet false content, and machine hallucination, where outputs are logically coherent but factually inaccurate—both undermining trust, and the article outlines their causes, impacts, and technical, ethical, and regulatory mitigation strategies.

Artificial IntelligenceMachine Deceptionhallucination
0 likes · 6 min read
Two Major Bottlenecks in Deploying Large Language Models: Machine Deception and Hallucination
Architect's Alchemy Furnace
Architect's Alchemy Furnace
Feb 17, 2025 · Artificial Intelligence

24 Proven Prompt Formulas to Unlock DeepSeek’s Full Potential

Discover a comprehensive collection of 24 structured prompting techniques—from basic role‑play formulas to advanced cross‑disciplinary and managerial frameworks—designed to help users of DeepSeek and other large language models craft precise, high‑impact queries that dramatically improve response quality and efficiency.

AI promptingDeepSeekPrompt engineering
0 likes · 12 min read
24 Proven Prompt Formulas to Unlock DeepSeek’s Full Potential
Java Architecture Diary
Java Architecture Diary
Feb 17, 2025 · Artificial Intelligence

What Is LLMs.txt? The New AI‑Friendly Web Standard Explained

LLMs.txt is a lightweight, AI‑optimized web standard that provides concise Markdown navigation files for large language models, addressing context limits, redundant content, and lack of structure, and is already adopted by companies like Mintlify, Anthropic, and Cursor.

AI StandardsAI indexinglarge language models
0 likes · 6 min read
What Is LLMs.txt? The New AI‑Friendly Web Standard Explained
Fun with Large Models
Fun with Large Models
Feb 16, 2025 · Artificial Intelligence

Can You Claim to Know Large Models? Guide to Distillation, Quantization & Fine‑Tuning

This article explains why the massive DeepSeek V3/R1 model (671 B parameters) is hard to deploy and introduces three key techniques—model distillation, quantization, and fine‑tuning—that can shrink, accelerate, or specialize large models, while outlining their trade‑offs and practical steps.

AI model compressionDeepSeeklarge language models
0 likes · 10 min read
Can You Claim to Know Large Models? Guide to Distillation, Quantization & Fine‑Tuning
Architects' Tech Alliance
Architects' Tech Alliance
Feb 16, 2025 · Artificial Intelligence

How DeepSeek’s Distillation Breaks Bottlenecks and Boosts Multimodal AI Performance

This article provides an in‑depth technical analysis of DeepSeek’s model distillation technology, covering its core principles, innovative data‑model fusion strategies, architecture design, training optimizations, performance benchmarks, and the remaining challenges of scaling distillation to multimodal tasks.

AI OptimizationDeepSeekMultimodal
0 likes · 16 min read
How DeepSeek’s Distillation Breaks Bottlenecks and Boosts Multimodal AI Performance
Lao Guo's Learning Space
Lao Guo's Learning Space
Feb 15, 2025 · Artificial Intelligence

What Is deepseek-MoE? Understanding the Mixture‑of‑Experts Architecture

The article explains deepseek-MoE (Mixture of Experts), describing its full English name, Chinese translation, how a gating network selects and weights multiple expert models for each input, and uses an analogy to illustrate load‑balancing and the divide‑and‑conquer design in large AI models.

AI ArchitectureMixture of Expertsdeepseek-MoE
0 likes · 2 min read
What Is deepseek-MoE? Understanding the Mixture‑of‑Experts Architecture
Ops Development & AI Practice
Ops Development & AI Practice
Feb 14, 2025 · Artificial Intelligence

Large Model Format Showdown: Hugging Face, TensorFlow, ONNX, TorchScript, GGUF

This comprehensive guide examines the leading large‑model storage formats—including Hugging Face Transformers, TensorFlow SavedModel, ONNX, TorchScript, and GGUF—detailing their file structures, serialization methods, strengths, weaknesses, and typical use‑cases, helping developers and researchers select the optimal format for their specific AI workloads.

AI deploymentGGUFModel Formats
0 likes · 21 min read
Large Model Format Showdown: Hugging Face, TensorFlow, ONNX, TorchScript, GGUF
DataFunSummit
DataFunSummit
Feb 14, 2025 · Artificial Intelligence

Building Large‑Scale Recommendation Systems with Big Data and Large Language Models on Alibaba Cloud AI Platform

This presentation details how Alibaba Cloud's AI platform integrates big‑data pipelines, feature‑store services, and large language model capabilities to construct high‑performance search‑recommendation architectures, covering system design, training and inference optimizations, LLM‑driven use cases, and open‑source RAG tooling.

AI PlatformBig DataDistributed Training
0 likes · 17 min read
Building Large‑Scale Recommendation Systems with Big Data and Large Language Models on Alibaba Cloud AI Platform
Top Architect
Top Architect
Feb 14, 2025 · Artificial Intelligence

DeepSeek Model Distillation: Principles, Innovations, Architecture, and Performance

This article provides an in‑depth overview of DeepSeek’s model distillation technology, covering its definition, core principles, innovative data‑model distillation integration, architecture design, training strategies, performance gains, and the challenges of scaling to multimodal data.

AI OptimizationDeepSeekKnowledge Transfer
0 likes · 16 min read
DeepSeek Model Distillation: Principles, Innovations, Architecture, and Performance
Ma Wei Says
Ma Wei Says
Feb 13, 2025 · Artificial Intelligence

Master AI Prompting: 5 Proven Techniques to Unlock Accurate Outputs

This guide presents five practical prompting techniques—including structured output, role‑playing, visual conversion, multi‑turn refinement, and multilingual handling—plus industry‑specific examples and common pitfalls, helping users craft precise commands for AI models like DeepSeek.

AI promptingPrompt engineeringStructured Output
0 likes · 8 min read
Master AI Prompting: 5 Proven Techniques to Unlock Accurate Outputs
Architect
Architect
Feb 12, 2025 · Artificial Intelligence

Can S‑Curve Theory Explain the Limits of Large‑Model Scaling Laws?

The article analyses how S‑shaped growth curves can model the apparent scaling laws of large language models, discusses the three phases of model development, proposes an ability‑density hypothesis, and explores future scenarios where scaling laws may plateau or shift.

AI growthAbility DensityModel Training
0 likes · 16 min read
Can S‑Curve Theory Explain the Limits of Large‑Model Scaling Laws?
Architect
Architect
Feb 12, 2025 · Artificial Intelligence

Master Prompt Engineering: A Universal Framework for LLMs

This article presents a comprehensive, step‑by‑step Prompt engineering framework—including role definition, problem description, goal setting, and requirement specification—augmented with techniques such as RAG, few‑shot examples, memory handling, and parameter tuning, enabling users to craft effective prompts for large language models across domains.

AI Prompt OptimizationFew-ShotMemory
0 likes · 27 min read
Master Prompt Engineering: A Universal Framework for LLMs
AIWalker
AIWalker
Feb 11, 2025 · Artificial Intelligence

LLMDet: LLM‑Powered Open‑Vocabulary Detector Beats Grounding DINO

LLMDet introduces a novel training pipeline that leverages large language models to generate detailed image‑level captions and region‑level phrases, fine‑tunes an open‑vocabulary detector with the GroundingCap‑1M dataset, and achieves state‑of‑the‑art zero‑shot performance surpassing Grounding DINO across multiple benchmarks.

GroundingCapLLMDetlarge language models
0 likes · 20 min read
LLMDet: LLM‑Powered Open‑Vocabulary Detector Beats Grounding DINO
DataFunTalk
DataFunTalk
Feb 11, 2025 · Artificial Intelligence

Roundtable on Enhancing Large Model Effectiveness: RAG, Tool Use, and Knowledge Engineering

Experts from Dipu, Ant Financial, iKang, and Zhihu discuss practical strategies for improving large model performance, covering RAG, tool‑using, offline knowledge engineering, multimodal training, evaluation metrics, and future trends, while sharing case studies from manufacturing, healthcare, retail, and C‑end applications.

Knowledge EngineeringRAGlarge language models
0 likes · 9 min read
Roundtable on Enhancing Large Model Effectiveness: RAG, Tool Use, and Knowledge Engineering
Cognitive Technology Team
Cognitive Technology Team
Feb 10, 2025 · Artificial Intelligence

Survey of Major Chinese AI Large Language Models: Technologies, Innovations, and Comparative Evaluation

This report systematically reviews the key technologies, innovations, and performance of leading Chinese AI large language models—including DeepSeek, Kimi, and Qwen2.5—detailing their architectures, training methods, multimodal capabilities, and comparative evaluations against each other and foreign models.

AIChinalarge language models
0 likes · 20 min read
Survey of Major Chinese AI Large Language Models: Technologies, Innovations, and Comparative Evaluation
AI Algorithm Path
AI Algorithm Path
Feb 10, 2025 · Artificial Intelligence

Understanding DualPipe: DeepDive into DeepSeek‑R1 Architecture (Part 5)

This article explains how the DualPipe scheduling mechanism in DeepSeek‑R1 improves GPU cluster compute‑communication efficiency by using fine‑grained pipeline stages and bidirectional data flow, comparing it with Zero Bubble pipeline parallelism and discussing the challenges of large‑scale distributed training.

DeepSeekDistributed TrainingDualPipe
0 likes · 10 min read
Understanding DualPipe: DeepDive into DeepSeek‑R1 Architecture (Part 5)
IT Architects Alliance
IT Architects Alliance
Feb 10, 2025 · Artificial Intelligence

DeepSeek Distillation Technology: Principles, Innovations, Performance, and Future Outlook

The article explains DeepSeek's model distillation technique, covering its fundamental knowledge‑transfer principles, unique innovations such as data‑model fusion and task‑specific strategies, impressive benchmark results, practical applications in edge and online inference, existing challenges, and future research directions.

AI OptimizationDeep LearningEdge Computing
0 likes · 15 min read
DeepSeek Distillation Technology: Principles, Innovations, Performance, and Future Outlook
Baidu Geek Talk
Baidu Geek Talk
Feb 10, 2025 · Artificial Intelligence

How Baidu Cloud Slashes Inference Costs: DeepSeek Model Optimizations Unveiled

Baidu Cloud's Qianfan platform launched DeepSeek‑R1 and DeepSeek‑V3 with ultra‑low inference pricing, leveraging advanced engine performance tweaks, a split Prefill/Decode architecture, and comprehensive security measures that together boost throughput, cut costs, and ensure enterprise‑grade reliability.

AI inferenceBaidu CloudModel Serving
0 likes · 5 min read
How Baidu Cloud Slashes Inference Costs: DeepSeek Model Optimizations Unveiled
Architects' Tech Alliance
Architects' Tech Alliance
Feb 10, 2025 · Artificial Intelligence

Why DeepSeek Is Disrupting the Global AI Landscape: Tech, Cost, and Open‑Source Edge

DeepSeek, a Chinese AI startup, has rapidly risen to global prominence by releasing high‑performance large language models such as V2, V3, and R1, which combine innovative architectures, dramatically lower training costs, and an open‑source strategy that challenges established AI giants and reshapes industry dynamics.

Artificial IntelligenceChina AIDeepSeek
0 likes · 14 min read
Why DeepSeek Is Disrupting the Global AI Landscape: Tech, Cost, and Open‑Source Edge
Open Source Linux
Open Source Linux
Feb 10, 2025 · Artificial Intelligence

How DeepSeek R1 Uses Large‑Scale Reinforcement Learning to Replicate OpenAI o1

This article examines DeepSeek R1’s large‑scale reinforcement‑learning approach, its training pipeline that combines rule‑based scaling and deep‑reasoning SFT data, and why its open‑source, low‑cost replication of OpenAI o1 marks a pivotal step toward more efficient, democratized AI models.

AI efficiencyDeepSeekModel Scaling
0 likes · 18 min read
How DeepSeek R1 Uses Large‑Scale Reinforcement Learning to Replicate OpenAI o1
DevOps
DevOps
Feb 9, 2025 · Artificial Intelligence

DeepSeek’s Impact on the Large Model Ecosystem and the Resurgence of AI PCs

The article examines DeepSeek’s rapid rise, its open‑source R1 model and distilled variants, the resurgence of AI PCs, hardware support from Nvidia, AMD and others, and how this ecosystem is reshaping personal AI experiences and the broader large‑model landscape.

AI PCDeepSeekHardware
0 likes · 11 min read
DeepSeek’s Impact on the Large Model Ecosystem and the Resurgence of AI PCs
AI Algorithm Path
AI Algorithm Path
Feb 9, 2025 · Artificial Intelligence

Understanding Multi-Token Prediction in DeepSeek‑R1 Architecture

This article dissects the Multi‑Token Prediction (MTP) technique used in DeepSeek‑R1, contrasting it with traditional next‑token prediction, detailing Meta’s MTP design, DeepSeek’s adapted architecture, loss weighting, and why MTP is applied only during training to boost efficiency and model capability.

DeepSeekMTPModel architecture
0 likes · 9 min read
Understanding Multi-Token Prediction in DeepSeek‑R1 Architecture
Architect
Architect
Feb 9, 2025 · Artificial Intelligence

How DeepSeek’s Model Distillation Boosts AI Efficiency and Performance

This article provides an in‑depth analysis of DeepSeek’s model distillation technology, covering its definition, core principles, innovative strategies, architecture design, training optimizations, benchmark results, efficiency gains, and the remaining challenges of applying distillation to large language models and multimodal data.

AI efficiencyDeepSeekKnowledge Transfer
0 likes · 16 min read
How DeepSeek’s Model Distillation Boosts AI Efficiency and Performance
Architects' Tech Alliance
Architects' Tech Alliance
Feb 9, 2025 · Artificial Intelligence

How DeepSeek R1 Replicates OpenAI o1 Using Large‑Scale Reinforcement Learning

The article provides an in‑depth technical analysis of DeepSeek R1, explaining how it reproduces OpenAI o1's reasoning abilities through rule‑based large‑scale reinforcement learning, mixed SFT data, and efficient scaling, while discussing its broader impact on AI model development and capability density trends.

AI industryCapability DensityDeepSeek
0 likes · 19 min read
How DeepSeek R1 Replicates OpenAI o1 Using Large‑Scale Reinforcement Learning
AI2ML AI to Machine Learning
AI2ML AI to Machine Learning
Feb 8, 2025 · Artificial Intelligence

Analyzing DeepSeek R1 Inference Projects: Source Code, Cold‑Start, and Scaling Techniques

This article examines DeepSeek R1’s three breakthroughs, its low‑cost optimizations that bypass CUDA, and the resulting impact on the AI ecosystem, then provides a detailed technical review of seven open‑source reproductions—Open‑R1, Tiny‑Zero, SimpleScaling‑S1, and simpleRL‑reason—covering their architectures, reinforcement‑learning pipelines, and code implementations.

DeepSeekInference ScalingPTX
0 likes · 10 min read
Analyzing DeepSeek R1 Inference Projects: Source Code, Cold‑Start, and Scaling Techniques
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Feb 8, 2025 · Artificial Intelligence

Why DeepSeek V3 and R1 Are Redefining Low‑Cost AI: Architecture, Training Tricks, and Industry Impact

This article analyses DeepSeek's V3 and R1 models, explaining how their innovative MoE architecture, Multi‑Head Latent Attention, low‑cost training strategies, and distributed‑training optimizations deliver high‑performance large language models while reducing GPU/NPU demand and sparking industry excitement.

AI inferenceDeepSeekMixture of Experts
0 likes · 16 min read
Why DeepSeek V3 and R1 Are Redefining Low‑Cost AI: Architecture, Training Tricks, and Industry Impact
IT Services Circle
IT Services Circle
Feb 7, 2025 · Artificial Intelligence

Building Low‑Cost AI Clusters with Old Phones Using Exo and Open WebUI

This article introduces Exo, an open‑source platform that lets you turn idle smartphones, tablets, and laptops into a distributed AI cluster capable of running large language models, and shows how Open WebUI provides a user‑friendly interface for deploying private AI assistants.

AI clusteringDistributed inferenceExo
0 likes · 6 min read
Building Low‑Cost AI Clusters with Old Phones Using Exo and Open WebUI
Java Captain
Java Captain
Feb 7, 2025 · Artificial Intelligence

DeepSeek: Disruptive Innovations in Large Language Model Architecture, Efficiency, and Ecosystem

DeepSeek reshapes the AI landscape by replacing brute‑force compute scaling with algorithmic breakthroughs such as a novel MoE architecture, memory compression, active‑learning data pipelines, and open‑source tooling, delivering dramatically lower training and inference costs while enabling edge deployment and a vibrant developer ecosystem.

Algorithmic EfficiencyDeepSeekMoE
0 likes · 11 min read
DeepSeek: Disruptive Innovations in Large Language Model Architecture, Efficiency, and Ecosystem
Tencent Cloud Developer
Tencent Cloud Developer
Feb 6, 2025 · Artificial Intelligence

DeepSeek V Series: Technical Overview of Scaling Laws, Grouped Query Attention, and Mixture‑of‑Experts

The article reviews DeepSeek’s V‑series papers, explaining how scaling‑law insights, Grouped Query Attention, a depth‑first design, loss‑free load balancing, multi‑token prediction and Multi‑Head Latent Attention together enable economical mixture‑of‑experts LLMs that rival closed‑source models while cutting compute and hardware costs.

DeepSeekGrouped Query AttentionMixture of Experts
0 likes · 13 min read
DeepSeek V Series: Technical Overview of Scaling Laws, Grouped Query Attention, and Mixture‑of‑Experts
Alibaba Cloud Developer
Alibaba Cloud Developer
Feb 5, 2025 · Artificial Intelligence

10 Common Prompt Engineering Mistakes and How to Overcome Them

This article lists ten common misconceptions about prompt engineering, explains why each is flawed, and offers practical insights and strategies—such as using the CO‑STAR framework, tailoring prompts to specific models, keeping prompts concise, and continuously testing and refining—to help readers communicate effectively with large language models.

AI misconceptionsLLMPrompt Design
0 likes · 10 min read
10 Common Prompt Engineering Mistakes and How to Overcome Them
Architect
Architect
Feb 3, 2025 · Artificial Intelligence

How DeepSeek‑R1 Uses Pure Reinforcement Learning to Match OpenAI’s o1

This article presents DeepSeek‑R1 and DeepSeek‑R1‑Zero, two next‑generation LLMs trained with pure reinforcement learning and multi‑stage fine‑tuning, details their GRPO training framework, model‑distillation pipeline, open‑source release, and evaluation results that rival OpenAI’s o1‑1217 across reasoning, knowledge, and coding benchmarks.

DeepSeekLLM evaluationOpenAI o1
0 likes · 10 min read
How DeepSeek‑R1 Uses Pure Reinforcement Learning to Match OpenAI’s o1
Cognitive Technology Team
Cognitive Technology Team
Feb 3, 2025 · Artificial Intelligence

DeepSeek R1 Introduces Group‑Related Policy Optimization for Advanced Reasoning in Large Language Models

DeepSeek AI’s new open‑source model DeepSeek‑R1 leverages a novel Group‑Related Policy Optimization (GRPO) reinforcement‑learning framework and multi‑stage training to dramatically boost complex reasoning performance, achieving AIME 2024 Pass@1 scores comparable to OpenAI’s o1 model.

AIDeepSeekGRPO
0 likes · 4 min read
DeepSeek R1 Introduces Group‑Related Policy Optimization for Advanced Reasoning in Large Language Models
DataFunSummit
DataFunSummit
Jan 31, 2025 · Artificial Intelligence

LLMOps: Building a Prompt‑Driven Engine for AI Operations

This article presents the concept of LLMOps—applying large language models to AIOps—by analyzing prompt challenges, introducing the LogPrompt engine for log analysis, describing a prompt‑learning data flywheel with CoachLM optimization, reporting experimental results, and outlining future multi‑modal directions.

CoachLMData FlywheelLLMOps
0 likes · 16 min read
LLMOps: Building a Prompt‑Driven Engine for AI Operations
JD Cloud Developers
JD Cloud Developers
Jan 26, 2025 · Operations

How Large Language Models are Transforming Modern IT Operations

This article traces the evolution of IT operations from manual tasks to automation, AIOps, and ChatOps, and explains how large language models boost efficiency, enable intelligent assistants, automated diagnosis, and smart log analysis for more reliable, automated Ops workflows.

ChatOpsaiopslarge language models
0 likes · 7 min read
How Large Language Models are Transforming Modern IT Operations
ByteDance Web Infra
ByteDance Web Infra
Jan 22, 2025 · Artificial Intelligence

Introducing UI‑TARS: A Native GUI Agent Model Integrated with Midscene.js for Multimodal UI Automation

The article presents UI‑TARS, a native GUI‑agent model that combines multimodal large‑language models with the open‑source Midscene.js framework to enable more accurate, token‑efficient, and privacy‑preserving UI automation, while discussing its architecture, advantages, limitations, and integration steps.

GUI AgentMidscene.jsMultimodal AI
0 likes · 11 min read
Introducing UI‑TARS: A Native GUI Agent Model Integrated with Midscene.js for Multimodal UI Automation
Bilibili Tech
Bilibili Tech
Jan 21, 2025 · Artificial Intelligence

Accelerating Large Model Inference: Challenges and Multi‑Level Optimization Strategies

The article outlines how exploding LLM sizes create compute, memory, and latency bottlenecks and proposes a full‑stack solution—operator fusion, high‑performance libraries, quantization, speculative decoding, sharding, contiguous batching, PageAttention, and specialized frameworks like MindIE‑LLM—to dramatically boost inference throughput and reduce latency, while highlighting future ultra‑low‑bit and heterogeneous hardware directions.

Continuous BatchingHardware OptimizationInference Acceleration
0 likes · 21 min read
Accelerating Large Model Inference: Challenges and Multi‑Level Optimization Strategies
Baidu Tech Salon
Baidu Tech Salon
Jan 8, 2025 · Artificial Intelligence

Evolution of Video Search Ranking Architecture Toward an End‑to‑End Large‑Model Framework

The paper describes transforming a tightly coupled, multi‑stage video search ranking pipeline into a modular, end‑to‑end large‑model architecture that decouples recall, employs a graph‑engine parallel framework and elastic compute allocation, thereby boosting performance, flexibility, personalization and lowering long‑term operational costs.

End-to-EndSystem optimizationelastic resources
0 likes · 10 min read
Evolution of Video Search Ranking Architecture Toward an End‑to‑End Large‑Model Framework
ZhongAn Tech Team
ZhongAn Tech Team
Jan 5, 2025 · Artificial Intelligence

Weekly AI Roundup Issue 9: OpenAI Vision, LeCun Interview, ByteDance HLLM, and DeepSeek‑V3 Highlights

This issue presents a curated overview of recent AI developments, including Sam Altman's 2025 technology vision poll, LeCun's interview on future AI directions, ByteDance's hierarchical large language model for recommendation, and the performance and cost advantages of the open‑source DeepSeek‑V3 model.

AIByteDanceDeepSeek
0 likes · 10 min read
Weekly AI Roundup Issue 9: OpenAI Vision, LeCun Interview, ByteDance HLLM, and DeepSeek‑V3 Highlights
DataFunTalk
DataFunTalk
Jan 1, 2025 · Artificial Intelligence

Applying Large Language Models to Financial Risk Control at Akulaku

This article details Akulaku’s deployment of large language models across multimodal financial risk‑control scenarios—covering business background, a three‑module intelligent‑agent architecture, concrete tool‑ and planning‑enhancement case studies, and future outlook—demonstrating how LLMs boost efficiency, reduce labeling effort, and enable copilot‑style assistance.

Agent ArchitectureKYC verificationMultimodal AI
0 likes · 15 min read
Applying Large Language Models to Financial Risk Control at Akulaku
DataFunSummit
DataFunSummit
Dec 31, 2024 · Artificial Intelligence

How Momo Leverages Large Model Technology to Transform Business and R&D Processes

This article explains how Momo utilizes large language model technologies to revamp its AI application paradigm, achieve efficient inference through quantization and prefix caching, build a workflow‑based model platform, and outline future plans for framework optimization and multimodal support.

AI PlatformInference OptimizationMomo
0 likes · 16 min read
How Momo Leverages Large Model Technology to Transform Business and R&D Processes
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Dec 26, 2024 · Artificial Intelligence

Instruction Embedding: Latent Representations of Instructions for Task Identification

The paper introduces Instruction Embedding—a task‑focused text representation learned on the new Instruction Embedding Benchmark—and shows that Prompt‑based Instruction Embedding (PIE) outperforms standard embeddings in clustering, similarity, and downstream tasks such as data selection, in‑context example retrieval, test‑set compression, and task‑correlation analysis.

Fine-tuningcontrastive learninginstruction embedding
0 likes · 15 min read
Instruction Embedding: Latent Representations of Instructions for Task Identification
DeWu Technology
DeWu Technology
Dec 25, 2024 · Artificial Intelligence

AI-Powered Intelligent Coding: Product Evolution, Technical Advances, and Future Outlook

AI‑powered coding tools—from JetBrains’ free IDEs to VSCode extensions like Cursor and end‑to‑end web platforms—are rapidly evolving, offering code continuation, AI‑driven Q&A, multi‑file editing, and chat interfaces, while advances in context handling, caching, LLM fine‑tuning, and speculative decoding promise faster, more integrated development workflows and a future where IDEs become chat‑centric assistants that streamline debugging, deployment, and junior developer support.

AI CodingIDE integrationIntelligent code completion
0 likes · 18 min read
AI-Powered Intelligent Coding: Product Evolution, Technical Advances, and Future Outlook
Architects' Tech Alliance
Architects' Tech Alliance
Dec 23, 2024 · Artificial Intelligence

Why High‑Quality, Massive, Diverse Data Fuels AI Breakthroughs

The article explains how breakthroughs in artificial intelligence depend on high‑quality, large‑scale, and diverse training data, outlines the data‑centric AI movement, details a six‑step workflow for building datasets, and surveys the data industry ecosystem supporting large language model development.

AI dataData QualityData‑Centric AI
0 likes · 7 min read
Why High‑Quality, Massive, Diverse Data Fuels AI Breakthroughs
Fighter's World
Fighter's World
Dec 21, 2024 · Artificial Intelligence

Is Pre‑training Coming to an End? Evaluating Data Sufficiency

The article examines Ilya Sutskever’s claim that pre‑training will end, argues that scaling laws still hold and data is not yet a bottleneck, highlights the scarcity of high‑quality frontier data, and explains why the industry is shifting toward inference‑time compute (o1) as a more sustainable path for large language models.

AI trendsData WallInference‑time Compute
0 likes · 13 min read
Is Pre‑training Coming to an End? Evaluating Data Sufficiency
Data Thinking Notes
Data Thinking Notes
Dec 18, 2024 · Artificial Intelligence

Mastering Prompt Engineering: Advanced Techniques from OpenAI, Anthropic, and Google

This article provides a comprehensive guide to modern prompt engineering, covering foundational principles, detailed techniques such as role‑playing, delimiters, step‑by‑step instructions, and advanced strategies like chain‑of‑thought, reflection, and external tool integration, with real‑world examples from major AI providers and a practical Img2Code case study.

AI best practicesLLM Developmentimg2code
0 likes · 24 min read
Mastering Prompt Engineering: Advanced Techniques from OpenAI, Anthropic, and Google
Baidu Geek Talk
Baidu Geek Talk
Dec 16, 2024 · Artificial Intelligence

AIAPI: Baidu's AI-Native Retrieval System for Large Language Model Applications

AIAPI, Baidu’s AI‑native retrieval platform for large language models, tackles hallucination, slow domain updates, and output opacity by delivering authoritative, timely, full‑content data through a dual‑channel architecture that combines traditional search and RAG, employs reusable ranking, graph‑enhanced data layers, dynamic caching that cuts storage by 70 %, and QueryPlan‑based QoS, achieving markedly higher retrieval quality and a 34 % speed gain with Wenxin 4.0.

AI-Native SystemsAIAPIQuery Planning
0 likes · 12 min read
AIAPI: Baidu's AI-Native Retrieval System for Large Language Model Applications
JD Tech
JD Tech
Dec 14, 2024 · Artificial Intelligence

Generative Retrieval for E‑commerce Search: Lexical and Semantic ID Approaches

This article presents a comprehensive study of generative retrieval for large‑scale e‑commerce search, comparing lexical‑based and Semantic‑ID‑based methods, introducing a Query‑to‑MultiSpan framework, analyzing the sand‑glass distribution problem in residual quantization, and proposing heuristic and adaptive solutions to improve recall and efficiency.

AIE-commerce SearchGenerative Retrieval
0 likes · 20 min read
Generative Retrieval for E‑commerce Search: Lexical and Semantic ID Approaches
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Dec 12, 2024 · Artificial Intelligence

How PertEval Reveals the Real Knowledge Limits of Large Language Models

At NeurIPS 2024, Alibaba Cloud's PAI team presented the Spotlight paper PertEval, which introduces knowledge‑invariant perturbations to expose the true knowledge capacity of LLMs, critiques over‑optimistic static benchmarks, and showcases responsible AI solutions and platform demos for enterprise use.

Alibaba CloudNeurIPS 2024PertEval
0 likes · 6 min read
How PertEval Reveals the Real Knowledge Limits of Large Language Models
Tencent Tech
Tencent Tech
Dec 11, 2024 · Artificial Intelligence

Inside Tencent LeYong AI: Solving Enterprise RAG with Knowledge, Engineering & Algorithms

This article explores how Tencent's LeYong AI assistant leverages Retrieval‑Augmented Generation to empower enterprise knowledge retrieval, detailing three capability dimensions—knowledge management, engineering, and algorithmic—along with eight sub‑areas such as knowledge boundaries, quality, permissions, multimodal handling, long‑context span, and complex reasoning.

AI assistantsEnterprise AIRAG
0 likes · 18 min read
Inside Tencent LeYong AI: Solving Enterprise RAG with Knowledge, Engineering & Algorithms
AntTech
AntTech
Dec 11, 2024 · Artificial Intelligence

Ant Group’s Selected NeurIPS 2024 Papers: Summaries and Highlights

This article presents a curated overview of fifteen Ant Group research papers accepted at NeurIPS 2024, covering topics such as large language models, knowledge graphs, recommendation systems, privacy-preserving inference, and multimodal learning, with abstracts, paper types, links, and key contributions highlighted.

Ant GroupArtificial IntelligenceNeurIPS2024
0 likes · 32 min read
Ant Group’s Selected NeurIPS 2024 Papers: Summaries and Highlights
DevOps
DevOps
Dec 10, 2024 · Artificial Intelligence

Key Generative AI Trends to Watch in 2024

The article outlines the major 2024 generative AI trends—including realistic expectations, multimodal models, smaller open‑source LLMs, GPU shortages, easier model optimization, custom local pipelines, stronger virtual agents, regulatory and ethical challenges, and the rise of shadow AI—while explaining their technical and business implications.

AI Governancelarge language models
0 likes · 17 min read
Key Generative AI Trends to Watch in 2024
AntTech
AntTech
Dec 10, 2024 · Artificial Intelligence

Three Representative Ant Group Papers at NeurIPS 2024

Ant Group will showcase three flagship papers at NeurIPS 2024—AMOR for adaptable modular knowledge agents, PaRO for efficient data‑parallel training of large language models, and LLMDFA for code data‑flow analysis using LLMs—highlighting novel methods, experimental results, and upcoming live discussions.

Ant GroupArtificial IntelligenceDataflow Analysis
0 likes · 5 min read
Three Representative Ant Group Papers at NeurIPS 2024
AsiaInfo Technology: New Tech Exploration
AsiaInfo Technology: New Tech Exploration
Dec 9, 2024 · Artificial Intelligence

How Programming Large Models Transform Repository‑Level Code Completion

This article examines how programming large models combined with code knowledge graphs can overcome the limited context of traditional code‑completion tools, detailing key techniques, trigger strategies, context acquisition methods, model fine‑tuning practices, current challenges, and future research directions for intelligent, repository‑wide code suggestions.

AI programmingcode completionknowledge graph
0 likes · 14 min read
How Programming Large Models Transform Repository‑Level Code Completion
JD Retail Technology
JD Retail Technology
Dec 9, 2024 · Artificial Intelligence

Generative Retrieval for E‑commerce Search: Lexical‑Based and Semantic‑ID Approaches

This article presents a comprehensive study of generative retrieval in large‑scale e‑commerce search, detailing lexical‑based and SemanticID‑based methods, their challenges such as long‑tail distribution and token length, experimental evaluations, the discovered "sandglass" effect, and proposed solutions to improve recall and efficiency.

AIE-commerce SearchGenerative Retrieval
0 likes · 20 min read
Generative Retrieval for E‑commerce Search: Lexical‑Based and Semantic‑ID Approaches
ZhongAn Tech Team
ZhongAn Tech Team
Dec 8, 2024 · Artificial Intelligence

Weekly AI Digest Issue 5: Voice Interaction Trends, End‑to‑End vs. Chain Integration, and Enterprise Solutions

This issue examines the growing importance of voice interaction in AI, highlights Justin Uberti’s move to OpenAI and the launch of GPT‑4o, compares end‑to‑end large‑model and chain‑integration approaches, and offers practical enterprise deployment scenarios for both weak and strong voice‑based interactions.

AIChain IntegrationEnd-to-End
0 likes · 14 min read
Weekly AI Digest Issue 5: Voice Interaction Trends, End‑to‑End vs. Chain Integration, and Enterprise Solutions
Fighter's World
Fighter's World
Dec 7, 2024 · Artificial Intelligence

Does Scaling Law Still Hold? Analyzing OpenAI’s 12‑Day Mini Releases and the Future of GPT‑5

The article examines OpenAI’s 12‑day mini‑series, the emergence of o1 and Reinforcement Fine‑Tuning, and uses Epoch AI’s 2024 report to evaluate four critical constraints—power, chip capacity, data scarcity, and latency—that determine whether AI scaling laws can sustain the compute needed for a GPT‑5‑scale model by 2030.

AI scalingLatencychip manufacturing
0 likes · 11 min read
Does Scaling Law Still Hold? Analyzing OpenAI’s 12‑Day Mini Releases and the Future of GPT‑5
Baobao Algorithm Notes
Baobao Algorithm Notes
Dec 7, 2024 · Artificial Intelligence

What Is Reinforcement Fine-Tuning (RFT) and How Does It Supercharge LLMs?

Reinforcement Fine-Tuning (RFT) combines supervised fine‑tuning with reinforcement learning to teach large language models to reason more effectively, using separate training and validation datasets, graders, and PPO optimization, and has shown superior performance on tasks like gene prediction and math reasoning compared to standard SFT.

AIReinforcement Learninglarge language models
0 likes · 8 min read
What Is Reinforcement Fine-Tuning (RFT) and How Does It Supercharge LLMs?
NewBeeNLP
NewBeeNLP
Dec 2, 2024 · Artificial Intelligence

What Are Today’s Unified Generation-and-Understanding Multimodal Model Architectures?

This article surveys current unified generation-and-understanding multimodal large-model architectures, compares LLM-centric and LLM-plus-diffusion designs, extracts common insights, details large-scale training tricks from models like Emu3, Chameleon and Janus, and outlines open research directions for visual encoders.

Multimodaldiffusionlarge language models
0 likes · 5 min read
What Are Today’s Unified Generation-and-Understanding Multimodal Model Architectures?
AntTech
AntTech
Nov 29, 2024 · Artificial Intelligence

AI Industry Trends in 2024: From Global Slowdown to Chinese Market Acceleration

In 2024, despite a global slowdown in generative AI hype, China's AI market accelerates with rapid application deployments, emerging industries like embodied intelligence and autonomous driving, and a maturing ecosystem that shifts AI from hype to tangible industrial impact.

Artificial IntelligenceChinaDigital Transformation
0 likes · 11 min read
AI Industry Trends in 2024: From Global Slowdown to Chinese Market Acceleration