Tagged articles

Large Language Models

1206 articles · Page 9 of 13

Feb 19, 2025 · Artificial Intelligence

Large Models: Concepts, Principles, Classifications and Applications

This report provides a comprehensive overview of large-scale AI models, explaining their definition, massive parameter and data requirements, underlying transformer architecture, classification into language, vision and multimodal models, notable examples such as DeepSeek, and a survey of popular AIGC tools and practical use cases.

AIGC toolsLarge Language ModelsMultimodal AI

0 likes · 9 min read

Large Models: Concepts, Principles, Classifications and Applications

Architects' Tech Alliance

Feb 19, 2025 · Industry Insights

Why DeepSeek One‑Stop AI Machines Are Redefining Private Model Deployment

The surge in demand for private AI deployment has prompted multiple vendors to launch DeepSeek one‑stop machines—integrated hardware solutions that support the full DeepSeek model family, offering higher stability, easier setup, customization, cost savings, and data security across diverse industry scenarios.

AI InfrastructureAI hardwareDeepSeek

0 likes · 7 min read

Why DeepSeek One‑Stop AI Machines Are Redefining Private Model Deployment

Tencent Cloud Developer

Feb 19, 2025 · Industry Insights

Why Every Enterprise Needs a Knowledge‑Management System in the LLM Era

The article analyzes how the shift from data‑driven to knowledge‑driven operations, powered by large language models like DeepSeek, forces companies to build dynamic knowledge‑management platforms that integrate personal and corporate knowledge, improve efficiency, and create sustainable competitive advantage.

DeepSeekEnterprise AIKnowledge Management

0 likes · 14 min read

Why Every Enterprise Needs a Knowledge‑Management System in the LLM Era

Architects' Tech Alliance

Feb 18, 2025 · Artificial Intelligence

How DeepSeek’s Latest Models Redefine AI Performance and Industry Adoption

The DeepSeek report details rapid model releases from 2024 onward, highlighting innovations such as model distillation, a 671 B MoE architecture, FP8 mixed‑precision, and the Janus‑Pro multimodal framework, while also documenting major cloud and chip providers' integration of these models into their services.

AI industry adoptionDeepSeekLarge Language Models

0 likes · 10 min read

How DeepSeek’s Latest Models Redefine AI Performance and Industry Adoption

Mingyi World Elasticsearch

Feb 18, 2025 · Artificial Intelligence

Master Prompt Engineering for DeepSeek and ChatGPT‑4o: Essential Techniques

This guide explains the fundamentals of prompt engineering for large language models such as DeepSeek and ChatGPT‑4o, illustrating clear‑prompt design, giving models time to think, chaining prompts, iterative refinement, and advanced tricks with concrete good and bad examples.

AIChatGPT-4oDeepSeek

0 likes · 12 min read

Master Prompt Engineering for DeepSeek and ChatGPT‑4o: Essential Techniques

Software Engineering 3.0 Era

Feb 18, 2025 · Artificial Intelligence

Deep Dive into Grok 3: How the New Reasoning Model Beats OpenAI o3-mini and DeepSeek R1

The article examines xAI's newly released Grok 3, detailing its chain‑of‑thought reasoning, synthetic‑data training, benchmark dominance over rivals like DeepSeek V3 and GPT‑4o, internal controversy, massive GPU investment, pricing, and its broader impact on the competitive AI landscape.

AI benchmarkingChain-of-ThoughtGrok 3

0 likes · 9 min read

Deep Dive into Grok 3: How the New Reasoning Model Beats OpenAI o3-mini and DeepSeek R1

DataFunTalk

Feb 18, 2025 · Artificial Intelligence

CODEI/O: Leveraging Code to Train Large Language Models for Enhanced Reasoning

The DeepSeek team introduced CODEI/O, a massive dataset that converts code into natural‑language reasoning chains, and demonstrated that training large language models on this data markedly improves their performance on diverse inference tasks, including non‑code domains, through a two‑stage training strategy.

CODEI/OLarge Language Modelscode reasoning

0 likes · 8 min read

CODEI/O: Leveraging Code to Train Large Language Models for Enhanced Reasoning

Cognitive Technology Team

Feb 18, 2025 · Artificial Intelligence

Two Major Bottlenecks in Deploying Large Language Models: Machine Deception and Hallucination

Deploying large language models faces two critical challenges—machine deception, where AI generates plausible yet false content, and machine hallucination, where outputs are logically coherent but factually inaccurate—both undermining trust, and the article outlines their causes, impacts, and technical, ethical, and regulatory mitigation strategies.

Artificial IntelligenceHallucinationLarge Language Models

0 likes · 6 min read

Two Major Bottlenecks in Deploying Large Language Models: Machine Deception and Hallucination

Architect's Alchemy Furnace

Feb 17, 2025 · Artificial Intelligence

24 Proven Prompt Formulas to Unlock DeepSeek’s Full Potential

Discover a comprehensive collection of 24 structured prompting techniques—from basic role‑play formulas to advanced cross‑disciplinary and managerial frameworks—designed to help users of DeepSeek and other large language models craft precise, high‑impact queries that dramatically improve response quality and efficiency.

AI promptingDeepSeekLarge Language Models

0 likes · 12 min read

24 Proven Prompt Formulas to Unlock DeepSeek’s Full Potential

Baobao Algorithm Notes

Feb 17, 2025 · Artificial Intelligence

Can TransMLA Turn GQA into a More Powerful MLA? A Deep Dive into DeepSeek Models

This article presents a theoretical and experimental analysis of converting Group Query Attention (GQA) models to Multi‑Head Linear Attention (MLA) using the TransMLA method, demonstrating superior expressiveness and performance on DeepSeek‑based large language models while keeping KV‑Cache costs unchanged.

DeepSeekLarge Language ModelsMLA

0 likes · 11 min read

Can TransMLA Turn GQA into a More Powerful MLA? A Deep Dive into DeepSeek Models

Java Architecture Diary

Feb 17, 2025 · Artificial Intelligence

What Is LLMs.txt? The New AI‑Friendly Web Standard Explained

LLMs.txt is a lightweight, AI‑optimized web standard that provides concise Markdown navigation files for large language models, addressing context limits, redundant content, and lack of structure, and is already adopted by companies like Mintlify, Anthropic, and Cursor.

AI indexingAI standardsLarge Language Models

0 likes · 6 min read

What Is LLMs.txt? The New AI‑Friendly Web Standard Explained

Fun with Large Models

Feb 16, 2025 · Artificial Intelligence

Can You Claim to Know Large Models? Guide to Distillation, Quantization & Fine‑Tuning

This article explains why the massive DeepSeek V3/R1 model (671 B parameters) is hard to deploy and introduces three key techniques—model distillation, quantization, and fine‑tuning—that can shrink, accelerate, or specialize large models, while outlining their trade‑offs and practical steps.

AI model compressionDeepSeekLarge Language Models

0 likes · 10 min read

Can You Claim to Know Large Models? Guide to Distillation, Quantization & Fine‑Tuning

Architects' Tech Alliance

Feb 16, 2025 · Artificial Intelligence

How DeepSeek’s Distillation Breaks Bottlenecks and Boosts Multimodal AI Performance

This article provides an in‑depth technical analysis of DeepSeek’s model distillation technology, covering its core principles, innovative data‑model fusion strategies, architecture design, training optimizations, performance benchmarks, and the remaining challenges of scaling distillation to multimodal tasks.

DeepSeekLarge Language Modelsai-optimization

0 likes · 16 min read

How DeepSeek’s Distillation Breaks Bottlenecks and Boosts Multimodal AI Performance

Lao Guo's Learning Space

Feb 15, 2025 · Artificial Intelligence

What Is deepseek-MoE? Understanding the Mixture‑of‑Experts Architecture

The article explains deepseek-MoE (Mixture of Experts), describing its full English name, Chinese translation, how a gating network selects and weights multiple expert models for each input, and uses an analogy to illustrate load‑balancing and the divide‑and‑conquer design in large AI models.

AI ArchitectureLarge Language ModelsMixture of Experts

0 likes · 2 min read

What Is deepseek-MoE? Understanding the Mixture‑of‑Experts Architecture

Ops Development & AI Practice

Feb 14, 2025 · Artificial Intelligence

Large Model Format Showdown: Hugging Face, TensorFlow, ONNX, TorchScript, GGUF

This comprehensive guide examines the leading large‑model storage formats—including Hugging Face Transformers, TensorFlow SavedModel, ONNX, TorchScript, and GGUF—detailing their file structures, serialization methods, strengths, weaknesses, and typical use‑cases, helping developers and researchers select the optimal format for their specific AI workloads.

AI DeploymentGGUFLarge Language Models

0 likes · 21 min read

Large Model Format Showdown: Hugging Face, TensorFlow, ONNX, TorchScript, GGUF

DataFunSummit

Feb 14, 2025 · Artificial Intelligence

Building Large‑Scale Recommendation Systems with Big Data and Large Language Models on Alibaba Cloud AI Platform

This presentation details how Alibaba Cloud's AI platform integrates big‑data pipelines, feature‑store services, and large language model capabilities to construct high‑performance search‑recommendation architectures, covering system design, training and inference optimizations, LLM‑driven use cases, and open‑source RAG tooling.

AI platformBig DataFeature Store

0 likes · 17 min read

Building Large‑Scale Recommendation Systems with Big Data and Large Language Models on Alibaba Cloud AI Platform

Top Architect

Feb 14, 2025 · Artificial Intelligence

DeepSeek Model Distillation: Principles, Innovations, Architecture, and Performance

This article provides an in‑depth overview of DeepSeek’s model distillation technology, covering its definition, core principles, innovative data‑model distillation integration, architecture design, training strategies, performance gains, and the challenges of scaling to multimodal data.

DeepSeekKnowledge TransferLarge Language Models

0 likes · 16 min read

DeepSeek Model Distillation: Principles, Innovations, Architecture, and Performance

Code Mala Tang

Feb 13, 2025 · Artificial Intelligence

Why Apple Chose Alibaba: Inside the AI Partnership Reshaping China’s Smartphone Market

Apple’s steep sales decline in China has driven it to partner with Alibaba’s Qwen AI platform, a move that blends cutting‑edge large‑model technology, cloud scalability, and local compliance to revive iPhone market share and showcase China’s rising AI prowess.

AI partnershipAlibabaApple

0 likes · 11 min read

Why Apple Chose Alibaba: Inside the AI Partnership Reshaping China’s Smartphone Market

Ma Wei Says

Feb 13, 2025 · Artificial Intelligence

Master AI Prompting: 5 Proven Techniques to Unlock Accurate Outputs

This guide presents five practical prompting techniques—including structured output, role‑playing, visual conversion, multi‑turn refinement, and multilingual handling—plus industry‑specific examples and common pitfalls, helping users craft precise commands for AI models like DeepSeek.

AI promptingLarge Language ModelsPrompt Engineering

0 likes · 8 min read

Master AI Prompting: 5 Proven Techniques to Unlock Accurate Outputs

Architect

Feb 12, 2025 · Artificial Intelligence

Can S‑Curve Theory Explain the Limits of Large‑Model Scaling Laws?

The article analyses how S‑shaped growth curves can model the apparent scaling laws of large language models, discusses the three phases of model development, proposes an ability‑density hypothesis, and explores future scenarios where scaling laws may plateau or shift.

AI growthAbility DensityLarge Language Models

0 likes · 16 min read

Can S‑Curve Theory Explain the Limits of Large‑Model Scaling Laws?

Architect

Feb 12, 2025 · Artificial Intelligence

Master Prompt Engineering: A Universal Framework for LLMs

This article presents a comprehensive, step‑by‑step Prompt engineering framework—including role definition, problem description, goal setting, and requirement specification—augmented with techniques such as RAG, few‑shot examples, memory handling, and parameter tuning, enabling users to craft effective prompts for large language models across domains.

AI Prompt OptimizationFew-shotLarge Language Models

0 likes · 27 min read

Master Prompt Engineering: A Universal Framework for LLMs

AIWalker

Feb 11, 2025 · Artificial Intelligence

LLMDet: LLM‑Powered Open‑Vocabulary Detector Beats Grounding DINO

LLMDet introduces a novel training pipeline that leverages large language models to generate detailed image‑level captions and region‑level phrases, fine‑tunes an open‑vocabulary detector with the GroundingCap‑1M dataset, and achieves state‑of‑the‑art zero‑shot performance surpassing Grounding DINO across multiple benchmarks.

GroundingCapLLMDetLarge Language Models

0 likes · 20 min read

LLMDet: LLM‑Powered Open‑Vocabulary Detector Beats Grounding DINO

DataFunTalk

Feb 11, 2025 · Artificial Intelligence

Roundtable on Enhancing Large Model Effectiveness: RAG, Tool Use, and Knowledge Engineering

Experts from Dipu, Ant Financial, iKang, and Zhihu discuss practical strategies for improving large model performance, covering RAG, tool‑using, offline knowledge engineering, multimodal training, evaluation metrics, and future trends, while sharing case studies from manufacturing, healthcare, retail, and C‑end applications.

Knowledge EngineeringLarge Language ModelsRAG

0 likes · 9 min read

Roundtable on Enhancing Large Model Effectiveness: RAG, Tool Use, and Knowledge Engineering

Cognitive Technology Team

Feb 10, 2025 · Artificial Intelligence

Survey of Major Chinese AI Large Language Models: Technologies, Innovations, and Comparative Evaluation

This report systematically reviews the key technologies, innovations, and performance of leading Chinese AI large language models—including DeepSeek, Kimi, and Qwen2.5—detailing their architectures, training methods, multimodal capabilities, and comparative evaluations against each other and foreign models.

AIChinaLarge Language Models

0 likes · 20 min read

Survey of Major Chinese AI Large Language Models: Technologies, Innovations, and Comparative Evaluation

AI Algorithm Path

Feb 10, 2025 · Artificial Intelligence

Understanding DualPipe: DeepDive into DeepSeek‑R1 Architecture (Part 5)

This article explains how the DualPipe scheduling mechanism in DeepSeek‑R1 improves GPU cluster compute‑communication efficiency by using fine‑grained pipeline stages and bidirectional data flow, comparing it with Zero Bubble pipeline parallelism and discussing the challenges of large‑scale distributed training.

DeepSeekDualPipeLarge Language Models

0 likes · 10 min read

Understanding DualPipe: DeepDive into DeepSeek‑R1 Architecture (Part 5)

IT Architects Alliance

Feb 10, 2025 · Artificial Intelligence

DeepSeek Distillation Technology: Principles, Innovations, Performance, and Future Outlook

The article explains DeepSeek's model distillation technique, covering its fundamental knowledge‑transfer principles, unique innovations such as data‑model fusion and task‑specific strategies, impressive benchmark results, practical applications in edge and online inference, existing challenges, and future research directions.

Knowledge TransferLarge Language Modelsai-optimization

0 likes · 15 min read

DeepSeek Distillation Technology: Principles, Innovations, Performance, and Future Outlook

Software Engineering 3.0 Era

Feb 10, 2025 · Industry Insights

Can China’s Homegrown LLMs Compete After DeepSeek’s Open‑Source Disruption?

The open‑source release of DeepSeek R1 under an MIT license has reshaped the large‑model market, driving cost cuts, prompting rapid responses from global rivals and Chinese cloud providers, and forcing domestic AI firms to rethink differentiation and ecosystem strategies to stay competitive.

AI EcosystemChinese AIDeepSeek

0 likes · 11 min read

Can China’s Homegrown LLMs Compete After DeepSeek’s Open‑Source Disruption?

Baidu Geek Talk

Feb 10, 2025 · Artificial Intelligence

How Baidu Cloud Slashes Inference Costs: DeepSeek Model Optimizations Unveiled

Baidu Cloud's Qianfan platform launched DeepSeek‑R1 and DeepSeek‑V3 with ultra‑low inference pricing, leveraging advanced engine performance tweaks, a split Prefill/Decode architecture, and comprehensive security measures that together boost throughput, cut costs, and ensure enterprise‑grade reliability.

AI inferenceBaidu CloudLarge Language Models

0 likes · 5 min read

How Baidu Cloud Slashes Inference Costs: DeepSeek Model Optimizations Unveiled

Architects' Tech Alliance

Feb 10, 2025 · Artificial Intelligence

Why DeepSeek Is Disrupting the Global AI Landscape: Tech, Cost, and Open‑Source Edge

DeepSeek, a Chinese AI startup, has rapidly risen to global prominence by releasing high‑performance large language models such as V2, V3, and R1, which combine innovative architectures, dramatically lower training costs, and an open‑source strategy that challenges established AI giants and reshapes industry dynamics.

Artificial IntelligenceChina AIDeepSeek

0 likes · 14 min read

Why DeepSeek Is Disrupting the Global AI Landscape: Tech, Cost, and Open‑Source Edge

Open Source Linux

Feb 10, 2025 · Artificial Intelligence

How DeepSeek R1 Uses Large‑Scale Reinforcement Learning to Replicate OpenAI o1

This article examines DeepSeek R1’s large‑scale reinforcement‑learning approach, its training pipeline that combines rule‑based scaling and deep‑reasoning SFT data, and why its open‑source, low‑cost replication of OpenAI o1 marks a pivotal step toward more efficient, democratized AI models.

AI efficiencyDeepSeekLarge Language Models

0 likes · 18 min read

How DeepSeek R1 Uses Large‑Scale Reinforcement Learning to Replicate OpenAI o1

DevOps

Feb 9, 2025 · Artificial Intelligence

DeepSeek’s Impact on the Large Model Ecosystem and the Resurgence of AI PCs

The article examines DeepSeek’s rapid rise, its open‑source R1 model and distilled variants, the resurgence of AI PCs, hardware support from Nvidia, AMD and others, and how this ecosystem is reshaping personal AI experiences and the broader large‑model landscape.

AI PCDeepSeekHardware

0 likes · 11 min read

DeepSeek’s Impact on the Large Model Ecosystem and the Resurgence of AI PCs

AI Algorithm Path

Feb 9, 2025 · Artificial Intelligence

Understanding Multi-Token Prediction in DeepSeek‑R1 Architecture

This article dissects the Multi‑Token Prediction (MTP) technique used in DeepSeek‑R1, contrasting it with traditional next‑token prediction, detailing Meta’s MTP design, DeepSeek’s adapted architecture, loss weighting, and why MTP is applied only during training to boost efficiency and model capability.

DeepSeekLarge Language ModelsMTP

0 likes · 9 min read

Architect

Feb 9, 2025 · Artificial Intelligence

How DeepSeek’s Model Distillation Boosts AI Efficiency and Performance

This article provides an in‑depth analysis of DeepSeek’s model distillation technology, covering its definition, core principles, innovative strategies, architecture design, training optimizations, benchmark results, efficiency gains, and the remaining challenges of applying distillation to large language models and multimodal data.

AI efficiencyDeepSeekKnowledge Transfer

0 likes · 16 min read

How DeepSeek’s Model Distillation Boosts AI Efficiency and Performance

Architects' Tech Alliance

Feb 9, 2025 · Artificial Intelligence

How DeepSeek R1 Replicates OpenAI o1 Using Large‑Scale Reinforcement Learning

The article provides an in‑depth technical analysis of DeepSeek R1, explaining how it reproduces OpenAI o1's reasoning abilities through rule‑based large‑scale reinforcement learning, mixed SFT data, and efficient scaling, while discussing its broader impact on AI model development and capability density trends.

AI industryCapability DensityDeepSeek

0 likes · 19 min read

How DeepSeek R1 Replicates OpenAI o1 Using Large‑Scale Reinforcement Learning

AI2ML AI to Machine Learning

Feb 8, 2025 · Artificial Intelligence

Analyzing DeepSeek R1 Inference Projects: Source Code, Cold‑Start, and Scaling Techniques

This article examines DeepSeek R1’s three breakthroughs, its low‑cost optimizations that bypass CUDA, and the resulting impact on the AI ecosystem, then provides a detailed technical review of seven open‑source reproductions—Open‑R1, Tiny‑Zero, SimpleScaling‑S1, and simpleRL‑reason—covering their architectures, reinforcement‑learning pipelines, and code implementations.

DeepSeekInference ScalingLarge Language Models

0 likes · 10 min read

Analyzing DeepSeek R1 Inference Projects: Source Code, Cold‑Start, and Scaling Techniques

Huawei Cloud Developer Alliance

Feb 8, 2025 · Artificial Intelligence

Why DeepSeek V3 and R1 Are Redefining Low‑Cost AI: Architecture, Training Tricks, and Industry Impact

This article analyses DeepSeek's V3 and R1 models, explaining how their innovative MoE architecture, Multi‑Head Latent Attention, low‑cost training strategies, and distributed‑training optimizations deliver high‑performance large language models while reducing GPU/NPU demand and sparking industry excitement.

AI inferenceDeepSeekLarge Language Models

0 likes · 16 min read

Why DeepSeek V3 and R1 Are Redefining Low‑Cost AI: Architecture, Training Tricks, and Industry Impact

IT Services Circle

Feb 7, 2025 · Artificial Intelligence

Building Low‑Cost AI Clusters with Old Phones Using Exo and Open WebUI

This article introduces Exo, an open‑source platform that lets you turn idle smartphones, tablets, and laptops into a distributed AI cluster capable of running large language models, and shows how Open WebUI provides a user‑friendly interface for deploying private AI assistants.

AI clusteringDistributed InferenceExo

0 likes · 6 min read

Building Low‑Cost AI Clusters with Old Phones Using Exo and Open WebUI

Java Captain

Feb 7, 2025 · Artificial Intelligence

DeepSeek: Disruptive Innovations in Large Language Model Architecture, Efficiency, and Ecosystem

DeepSeek reshapes the AI landscape by replacing brute‑force compute scaling with algorithmic breakthroughs such as a novel MoE architecture, memory compression, active‑learning data pipelines, and open‑source tooling, delivering dramatically lower training and inference costs while enabling edge deployment and a vibrant developer ecosystem.

Algorithmic EfficiencyDeepSeekEdge deployment

0 likes · 11 min read

DeepSeek: Disruptive Innovations in Large Language Model Architecture, Efficiency, and Ecosystem

Tencent Cloud Developer

Feb 6, 2025 · Artificial Intelligence

DeepSeek V Series: Technical Overview of Scaling Laws, Grouped Query Attention, and Mixture‑of‑Experts

The article reviews DeepSeek’s V‑series papers, explaining how scaling‑law insights, Grouped Query Attention, a depth‑first design, loss‑free load balancing, multi‑token prediction and Multi‑Head Latent Attention together enable economical mixture‑of‑experts LLMs that rival closed‑source models while cutting compute and hardware costs.

DeepSeekGrouped Query AttentionLarge Language Models

0 likes · 13 min read

DeepSeek V Series: Technical Overview of Scaling Laws, Grouped Query Attention, and Mixture‑of‑Experts

Alibaba Cloud Developer

Feb 5, 2025 · Artificial Intelligence

10 Common Prompt Engineering Mistakes and How to Overcome Them

This article lists ten common misconceptions about prompt engineering, explains why each is flawed, and offers practical insights and strategies—such as using the CO‑STAR framework, tailoring prompts to specific models, keeping prompts concise, and continuously testing and refining—to help readers communicate effectively with large language models.

AI misconceptionsLLMLarge Language Models

0 likes · 10 min read

10 Common Prompt Engineering Mistakes and How to Overcome Them

Architect

Feb 3, 2025 · Artificial Intelligence

How DeepSeek‑R1 Uses Pure Reinforcement Learning to Match OpenAI’s o1

This article presents DeepSeek‑R1 and DeepSeek‑R1‑Zero, two next‑generation LLMs trained with pure reinforcement learning and multi‑stage fine‑tuning, details their GRPO training framework, model‑distillation pipeline, open‑source release, and evaluation results that rival OpenAI’s o1‑1217 across reasoning, knowledge, and coding benchmarks.

DeepSeekLLM evaluationLarge Language Models

0 likes · 10 min read

How DeepSeek‑R1 Uses Pure Reinforcement Learning to Match OpenAI’s o1

Cognitive Technology Team

Feb 3, 2025 · Artificial Intelligence

DeepSeek R1 Introduces Group‑Related Policy Optimization for Advanced Reasoning in Large Language Models

DeepSeek AI’s new open‑source model DeepSeek‑R1 leverages a novel Group‑Related Policy Optimization (GRPO) reinforcement‑learning framework and multi‑stage training to dramatically boost complex reasoning performance, achieving AIME 2024 Pass@1 scores comparable to OpenAI’s o1 model.

AIDeepSeekGRPO

0 likes · 4 min read

DeepSeek R1 Introduces Group‑Related Policy Optimization for Advanced Reasoning in Large Language Models

DataFunSummit

Jan 31, 2025 · Artificial Intelligence

LLMOps: Building a Prompt‑Driven Engine for AI Operations

This article presents the concept of LLMOps—applying large language models to AIOps—by analyzing prompt challenges, introducing the LogPrompt engine for log analysis, describing a prompt‑learning data flywheel with CoachLM optimization, reporting experimental results, and outlining future multi‑modal directions.

AIOpsCoachLMData Flywheel

0 likes · 16 min read

LLMOps: Building a Prompt‑Driven Engine for AI Operations

Alibaba Cloud Native

Jan 27, 2025 · Frontend Development

How Large Language Models Can Supercharge Frontend Development: Practical Insights

This article explores how large language models can be leveraged to automate and accelerate frontend development tasks, covering prompt engineering, repo‑level code generation, quality factors, hallucination mitigation, knowledge‑base integration, and practical strategies for improving developer productivity.

AIFrontendKnowledge Base

0 likes · 22 min read

How Large Language Models Can Supercharge Frontend Development: Practical Insights

JD Cloud Developers

Jan 26, 2025 · Operations

How Large Language Models are Transforming Modern IT Operations

This article traces the evolution of IT operations from manual tasks to automation, AIOps, and ChatOps, and explains how large language models boost efficiency, enable intelligent assistants, automated diagnosis, and smart log analysis for more reliable, automated Ops workflows.

AIOpsChatOpsLarge Language Models

0 likes · 7 min read

How Large Language Models are Transforming Modern IT Operations

Software Engineering 3.0 Era

Jan 22, 2025 · Artificial Intelligence

When Will China Overtake the US in Large‑Model AI? A Technical Comparison

The article analyzes the US‑China large‑model race, detailing algorithmic and architectural strengths of OpenAI, Google and Microsoft versus Chinese innovations like Doubao 1.5, MiniMax‑01 and Vidu, and projects a timeline from 2025 to 2033 for China to close the gap.

AI competitionChinaLarge Language Models

0 likes · 12 min read

When Will China Overtake the US in Large‑Model AI? A Technical Comparison

ByteDance Web Infra

Jan 22, 2025 · Artificial Intelligence

Introducing UI‑TARS: A Native GUI Agent Model Integrated with Midscene.js for Multimodal UI Automation

The article presents UI‑TARS, a native GUI‑agent model that combines multimodal large‑language models with the open‑source Midscene.js framework to enable more accurate, token‑efficient, and privacy‑preserving UI automation, while discussing its architecture, advantages, limitations, and integration steps.

GUI AgentLarge Language ModelsMidscene.js

0 likes · 11 min read

Introducing UI‑TARS: A Native GUI Agent Model Integrated with Midscene.js for Multimodal UI Automation

Bilibili Tech

Jan 21, 2025 · Artificial Intelligence

Accelerating Large Model Inference: Challenges and Multi‑Level Optimization Strategies

The article outlines how exploding LLM sizes create compute, memory, and latency bottlenecks and proposes a full‑stack solution—operator fusion, high‑performance libraries, quantization, speculative decoding, sharding, contiguous batching, PageAttention, and specialized frameworks like MindIE‑LLM—to dramatically boost inference throughput and reduce latency, while highlighting future ultra‑low‑bit and heterogeneous hardware directions.

Continuous BatchingLarge Language ModelsMulti-modal

0 likes · 21 min read

Accelerating Large Model Inference: Challenges and Multi‑Level Optimization Strategies

Software Engineering 3.0 Era

Jan 18, 2025 · Industry Insights

Is AI Self‑Programming and Recursive Self‑Improvement Signaling the Endgame?

The article examines Nvidia’s claim that AI can now write software and build an “AI factory,” analyzes OpenAI’s emerging o‑series models that purportedly achieve recursive self‑improvement, and surveys community reactions ranging from excitement to safety concerns about a potential AI “game over.”

AI safetyIndustry AnalysisLarge Language Models

0 likes · 8 min read

Is AI Self‑Programming and Recursive Self‑Improvement Signaling the Endgame?

Fighter's World

Jan 10, 2025 · Artificial Intelligence

How to Escape the Demo Dilemma: A Three‑Stage Leap for B2B Large‑Model Deployment

The article analyzes why B2B large‑model projects often stall at demo, prototype, or POC stages and proposes a three‑level value‑lift framework—model domain intelligence, business‑process smart density, and pervasive seamless interaction—to turn demos into real‑world impact.

AI value ladderAI-nativeB2B AI

0 likes · 13 min read

How to Escape the Demo Dilemma: A Three‑Stage Leap for B2B Large‑Model Deployment

Baidu Tech Salon

Jan 8, 2025 · Artificial Intelligence

Evolution of Video Search Ranking Architecture Toward an End‑to‑End Large‑Model Framework

The paper describes transforming a tightly coupled, multi‑stage video search ranking pipeline into a modular, end‑to‑end large‑model architecture that decouples recall, employs a graph‑engine parallel framework and elastic compute allocation, thereby boosting performance, flexibility, personalization and lowering long‑term operational costs.

End-to-EndLarge Language ModelsSystem Optimization

0 likes · 10 min read

Evolution of Video Search Ranking Architecture Toward an End‑to‑End Large‑Model Framework

ZhongAn Tech Team

Jan 5, 2025 · Artificial Intelligence

Weekly AI Roundup Issue 9: OpenAI Vision, LeCun Interview, ByteDance HLLM, and DeepSeek‑V3 Highlights

This issue presents a curated overview of recent AI developments, including Sam Altman's 2025 technology vision poll, LeCun's interview on future AI directions, ByteDance's hierarchical large language model for recommendation, and the performance and cost advantages of the open‑source DeepSeek‑V3 model.

AIByteDanceDeepSeek

0 likes · 10 min read

Weekly AI Roundup Issue 9: OpenAI Vision, LeCun Interview, ByteDance HLLM, and DeepSeek‑V3 Highlights

DataFunTalk

Jan 1, 2025 · Artificial Intelligence

Applying Large Language Models to Financial Risk Control at Akulaku

This article details Akulaku’s deployment of large language models across multimodal financial risk‑control scenarios—covering business background, a three‑module intelligent‑agent architecture, concrete tool‑ and planning‑enhancement case studies, and future outlook—demonstrating how LLMs boost efficiency, reduce labeling effort, and enable copilot‑style assistance.

KYC verificationLarge Language ModelsMultimodal AI

0 likes · 15 min read

Applying Large Language Models to Financial Risk Control at Akulaku

DataFunSummit

Dec 31, 2024 · Artificial Intelligence

How Momo Leverages Large Model Technology to Transform Business and R&D Processes

This article explains how Momo utilizes large language model technologies to revamp its AI application paradigm, achieve efficient inference through quantization and prefix caching, build a workflow‑based model platform, and outline future plans for framework optimization and multimodal support.

AI platformInference OptimizationLarge Language Models

0 likes · 16 min read

How Momo Leverages Large Model Technology to Transform Business and R&D Processes

Xiaohongshu Tech REDtech

Dec 26, 2024 · Artificial Intelligence

Instruction Embedding: Latent Representations of Instructions for Task Identification

The paper introduces Instruction Embedding—a task‑focused text representation learned on the new Instruction Embedding Benchmark—and shows that Prompt‑based Instruction Embedding (PIE) outperforms standard embeddings in clustering, similarity, and downstream tasks such as data selection, in‑context example retrieval, test‑set compression, and task‑correlation analysis.

Large Language Modelscontrastive learningfine‑tuning

0 likes · 15 min read

Instruction Embedding: Latent Representations of Instructions for Task Identification

DeWu Technology

Dec 25, 2024 · Artificial Intelligence

AI-Powered Intelligent Coding: Product Evolution, Technical Advances, and Future Outlook

AI‑powered coding tools—from JetBrains’ free IDEs to VSCode extensions like Cursor and end‑to‑end web platforms—are rapidly evolving, offering code continuation, AI‑driven Q&A, multi‑file editing, and chat interfaces, while advances in context handling, caching, LLM fine‑tuning, and speculative decoding promise faster, more integrated development workflows and a future where IDEs become chat‑centric assistants that streamline debugging, deployment, and junior developer support.

AI codingIDE integrationIntelligent code completion

0 likes · 18 min read

AI-Powered Intelligent Coding: Product Evolution, Technical Advances, and Future Outlook

Architects' Tech Alliance

Dec 23, 2024 · Artificial Intelligence

Why High‑Quality, Massive, Diverse Data Fuels AI Breakthroughs

The article explains how breakthroughs in artificial intelligence depend on high‑quality, large‑scale, and diverse training data, outlines the data‑centric AI movement, details a six‑step workflow for building datasets, and surveys the data industry ecosystem supporting large language model development.

AI dataAnnotationData Quality

0 likes · 7 min read

Why High‑Quality, Massive, Diverse Data Fuels AI Breakthroughs

Fighter's World

Dec 21, 2024 · Artificial Intelligence

Is Pre‑training Coming to an End? Evaluating Data Sufficiency

The article examines Ilya Sutskever’s claim that pre‑training will end, argues that scaling laws still hold and data is not yet a bottleneck, highlights the scarcity of high‑quality frontier data, and explains why the industry is shifting toward inference‑time compute (o1) as a more sustainable path for large language models.

AI trendsInference‑time ComputeLarge Language Models

0 likes · 13 min read

Is Pre‑training Coming to an End? Evaluating Data Sufficiency

Data Thinking Notes

Dec 18, 2024 · Artificial Intelligence

Mastering Prompt Engineering: Advanced Techniques from OpenAI, Anthropic, and Google

This article provides a comprehensive guide to modern prompt engineering, covering foundational principles, detailed techniques such as role‑playing, delimiters, step‑by‑step instructions, and advanced strategies like chain‑of‑thought, reflection, and external tool integration, with real‑world examples from major AI providers and a practical Img2Code case study.

AI best practicesLLM DevelopmentLarge Language Models

0 likes · 24 min read

Mastering Prompt Engineering: Advanced Techniques from OpenAI, Anthropic, and Google

Baidu Geek Talk

Dec 16, 2024 · Artificial Intelligence

AIAPI: Baidu's AI-Native Retrieval System for Large Language Model Applications

AIAPI, Baidu’s AI‑native retrieval platform for large language models, tackles hallucination, slow domain updates, and output opacity by delivering authoritative, timely, full‑content data through a dual‑channel architecture that combines traditional search and RAG, employs reusable ranking, graph‑enhanced data layers, dynamic caching that cuts storage by 70 %, and QueryPlan‑based QoS, achieving markedly higher retrieval quality and a 34 % speed gain with Wenxin 4.0.

AI-Native SystemsAIAPILarge Language Models

0 likes · 12 min read

AIAPI: Baidu's AI-Native Retrieval System for Large Language Model Applications

JD Tech

Dec 14, 2024 · Artificial Intelligence

Generative Retrieval for E‑commerce Search: Lexical and Semantic ID Approaches

This article presents a comprehensive study of generative retrieval for large‑scale e‑commerce search, comparing lexical‑based and Semantic‑ID‑based methods, introducing a Query‑to‑MultiSpan framework, analyzing the sand‑glass distribution problem in residual quantization, and proposing heuristic and adaptive solutions to improve recall and efficiency.

AIE-commerce SearchGenerative Retrieval

0 likes · 20 min read

Generative Retrieval for E‑commerce Search: Lexical and Semantic ID Approaches

Alibaba Cloud Big Data AI Platform

Dec 12, 2024 · Artificial Intelligence

How PertEval Reveals the Real Knowledge Limits of Large Language Models

At NeurIPS 2024, Alibaba Cloud's PAI team presented the Spotlight paper PertEval, which introduces knowledge‑invariant perturbations to expose the true knowledge capacity of LLMs, critiques over‑optimistic static benchmarks, and showcases responsible AI solutions and platform demos for enterprise use.

Alibaba CloudEvaluationLarge Language Models

0 likes · 6 min read

How PertEval Reveals the Real Knowledge Limits of Large Language Models

Tencent Tech

Dec 11, 2024 · Artificial Intelligence

Inside Tencent LeYong AI: Solving Enterprise RAG with Knowledge, Engineering & Algorithms

This article explores how Tencent's LeYong AI assistant leverages Retrieval‑Augmented Generation to empower enterprise knowledge retrieval, detailing three capability dimensions—knowledge management, engineering, and algorithmic—along with eight sub‑areas such as knowledge boundaries, quality, permissions, multimodal handling, long‑context span, and complex reasoning.

AI assistantsEnterprise AIKnowledge Management

0 likes · 18 min read

Inside Tencent LeYong AI: Solving Enterprise RAG with Knowledge, Engineering & Algorithms

AntTech

Dec 11, 2024 · Artificial Intelligence

Ant Group’s Selected NeurIPS 2024 Papers: Summaries and Highlights

This article presents a curated overview of fifteen Ant Group research papers accepted at NeurIPS 2024, covering topics such as large language models, knowledge graphs, recommendation systems, privacy-preserving inference, and multimodal learning, with abstracts, paper types, links, and key contributions highlighted.

Ant GroupArtificial IntelligenceLarge Language Models

0 likes · 32 min read

Ant Group’s Selected NeurIPS 2024 Papers: Summaries and Highlights

DevOps

Dec 10, 2024 · Artificial Intelligence

Key Generative AI Trends to Watch in 2024

The article outlines the major 2024 generative AI trends—including realistic expectations, multimodal models, smaller open‑source LLMs, GPU shortages, easier model optimization, custom local pipelines, stronger virtual agents, regulatory and ethical challenges, and the rise of shadow AI—while explaining their technical and business implications.

AI GovernanceLarge Language Models

0 likes · 17 min read

Key Generative AI Trends to Watch in 2024

AntTech

Dec 10, 2024 · Artificial Intelligence

Three Representative Ant Group Papers at NeurIPS 2024

Ant Group will showcase three flagship papers at NeurIPS 2024—AMOR for adaptable modular knowledge agents, PaRO for efficient data‑parallel training of large language models, and LLMDFA for code data‑flow analysis using LLMs—highlighting novel methods, experimental results, and upcoming live discussions.

Ant GroupArtificial IntelligenceDataflow Analysis

0 likes · 5 min read

Three Representative Ant Group Papers at NeurIPS 2024

AsiaInfo Technology: New Tech Exploration

Dec 9, 2024 · Artificial Intelligence

How Programming Large Models Transform Repository‑Level Code Completion

This article examines how programming large models combined with code knowledge graphs can overcome the limited context of traditional code‑completion tools, detailing key techniques, trigger strategies, context acquisition methods, model fine‑tuning practices, current challenges, and future research directions for intelligent, repository‑wide code suggestions.

AI programmingLarge Language Modelscode completion

0 likes · 14 min read

How Programming Large Models Transform Repository‑Level Code Completion

JD Retail Technology

Dec 9, 2024 · Artificial Intelligence

Generative Retrieval for E‑commerce Search: Lexical‑Based and Semantic‑ID Approaches

This article presents a comprehensive study of generative retrieval in large‑scale e‑commerce search, detailing lexical‑based and SemanticID‑based methods, their challenges such as long‑tail distribution and token length, experimental evaluations, the discovered "sandglass" effect, and proposed solutions to improve recall and efficiency.

AIE-commerce SearchGenerative Retrieval

0 likes · 20 min read

Generative Retrieval for E‑commerce Search: Lexical‑Based and Semantic‑ID Approaches

ZhongAn Tech Team

Dec 8, 2024 · Artificial Intelligence

Weekly AI Digest Issue 5: Voice Interaction Trends, End‑to‑End vs. Chain Integration, and Enterprise Solutions

This issue examines the growing importance of voice interaction in AI, highlights Justin Uberti’s move to OpenAI and the launch of GPT‑4o, compares end‑to‑end large‑model and chain‑integration approaches, and offers practical enterprise deployment scenarios for both weak and strong voice‑based interactions.

AIChain IntegrationEnd-to-End

0 likes · 14 min read

Weekly AI Digest Issue 5: Voice Interaction Trends, End‑to‑End vs. Chain Integration, and Enterprise Solutions

Fighter's World

Dec 7, 2024 · Artificial Intelligence

Does Scaling Law Still Hold? Analyzing OpenAI’s 12‑Day Mini Releases and the Future of GPT‑5

The article examines OpenAI’s 12‑day mini‑series, the emergence of o1 and Reinforcement Fine‑Tuning, and uses Epoch AI’s 2024 report to evaluate four critical constraints—power, chip capacity, data scarcity, and latency—that determine whether AI scaling laws can sustain the compute needed for a GPT‑5‑scale model by 2030.

AI scalingData ScarcityLarge Language Models

0 likes · 11 min read

Does Scaling Law Still Hold? Analyzing OpenAI’s 12‑Day Mini Releases and the Future of GPT‑5

Baobao Algorithm Notes

Dec 7, 2024 · Artificial Intelligence

What Is Reinforcement Fine-Tuning (RFT) and How Does It Supercharge LLMs?

Reinforcement Fine-Tuning (RFT) combines supervised fine‑tuning with reinforcement learning to teach large language models to reason more effectively, using separate training and validation datasets, graders, and PPO optimization, and has shown superior performance on tasks like gene prediction and math reasoning compared to standard SFT.

AILarge Language ModelsMachine Learning

0 likes · 8 min read

What Is Reinforcement Fine-Tuning (RFT) and How Does It Supercharge LLMs?

NewBeeNLP

Dec 3, 2024 · Artificial Intelligence

Can LLMs Self‑Correct Their Answers? Exploring Reward Models, Loss Functions, and Training Dynamics

The article reflects on open‑source LLMs like Qwen2 and Llama 3.1, questioning whether models should self‑review answers, how hidden states might signal uncertainty, the role of loss‑function design, scaling laws, and the trade‑offs between PPO and DPO in alignment.

Large Language ModelsReward ModelScaling Law

0 likes · 9 min read

Can LLMs Self‑Correct Their Answers? Exploring Reward Models, Loss Functions, and Training Dynamics

NewBeeNLP

Dec 2, 2024 · Artificial Intelligence

What Are Today’s Unified Generation-and-Understanding Multimodal Model Architectures?

This article surveys current unified generation-and-understanding multimodal large-model architectures, compares LLM-centric and LLM-plus-diffusion designs, extracts common insights, details large-scale training tricks from models like Emu3, Chameleon and Janus, and outlines open research directions for visual encoders.

Large Language Modelsdiffusionmultimodal

0 likes · 5 min read

What Are Today’s Unified Generation-and-Understanding Multimodal Model Architectures?

ZhongAn Tech Team

Dec 1, 2024 · Artificial Intelligence

AI Weekly Digest Issue 4: Market Insights, Industry Solutions, and Emerging Technologies

The fourth AI weekly newsletter reviews recent industry news—including Jensen Huang's robot era vision and Tesla's Optimus plans—introduces Claude's new style‑customization feature, explores AI‑enhanced input methods, and evaluates DeepSeek's R1‑Lite model performance on complex reasoning tasks.

AIAI ApplicationsClaude

0 likes · 10 min read

AI Weekly Digest Issue 4: Market Insights, Industry Solutions, and Emerging Technologies

AntTech

Nov 29, 2024 · Artificial Intelligence

AI Industry Trends in 2024: From Global Slowdown to Chinese Market Acceleration

In 2024, despite a global slowdown in generative AI hype, China's AI market accelerates with rapid application deployments, emerging industries like embodied intelligence and autonomous driving, and a maturing ecosystem that shifts AI from hype to tangible industrial impact.

Artificial IntelligenceChinaIndustry Trends

0 likes · 11 min read

AI Industry Trends in 2024: From Global Slowdown to Chinese Market Acceleration

AI Large Model Application Practice

Nov 29, 2024 · Artificial Intelligence

Understanding RAG: How Retrieval‑Augmented Generation Reduces Large‑Model Hallucinations

This article explains the hallucination problem of large language models, introduces Retrieval‑Augmented Generation (RAG) as a solution, compares RAG with model fine‑tuning, and outlines basic RAG architecture and workflow for practical applications.

Large Language ModelsRAGhallucination mitigation

0 likes · 10 min read

Understanding RAG: How Retrieval‑Augmented Generation Reduces Large‑Model Hallucinations

Ximalaya Technology Team

Nov 29, 2024 · Artificial Intelligence

Applying Large Language Models for AIGC Advertising: Content Generation, Multimodal Understanding, and Creative Optimization at Ximalaya

Ximalaya leverages large language models and AI‑generated content to automate ad creative production, multimodal semantic understanding, and creative selection, slashing image costs to 0.2 CNY, boosting CTR by up to 3.5 %, improving revenue and eCPM by over 2 %, and expanding material diversity fivefold.

AIGCLarge Language Modelscreative optimization

0 likes · 21 min read

Applying Large Language Models for AIGC Advertising: Content Generation, Multimodal Understanding, and Creative Optimization at Ximalaya

Alibaba Cloud Developer

Nov 28, 2024 · Artificial Intelligence

Mooncake: Open-Source KVCache-Centric Architecture Boosting Large-Model Inference

Mooncake, an open-source KVCache-centric inference architecture co-developed by Alibaba Cloud and Tsinghua University's MADSys lab, dramatically improves large-model throughput and reduces cost by decoupling resources, standardizing cache pooling, and integrating with frameworks like vLLM, sparking broad industry interest.

AI InfrastructureKVCacheLarge Language Models

0 likes · 4 min read

Mooncake: Open-Source KVCache-Centric Architecture Boosting Large-Model Inference

Kuaishou Large Model

Nov 22, 2024 · Artificial Intelligence

Boost LLM Training on Massive Clusters with DP/TP Overlap and Context Parallelism

This article details a comprehensive set of techniques—including data‑ and tensor‑parallel overlap, context‑parallelism, activation rematerialization, and a performance‑driven cost model—that dramatically improve large‑language‑model training efficiency on ultra‑large GPU clusters while preserving model quality.

Large Language ModelsPerformance Modelingactivation recomputation

0 likes · 28 min read

Boost LLM Training on Massive Clusters with DP/TP Overlap and Context Parallelism

HyperAI Super Neural

Nov 20, 2024 · Artificial Intelligence

From Computer Vision to Medical AI: Prof. Xie's Work Hits Nature, NeurIPS, CVPR

Professor Xie's team at Shanghai Jiao Tong University reports rapid progress in AI for Science, detailing multimodal medical AI models, large open datasets, language and vision‑language models, and knowledge‑enhanced representations that outperform existing baselines across multiple benchmarks.

Large Language ModelsOpen Datasetsknowledge graphs

0 likes · 14 min read

From Computer Vision to Medical AI: Prof. Xie's Work Hits Nature, NeurIPS, CVPR

DataFunSummit

Nov 18, 2024 · Artificial Intelligence

Intelligent Data Analysis: Agent Architecture Combined with Semantic Layer for Product Implementation

This article explores how large‑model technologies can address data analysis challenges by introducing an Agent‑based architecture integrated with a semantic layer, detailing design principles, optimization paths, technical implementation, real‑world retail case studies, product design considerations, and future directions for intelligent analytics.

AIBusiness IntelligenceLarge Language Models

0 likes · 22 min read

Intelligent Data Analysis: Agent Architecture Combined with Semantic Layer for Product Implementation

Alibaba Cloud Developer

Nov 18, 2024 · Artificial Intelligence

Solving Knowledge Challenges in Retrieval‑Augmented Generation: Practical Optimizations

This article shares a half‑year of hands‑on experience with Retrieval‑Augmented Generation, analyzing why simple RAG setups often feel unintelligent, identifying three core knowledge issues, and presenting concrete optimization strategies—including chunking, knowledge expansion, and tag‑based conflict resolution—to improve retrieval and generation performance in low‑resource environments.

AIInformation RetrievalLarge Language Models

0 likes · 25 min read

Solving Knowledge Challenges in Retrieval‑Augmented Generation: Practical Optimizations

ZhongAn Tech Team

Nov 16, 2024 · Artificial Intelligence

Weekly AI Digest Issue 2: Video Generation, Large Models, AGI, and LoRA Fine‑Tuning

This weekly AI roundup discusses emerging video generation tools like PixelDance and Vidu 1.5, debates on scaling limits of large models, AGI geopolitical considerations, and a MIT study comparing LoRA with full fine‑tuning for domain adaptation.

AGIAILarge Language Models

0 likes · 8 min read

Weekly AI Digest Issue 2: Video Generation, Large Models, AGI, and LoRA Fine‑Tuning

NewBeeNLP

Nov 14, 2024 · Artificial Intelligence

What’s Trending in Recommendation Systems at KDD 2024? A Comprehensive Paper Overview

The 30th SIGKDD conference in Barcelona featured 2,046 research papers with a 20% acceptance rate, and this article compiles the 59 recommendation‑system papers—covering large‑model recommenders, graph‑based methods, sequential models, fairness, privacy, advertising, debiasing, reinforcement learning and more—for researchers to explore the latest academic advances.

Graph Neural NetworksKDD2024Large Language Models

0 likes · 15 min read

What’s Trending in Recommendation Systems at KDD 2024? A Comprehensive Paper Overview

Tencent Docs Tech Team

Nov 13, 2024 · Artificial Intelligence

Technical Architecture and Practices of the AI Document Assistant

This article explores the challenges large language models bring to efficiency tools, outlines the AI document assistant's technical thinking and architecture, and details both application‑side and model‑side practices such as retrieval‑augmented generation, intent recognition, and code‑driven table handling, concluding with key lessons.

AIAI ArchitectureDocument Automation

0 likes · 16 min read

Technical Architecture and Practices of the AI Document Assistant

JD Tech Talk

Nov 11, 2024 · Artificial Intelligence

Prompt Engineering: Concepts, Evolution, Techniques, and a Logistics Application Case

This article explains what Prompt Engineering is, traces its development from early command‑based interactions to modern adaptive and multimodal prompting, details various prompting techniques such as zero‑shot, few‑shot, Chain‑of‑Thought, hallucination‑reduction methods, and demonstrates their practical use in a JD Logistics SKU piece‑type classification case with code examples.

AI promptingChain-of-ThoughtLLM applications

0 likes · 26 min read

Prompt Engineering: Concepts, Evolution, Techniques, and a Logistics Application Case

DataFunSummit

Nov 9, 2024 · Artificial Intelligence

GraphRAG: Using Graph Structures to Enhance Retrieval‑Augmented Generation – Challenges, Methods, and Product Deployments

This article introduces GraphRAG, explains the limitations of traditional RAG, outlines four major challenges (fine‑grained retrieval, global context, similarity vs relevance, and macro‑level reasoning), describes GraphRAG’s graph‑based retrieval strategies, showcases comparative experiments, and presents NebulaGraph’s GenAI Suite and RAG products along with future research directions.

AIGraphRAGLarge Language Models

0 likes · 16 min read

GraphRAG: Using Graph Structures to Enhance Retrieval‑Augmented Generation – Challenges, Methods, and Product Deployments

Baobao Algorithm Notes

Nov 7, 2024 · Artificial Intelligence

Demystifying FlashAttention: A Minimalist Derivation of the Algorithm

This article presents a concise, step‑by‑step derivation of FlashAttention, explaining the prerequisite linear‑algebra concepts, the softmax simplifications, and the parallel computation workflow—including the LSE‑enhanced version—so readers can grasp the algorithm’s elegance without heavy mathematics.

Algorithm DerivationAttention MechanismFlashAttention

0 likes · 8 min read

Demystifying FlashAttention: A Minimalist Derivation of the Algorithm

NewBeeNLP

Nov 7, 2024 · Artificial Intelligence

Tackling Large Model Hallucinations: Causes, Detection, and Mitigation Strategies

This article provides a comprehensive analysis of large language model hallucinations, detailing their definitions, classifications, root causes, detection techniques, and a wide range of mitigation approaches—including RAG pipelines, decoding strategies, and model‑enhancement methods—to improve reliability and safety in real‑world AI applications.

AI safetyHallucinationLarge Language Models

0 likes · 22 min read

Tackling Large Model Hallucinations: Causes, Detection, and Mitigation Strategies

DataFunSummit

Nov 6, 2024 · Artificial Intelligence

Applying AIGC to Transform Insurance Marketing at Ant Group

This article explains how Ant Group’s insurance marketing team leverages Artificial Intelligence‑generated content (AIGC) to create personalized marketing materials, automate recommendation workflows, and produce video scripts, thereby improving efficiency, compliance, and user engagement in the insurance sector.

AIGCArtificial IntelligenceContent Generation

0 likes · 9 min read

Applying AIGC to Transform Insurance Marketing at Ant Group

Fighter's World

Nov 1, 2024 · Artificial Intelligence

How Fiercely Competitive Is the Large‑Model Landscape? Insights from the State of AI Report 2024

The State of AI Report 2024 reveals converging capabilities among open and closed LLMs, a shift toward inference compute, benchmark and data contamination challenges, rising synthetic‑data risks, booming robotics research, Nvidia's hardware dominance, and a mix of accurate and missed predictions for the coming year.

AI hardwareAI industryLarge Language Models

0 likes · 15 min read

How Fiercely Competitive Is the Large‑Model Landscape? Insights from the State of AI Report 2024

Infra Learning Club

Oct 31, 2024 · Industry Insights

Top AI Startups to Watch in 2024: 10 Leading and 6 Emerging Companies

The article surveys the most funded and influential AI startups of 2024, profiling ten large‑scale companies such as OpenAI, Anthropic, and Scale AI, and highlighting six promising newcomers, while detailing their products, CEOs, valuations, recent milestones, and industry impact.

2024AI industryAI startups

0 likes · 11 min read

Top AI Startups to Watch in 2024: 10 Leading and 6 Emerging Companies

Infra Learning Club

Oct 31, 2024 · Artificial Intelligence

What Is a Token in Large Language Models?

The article explains that a token is the unit processed by large language models, describes three common tokenizer methods—word‑level, character‑level, and sub‑word level—with English and Chinese examples, discusses their advantages and limitations, and shows how OpenAI’s tokenizer varies across model versions.

Large Language ModelsNLPToken

0 likes · 5 min read

What Is a Token in Large Language Models?

Smart Era Software Development

Oct 31, 2024 · Artificial Intelligence

How D2LLM and Codefuse‑CGE Are Redefining Search with Large Language Models

The article analyzes D2LLM’s teacher‑student bi‑encoder architecture and Codefuse‑CGE’s PMA‑enhanced code embedding, showing how both models surpass BERT dual encoders and LLM cross‑encoders in accuracy, efficiency, and storage cost across semantic and code search benchmarks.

Bi-EncoderCode EmbeddingLarge Language Models

0 likes · 7 min read

How D2LLM and Codefuse‑CGE Are Redefining Search with Large Language Models

AntTech

Oct 29, 2024 · Artificial Intelligence

Three Ant Group Papers Featured at EMNLP 2024: Dynamic Transformers, Plug‑and‑Play Visual Reasoner, and Efficient Fine‑Tuning of Large Language Models

This announcement introduces three Ant Group papers accepted at EMNLP 2024—Mixture‑of‑Modules for dynamic Transformer assembly, a plug‑and‑play visual reasoning framework built via data synthesis, and a layer‑wise importance‑aware efficient fine‑tuning method for large language models—highlighting their innovations and upcoming live presentations.

AI researchEMNLP 2024Large Language Models

0 likes · 6 min read

Three Ant Group Papers Featured at EMNLP 2024: Dynamic Transformers, Plug‑and‑Play Visual Reasoner, and Efficient Fine‑Tuning of Large Language Models

Alibaba Cloud Infrastructure

Oct 28, 2024 · Artificial Intelligence

How AI Is Redefining the Enterprise CIO Role – Insights from Alibaba Cloud’s CIO

In a detailed interview, Alibaba Cloud’s CIO Jiang Linquan discusses how rapid AI advancements—from large language models to multimodal and reasoning systems—are reshaping CIO responsibilities, accelerating enterprise information system intelligence, and driving new strategies for knowledge bases, customer service, and cross‑departmental adoption.

AICIOKnowledge Base

0 likes · 14 min read

How AI Is Redefining the Enterprise CIO Role – Insights from Alibaba Cloud’s CIO

Fighter's World

Oct 26, 2024 · Artificial Intelligence

Key Considerations for Deploying Large Language Models in Cloud Services

The article reflects on Alibaba Cloud's large‑model deployments, outlines four service scenarios, examines three fundamental questions about foundation models, and offers a prioritized roadmap—including prompt engineering, RAG, and organizational changes—to effectively bring LLMs to production.

AI DeploymentAlibaba CloudLLMOps

0 likes · 8 min read

Key Considerations for Deploying Large Language Models in Cloud Services

AntTech

Oct 15, 2024 · Artificial Intelligence

AI Large Model Technology Exploration and Application Forum (CNCC2024)

The AI Large Model Technology Exploration and Application Forum, held on October 24‑26, 2024 in Hengdian, Zhejiang, gathers leading experts from Ant Group, universities and research institutes to discuss challenges, knowledge enhancement, data infrastructure, diffusion models, multimodal and medical large models through a series of keynote talks and panel sessions.

AILarge Language Modelsconference

0 likes · 12 min read

AI Large Model Technology Exploration and Application Forum (CNCC2024)

Tencent Advertising Technology

Oct 14, 2024 · Artificial Intelligence

Generative Retrieval Based on Yuan Large Model: Implementation and Practice in Tencent Advertising

This paper presents the implementation and practice of generative retrieval based on Yuan large model in Tencent Advertising, addressing three key challenges: user intent capture, model alignment in advertising domain, and high-performance platform design under ROI constraints.

Generative RetrievalHigh-performance computingLarge Language Models

0 likes · 17 min read

Generative Retrieval Based on Yuan Large Model: Implementation and Practice in Tencent Advertising

360 Zhihui Cloud Developer

Oct 11, 2024 · Artificial Intelligence

How 360 Built a Thousand‑GPU AI Supercomputer with Kubernetes and Advanced Scheduling

This article details the design and implementation of 360’s AI Computing Center, covering server selection, network topology, Kubernetes scheduling, training and inference acceleration, and the AI platform’s core, visualization, and fault‑tolerance capabilities for large‑scale AI workloads.

AI InfrastructureGPU ClusterKubernetes

0 likes · 22 min read

How 360 Built a Thousand‑GPU AI Supercomputer with Kubernetes and Advanced Scheduling