Tagged articles

LLM

2301 articles · Page 22 of 24
Bilibili Tech
Bilibili Tech
Mar 15, 2024 · Artificial Intelligence

Hardware Resource Estimation and Bottleneck Analysis for Large Language Models (LLMs)

The article analyzes the compute, memory, and communication resources required to train and run large language models, quantifies bottlenecks such as the massive FLOP demand, terabyte‑scale GPU memory, and high‑bandwidth interconnect needs, and evaluates parallelism strategies and bandwidth estimates to guide hardware and software design for scaling LLMs.

AI InfrastructureHardwareLLM
0 likes · 53 min read
Hardware Resource Estimation and Bottleneck Analysis for Large Language Models (LLMs)
Sohu Tech Products
Sohu Tech Products
Mar 13, 2024 · Artificial Intelligence

Build a Minimal Retrieval‑Augmented Generation (Tiny‑RAG) from Scratch

This step‑by‑step guide explains how to implement a lightweight Retrieval‑Augmented Generation system—Tiny‑RAG—by creating embedding classes, loading and chunking documents, building a simple vector store, performing similarity search, and integrating a large language model for answer generation, complete with runnable Python code.

EmbeddingLLMPython
0 likes · 14 min read
Build a Minimal Retrieval‑Augmented Generation (Tiny‑RAG) from Scratch
Efficient Ops
Efficient Ops
Mar 13, 2024 · Operations

Why Traditional Ops Stalls and How AI‑Driven Solutions Can Revitalize It

The article examines common operational pain points such as cumbersome release processes, lack of standardization, and weak security controls, then explores how AI‑powered SRE tools and automation can address these challenges and guide teams toward more efficient, standardized, and resilient operations.

AILLMSRE
0 likes · 9 min read
Why Traditional Ops Stalls and How AI‑Driven Solutions Can Revitalize It
AI Large Model Application Practice
AI Large Model Application Practice
Mar 12, 2024 · Artificial Intelligence

How to Build a Corrective RAG Agent with LangGraph: A Step‑by‑Step Guide

This article explains how to use LangGraph—a graph‑based extension of LangChain—to implement a corrective RAG (C‑RAG) pipeline that evaluates retrieved documents, rewrites queries when needed, performs web search, and generates accurate answers, complete with code snippets and a runnable example.

Corrective RAGLLMLangChain
0 likes · 14 min read
How to Build a Corrective RAG Agent with LangGraph: A Step‑by‑Step Guide
AntTech
AntTech
Mar 11, 2024 · Artificial Intelligence

Can Small Language Models be Good Reasoners in Recommender Systems?

This article presents SLIM, a knowledge‑distillation framework that transfers the reasoning abilities of large language models to compact models for sequential recommendation, enhancing item representation, user profiling, and bias mitigation while achieving comparable performance with far lower computational resources.

AIEfficiencyLLM
0 likes · 12 min read
Can Small Language Models be Good Reasoners in Recommender Systems?
NewBeeNLP
NewBeeNLP
Mar 10, 2024 · Industry Insights

What WWW'24 Papers Reveal About LLMs in Search & Recommendation

This overview summarizes six WWW 2024 industry papers that apply large language models to e‑commerce search, personalized query suggestion, article recommendation, collaborative filtering, and lifelong sequential behavior understanding, highlighting their methods, experimental results, deployment status, and emerging trends in LLM‑driven search and recommendation.

LLMSearchWWW2024
0 likes · 16 min read
What WWW'24 Papers Reveal About LLMs in Search & Recommendation
NewBeeNLP
NewBeeNLP
Mar 8, 2024 · Industry Insights

Why Building LLMs Is Like Buying a Hardware Lottery – Lessons from a Startup

The article recounts Yi Tay’s experience founding Reka and building large language models from scratch, highlighting the unpredictable quality of GPU clusters, the challenges of multi‑cluster orchestration, code‑base choices, and how startups must rely on fast, intuition‑driven experimentation to succeed.

GPUHardwareLLM
0 likes · 12 min read
Why Building LLMs Is Like Buying a Hardware Lottery – Lessons from a Startup
Sohu Tech Products
Sohu Tech Products
Mar 6, 2024 · Mobile Development

On‑Device Deployment of Large Language Models Using Sohu’s Hybrid AI Engine and GPT‑2

The article outlines how Sohu’s Hybrid AI Engine enables on‑device deployment of a distilled GPT‑2 model by converting it to TensorFlow Lite, detailing the setup, customization with Keras, inference workflow, and core SDK calls, and argues that this approach offers fast, private, and cost‑effective AI for mobile devices despite typical LLM constraints.

GPT-2Hybrid AIKeras
0 likes · 9 min read
On‑Device Deployment of Large Language Models Using Sohu’s Hybrid AI Engine and GPT‑2
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 6, 2024 · Artificial Intelligence

Unlocking LangChain: Build Powerful LLM Apps Like LEGO with Real-World Examples

This article explains how LangChain simplifies building and integrating large language model applications by providing modular components such as models, prompts, indexes, tools, memory, chains, and agents, illustrated with practical use cases like travel assistants, face‑recognition troubleshooting, and multi‑agent workflows.

AI agentsLLMLangChain
0 likes · 44 min read
Unlocking LangChain: Build Powerful LLM Apps Like LEGO with Real-World Examples
21CTO
21CTO
Mar 5, 2024 · Artificial Intelligence

Can Generative AI Replace Human Programmers? LLM Insights & Future of Coding

The article examines why large language models (LLMs) cannot fully replace human programmers, compares major models like Gemma, Code Llama, GPT‑4 and Claude, discusses trust and copyright concerns, and explores how smaller, specialized LLMs may shape the future of software development.

AI ethicsGenerative AILLM
0 likes · 7 min read
Can Generative AI Replace Human Programmers? LLM Insights & Future of Coding
21CTO
21CTO
Feb 29, 2024 · Artificial Intelligence

StarCoder2 Unveiled: Open-Source LLM That Outperforms Its Predecessor with Fewer Parameters

StarCoder2, the latest open-source large language model from ServiceNow, Hugging Face, and NVIDIA, offers three sizes—30B, 70B, and 150B parameters—delivering performance comparable to the original 150B StarCoder while being more efficient and freely accessible under the BigCode Open RAIL‑M license.

LLMOpen-source AIStarCoder2
0 likes · 4 min read
StarCoder2 Unveiled: Open-Source LLM That Outperforms Its Predecessor with Fewer Parameters
NewBeeNLP
NewBeeNLP
Feb 27, 2024 · Artificial Intelligence

Boosting E‑Commerce AIGC with Knowledge Graphs: From Multimodal Inputs to Controlled LLMs

The article details how JD.com leverages domain‑specific and generic knowledge graphs to enhance multimodal product information, improve controlled text generation, and boost LLM performance for e‑commerce copywriting, covering model architecture, copy‑only mechanisms, token‑type encoding, experimental results, and practical deployment scenarios.

AIGCKnowledge GraphLLM
0 likes · 23 min read
Boosting E‑Commerce AIGC with Knowledge Graphs: From Multimodal Inputs to Controlled LLMs
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Feb 27, 2024 · Artificial Intelligence

Build a Knowledge‑Enhanced LLM Chatbot with Alibaba Cloud PAI: A Step‑by‑Step RAG Guide

This comprehensive guide walks AI developers through building a Retrieval‑Augmented Generation (RAG) chatbot on Alibaba Cloud PAI, covering architecture, vector store setup, model deployment, knowledge ingestion, multi‑modal retrieval, fusion, re‑ranking, prompt design, and end‑to‑end configuration with code examples.

Alibaba CloudChatbotLLM
0 likes · 26 min read
Build a Knowledge‑Enhanced LLM Chatbot with Alibaba Cloud PAI: A Step‑by‑Step RAG Guide
DataFunTalk
DataFunTalk
Feb 26, 2024 · Artificial Intelligence

Large Language Model Empowered Recommendation Systems: Overview, Techniques, and Future Directions

With the rapid rise of ChatGPT and large language models, recommendation systems are undergoing a transformative shift, moving beyond traditional behavior‑based methods to leverage LLMs for improved generalization, representation, and prompt‑based learning, while addressing challenges such as scalability, interpretability, bias, and deployment costs.

AILLMRepresentation
0 likes · 19 min read
Large Language Model Empowered Recommendation Systems: Overview, Techniques, and Future Directions
DaTaobao Tech
DaTaobao Tech
Feb 21, 2024 · Artificial Intelligence

An Overview of LangChain: Core Concepts and Practical Implementations

The article introduces LangChain as a framework that unifies LLM providers through model I/O, connects external data via retrievers, composes workflows with chains, maintains context with memory, and enables tool use through agents, and demonstrates Java examples for TongYi embeddings, a ChatGLM‑6B RetrievalQA chain, and discusses agent registration and micro‑service‑based agent factories.

EmbeddingJavaLLM
0 likes · 9 min read
An Overview of LangChain: Core Concepts and Practical Implementations
21CTO
21CTO
Feb 20, 2024 · Artificial Intelligence

Which LLM Dominates Coding? GPT‑4 vs CodeLlama vs Mixtral vs Gemini

This article presents a head‑to‑head evaluation of four leading large language models—GPT‑4, CodeLlama 70B, CodeLlama 7B, and Mixtral 8x7B—across eight coding‑related tasks, revealing GPT‑4 as the overall winner while highlighting the trade‑offs of smaller models and emerging competitors like Google Gemini.

AI evaluationCodeLlamaGPT-4
0 likes · 9 min read
Which LLM Dominates Coding? GPT‑4 vs CodeLlama vs Mixtral vs Gemini
DaTaobao Tech
DaTaobao Tech
Feb 19, 2024 · Artificial Intelligence

AI/ML Technology Articles Collection

This collection compiles technical articles that explore diverse AI/ML applications, from deploying large language models on MacBooks and building e‑commerce recommendation engines, to leveraging the LangChain framework, creating AIGC‑driven fashion solutions, and implementing Stable Diffusion for image generation.

AIAIGCLLM
0 likes · 1 min read
AI/ML Technology Articles Collection
DataFunTalk
DataFunTalk
Feb 19, 2024 · Artificial Intelligence

Large Language Model Inference Overview and Performance Optimizations

This article presents a comprehensive overview of large language model inference, detailing the prefill and decoding stages, key performance metrics such as throughput, latency and QPS, and a series of system-level optimizations—including pipeline parallelism, dynamic batching, specialized attention kernels, virtual memory allocation, KV‑cache quantization, and mixed‑precision strategies—to improve GPU utilization and overall inference efficiency.

GPULLMLatency
0 likes · 24 min read
Large Language Model Inference Overview and Performance Optimizations
Java Tech Enthusiast
Java Tech Enthusiast
Feb 16, 2024 · Artificial Intelligence

Google's Gemini 1.5: Breakthrough in Long-Context Understanding and Multimodal Capabilities

Google’s Gemini 1.5, a new multimodal Mixture‑of‑Experts model, supports up to a million‑token context (10 million internally), can understand text, video, audio and code, learns a new language from a single prompt, and is already being used by Samsung, Jasper and Quora, positioning it as a direct challenger to OpenAI’s flagship models.

Gemini 1.5Google AILLM
0 likes · 7 min read
Google's Gemini 1.5: Breakthrough in Long-Context Understanding and Multimodal Capabilities
AI Large Model Application Practice
AI Large Model Application Practice
Feb 15, 2024 · Artificial Intelligence

How Generative AI is Transforming RPA: Three Powerful Integration Scenarios

This article explores three key ways large language models and multimodal generative AI can enhance robotic process automation, from cognition‑boosted RPA and AI‑Agent collaboration to visual‑intelligent navigation, illustrating practical examples and future prospects for smarter digital workers.

AI AgentAutomationGenerative AI
0 likes · 12 min read
How Generative AI is Transforming RPA: Three Powerful Integration Scenarios
NewBeeNLP
NewBeeNLP
Feb 11, 2024 · Industry Insights

What 2023 Taught Us About LLMs and AI‑Guided Optimization

The author reviews a year of rapid progress in large language models, highlighting breakthrough papers such as Positional Interpolation, StreamingLLM, Deja Vu, and RLCD, and discusses how AI‑guided optimization techniques like SurCo, LANCER, and GenCo are reshaping research and industry applications.

LLMTransformersai-optimization
0 likes · 13 min read
What 2023 Taught Us About LLMs and AI‑Guided Optimization
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Feb 7, 2024 · Artificial Intelligence

Step-by-Step Guide to Building Multi‑Agent Applications with LangChain LangGraph in Google Colab

This tutorial walks through installing LangChain, LangGraph and related packages in Google Colab, configuring environment variables, defining search and Twitter‑writer tools, constructing a StateGraph workflow with supervisor logic, and executing a multi‑agent LLM pipeline using LangChain’s new multi‑agent capabilities.

AIGoogle ColabLLM
0 likes · 11 min read
Step-by-Step Guide to Building Multi‑Agent Applications with LangChain LangGraph in Google Colab
DataFunSummit
DataFunSummit
Feb 3, 2024 · Artificial Intelligence

Practical Application of Large Language Models in MaShang Consumer Finance: From Model Building to Deployment

This article details how MaShang Consumer Finance leverages large language models for sales, collection, and customer service, covering company background, AI research achievements, model training infrastructure, data‑quality and compliance challenges, prompt engineering, inference acceleration, evaluation methods, and lessons learned from real‑world deployment.

Data QualityLLMModel Deployment
0 likes · 21 min read
Practical Application of Large Language Models in MaShang Consumer Finance: From Model Building to Deployment
NewBeeNLP
NewBeeNLP
Feb 2, 2024 · Artificial Intelligence

ControlRec: Aligning LLMs with IDs to Boost Personalized Recommendations

ControlRec introduces heterogeneous feature matching and instruction contrastive learning to bridge the semantic gap between language models and discrete user/item IDs, enabling more effective personalized recommendation across multiple tasks such as rating prediction, sequential recommendation, and explanation generation.

ControlRecHeterogeneous Feature MatchingInstruction Contrast Learning
0 likes · 10 min read
ControlRec: Aligning LLMs with IDs to Boost Personalized Recommendations
Ctrip Technology
Ctrip Technology
Jan 26, 2024 · Artificial Intelligence

Implementing Plugin Functionality for a Large Language Model Chatbot Using Function Calling and Asynchronous Execution

This article explains how Ctrip's security R&D team built a web‑based LLM chatbot with version‑2.0 features such as plugin support, function calling, synchronous and asynchronous execution, WebSocket/Socket.IO communication, and provides full Python code examples for defining and invoking plugins.

AIFunction CallingLLM
0 likes · 15 min read
Implementing Plugin Functionality for a Large Language Model Chatbot Using Function Calling and Asynchronous Execution
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Jan 22, 2024 · Artificial Intelligence

Prompt Engineering and CAMEL: Role‑Playing AI Agents for Automated Prompt Generation

This article explains how Prompt Engineering combined with the CAMEL framework enables role‑playing AI agents to automatically generate and manage prompts, illustrates the concept with a stock‑trading example, and provides Python code using LangChain to build a marketing‑automation agent for a small business.

AI agentsCAMELInception Prompting
0 likes · 11 min read
Prompt Engineering and CAMEL: Role‑Playing AI Agents for Automated Prompt Generation
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Jan 21, 2024 · Artificial Intelligence

Understanding Pretraining and Fine‑Tuning of Large Language Models: Methods, Resources, and Practical Applications

This article explains the concepts of pretraining and fine‑tuning for large language models, compares full‑parameter, LoRA and QLoRA approaches, discusses resource consumption, introduces the ModelScope SWIFT framework with code examples, and shows how fine‑tuning can improve data‑visualisation tasks while reducing token usage.

Data VisualizationLLMLoRA
0 likes · 24 min read
Understanding Pretraining and Fine‑Tuning of Large Language Models: Methods, Resources, and Practical Applications
Bitu Technology
Bitu Technology
Jan 17, 2024 · Artificial Intelligence

Rosetta Stone: Scalable ID Mapping System for Tubi's Content Library Using LLMs and Embeddings

This article describes how Tubi built the Rosetta Stone system—a flexible ID mapping workflow that leverages large language models, embedding similarity ranking, and K‑nearest‑neighbors to unify and enrich metadata across a 200,000‑title library, improve content recommendation, and streamline operations.

Big DataLLMcontent ID mapping
0 likes · 10 min read
Rosetta Stone: Scalable ID Mapping System for Tubi's Content Library Using LLMs and Embeddings
Tencent Cloud Developer
Tencent Cloud Developer
Jan 16, 2024 · Frontend Development

Frontend Technology Review 2023 and Outlook 2024

The 2023 frontend review highlights TypeScript’s size and speed gains, ECMAScript 2023 features, evolving frameworks like React, Vue, Svelte, Angular and emerging Qwik, while Rust tooling, Bun, browser changes, AI‑driven low‑code, and WASM progress set the stage for 2024’s LLM‑powered, Rust‑centric, cross‑platform development.

BunD2CHarmonyOS
0 likes · 49 min read
Frontend Technology Review 2023 and Outlook 2024
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Jan 12, 2024 · Artificial Intelligence

Negative Sample Assisted Distillation for Large Language Models

The AAAI‑2024 paper introduces a Negative Sample Assisted Distillation framework—comprising Negative Assistance Training, Negative Calibration Enhancement, and Adaptive Self‑Consistency—that leverages both correct and incorrect reasoning examples to train a compact LLaMA‑7B student, achieving up to 75.75 % accuracy gains over fine‑tuning on MATH and improving out‑of‑domain benchmarks.

LLMchain-of-thoughtknowledge distillation
0 likes · 13 min read
Negative Sample Assisted Distillation for Large Language Models
Baobao Algorithm Notes
Baobao Algorithm Notes
Jan 6, 2024 · Artificial Intelligence

How to Pick the Best Fine‑Tuning Data for LLMs with the Nuggets Method

This article explains the Nuggets approach for selecting a high‑quality subset of annotated instructions to fine‑tune large language models, describing its three inputs, the gold‑score computation based on perplexity improvement, empirical results on Alpaca, and practical considerations such as task‑set design.

LLMNuggetsdata selection
0 likes · 7 min read
How to Pick the Best Fine‑Tuning Data for LLMs with the Nuggets Method
DaTaobao Tech
DaTaobao Tech
Jan 5, 2024 · Mobile Development

Edge Deployment and Performance Optimization of Large Language Models with MNN

The upgraded mnn‑llm framework adds a unified llm‑export pipeline, cross‑platform inference with tokenizers and disk‑embedding, and ARM‑focused linear‑layer optimizations—including SIMD, hand‑written assembly and 4‑bit quantization—that dramatically speed up prefilling and achieve real‑time LLM conversation on mobile devices within a 2 GB memory budget, outperforming llama.cpp, fastllm and mlc‑llm.

ARM CPUEdge deploymentLLM
0 likes · 17 min read
Edge Deployment and Performance Optimization of Large Language Models with MNN
DataFunSummit
DataFunSummit
Jan 4, 2024 · Big Data

YY Live Business Metric Governance Practice

This presentation details YY Live’s data product team’s end‑to‑end business metric governance practice, covering problem background, analysis, governance objectives, multi‑team collaboration, implementation steps, achieved efficiencies, and future directions leveraging large language models.

Big DataData PlatformLLM
0 likes · 16 min read
YY Live Business Metric Governance Practice
Baobao Algorithm Notes
Baobao Algorithm Notes
Jan 2, 2024 · Artificial Intelligence

Uncovering Mixtral‑8x7B: How MoE Experts Shape Performance and Training

This article analyses the Mixtral‑8x7B Mixture‑of‑Experts LLM, explains its gate‑driven 8‑expert architecture, presents a simplified PyTorch implementation, and reports a series of experiments that probe top‑2 gating during training, individual expert contributions, task‑specific pre‑training, the impact of expert count, and similarity with Mistral‑7B, ultimately offering hypotheses about its training pipeline.

LLMMixtralMixture of Experts
0 likes · 14 min read
Uncovering Mixtral‑8x7B: How MoE Experts Shape Performance and Training
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Dec 29, 2023 · Artificial Intelligence

Overview of Major Benchmark Datasets for Evaluating Large Language Models

This article provides a comprehensive overview of major benchmark datasets—including CMMLU, MMLU, C‑Eval, GSM8K, Gaokao‑Bench, AGIEval, MATH, BBH, HumanEval, and MBPP—used to evaluate large language models' knowledge, reasoning, and coding abilities, and summarizes related leaderboards and evaluation tools.

EvaluationLLMartificial-intelligence
0 likes · 14 min read
Overview of Major Benchmark Datasets for Evaluating Large Language Models
Huolala Tech
Huolala Tech
Dec 28, 2023 · Artificial Intelligence

How Huolala Built a Low‑Code LLM Platform to Accelerate AI Agent Deployment

Huolala created a visual, drag‑and‑drop LLM application platform that streamlines AI integration, reduces development costs, and enables rapid deployment of agents across marketing, invitation, advertising, and modeling scenarios, boosting efficiency by over 98% while cutting integration time from hours to minutes.

AIAgentLLM
0 likes · 13 min read
How Huolala Built a Low‑Code LLM Platform to Accelerate AI Agent Deployment
DaTaobao Tech
DaTaobao Tech
Dec 27, 2023 · Artificial Intelligence

Deploying a Private LLM Knowledge Base on a MacBook

The guide walks through installing and quantizing the open‑source ChatGLM3‑6B model and the m3e‑base embedder on a MacBook, wrapping them with a FastAPI OpenAI‑compatible service, routing requests through a One‑API gateway, storing metadata in MongoDB and vectors in PostgreSQL pgvector, deploying FastGPT for RAG, ingesting data, and demonstrating 5‑7 second response times, while outlining future improvements.

ChatGLM3FastAPIKnowledge Base
0 likes · 23 min read
Deploying a Private LLM Knowledge Base on a MacBook
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Dec 27, 2023 · Artificial Intelligence

Comprehensive Overview of Large Language Models: Capabilities, Limitations, Deployment, and Future Trends

This article provides a detailed examination of large language models, covering their underlying technologies, capabilities and constraints, model families, training processes, cloud and edge deployment challenges, agent architectures, and emerging trends, offering practical insights for developers, product managers, and researchers.

AgentsLLMModel Deployment
0 likes · 43 min read
Comprehensive Overview of Large Language Models: Capabilities, Limitations, Deployment, and Future Trends
21CTO
21CTO
Dec 15, 2023 · Artificial Intelligence

Why 2024 Will Be the Year of AI Engineers and LLM‑Driven Apps

The article outlines five major AI engineering trends for 2024—including the rise of AI engineers, evolving LLM tech stacks, open‑source large models, vector databases, and AI agents—highlighting how these shifts will reshape application development and industry competition.

2024 trendsAI EngineeringAI agents
0 likes · 9 min read
Why 2024 Will Be the Year of AI Engineers and LLM‑Driven Apps
DataFunSummit
DataFunSummit
Dec 15, 2023 · Artificial Intelligence

Integrating Large Language Models into Recommender Systems: Opportunities, Methods, and Challenges

This article explores how large language models can be incorporated into recommender systems, discussing background challenges, specific integration points across the recommendation pipeline, practical implementation methods, experimental results, and future research directions, while highlighting industrial considerations and potential improvements.

Industrial ApplicationsLLMModel Fusion
0 likes · 20 min read
Integrating Large Language Models into Recommender Systems: Opportunities, Methods, and Challenges
Data Thinking Notes
Data Thinking Notes
Dec 12, 2023 · Artificial Intelligence

Boosting Text‑to‑SQL Accuracy with Prompt Engineering and LLMs

This article examines the challenges of LLM‑based Text‑to‑SQL such as hallucinations, data‑security risks, and user input errors, and presents prompt‑engineering strategies, fine‑tuning comparisons, prompt types, code examples, and experimental results to improve reliability and cost‑effectiveness.

LLMLangChainPrompt engineering
0 likes · 15 min read
Boosting Text‑to‑SQL Accuracy with Prompt Engineering and LLMs
NetEase Cloud Music Tech Team
NetEase Cloud Music Tech Team
Dec 12, 2023 · Artificial Intelligence

How LangChain Powers AI Agents: Principles, Debugging, and Real‑World Optimizations

This article explains the concept of AI Agents in the large‑language‑model era, details LangChain's implementation mechanics, shares practical challenges and optimizations encountered by NetEase Cloud Music, and provides step‑by‑step code examples and performance insights for building robust AI Agents.

AI AgentLLMLangChain
0 likes · 20 min read
How LangChain Powers AI Agents: Principles, Debugging, and Real‑World Optimizations
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Dec 8, 2023 · Artificial Intelligence

Simplifying Transformer Blocks: Removing Residual Connections, LayerNorm, and Other Components without Losing Performance

A recent ETH Zurich paper shows that standard Transformer blocks can be drastically simplified by removing residual connections, LayerNorm, projection and value parameters, and even MLP sub‑block components, achieving up to 16% fewer parameters and comparable training speed and downstream performance on both GPT‑style decoders and BERT models.

AILLMSignal Propagation
0 likes · 11 min read
Simplifying Transformer Blocks: Removing Residual Connections, LayerNorm, and Other Components without Losing Performance
Sohu Tech Products
Sohu Tech Products
Dec 6, 2023 · Databases

GPTuner: LLM-Driven PostgreSQL Knob Tuning

GPTuner, an LLM‑driven system for PostgreSQL knob tuning developed by researchers at Sichuan University, demonstrates that knowledge processing, parameter selection, search‑range optimization, and a two‑stage Bayesian framework each significantly improve performance, while costing roughly 880 000 GPT‑4 tokens (≈ $30) with reusable knowledge.

Ablation StudyDatabase TuningGPTuner
0 likes · 9 min read
GPTuner: LLM-Driven PostgreSQL Knob Tuning
DataFunTalk
DataFunTalk
Dec 6, 2023 · Artificial Intelligence

Distributed Training Techniques and Quantitative Analysis for Large Language Models (GPT‑175B)

This article presents a comprehensive overview of state‑of‑the‑art distributed training methods for large language models, using GPT‑175B as a case study to analyze memory, communication, and compute overheads, and to recommend practical optimization strategies such as tensor, pipeline, and sequence parallelism, ZeRO‑1 optimizer, and selective activation checkpointing.

GPU memory optimizationLLMMegatron
0 likes · 22 min read
Distributed Training Techniques and Quantitative Analysis for Large Language Models (GPT‑175B)
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Dec 6, 2023 · Artificial Intelligence

Multi-Agent Research Overview, Open-Source Implementations, and Design Considerations

This article reviews the background of multi‑agent systems, compares major open‑source frameworks such as AutoGen, MetaGPT, AgentVerse, and XAgent, discusses design principles, collaboration strategies, and offers conclusions on LLM‑driven versus SOP‑driven approaches for building multi‑agent applications.

AIAgent frameworkAutoGen
0 likes · 15 min read
Multi-Agent Research Overview, Open-Source Implementations, and Design Considerations
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Dec 5, 2023 · Artificial Intelligence

How to Efficiently Fine‑Tune Qwen LLMs on Alibaba Cloud PAI Lingjun

This guide walks you through setting up Alibaba Cloud PAI Lingjun resources, preparing Qwen‑7B/14B/72B models, preprocessing large‑scale WuDao data, configuring distributed training with Megatron‑LM, performing continued pre‑training and supervised fine‑tuning, and finally deploying the model as an online service via PAI‑EAS.

Alibaba CloudLLMMegatron-LM
0 likes · 27 min read
How to Efficiently Fine‑Tune Qwen LLMs on Alibaba Cloud PAI Lingjun
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Nov 30, 2023 · Artificial Intelligence

Mastering LLM Text Generation: Decoding Methods Explained

This review of the recent MindSpore NLP public class walks through the fundamentals of large language model text generation, detailing deterministic decoding such as greedy and beam search, stochastic sampling techniques like temperature, top‑k and top‑p, and advanced methods including constrained beam, contrastive, and assisted search, with illustrative examples.

Beam SearchGreedy SearchLLM
0 likes · 5 min read
Mastering LLM Text Generation: Decoding Methods Explained
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Nov 29, 2023 · Artificial Intelligence

Building a Private LLM‑Powered Knowledge Base with LangChain and ChatGLM3

This article explains how to migrate personal notes into a private knowledge base by combining a large language model with an external vector store, detailing the concepts of tokenization, embedding, vector databases, and step‑by‑step deployment using LangChain‑Chatchat and the open‑source ChatGLM3 model.

ChatGLM3EmbeddingKnowledge Base
0 likes · 10 min read
Building a Private LLM‑Powered Knowledge Base with LangChain and ChatGLM3
DataFunSummit
DataFunSummit
Nov 20, 2023 · Artificial Intelligence

ModelScope Agents: Open‑Source LLM Agent Framework and Practical Guide

This article introduces ModelScope Agents, an open‑source LLM‑based agent framework that addresses limitations of GPT Store, outlines its features, provides installation and usage instructions, showcases a RPG game example, and invites the community to contribute to its roadmap.

AIAgent frameworkLLM
0 likes · 7 min read
ModelScope Agents: Open‑Source LLM Agent Framework and Practical Guide
Baobao Algorithm Notes
Baobao Algorithm Notes
Nov 13, 2023 · Artificial Intelligence

Mastering LLM Fundamentals: Tokenizers, Layer Norm, and PEFT Explained

This article provides a comprehensive technical guide on large language model fundamentals, covering tokenizer construction methods such as BPE, WordPiece, and SentencePiece, detailed explanations of Layer Normalization variants, Deep Norm concepts with code, and an overview of parameter‑efficient fine‑tuning techniques like LoRA and PEFT.

LLMLayer NormalizationPEFT
0 likes · 36 min read
Mastering LLM Fundamentals: Tokenizers, Layer Norm, and PEFT Explained
Data Thinking Notes
Data Thinking Notes
Nov 12, 2023 · Artificial Intelligence

Unlocking LLM Power: Semantic Search, Private Knowledge Bases, and Text‑to‑SQL for Data Teams

This article explores how large language models can boost data workflows by using embeddings for semantic retrieval, building domain‑specific knowledge bases for private Q&A, generating SQL code from natural language, and automating exploratory data analysis, offering practical steps and visual examples.

EmbeddingKnowledge BaseLLM
0 likes · 7 min read
Unlocking LLM Power: Semantic Search, Private Knowledge Bases, and Text‑to‑SQL for Data Teams
Baobao Algorithm Notes
Baobao Algorithm Notes
Nov 7, 2023 · Artificial Intelligence

A Complete Technical Guide to LLM Foundations, Advanced Topics, Fine‑Tuning, and LangChain Applications

This article provides an in‑depth technical overview of large language models (LLMs), covering core model families, architectural differences, emergent abilities, common challenges such as repetition and token limits, detailed fine‑tuning strategies including PEFT, practical guidance for training custom models, and a thorough introduction to the LangChain framework with code examples, core concepts, and troubleshooting tips for building LLM‑powered applications.

LLMLangChainVector Store
0 likes · 97 min read
A Complete Technical Guide to LLM Foundations, Advanced Topics, Fine‑Tuning, and LangChain Applications
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Nov 3, 2023 · Artificial Intelligence

Can LLMs Master Lifelong Learning? Exploring MoE and Continuous Adaptation

This article explains how large language models can achieve continual lifelong learning, outlines the key properties required, reviews mixture‑of‑experts (MoE) techniques—including sparse MoE, GShard, Switch Transformer, GLaM and PanGu‑Sigma—and discusses the remaining challenges such as model complexity, expert balancing and distributed communication overhead.

LLMLifelong LearningMixture of Experts
0 likes · 9 min read
Can LLMs Master Lifelong Learning? Exploring MoE and Continuous Adaptation
DataFunSummit
DataFunSummit
Nov 1, 2023 · Artificial Intelligence

DataFunCon2023 Shenzhen: Program Overview and Session Highlights

DataFunCon2023 Shenzhen showcases a comprehensive program featuring expert talks on building Data+LLM applications, large-scale storage, cloud‑native architectures, metric systems, data governance, AB testing, and industry‑specific large language model use cases across finance, gaming, advertising, and more, providing valuable insights for practitioners and researchers alike.

@DataAIGCBig Data
0 likes · 50 min read
DataFunCon2023 Shenzhen: Program Overview and Session Highlights
Software Development Quality
Software Development Quality
Oct 27, 2023 · Artificial Intelligence

TestAgent: Open-Source 7B LLM for Multi-Language Test Generation

TestAgent introduces an open-source 7B large language model tailored for software testing, offering multi‑language test case generation, automatic assert completion, and a lightweight engineering framework with quick‑start scripts, performance benchmarks, and deployment options for various hardware accelerators.

AI modelLLMMulti-language Generation
0 likes · 10 min read
TestAgent: Open-Source 7B LLM for Multi-Language Test Generation
JD Retail Technology
JD Retail Technology
Oct 26, 2023 · Artificial Intelligence

Leveraging Large Language Models for Text-to-SQL: Prompt Design and End-to-End Pipeline

This article explains how large language models can be used to convert natural language queries into SQL statements, describes two main approaches—direct generation and fine‑tuned open‑source models—details prompt engineering techniques, and outlines an end‑to‑end pipeline that executes the generated SQL and summarizes results.

ChatGLMLLMPrompt engineering
0 likes · 7 min read
Leveraging Large Language Models for Text-to-SQL: Prompt Design and End-to-End Pipeline
phodal
phodal
Oct 19, 2023 · Operations

Can LLMs Revolutionize Code Review? Inside AutoDev’s AI‑Powered Approach

The article examines how rising code volume and AI‑generated snippets challenge traditional code review, proposes an LLM‑assisted workflow using AutoDev and DevOpsGenius, details prompt design, commit filtering, and implementation steps, and discusses the benefits and limitations for different team roles.

AI AutomationLLMPrompt engineering
0 likes · 9 min read
Can LLMs Revolutionize Code Review? Inside AutoDev’s AI‑Powered Approach
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Oct 19, 2023 · Artificial Intelligence

How to Build a Retrieval‑Augmented LLM Knowledge Base on Alibaba Cloud

This guide details a complete end‑to‑end solution for constructing a large‑language‑model knowledge‑base chatbot on Alibaba Cloud, covering background, modular architecture, vector database selection, text preprocessing, embedding models, LLM fine‑tuning, prompt engineering, deployment with PAI‑EAS and BladeLLM, and real‑world results.

AICloudLLM
0 likes · 37 min read
How to Build a Retrieval‑Augmented LLM Knowledge Base on Alibaba Cloud
Baobao Algorithm Notes
Baobao Algorithm Notes
Oct 19, 2023 · Artificial Intelligence

Efficient LLM Deployment: Low‑Precision, Flash Attention, and Architecture Tricks

This article reviews the main memory and compute challenges of deploying large language models and presents practical solutions—including low‑precision arithmetic, flash attention, advanced positional embeddings, key‑value caching, and quantization techniques—backed by code examples and performance measurements on models such as OctoCoder.

Flash AttentionLLMQuantization
0 likes · 35 min read
Efficient LLM Deployment: Low‑Precision, Flash Attention, and Architecture Tricks
Architect
Architect
Oct 18, 2023 · Artificial Intelligence

Code Understanding: Techniques, Applications, and AI‑Driven Solutions

This article explores the fundamentals of code understanding, including static, dynamic, and non‑code analysis, presents a three‑layer architecture for scalable code comprehension, and demonstrates practical AI‑enhanced applications such as intelligent unit testing, dead‑code detection, and AI‑based static analysis within CI/CD pipelines.

AICI/CDLLM
0 likes · 16 min read
Code Understanding: Techniques, Applications, and AI‑Driven Solutions
AI Large Model Application Practice
AI Large Model Application Practice
Oct 18, 2023 · Artificial Intelligence

How to Extract and Embed Tables and Images from PDFs for Multimodal RAG

This article explains a practical approach to parsing PDFs containing text, tables, and images, using the open‑source Unstructured library and LlaVA model, then embedding each modality into a vector store with multi‑vector retrieval to enable accurate semantic search in private‑knowledge RAG pipelines, with optional LangChain integration.

LLMLangChainPDF Processing
0 likes · 12 min read
How to Extract and Embed Tables and Images from PDFs for Multimodal RAG
Ximalaya Technology Team
Ximalaya Technology Team
Oct 18, 2023 · Artificial Intelligence

The Evolution of AI Agents: From Philosophy to Modern Implementations

Tracing AI agents from Aristotle’s and Zhuangzi’s philosophical notions through the coining of “agent” in computer science to today’s learning‑based systems powered by large language models, the article outlines key milestones, core components—LLM brain, memory, planning, tool use—and showcases applications such as AlphaGo, Siri, and autonomous platforms, while forecasting their expanding, industry‑wide ubiquity.

AI agentsAutonomous SystemsGenerative Agents
0 likes · 21 min read
The Evolution of AI Agents: From Philosophy to Modern Implementations
Model Perspective
Model Perspective
Oct 15, 2023 · Artificial Intelligence

How to Use Large Language Models Ethically in Math Modeling Contests

COMAP’s new policy outlines why and how teams in mathematical modeling competitions should responsibly employ large language models and generative AI, detailing guiding principles, risks, citation requirements, and ethical considerations to ensure fairness, transparency, and academic integrity.

AI policyGenerative AILLM
0 likes · 9 min read
How to Use Large Language Models Ethically in Math Modeling Contests
dbaplus Community
dbaplus Community
Oct 14, 2023 · Artificial Intelligence

Demystifying Retrieval‑Augmented Generation: From Theory to Working Chatbot

This guide explains the Retrieval‑Augmented Generation (RAG) technique, detailing how user queries are matched to private knowledge bases, how relevant passages are retrieved, and how large language models use those passages to generate context‑aware answers, complete with code examples and practical tips.

ChatbotEmbeddingLLM
0 likes · 19 min read
Demystifying Retrieval‑Augmented Generation: From Theory to Working Chatbot
21CTO
21CTO
Oct 12, 2023 · Frontend Development

How Vercel’s AI‑Powered v0 Tool Is Transforming Frontend Development

Vercel has launched v0, an AI‑driven tool that lets developers describe desired UI components in plain text and receive generated frontend code, streamlining creation, offering multiple design options, and shifting developer focus toward creativity and design.

AILLMVercel
0 likes · 4 min read
How Vercel’s AI‑Powered v0 Tool Is Transforming Frontend Development
Zhuanzhuan Tech
Zhuanzhuan Tech
Oct 11, 2023 · Artificial Intelligence

Building a ChatGPT‑Based Intelligent Customer Service System with BERT Classification and Knowledge Filtering

This article describes how to construct an intelligent customer‑service assistant using ChatGPT for natural‑language understanding, BERT for user‑question classification, and Sentence‑BERT for knowledge‑selection, detailing system architecture, prompt design, model training, performance results, and practical cost reductions.

BERTChatGPTIntelligent Customer Service
0 likes · 16 min read
Building a ChatGPT‑Based Intelligent Customer Service System with BERT Classification and Knowledge Filtering
ByteFE
ByteFE
Oct 11, 2023 · Artificial Intelligence

CR Copilot: An Open‑Source LLM‑Based Code Review Assistant with Private Knowledge Base

This article describes the design and implementation of a code‑review assistant powered by open‑source large language models and a privately hosted knowledge base, covering background, pain points, system architecture, model selection, vector‑store integration, prompt engineering, diff parsing, and practical reflections.

AIKnowledge BaseLLM
0 likes · 24 min read
CR Copilot: An Open‑Source LLM‑Based Code Review Assistant with Private Knowledge Base
DataFunTalk
DataFunTalk
Oct 10, 2023 · Artificial Intelligence

Integrating Large Language Models into Recommender Systems: Opportunities, Methods, and Challenges

This article surveys how large language models can be incorporated into recommender systems, discussing their strengths and limitations, outlining where and how they can be applied across the recommendation pipeline, presenting recent research examples, and highlighting challenges and future directions for industrial deployment.

LLMfeature engineeringrecommender systems
0 likes · 20 min read
Integrating Large Language Models into Recommender Systems: Opportunities, Methods, and Challenges
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Oct 10, 2023 · Artificial Intelligence

Create a Custom Enterprise Conversational Search with Alibaba Cloud OpenSearch Vector & LLM

This guide walks you through setting up Alibaba Cloud OpenSearch Vector Search and LLM Intelligent Q&A editions, covering environment preparation, instance creation, data source configuration, field and index setup, document ingestion, query processing, and a complete Java SDK demo for building a flexible enterprise conversational search system.

Alibaba CloudConversational AIJava SDK
0 likes · 20 min read
Create a Custom Enterprise Conversational Search with Alibaba Cloud OpenSearch Vector & LLM
Baidu Geek Talk
Baidu Geek Talk
Oct 9, 2023 · Artificial Intelligence

Code Understanding Technology: Building White-Box Software Knowledge Graph at Baidu

Baidu’s white‑box code understanding platform combines static, dynamic, non‑code and LLM‑based analyses in a three‑layer architecture that accelerates C/C++ processing ninefold, supports multiple languages, and powers applications such as intelligent unit testing, orphan‑function cleanup and AI‑driven risk detection, while future integration with models like GPT‑4 aims to enable multi‑turn code Q&A, automated refactoring and predictive testing.

ASTBaiduCI/CD
0 likes · 15 min read
Code Understanding Technology: Building White-Box Software Knowledge Graph at Baidu
Baobao Algorithm Notes
Baobao Algorithm Notes
Oct 8, 2023 · Interview Experience

Must‑Know Large‑Model Interview Questions for RLHF Candidates

The article shares a practitioner’s transition story from reinforcement‑learning‑focused game AI to large‑model work, outlines the challenges faced during job hunting at major Chinese tech firms, and provides a curated list of 23 technical interview questions covering PPO, RLHF, dataset evaluation, model fine‑tuning, and broader LLM concepts.

AI researchLLMRLHF
0 likes · 10 min read
Must‑Know Large‑Model Interview Questions for RLHF Candidates
21CTO
21CTO
Oct 4, 2023 · Artificial Intelligence

How LangStream Merges Data Streams with Generative AI for Real‑Time LLM Apps

LangStream, the new open‑source framework from DataStax, combines event‑driven data streaming with generative AI, offering seamless integration with vector databases like Astra DB, Milvus, and Pinecone, and providing a Kubernetes‑based runtime that enables real‑time LLM applications without extensive coding.

Data StreamingLLMLangStream
0 likes · 7 min read
How LangStream Merges Data Streams with Generative AI for Real‑Time LLM Apps