Tagged articles
891 articles
Page 9 of 9
Model Perspective
Model Perspective
Jul 30, 2024 · Artificial Intelligence

Your Complete AI Learning Roadmap: From Basics to Large Model Mastery

This guide presents a comprehensive AI learning roadmap, dividing study into five progressive stages—from foundational math and programming to core deep‑learning and reinforcement‑learning techniques, large‑model training, industry applications, and future trends—plus curated book lists, tool recommendations, and practical RAG tutorials.

AI learning roadmapAI resourcesRAG
0 likes · 9 min read
Your Complete AI Learning Roadmap: From Basics to Large Model Mastery
Tencent Cloud Developer
Tencent Cloud Developer
Jul 30, 2024 · Artificial Intelligence

A Systematic Guide to Prompt Engineering: From Zero to One

This guide walks readers from beginner to proficient Prompt Engineer by outlining the evolution of prompting, introducing a universal four‑component template, and detailing a five‑step workflow—including refinement, retrieval‑augmented generation, chain‑of‑thought reasoning, and advanced tuning techniques—plus evaluation metrics for LLM performance.

AI promptingLLM optimizationPrompt engineering
0 likes · 51 min read
A Systematic Guide to Prompt Engineering: From Zero to One
phodal
phodal
Jul 24, 2024 · Artificial Intelligence

How to Build Trustworthy Coding Agents with Shire’s Custom RAG Workflow

This article explains how to use the Shire language to create reliable coding agents by defining custom RAG workflows, leveraging IDE APIs, code verification functions, and vector‑based search, with detailed examples, configuration snippets, and a roadmap for future enhancements.

AICoding AgentIDE
0 likes · 10 min read
How to Build Trustworthy Coding Agents with Shire’s Custom RAG Workflow
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 22, 2024 · Artificial Intelligence

How Alibaba’s Logistics AI Overcame B2B Large Model Challenges

Alibaba’s logistics AI team shares their year‑long journey building a vertical‑domain large language model for logistics, detailing model alignment, Text2API, RAG, SFT techniques, challenges like accuracy and knowledge‑base maintenance, and showcasing real‑world applications such as chatbots, DingTalk assistants, and custom AI assistants.

Model AlignmentRAGSFT
0 likes · 16 min read
How Alibaba’s Logistics AI Overcame B2B Large Model Challenges
DevOps
DevOps
Jul 21, 2024 · Artificial Intelligence

LLM Fundamentals, Applications, Prompt Engineering, RAG, and Agentic Workflows

This article provides a comprehensive overview of large language models (LLMs), covering their basic concepts, relationship with NLP, development history, parameter scaling, offline deployment, practical applications, prompt‑engineering frameworks, retrieval‑augmented generation, LangChain integration, agents, workflow orchestration, and future directions toward multimodal AI and AGI.

AI applicationsAgentLLM
0 likes · 36 min read
LLM Fundamentals, Applications, Prompt Engineering, RAG, and Agentic Workflows
DaTaobao Tech
DaTaobao Tech
Jul 19, 2024 · Artificial Intelligence

Practices and Techniques for Vertical Domain Large Language Models

Vertical domain large language models, fine‑tuned on specialized data, deliver higher expertise and task performance, but require continual knowledge updates and careful alignment; techniques such as BPO‑guided instruction tuning (+1.8% accuracy), Reflexion‑based Text2API (+4% API correctness), advanced RAG preprocessing, and SFT combined with ORPO (+5.2% gain) demonstrate notable improvements while underscoring remaining challenges and collaborative opportunities.

AIAlignmentRAG
0 likes · 9 min read
Practices and Techniques for Vertical Domain Large Language Models
Tencent Cloud Developer
Tencent Cloud Developer
Jul 18, 2024 · Artificial Intelligence

Exploring Large Language Models (LLM): Fundamentals, Applications, and Future Directions

Exploring Large Language Models, this article surveys their core concepts, evolution through Transformers, GPT and BERT, generation challenges, diverse applications such as QA, multimodal creation, summarization and retrieval‑augmented generation, prompt‑engineering frameworks and tools, LangChain‑based pipelines, AI‑driven agents, and future prospects toward domain‑specific use, multimodality, and AGI.

AIAgentLLM
0 likes · 35 min read
Exploring Large Language Models (LLM): Fundamentals, Applications, and Future Directions
JD Tech Talk
JD Tech Talk
Jul 16, 2024 · Artificial Intelligence

Task‑Aware Decoding (TaD): A Plug‑and‑Play Method to Mitigate Hallucinations in Large Language Models

TaD, a task‑aware decoding technique jointly developed by JD.com and Tsinghua University and presented at IJCAI 2024, leverages differences between pre‑ and post‑fine‑tuned LLM outputs to construct knowledge vectors, significantly reducing hallucinations across various models, tasks, and data‑scarce scenarios, especially when combined with RAG.

AILLMRAG
0 likes · 18 min read
Task‑Aware Decoding (TaD): A Plug‑and‑Play Method to Mitigate Hallucinations in Large Language Models
Architect
Architect
Jul 13, 2024 · Artificial Intelligence

Practical Guide to Building LLM Products: Prompt Engineering, RAG, Evaluation, and Operations

This article provides a comprehensive, step‑by‑step guide for developing large‑language‑model (LLM) applications, covering prompt design techniques, n‑shot and chain‑of‑thought strategies, retrieval‑augmented generation, structured I/O, workflow optimization, evaluation pipelines, operational best practices, and team organization to create reliable, scalable AI products.

AI OperationsLLMProduct Development
0 likes · 54 min read
Practical Guide to Building LLM Products: Prompt Engineering, RAG, Evaluation, and Operations
AsiaInfo Technology: New Tech Exploration
AsiaInfo Technology: New Tech Exploration
Jul 12, 2024 · Artificial Intelligence

How AI‑Native Transforms User Experience Management in Telecom Networks

This article examines how the AI‑Native approach reshapes the AISWare CEM platform by integrating large language models, Retrieval‑Augmented Generation, and atomic capability decomposition to improve user perception, streamline interactions, and enable intelligent diagnostic assistants for telecom operators.

AI-nativeAtomic CapabilitiesDiagnostic Assistant
0 likes · 12 min read
How AI‑Native Transforms User Experience Management in Telecom Networks
JD Tech
JD Tech
Jul 10, 2024 · Artificial Intelligence

Implementing Retrieval‑Augmented Generation (RAG) with LangChain4j in Java

This article provides a step‑by‑step guide for Java engineers on building a Retrieval‑Augmented Generation (RAG) application using the LangChain4j framework, covering RAG fundamentals, environment setup, Maven integration, document loading, splitting, embedding with OpenAI, vector store management with Chroma, and prompt‑based LLM interaction.

EmbeddingJavaLLM
0 likes · 35 min read
Implementing Retrieval‑Augmented Generation (RAG) with LangChain4j in Java
21CTO
21CTO
Jul 7, 2024 · Artificial Intelligence

How to Build a Secure Local LLM Chatbot with Ollama, Python, and ChromaDB

This tutorial walks you through creating a privacy‑preserving, locally hosted large language model chatbot using Ollama, Python 3, and ChromaDB, covering RAG fundamentals, GPU selection, environment setup, and full source code for a Flask‑based application.

ChromaDBLLMOllama
0 likes · 19 min read
How to Build a Secure Local LLM Chatbot with Ollama, Python, and ChromaDB
AI Large Model Application Practice
AI Large Model Application Practice
Jul 4, 2024 · Artificial Intelligence

Mastering Multimodal RAG: From PDF Parsing to Advanced Query Rewriting

This article explains how to handle complex multimodal PDFs in RAG systems, outlines extraction, indexing, and multimodal model integration, details four query‑rewriting strategies (HyDE, stepwise, sub‑question, backward), and presents key evaluation metrics and tools for assessing RAG performance.

Document ParsingQuery RewritingRAG
0 likes · 12 min read
Mastering Multimodal RAG: From PDF Parsing to Advanced Query Rewriting
AntTech
AntTech
Jul 2, 2024 · Artificial Intelligence

Design and Implementation of a Generalized Retrieval‑Augmented Generation (RAG) Framework with Graph RAG Support

This article surveys Retrieval‑Augmented Generation (RAG), analyzes the limitations of traditional vector‑based RAG, introduces Graph RAG that leverages knowledge graphs for more reliable context, proposes a universal RAG architecture compatible with vector, graph and full‑text indexes, and details its open‑source implementation, code components, testing, and future research directions.

AIEngineeringGraphRAGKnowledgeGraph
0 likes · 26 min read
Design and Implementation of a Generalized Retrieval‑Augmented Generation (RAG) Framework with Graph RAG Support
JD Tech
JD Tech
Jun 28, 2024 · Artificial Intelligence

An Overview of Large Language Models: History, Fundamentals, Prompt Engineering, Retrieval‑Augmented Generation, Agents, and Multimodal AI

This article provides a comprehensive introduction to large language models, covering their historical development, core architecture, training process, prompt engineering techniques, Retrieval‑Augmented Generation, agent frameworks, multimodal capabilities, safety challenges, and future research directions.

AI SafetyAI agentsDeep Learning
0 likes · 22 min read
An Overview of Large Language Models: History, Fundamentals, Prompt Engineering, Retrieval‑Augmented Generation, Agents, and Multimodal AI
Baobao Algorithm Notes
Baobao Algorithm Notes
Jun 27, 2024 · Artificial Intelligence

Engineering Data for R&D Large Language Models: From Pre‑training to Prompt Design

This article presents a comprehensive guide to data engineering for research‑focused large language models, covering domain‑adaptive pre‑training, supervised fine‑tuning, retrieval‑augmented generation, dataset construction, data cleaning pipelines, token‑izer adaptation, and prompt engineering best practices to boost model performance in specialized tasks.

Fine‑TuningLLMRAG
0 likes · 20 min read
Engineering Data for R&D Large Language Models: From Pre‑training to Prompt Design
Alibaba Cloud Developer
Alibaba Cloud Developer
Jun 27, 2024 · Artificial Intelligence

How to Supercharge Retrieval‑Augmented Generation: Papers, Techniques, and Real‑World Tips

This article surveys the main challenges of deploying large language models, introduces key RAG optimization papers such as RAPTOR, Self‑RAG, and CRAG, and compiles practical engineering tricks—including chunking, query rewriting, hybrid and progressive retrieval—to help practitioners build more accurate and efficient RAG systems.

AI researchLLM optimizationRAG
0 likes · 22 min read
How to Supercharge Retrieval‑Augmented Generation: Papers, Techniques, and Real‑World Tips
DataFunTalk
DataFunTalk
Jun 21, 2024 · Artificial Intelligence

Fine‑tuning Large Language Models with Alibaba Cloud PAI: Practices, Techniques, and Deployment

This article introduces the Alibaba Cloud PAI platform for large language model (LLM) fine‑tuning, covering model‑training pipelines, performance‑cost trade‑offs, retrieval‑augmented generation, fine‑tuning methods such as full‑parameter, LoRA and QLoRA, model selection, data preparation, evaluation, and real‑world deployment examples.

AI PlatformFine-tuningLLM
0 likes · 20 min read
Fine‑tuning Large Language Models with Alibaba Cloud PAI: Practices, Techniques, and Deployment
JD Cloud Developers
JD Cloud Developers
Jun 20, 2024 · Artificial Intelligence

How Large Language Models Boost Courier Efficiency: From Voice Commands to Smart QA

This article explains how large language models like ChatGPT can transform courier operations by automating voice‑driven tasks, enabling intelligent question answering with retrieval‑augmented generation, extracting and splitting document content, embedding it for vector search, and delivering smart prompts and agents to improve productivity and accuracy.

AIEmbeddingLogistics
0 likes · 15 min read
How Large Language Models Boost Courier Efficiency: From Voice Commands to Smart QA
Architecture & Thinking
Architecture & Thinking
Jun 19, 2024 · Artificial Intelligence

Build AI‑Native Apps Quickly with Spring AI: From Chat Models to RAG

This guide explains what an AI‑native application is, compares AI‑native and AI‑based approaches, and walks through Spring AI’s core features—including chat models, prompt templates, function calling, structured output, image generation, embedding, and vector stores—showing step‑by‑step code examples and how to assemble a complete AI‑native app with RAG support.

AI native applicationFunction CallingJava
0 likes · 43 min read
Build AI‑Native Apps Quickly with Spring AI: From Chat Models to RAG
JD Tech
JD Tech
Jun 19, 2024 · Artificial Intelligence

Advances in Large AI Models: Prompt Engineering, RAG, Agents, Fine‑Tuning, Vector Databases and Knowledge Graphs

This article surveys the rapid expansion of large AI models, covering prompt engineering, structured prompts, retrieval‑augmented generation, AI agents, fine‑tuning strategies, vector database technology, knowledge graphs, function calling, and their collective role in moving toward artificial general intelligence.

AIAgentFine‑tuning
0 likes · 23 min read
Advances in Large AI Models: Prompt Engineering, RAG, Agents, Fine‑Tuning, Vector Databases and Knowledge Graphs
AI Large Model Application Practice
AI Large Model Application Practice
Jun 17, 2024 · Artificial Intelligence

Boost Your RAG Pipeline with Cohere and BGE Rerank Models

This guide explains why post‑retrieval reranking is essential for Retrieval‑Augmented Generation, compares the commercial Cohere Rerank service with the open‑source bge‑reranker‑large model, and provides step‑by‑step code for integrating both into LlamaIndex pipelines, including a custom TEI‑based processor.

BGECohereLlamaIndex
0 likes · 11 min read
Boost Your RAG Pipeline with Cohere and BGE Rerank Models
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Jun 14, 2024 · Artificial Intelligence

How Alibaba Cloud OpenSearch Powers RAG: Insights from AICon 2024

In this talk, Alibaba Cloud's OpenSearch RAG team shares their year‑long journey of building retrieval‑augmented generation systems, covering data parsing, slicing, vectorization, hybrid retrieval, model fine‑tuning, performance optimizations, cost reduction, and future directions such as multimodal queries and agents.

AI searchHybrid RetrievalLLM
0 likes · 25 min read
How Alibaba Cloud OpenSearch Powers RAG: Insights from AICon 2024
Alibaba Cloud Developer
Alibaba Cloud Developer
Jun 11, 2024 · Artificial Intelligence

Mastering Retrieval‑Augmented Generation: Challenges, Paradigms, and Engineering Best Practices

This article explores Retrieval‑Augmented Generation (RAG) by outlining its background, inherent challenges such as knowledge limits and hallucinations, describing the Naïve, Advanced, and Modular RAG paradigms, and presenting practical engineering strategies for pre‑retrieval, retrieval, and post‑retrieval optimization.

Knowledge RetrievalNLPRAG
0 likes · 25 min read
Mastering Retrieval‑Augmented Generation: Challenges, Paradigms, and Engineering Best Practices
AI Large Model Application Practice
AI Large Model Application Practice
Jun 7, 2024 · Artificial Intelligence

Mastering Advanced Retrieval: Fusion and Recursive Strategies for RAG

This article explores two advanced retrieval paradigms—Fusion Retrieval, which merges results from multiple retrievers using re‑ranking, and Recursive Retrieval, which builds hierarchical chunk‑to‑chunk or chunk‑to‑retriever links—to boost the quality and flexibility of Retrieval‑Augmented Generation pipelines.

Fusion RetrievalLLMLangChain
0 likes · 12 min read
Mastering Advanced Retrieval: Fusion and Recursive Strategies for RAG
Bilibili Tech
Bilibili Tech
Jun 7, 2024 · Artificial Intelligence

AI Development for Frontend Developers: From Basics to Agent Implementation

This article guides frontend developers through AI development, comparing model training, fine‑tuning, prompt engineering, and Retrieval‑Augmented Generation, then explains agent creation via ReAct and tool‑call methods, and showcases Langchain and Flowise as low‑code frameworks for building domain‑specific AI agents.

AI DevelopmentAgentFlowise
0 likes · 13 min read
AI Development for Frontend Developers: From Basics to Agent Implementation
Sohu Tech Products
Sohu Tech Products
Jun 5, 2024 · Artificial Intelligence

Retrieval Augmented Generation (RAG): Concepts, Workflow, and LangChain Implementation

The article outlines LLM issues such as hallucination, outdated knowledge, and data privacy, then explains Retrieval‑Augmented Generation—detailing its data‑preparation and query‑time retrieval workflow, demonstrates a full LangChain implementation, and contrasts RAG with fine‑tuning as complementary strategies for up‑to‑date, grounded responses.

LLMLangChainPrompt engineering
0 likes · 15 min read
Retrieval Augmented Generation (RAG): Concepts, Workflow, and LangChain Implementation
Tencent Cloud Developer
Tencent Cloud Developer
Jun 5, 2024 · Artificial Intelligence

Introduction to AI Development and Practical Applications

The article surveys AI development from early GPT experiments to real‑world deployments, explaining how tools like LangChain and Retrieval‑Augmented Generation enable sophisticated agents, multi‑prompt workflows, and function calls for chatbots, education, and creative content while addressing accuracy, resource, and ethical challenges.

AI DemosAI DevelopmentAgent Frameworks
0 likes · 34 min read
Introduction to AI Development and Practical Applications
JD Retail Technology
JD Retail Technology
Jun 4, 2024 · Databases

How to Deploy and Query JD’s Open‑Source Vearch Vector Database for LLM Retrieval

This article walks through the practical use of JD’s self‑developed Vearch vector database—covering cluster creation, space setup, data insertion, and both text and vector search—illustrating how it integrates with LangChain and OpenAI embeddings to enable retrieval‑augmented generation for large language models.

EmbeddingLLM RetrievalLangChain
0 likes · 16 min read
How to Deploy and Query JD’s Open‑Source Vearch Vector Database for LLM Retrieval
Baobao Algorithm Notes
Baobao Algorithm Notes
Jun 3, 2024 · Artificial Intelligence

Can Adversarial Training Make Retrieval‑Augmented Generators More Robust?

Recent arXiv work introduces ATM, an adversarially‑tuned multi‑agent system that iteratively pits a fake‑knowledge attacker against a generator, dramatically improving retrieval‑augmented language models’ resistance to hallucinated content and boosting performance on knowledge‑intensive benchmarks, even with noisy or irrelevant documents.

RAGadversarial traininghallucination mitigation
0 likes · 12 min read
Can Adversarial Training Make Retrieval‑Augmented Generators More Robust?
JD Tech
JD Tech
May 31, 2024 · Artificial Intelligence

Understanding Large Language Models, Retrieval‑Augmented Generation, and AI Agents: Concepts, Engineering Practices, and Applications

This article explains the fundamentals and engineering practices of large language models (LLM), retrieval‑augmented generation (RAG) and AI agents, compares small and large embedding models, provides Python code for vector‑database RAG with Chroma, and discusses integration, use cases, and future challenges in AI development.

AI EngineeringAI agentsLLM
0 likes · 41 min read
Understanding Large Language Models, Retrieval‑Augmented Generation, and AI Agents: Concepts, Engineering Practices, and Applications
G7 EasyFlow Tech Circle
G7 EasyFlow Tech Circle
May 29, 2024 · Artificial Intelligence

Engineering Large Model Enterprise Applications: Best Practices

This article outlines the key characteristics of large‑model enterprise applications, compares them with consumer use cases, and presents a comprehensive engineering roadmap—including model selection, knowledge‑base integration, tool implementation, intent recognition, output control, high‑availability deployment, and ongoing optimization—to help practitioners effectively harness AI models in real‑world business environments.

AI EngineeringLarge ModelRAG
0 likes · 12 min read
Engineering Large Model Enterprise Applications: Best Practices
37 Interactive Technology Team
37 Interactive Technology Team
May 27, 2024 · Artificial Intelligence

Enhancing AI Code Review Quality with Contextual Embedding and Function Calling

The article explains how AI code reviews suffer from missing context, and improves them by embedding the codebase, using Retrieval‑Augmented Generation to fetch relevant snippets, and adding a function‑calling tool that lets the model autonomously request additional code, resulting in precise, bug‑detecting feedback.

AI code reviewEmbeddingFunction Calling
0 likes · 8 min read
Enhancing AI Code Review Quality with Contextual Embedding and Function Calling
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
May 27, 2024 · Databases

Baidu’s Enterprise Vector Database: Architecture, Performance, and RAG Secrets

An exclusive interview with Baidu’s senior database architects reveals the motivations behind building a dedicated enterprise vector database, details its novel column‑store engine, C++‑based retrieval stack, performance gains over open‑source solutions, multi‑modal support, RAG integration, and future research directions.

AIRAGStorage Engine
0 likes · 28 min read
Baidu’s Enterprise Vector Database: Architecture, Performance, and RAG Secrets
Eric Tech Circle
Eric Tech Circle
May 22, 2024 · Artificial Intelligence

Deploy and Build AI Apps with Dify: A Complete Open‑Source Guide

This article introduces Dify, an open‑source LLM application platform, outlines its core features such as workflows, model support, RAG pipelines, agents, and observability, compares it with alternatives, and provides step‑by‑step deployment instructions using Docker Compose and Helm for local and Kubernetes environments.

AI PlatformDockerKubernetes
0 likes · 7 min read
Deploy and Build AI Apps with Dify: A Complete Open‑Source Guide
Baidu Tech Salon
Baidu Tech Salon
May 10, 2024 · Artificial Intelligence

Baidu Comate: Core Capabilities of Intelligent Code Assistant

The article surveys Baidu Comate, an AI‑powered code assistant built on the Wenxin (ERNIE) large model, tracing software development from the 1950s crisis through the internet and open‑source era to today’s AI‑driven tools, and highlights its features and demonstration at a global development conference.

AI CodingBaidu ComateIDE plugin
0 likes · 7 min read
Baidu Comate: Core Capabilities of Intelligent Code Assistant
DataFunSummit
DataFunSummit
May 10, 2024 · Artificial Intelligence

LLMOps: Definition, Fine‑tuning Techniques, Application Architecture, Challenges and Solutions

This article introduces LLMOps by defining large language model operations, explains the three stages of LLM development, details modern fine‑tuning methods such as PEFT, Adapter, Prefix, Prompt and LoRA, outlines the architecture for building LLM applications, discusses the main difficulties of agent‑based deployments, and presents practical solutions including Prompt IDE, low‑code deployment, monitoring and cost control.

AI OperationsFine-tuningLLMOps
0 likes · 14 min read
LLMOps: Definition, Fine‑tuning Techniques, Application Architecture, Challenges and Solutions
Java Backend Technology
Java Backend Technology
May 8, 2024 · Artificial Intelligence

Explore the Latest Open‑Source AI Projects: Llama 3, MaxKB, Phidata & RAGFlow

This article highlights four cutting‑edge open‑source AI initiatives—Meta’s Llama 3 large language model, the MaxKB knowledge‑base Q&A system, the Phidata framework for building AI assistants, and the RAGFlow retrieval‑augmented generation engine—detailing their capabilities, licensing, and where to access the code.

AIKnowledge BaseLLM
0 likes · 7 min read
Explore the Latest Open‑Source AI Projects: Llama 3, MaxKB, Phidata & RAGFlow
21CTO
21CTO
May 6, 2024 · Databases

How Oracle’s New 23ai Database Brings AI-Powered Vector Search to Enterprises

Oracle’s latest release, Database 23ai, upgrades its 23c platform with AI-driven vector search, RAG capabilities, and enhanced JSON and graph querying, positioning the database as a unified, secure, and scalable solution for handling structured, semi‑structured, and unstructured data across cloud and on‑premises environments.

AIOracleRAG
0 likes · 7 min read
How Oracle’s New 23ai Database Brings AI-Powered Vector Search to Enterprises
AI Large Model Application Practice
AI Large Model Application Practice
May 3, 2024 · Artificial Intelligence

Can Giant Context LLMs Replace RAG? Exploring the Limits of Long‑Context Retrieval

This article examines whether the rapid growth of large‑language‑model context windows can eliminate the need for retrieval‑augmented generation, presenting experimental needle‑in‑a‑haystack tests, analysis of model performance across token lengths and needle positions, and practical guidance using an open‑source evaluation tool.

AILLMNeedle-in-a-Haystack
0 likes · 13 min read
Can Giant Context LLMs Replace RAG? Exploring the Limits of Long‑Context Retrieval
DataFunTalk
DataFunTalk
Apr 29, 2024 · Artificial Intelligence

Practical Experience and Q&A Exploration of Patent Large Models

This article presents a comprehensive overview of the development, training, data preparation, algorithmic strategies, evaluation methods, and RAG integration for a domain‑specific patent large language model, highlighting challenges, practical results, and future research directions.

Domain-specific ModelPatent AIRAG
0 likes · 19 min read
Practical Experience and Q&A Exploration of Patent Large Models
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Apr 29, 2024 · Artificial Intelligence

Building Enterprise‑Grade Retrieval‑Augmented Generation (RAG) Systems: Challenges, Fault Points, and Best Practices

This comprehensive guide explores the complexities of building enterprise‑level Retrieval‑Augmented Generation (RAG) systems, detailing common failure points, architectural components such as authentication, input guards, query rewriting, document ingestion, indexing, storage, retrieval, generation, observability, caching, and multi‑tenant considerations, and provides actionable best‑practice recommendations for developers and technical leaders.

Enterprise AILLMObservability
0 likes · 32 min read
Building Enterprise‑Grade Retrieval‑Augmented Generation (RAG) Systems: Challenges, Fault Points, and Best Practices
Huolala Tech
Huolala Tech
Apr 25, 2024 · Artificial Intelligence

How LLM‑Powered Multi‑Agent AI Boosts Vehicle Selection in HuoLala’s Customer Service

This article details the design and implementation of an LLM‑driven multi‑agent AI customer‑service assistant for vehicle selection at HuoLala, covering system architecture, algorithmic solutions, retrieval‑augmented generation, NLU/NLG agents, performance improvements, and future outlooks.

AI Customer ServiceLLMMulti-Agent System
0 likes · 12 min read
How LLM‑Powered Multi‑Agent AI Boosts Vehicle Selection in HuoLala’s Customer Service
DevOps
DevOps
Apr 17, 2024 · Artificial Intelligence

Engineering Capabilities for Enterprise Large Model Applications: Prompt Engineering, RAG, and Fine‑Tuning

The article explores how enterprises can build and improve large‑model applications by combining prompt engineering, retrieval‑augmented generation (RAG), and fine‑tuning, discusses their relationships, optimization dimensions, testing challenges, and provides practical guidance for SE4AI implementation.

AI EngineeringEnterprise AIFine-tuning
0 likes · 20 min read
Engineering Capabilities for Enterprise Large Model Applications: Prompt Engineering, RAG, and Fine‑Tuning
21CTO
21CTO
Apr 12, 2024 · Artificial Intelligence

How I Built an AI‑Powered Resume Chatbot with LLMs and RAG

Senior developer Jon Olson shares how he created an AI resume assistant using GPT‑4/3.5, LangChain, LlamaIndex, and retrieval‑augmented generation, detailing prompt engineering, backend integration, and future routing features to help job seekers showcase their skills.

AI chatbotLLMLangChain
0 likes · 8 min read
How I Built an AI‑Powered Resume Chatbot with LLMs and RAG
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Apr 12, 2024 · Artificial Intelligence

Typical Business and Technical Architectures for Large Language Model Applications

This article reviews the common business and technical architectures used in large language model (LLM) applications, explains AI Embedded, AI Copilot, and AI Agent modes—including single‑ and multi‑agent systems—and offers guidance on selecting appropriate technology stacks such as prompt‑only, function‑calling agents, RAG, and fine‑tuning.

AI AgentFine-tuningLLM
0 likes · 9 min read
Typical Business and Technical Architectures for Large Language Model Applications
Eric Tech Circle
Eric Tech Circle
Apr 11, 2024 · Artificial Intelligence

Build a Generative AI RAG App with Spring AI in Minutes

This guide walks you through setting up Spring AI, configuring model providers and vector stores, initializing a Spring Boot project, adding OpenAI credentials, and running a complete RAG (Retrieval‑Augmented Generation) demo with code snippets and sample API calls.

JavaOpenAIRAG
0 likes · 15 min read
Build a Generative AI RAG App with Spring AI in Minutes
HelloTech
HelloTech
Apr 10, 2024 · Artificial Intelligence

An Overview of LangChain: Architecture, Core Components, and Code Examples

LangChain is an open‑source framework that provides Python and JavaScript SDKs, templates, and services such as LangServe and LangSmith to compose models, embeddings, prompts, indexes, memory, chains, and agents via a concise expression language, enabling rapid prototyping, debugging, and deployment of LLM‑driven applications.

AI EngineeringJavaScriptLLM
0 likes · 19 min read
An Overview of LangChain: Architecture, Core Components, and Code Examples
Alibaba Cloud Developer
Alibaba Cloud Developer
Apr 10, 2024 · Artificial Intelligence

Master LangChain in 10 Minutes: From Basics to Advanced AI Engineering

This guide walks AI engineers through a rapid 10‑minute boot‑strap of LangChain, explaining its purpose, core concepts, design questions, environment setup, and step‑by‑step code examples that cover APIs, chains, memory, retrieval‑augmented generation, tools, agents, and the overall architecture.

AI EngineeringLLMLangChain
0 likes · 28 min read
Master LangChain in 10 Minutes: From Basics to Advanced AI Engineering
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Apr 8, 2024 · Artificial Intelligence

PreFLMR: Scaling Up Fine-Grained Late-Interaction Multi-modal Retrievers

The article introduces PreFLMR, an open‑source, general‑purpose pre‑trained multimodal retriever that leverages fine‑grained late‑interaction to boost retrieval‑augmented generation for knowledge‑intensive visual tasks, describes its M2KR benchmark, training stages, and strong experimental results across multiple tasks.

AIFLMRKnowledge Retrieval
0 likes · 11 min read
PreFLMR: Scaling Up Fine-Grained Late-Interaction Multi-modal Retrievers
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Mar 30, 2024 · Artificial Intelligence

Comprehensive Guide to Coze: AI Bot Development, Prompt Engineering, and Workflow Design

This article provides an in‑depth overview of the Coze low‑code AI bot platform, covering its core features, product comparisons, step‑by‑step bot creation, RAG implementation, plugin usage, memory mechanisms, cron jobs, agent design, advanced workflow techniques, quality management, and future prospects.

AI botCozeLLM
0 likes · 25 min read
Comprehensive Guide to Coze: AI Bot Development, Prompt Engineering, and Workflow Design
AI Large Model Application Practice
AI Large Model Application Practice
Mar 29, 2024 · Artificial Intelligence

How RAG Architecture Evolves: From Simple Chains to Flexible RAG Flows

This article examines the evolution of Retrieval‑Augmented Generation (RAG) architectures for large language models, outlines the challenges they face, introduces the modular RAG Flow concept with four workflow paradigms, and provides a step‑by‑step implementation using LangChain and LlamaIndex with code examples.

LLMLangChainRAG
0 likes · 15 min read
How RAG Architecture Evolves: From Simple Chains to Flexible RAG Flows
Sohu Tech Products
Sohu Tech Products
Mar 27, 2024 · Artificial Intelligence

Building a RAG Application with Baidu Vector Database and Qianfan Embedding

This tutorial walks through building a Retrieval‑Augmented Generation application by setting up Baidu’s Vector Database and Qianfan embedding service, configuring credentials, creating a document database and vector table, loading and chunking PDFs, generating embeddings, storing them, and performing scalar, vector and hybrid similarity searches, ready for integration with Wenxin LLM for answer generation.

AI applicationsBaidu QianfanEmbedding
0 likes · 11 min read
Building a RAG Application with Baidu Vector Database and Qianfan Embedding
Sohu Tech Products
Sohu Tech Products
Mar 27, 2024 · Artificial Intelligence

NVIDIA NeMo Framework, TensorRT‑LLM, and RAG for Large Language Model Solutions

NVIDIA’s comprehensive LLM ecosystem combines the full‑stack NeMo Framework for data curation, distributed training, fine‑tuning, inference acceleration with TensorRT‑LLM and Triton, plus Retrieval‑Augmented Generation and Guardrails, enabling efficient, low‑latency, knowledge‑grounded model deployment across clusters.

AI accelerationModel TrainingNeMo Framework
0 likes · 16 min read
NVIDIA NeMo Framework, TensorRT‑LLM, and RAG for Large Language Model Solutions
Eric Tech Circle
Eric Tech Circle
Mar 24, 2024 · Artificial Intelligence

Running Local LLMs: Ollama vs Hugging Face – A Hands‑On Comparison

This guide compares Ollama and Hugging Face for running large language models locally, detailing API and local execution methods, installation steps, model selection, resource requirements, integration with AnythingLLM, container deployment, embedding and vector store setup, and practical observations on performance and limitations.

AnythingLLMDockerEmbedding
0 likes · 15 min read
Running Local LLMs: Ollama vs Hugging Face – A Hands‑On Comparison
NewBeeNLP
NewBeeNLP
Mar 18, 2024 · Artificial Intelligence

Mastering RAG and LLM Techniques: From Retrieval to Fine‑Tuning

This article provides a comprehensive technical guide on Retrieval‑Augmented Generation (RAG), open‑source large language models such as LLaMA, fine‑tuning methods, evaluation metrics, memory‑optimization tricks, and attention‑related optimizations for modern AI systems.

LLMLangChainMemory Optimization
0 likes · 19 min read
Mastering RAG and LLM Techniques: From Retrieval to Fine‑Tuning
DataFunTalk
DataFunTalk
Mar 15, 2024 · Artificial Intelligence

NVIDIA’s NeMo Framework and TensorRT‑LLM: Full‑Stack Solutions for Large Language Models and Retrieval‑Augmented Generation

This article explains NVIDIA’s end‑to‑end ecosystem for large language models, covering the NeMo Framework’s data processing, distributed training, model fine‑tuning, inference acceleration with TensorRT‑LLM, deployment via Triton, and Retrieval‑Augmented Generation (RAG) techniques that enhance model reliability and performance.

AINeMoNvidia
0 likes · 16 min read
NVIDIA’s NeMo Framework and TensorRT‑LLM: Full‑Stack Solutions for Large Language Models and Retrieval‑Augmented Generation
Sohu Tech Products
Sohu Tech Products
Mar 13, 2024 · Artificial Intelligence

Build a Minimal Retrieval‑Augmented Generation (Tiny‑RAG) from Scratch

This step‑by‑step guide explains how to implement a lightweight Retrieval‑Augmented Generation system—Tiny‑RAG—by creating embedding classes, loading and chunking documents, building a simple vector store, performing similarity search, and integrating a large language model for answer generation, complete with runnable Python code.

EmbeddingLLMPython
0 likes · 14 min read
Build a Minimal Retrieval‑Augmented Generation (Tiny‑RAG) from Scratch
Baidu Geek Talk
Baidu Geek Talk
Mar 13, 2024 · Artificial Intelligence

Understanding Retrieval-Augmented Generation (RAG) and Building a Personal Knowledge Base with ERNIE SDK and LangChain

The article explains Retrieval-Augmented Generation (RAG), its workflow, advantages, comparison with fine-tuning, and provides a step-by-step implementation using Baidu's ERNIE SDK, LangChain, and ChromaDB to build a personal knowledge base that answers queries with retrieved context.

AIERNIE SDKKnowledge Base
0 likes · 13 min read
Understanding Retrieval-Augmented Generation (RAG) and Building a Personal Knowledge Base with ERNIE SDK and LangChain
Xiaohe Frontend Team
Xiaohe Frontend Team
Mar 6, 2024 · Artificial Intelligence

What the New “Generative AI Act Two” Reveals About the Next AI Wave

Sequoia Capital’s “Generative AI Act Two” report highlights a shift from hype‑driven model releases to user‑centric, end‑to‑end solutions, emphasizing the rise of foundational models as components, the importance of developer tools, emerging RAG and fine‑tuning techniques, and the evolving competitive landscape.

AI MarketFine-tuningFoundational models
0 likes · 6 min read
What the New “Generative AI Act Two” Reveals About the Next AI Wave
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Feb 27, 2024 · Artificial Intelligence

Build a Knowledge‑Enhanced LLM Chatbot with Alibaba Cloud PAI: A Step‑by‑Step RAG Guide

This comprehensive guide walks AI developers through building a Retrieval‑Augmented Generation (RAG) chatbot on Alibaba Cloud PAI, covering architecture, vector store setup, model deployment, knowledge ingestion, multi‑modal retrieval, fusion, re‑ranking, prompt design, and end‑to‑end configuration with code examples.

Alibaba CloudChatbotLLM
0 likes · 26 min read
Build a Knowledge‑Enhanced LLM Chatbot with Alibaba Cloud PAI: A Step‑by‑Step RAG Guide
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Feb 25, 2024 · Artificial Intelligence

Pinecone Vector Database and Embedding Model Summary from DeepLearning.AI’s AI Course

This article reviews the author’s hands‑on experience with Pinecone’s serverless vector database, various embedding and generation models such as all‑MiniLM‑L6‑v2, text‑embedding‑ada‑002, clip‑ViT‑B‑32, and GPT‑3.5‑turbo‑instruct, and demonstrates how they are applied to semantic search, RAG, recommendation, hybrid, and facial similarity tasks using Python code examples.

AIPineconePython
0 likes · 9 min read
Pinecone Vector Database and Embedding Model Summary from DeepLearning.AI’s AI Course
Cloud Native Technology Community
Cloud Native Technology Community
Feb 8, 2024 · Artificial Intelligence

How Retrieval‑Augmented Generation Boosts LLM Accuracy and Trust

Retrieval‑augmented generation (RAG) enhances large language models by fetching up‑to‑date, authoritative information from external sources, addressing hallucinations, outdated knowledge, and lack of citations, while offering cost‑effective implementation, improved relevance, user trust, and greater developer control through vector databases, semantic search, and prompt engineering.

AIPrompt engineeringRAG
0 likes · 10 min read
How Retrieval‑Augmented Generation Boosts LLM Accuracy and Trust
Baobao Algorithm Notes
Baobao Algorithm Notes
Feb 4, 2024 · Industry Insights

Balancing Fun, Utility, and Slow Thinking: The Future of AI Agents

In this talk, the speaker examines the dual goals of AI agents—being entertaining and useful—while introducing the concepts of fast and slow thinking, multimodal perception, long‑term memory, retrieval‑augmented generation, and tool integration as essential steps toward building truly valuable digital companions.

AI agentsFuture AILong-term Memory
0 likes · 18 min read
Balancing Fun, Utility, and Slow Thinking: The Future of AI Agents
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Jan 31, 2024 · Artificial Intelligence

Advanced RAG with Semi‑Structured Data Using LangChain, Unstructured, and ChromaDB

This tutorial demonstrates how to build an advanced Retrieval‑Augmented Generation (RAG) system for semi‑structured PDF data by leveraging LangChain, the unstructured library, ChromaDB vector store, and OpenAI models, covering installation, PDF partitioning, element classification, summarization, and query execution.

AIChromaDBLangChain
0 likes · 11 min read
Advanced RAG with Semi‑Structured Data Using LangChain, Unstructured, and ChromaDB
DaTaobao Tech
DaTaobao Tech
Dec 27, 2023 · Artificial Intelligence

Deploying a Private LLM Knowledge Base on a MacBook

The guide walks through installing and quantizing the open‑source ChatGLM3‑6B model and the m3e‑base embedder on a MacBook, wrapping them with a FastAPI OpenAI‑compatible service, routing requests through a One‑API gateway, storing metadata in MongoDB and vectors in PostgreSQL pgvector, deploying FastGPT for RAG, ingesting data, and demonstrating 5‑7 second response times, while outlining future improvements.

ChatGLM3DeploymentFastAPI
0 likes · 23 min read
Deploying a Private LLM Knowledge Base on a MacBook
Baobao Algorithm Notes
Baobao Algorithm Notes
Dec 6, 2023 · Artificial Intelligence

How to Systematically Fix Bad Cases in Large Language Models

The article outlines a structured approach to identifying, categorizing, evaluating impact, and repairing undesirable responses from large language models, covering both model‑level interventions across training stages and practical inference‑time techniques such as parameter tuning, prompt engineering, RAG, and pre/post‑processing safeguards.

Model AlignmentPrompt engineeringRAG
0 likes · 9 min read
How to Systematically Fix Bad Cases in Large Language Models
DataFunTalk
DataFunTalk
Nov 17, 2023 · Databases

Cost as the Primary Driver of Vector Database Industry Development

Vector databases gain traction because they dramatically reduce storage, learning, scaling, and large‑model limitations costs by enabling semantic similarity search, RAG‑based prompt optimization, efficient high‑dimensional indexing, and cloud‑native architectures, making them essential for modern AI applications despite the promotional context.

AIBig DataRAG
0 likes · 11 min read
Cost as the Primary Driver of Vector Database Industry Development
Architect
Architect
Nov 8, 2023 · Artificial Intelligence

AI Agents Unleashed: From Assistants API to Multi‑Agent Frameworks

The article dissects the rise of AI agents—from OpenAI's Assistants API and multimodal perception‑brain‑action pipelines to retrieval‑augmented generation, tool‑use strategies, single‑ and multi‑agent deployments, and emerging frameworks like AutoGen—while highlighting concrete examples, benchmark results, and current limitations.

AI agentsAssistants APIEmbodied AI
0 likes · 38 min read
AI Agents Unleashed: From Assistants API to Multi‑Agent Frameworks
AI Large Model Application Practice
AI Large Model Application Practice
Oct 18, 2023 · Artificial Intelligence

How to Extract and Embed Tables and Images from PDFs for Multimodal RAG

This article explains a practical approach to parsing PDFs containing text, tables, and images, using the open‑source Unstructured library and LlaVA model, then embedding each modality into a vector store with multi‑vector retrieval to enable accurate semantic search in private‑knowledge RAG pipelines, with optional LangChain integration.

LLMLangChainPDF processing
0 likes · 12 min read
How to Extract and Embed Tables and Images from PDFs for Multimodal RAG
dbaplus Community
dbaplus Community
Oct 14, 2023 · Artificial Intelligence

Demystifying Retrieval‑Augmented Generation: From Theory to Working Chatbot

This guide explains the Retrieval‑Augmented Generation (RAG) technique, detailing how user queries are matched to private knowledge bases, how relevant passages are retrieved, and how large language models use those passages to generate context‑aware answers, complete with code examples and practical tips.

ChatbotEmbeddingLLM
0 likes · 19 min read
Demystifying Retrieval‑Augmented Generation: From Theory to Working Chatbot
phodal
phodal
Sep 24, 2023 · Artificial Intelligence

Designing a JVM‑Based LLM Framework: Insights from Chocolate Factory

This article explores the design principles, architectural decisions, and practical code examples behind the Chocolate Factory framework, a JVM‑centric LLM development platform inspired by LangChain, LlamaIndex, Spring AI, and PromptFlow, highlighting SDK construction, RAG workflows, and prompt engineering challenges.

AI DevelopmentFrameworkJVM
0 likes · 11 min read
Designing a JVM‑Based LLM Framework: Insights from Chocolate Factory
Java High-Performance Architecture
Java High-Performance Architecture
Aug 18, 2023 · Databases

Redis 7.2 Unified Release: Boost AI, Vector Search, and Real‑Time Functions

Redis 7.2, the first Unified Redis Release, introduces AI‑ready vector indexing, hybrid semantic search, scalable RAG support, server‑side Triggers and Functions, enhanced geospatial queries, and a preview of high‑performance searchable indexes, while expanding client library support and integrating Redis Data Integration for seamless enterprise data pipelines.

AIRAGServerless Functions
0 likes · 8 min read
Redis 7.2 Unified Release: Boost AI, Vector Search, and Real‑Time Functions