Tagged articles

891 articles

Page 9 of 9

Jul 30, 2024 · Artificial Intelligence

What Does Galileo’s New Hallucination Index Reveal About Today’s Top Generative AI Models?

Galileo’s Hallucination Index evaluates 22 leading generative AI models using a contextual‑adherence metric, ranking Claude 3.5 Sonnet as the overall RAG leader, Gemini 1.5 Flash as the most cost‑effective, and highlighting open‑source and context‑length performance nuances for AI practitioners.

AIModel EvaluationRAG

0 likes · 5 min read

What Does Galileo’s New Hallucination Index Reveal About Today’s Top Generative AI Models?

Model Perspective

Jul 30, 2024 · Artificial Intelligence

Your Complete AI Learning Roadmap: From Basics to Large Model Mastery

This guide presents a comprehensive AI learning roadmap, dividing study into five progressive stages—from foundational math and programming to core deep‑learning and reinforcement‑learning techniques, large‑model training, industry applications, and future trends—plus curated book lists, tool recommendations, and practical RAG tutorials.

AI learning roadmapAI resourcesRAG

0 likes · 9 min read

Your Complete AI Learning Roadmap: From Basics to Large Model Mastery

Tencent Cloud Developer

Jul 30, 2024 · Artificial Intelligence

A Systematic Guide to Prompt Engineering: From Zero to One

This guide walks readers from beginner to proficient Prompt Engineer by outlining the evolution of prompting, introducing a universal four‑component template, and detailing a five‑step workflow—including refinement, retrieval‑augmented generation, chain‑of‑thought reasoning, and advanced tuning techniques—plus evaluation metrics for LLM performance.

AI promptingLLM optimizationPrompt engineering

0 likes · 51 min read

A Systematic Guide to Prompt Engineering: From Zero to One

phodal

Jul 24, 2024 · Artificial Intelligence

How to Build Trustworthy Coding Agents with Shire’s Custom RAG Workflow

This article explains how to use the Shire language to create reliable coding agents by defining custom RAG workflows, leveraging IDE APIs, code verification functions, and vector‑based search, with detailed examples, configuration snippets, and a roadmap for future enhancements.

AICoding AgentIDE

0 likes · 10 min read

How to Build Trustworthy Coding Agents with Shire’s Custom RAG Workflow

Alibaba Cloud Developer

Jul 22, 2024 · Artificial Intelligence

How Alibaba’s Logistics AI Overcame B2B Large Model Challenges

Alibaba’s logistics AI team shares their year‑long journey building a vertical‑domain large language model for logistics, detailing model alignment, Text2API, RAG, SFT techniques, challenges like accuracy and knowledge‑base maintenance, and showcasing real‑world applications such as chatbots, DingTalk assistants, and custom AI assistants.

Model AlignmentRAGSFT

0 likes · 16 min read

How Alibaba’s Logistics AI Overcame B2B Large Model Challenges

DevOps

Jul 21, 2024 · Artificial Intelligence

LLM Fundamentals, Applications, Prompt Engineering, RAG, and Agentic Workflows

This article provides a comprehensive overview of large language models (LLMs), covering their basic concepts, relationship with NLP, development history, parameter scaling, offline deployment, practical applications, prompt‑engineering frameworks, retrieval‑augmented generation, LangChain integration, agents, workflow orchestration, and future directions toward multimodal AI and AGI.

AI applicationsAgentLLM

0 likes · 36 min read

LLM Fundamentals, Applications, Prompt Engineering, RAG, and Agentic Workflows

DaTaobao Tech

Jul 19, 2024 · Artificial Intelligence

Practices and Techniques for Vertical Domain Large Language Models

Vertical domain large language models, fine‑tuned on specialized data, deliver higher expertise and task performance, but require continual knowledge updates and careful alignment; techniques such as BPO‑guided instruction tuning (+1.8% accuracy), Reflexion‑based Text2API (+4% API correctness), advanced RAG preprocessing, and SFT combined with ORPO (+5.2% gain) demonstrate notable improvements while underscoring remaining challenges and collaborative opportunities.

AIAlignmentRAG

0 likes · 9 min read

Practices and Techniques for Vertical Domain Large Language Models

Tencent Cloud Developer

Jul 18, 2024 · Artificial Intelligence

Exploring Large Language Models (LLM): Fundamentals, Applications, and Future Directions

Exploring Large Language Models, this article surveys their core concepts, evolution through Transformers, GPT and BERT, generation challenges, diverse applications such as QA, multimodal creation, summarization and retrieval‑augmented generation, prompt‑engineering frameworks and tools, LangChain‑based pipelines, AI‑driven agents, and future prospects toward domain‑specific use, multimodality, and AGI.

AIAgentLLM

0 likes · 35 min read

Exploring Large Language Models (LLM): Fundamentals, Applications, and Future Directions

JD Tech Talk

Jul 16, 2024 · Artificial Intelligence

Task‑Aware Decoding (TaD): A Plug‑and‑Play Method to Mitigate Hallucinations in Large Language Models

TaD, a task‑aware decoding technique jointly developed by JD.com and Tsinghua University and presented at IJCAI 2024, leverages differences between pre‑ and post‑fine‑tuned LLM outputs to construct knowledge vectors, significantly reducing hallucinations across various models, tasks, and data‑scarce scenarios, especially when combined with RAG.

AILLMRAG

0 likes · 18 min read

Task‑Aware Decoding (TaD): A Plug‑and‑Play Method to Mitigate Hallucinations in Large Language Models

Architect

Jul 13, 2024 · Artificial Intelligence

Practical Guide to Building LLM Products: Prompt Engineering, RAG, Evaluation, and Operations

This article provides a comprehensive, step‑by‑step guide for developing large‑language‑model (LLM) applications, covering prompt design techniques, n‑shot and chain‑of‑thought strategies, retrieval‑augmented generation, structured I/O, workflow optimization, evaluation pipelines, operational best practices, and team organization to create reliable, scalable AI products.

AI OperationsLLMProduct Development

0 likes · 54 min read

Practical Guide to Building LLM Products: Prompt Engineering, RAG, Evaluation, and Operations

AsiaInfo Technology: New Tech Exploration

Jul 12, 2024 · Artificial Intelligence

How AI‑Native Transforms User Experience Management in Telecom Networks

This article examines how the AI‑Native approach reshapes the AISWare CEM platform by integrating large language models, Retrieval‑Augmented Generation, and atomic capability decomposition to improve user perception, streamline interactions, and enable intelligent diagnostic assistants for telecom operators.

AI-nativeAtomic CapabilitiesDiagnostic Assistant

0 likes · 12 min read

How AI‑Native Transforms User Experience Management in Telecom Networks

JD Tech

Jul 10, 2024 · Artificial Intelligence

Implementing Retrieval‑Augmented Generation (RAG) with LangChain4j in Java

This article provides a step‑by‑step guide for Java engineers on building a Retrieval‑Augmented Generation (RAG) application using the LangChain4j framework, covering RAG fundamentals, environment setup, Maven integration, document loading, splitting, embedding with OpenAI, vector store management with Chroma, and prompt‑based LLM interaction.

EmbeddingJavaLLM

0 likes · 35 min read

Implementing Retrieval‑Augmented Generation (RAG) with LangChain4j in Java

21CTO

Jul 7, 2024 · Artificial Intelligence

How to Build a Secure Local LLM Chatbot with Ollama, Python, and ChromaDB

This tutorial walks you through creating a privacy‑preserving, locally hosted large language model chatbot using Ollama, Python 3, and ChromaDB, covering RAG fundamentals, GPU selection, environment setup, and full source code for a Flask‑based application.

ChromaDBLLMOllama

0 likes · 19 min read

How to Build a Secure Local LLM Chatbot with Ollama, Python, and ChromaDB

AI Large Model Application Practice

Jul 4, 2024 · Artificial Intelligence

Mastering Multimodal RAG: From PDF Parsing to Advanced Query Rewriting

This article explains how to handle complex multimodal PDFs in RAG systems, outlines extraction, indexing, and multimodal model integration, details four query‑rewriting strategies (HyDE, stepwise, sub‑question, backward), and presents key evaluation metrics and tools for assessing RAG performance.

Document ParsingQuery RewritingRAG

0 likes · 12 min read

Mastering Multimodal RAG: From PDF Parsing to Advanced Query Rewriting

Baidu Geek Talk

Jul 3, 2024 · Databases

How Vector Databases Power AI‑Driven Retrieval: Inside Baidu’s VectorDB

This article reviews the evolution of databases and large models, explains vector database fundamentals and RAG pipelines, and details Baidu's VectorDB architecture, performance advantages, and its role in AI‑enhanced database operations.

AI integrationDatabase operationsRAG

0 likes · 15 min read

How Vector Databases Power AI‑Driven Retrieval: Inside Baidu’s VectorDB

AntTech

Jul 2, 2024 · Artificial Intelligence

Design and Implementation of a Generalized Retrieval‑Augmented Generation (RAG) Framework with Graph RAG Support

This article surveys Retrieval‑Augmented Generation (RAG), analyzes the limitations of traditional vector‑based RAG, introduces Graph RAG that leverages knowledge graphs for more reliable context, proposes a universal RAG architecture compatible with vector, graph and full‑text indexes, and details its open‑source implementation, code components, testing, and future research directions.

AIEngineeringGraphRAGKnowledgeGraph

0 likes · 26 min read

Design and Implementation of a Generalized Retrieval‑Augmented Generation (RAG) Framework with Graph RAG Support

JD Tech

Jun 28, 2024 · Artificial Intelligence

An Overview of Large Language Models: History, Fundamentals, Prompt Engineering, Retrieval‑Augmented Generation, Agents, and Multimodal AI

This article provides a comprehensive introduction to large language models, covering their historical development, core architecture, training process, prompt engineering techniques, Retrieval‑Augmented Generation, agent frameworks, multimodal capabilities, safety challenges, and future research directions.

AI SafetyAI agentsDeep Learning

0 likes · 22 min read

An Overview of Large Language Models: History, Fundamentals, Prompt Engineering, Retrieval‑Augmented Generation, Agents, and Multimodal AI

Baobao Algorithm Notes

Jun 27, 2024 · Artificial Intelligence

Engineering Data for R&D Large Language Models: From Pre‑training to Prompt Design

This article presents a comprehensive guide to data engineering for research‑focused large language models, covering domain‑adaptive pre‑training, supervised fine‑tuning, retrieval‑augmented generation, dataset construction, data cleaning pipelines, token‑izer adaptation, and prompt engineering best practices to boost model performance in specialized tasks.

Fine‑TuningLLMRAG

0 likes · 20 min read

Engineering Data for R&D Large Language Models: From Pre‑training to Prompt Design

Alibaba Cloud Developer

Jun 27, 2024 · Artificial Intelligence

How to Supercharge Retrieval‑Augmented Generation: Papers, Techniques, and Real‑World Tips

This article surveys the main challenges of deploying large language models, introduces key RAG optimization papers such as RAPTOR, Self‑RAG, and CRAG, and compiles practical engineering tricks—including chunking, query rewriting, hybrid and progressive retrieval—to help practitioners build more accurate and efficient RAG systems.

AI researchLLM optimizationRAG

0 likes · 22 min read

How to Supercharge Retrieval‑Augmented Generation: Papers, Techniques, and Real‑World Tips

Baidu Intelligent Cloud Tech Hub

Jun 21, 2024 · Databases

How Vector Databases and Large Models are Transforming AI-Driven Database Operations

This article reviews the evolution of databases and large models, explains the role of vector databases and Retrieval‑Augmented Generation (RAG) in AI‑enhanced data management, and showcases Baidu Cloud's VectorDB and DBSC solutions for intelligent database operations and knowledge‑driven services.

AI4DBDatabase operationsRAG

0 likes · 15 min read

How Vector Databases and Large Models are Transforming AI-Driven Database Operations

DataFunTalk

Jun 21, 2024 · Artificial Intelligence

Fine‑tuning Large Language Models with Alibaba Cloud PAI: Practices, Techniques, and Deployment

This article introduces the Alibaba Cloud PAI platform for large language model (LLM) fine‑tuning, covering model‑training pipelines, performance‑cost trade‑offs, retrieval‑augmented generation, fine‑tuning methods such as full‑parameter, LoRA and QLoRA, model selection, data preparation, evaluation, and real‑world deployment examples.

AI PlatformFine-tuningLLM

0 likes · 20 min read

Fine‑tuning Large Language Models with Alibaba Cloud PAI: Practices, Techniques, and Deployment

JD Cloud Developers

Jun 20, 2024 · Artificial Intelligence

How Large Language Models Boost Courier Efficiency: From Voice Commands to Smart QA

This article explains how large language models like ChatGPT can transform courier operations by automating voice‑driven tasks, enabling intelligent question answering with retrieval‑augmented generation, extracting and splitting document content, embedding it for vector search, and delivering smart prompts and agents to improve productivity and accuracy.

AIEmbeddingLogistics

0 likes · 15 min read

How Large Language Models Boost Courier Efficiency: From Voice Commands to Smart QA

Architecture & Thinking

Jun 19, 2024 · Artificial Intelligence

Build AI‑Native Apps Quickly with Spring AI: From Chat Models to RAG

This guide explains what an AI‑native application is, compares AI‑native and AI‑based approaches, and walks through Spring AI’s core features—including chat models, prompt templates, function calling, structured output, image generation, embedding, and vector stores—showing step‑by‑step code examples and how to assemble a complete AI‑native app with RAG support.

AI native applicationFunction CallingJava

0 likes · 43 min read

Build AI‑Native Apps Quickly with Spring AI: From Chat Models to RAG

JD Tech

Jun 19, 2024 · Artificial Intelligence

Advances in Large AI Models: Prompt Engineering, RAG, Agents, Fine‑Tuning, Vector Databases and Knowledge Graphs

This article surveys the rapid expansion of large AI models, covering prompt engineering, structured prompts, retrieval‑augmented generation, AI agents, fine‑tuning strategies, vector database technology, knowledge graphs, function calling, and their collective role in moving toward artificial general intelligence.

AIAgentFine‑tuning

0 likes · 23 min read

Advances in Large AI Models: Prompt Engineering, RAG, Agents, Fine‑Tuning, Vector Databases and Knowledge Graphs

AI Large Model Application Practice

Jun 17, 2024 · Artificial Intelligence

Boost Your RAG Pipeline with Cohere and BGE Rerank Models

This guide explains why post‑retrieval reranking is essential for Retrieval‑Augmented Generation, compares the commercial Cohere Rerank service with the open‑source bge‑reranker‑large model, and provides step‑by‑step code for integrating both into LlamaIndex pipelines, including a custom TEI‑based processor.

BGECohereLlamaIndex

0 likes · 11 min read

Boost Your RAG Pipeline with Cohere and BGE Rerank Models

JD Tech Talk

Jun 14, 2024 · Artificial Intelligence

Building a Retrieval‑Augmented Generation (RAG) System with JD Cloud Docs, ClickHouse, LangChain, and FastAPI

This guide explains how to build a Retrieval‑Augmented Generation (RAG) system using JD Cloud documentation as a knowledge base, storing document embeddings in ClickHouse, leveraging LangChain for vector retrieval, and exposing query and answer services via FastAPI and a Gradio UI.

AIClickHouseFastAPI

0 likes · 13 min read

Building a Retrieval‑Augmented Generation (RAG) System with JD Cloud Docs, ClickHouse, LangChain, and FastAPI

JD Cloud Developers

Jun 14, 2024 · Artificial Intelligence

Build a Retrieval‑Augmented Generation (RAG) System Using JD Cloud Docs and ClickHouse

This guide walks through creating a Retrieval‑Augmented Generation pipeline that harvests JD Cloud documentation, stores vector embeddings in ClickHouse, and serves queries via FastAPI, LangChain, a Qwen LLM, and a Gradio front‑end.

ClickHouseFastAPILLM

0 likes · 14 min read

Build a Retrieval‑Augmented Generation (RAG) System Using JD Cloud Docs and ClickHouse

Alibaba Cloud Big Data AI Platform

Jun 14, 2024 · Artificial Intelligence

How Alibaba Cloud OpenSearch Powers RAG: Insights from AICon 2024

In this talk, Alibaba Cloud's OpenSearch RAG team shares their year‑long journey of building retrieval‑augmented generation systems, covering data parsing, slicing, vectorization, hybrid retrieval, model fine‑tuning, performance optimizations, cost reduction, and future directions such as multimodal queries and agents.

AI searchHybrid RetrievalLLM

0 likes · 25 min read

How Alibaba Cloud OpenSearch Powers RAG: Insights from AICon 2024

Alibaba Cloud Developer

Jun 11, 2024 · Artificial Intelligence

Mastering Retrieval‑Augmented Generation: Challenges, Paradigms, and Engineering Best Practices

This article explores Retrieval‑Augmented Generation (RAG) by outlining its background, inherent challenges such as knowledge limits and hallucinations, describing the Naïve, Advanced, and Modular RAG paradigms, and presenting practical engineering strategies for pre‑retrieval, retrieval, and post‑retrieval optimization.

Knowledge RetrievalNLPRAG

0 likes · 25 min read

Mastering Retrieval‑Augmented Generation: Challenges, Paradigms, and Engineering Best Practices

AI Large Model Application Practice

Jun 7, 2024 · Artificial Intelligence

Mastering Advanced Retrieval: Fusion and Recursive Strategies for RAG

This article explores two advanced retrieval paradigms—Fusion Retrieval, which merges results from multiple retrievers using re‑ranking, and Recursive Retrieval, which builds hierarchical chunk‑to‑chunk or chunk‑to‑retriever links—to boost the quality and flexibility of Retrieval‑Augmented Generation pipelines.

Fusion RetrievalLLMLangChain

0 likes · 12 min read

Mastering Advanced Retrieval: Fusion and Recursive Strategies for RAG

Bilibili Tech

Jun 7, 2024 · Artificial Intelligence

AI Development for Frontend Developers: From Basics to Agent Implementation

This article guides frontend developers through AI development, comparing model training, fine‑tuning, prompt engineering, and Retrieval‑Augmented Generation, then explains agent creation via ReAct and tool‑call methods, and showcases Langchain and Flowise as low‑code frameworks for building domain‑specific AI agents.

AI DevelopmentAgentFlowise

0 likes · 13 min read

AI Development for Frontend Developers: From Basics to Agent Implementation

Sohu Tech Products

Jun 5, 2024 · Artificial Intelligence

Retrieval Augmented Generation (RAG): Concepts, Workflow, and LangChain Implementation

The article outlines LLM issues such as hallucination, outdated knowledge, and data privacy, then explains Retrieval‑Augmented Generation—detailing its data‑preparation and query‑time retrieval workflow, demonstrates a full LangChain implementation, and contrasts RAG with fine‑tuning as complementary strategies for up‑to‑date, grounded responses.

LLMLangChainPrompt engineering

0 likes · 15 min read

Retrieval Augmented Generation (RAG): Concepts, Workflow, and LangChain Implementation

Tencent Cloud Developer

Jun 5, 2024 · Artificial Intelligence

Introduction to AI Development and Practical Applications

The article surveys AI development from early GPT experiments to real‑world deployments, explaining how tools like LangChain and Retrieval‑Augmented Generation enable sophisticated agents, multi‑prompt workflows, and function calls for chatbots, education, and creative content while addressing accuracy, resource, and ethical challenges.

AI DemosAI DevelopmentAgent Frameworks

0 likes · 34 min read

Introduction to AI Development and Practical Applications

JD Retail Technology

Jun 4, 2024 · Databases

How to Deploy and Query JD’s Open‑Source Vearch Vector Database for LLM Retrieval

This article walks through the practical use of JD’s self‑developed Vearch vector database—covering cluster creation, space setup, data insertion, and both text and vector search—illustrating how it integrates with LangChain and OpenAI embeddings to enable retrieval‑augmented generation for large language models.

EmbeddingLLM RetrievalLangChain

0 likes · 16 min read

How to Deploy and Query JD’s Open‑Source Vearch Vector Database for LLM Retrieval

Baobao Algorithm Notes

Jun 3, 2024 · Artificial Intelligence

Can Adversarial Training Make Retrieval‑Augmented Generators More Robust?

Recent arXiv work introduces ATM, an adversarially‑tuned multi‑agent system that iteratively pits a fake‑knowledge attacker against a generator, dramatically improving retrieval‑augmented language models’ resistance to hallucinated content and boosting performance on knowledge‑intensive benchmarks, even with noisy or irrelevant documents.

RAGadversarial traininghallucination mitigation

0 likes · 12 min read

Can Adversarial Training Make Retrieval‑Augmented Generators More Robust?

JD Tech

May 31, 2024 · Artificial Intelligence

Understanding Large Language Models, Retrieval‑Augmented Generation, and AI Agents: Concepts, Engineering Practices, and Applications

This article explains the fundamentals and engineering practices of large language models (LLM), retrieval‑augmented generation (RAG) and AI agents, compares small and large embedding models, provides Python code for vector‑database RAG with Chroma, and discusses integration, use cases, and future challenges in AI development.

AI EngineeringAI agentsLLM

0 likes · 41 min read

Understanding Large Language Models, Retrieval‑Augmented Generation, and AI Agents: Concepts, Engineering Practices, and Applications

G7 EasyFlow Tech Circle

May 29, 2024 · Artificial Intelligence

Engineering Large Model Enterprise Applications: Best Practices

This article outlines the key characteristics of large‑model enterprise applications, compares them with consumer use cases, and presents a comprehensive engineering roadmap—including model selection, knowledge‑base integration, tool implementation, intent recognition, output control, high‑availability deployment, and ongoing optimization—to help practitioners effectively harness AI models in real‑world business environments.

AI EngineeringLarge ModelRAG

0 likes · 12 min read

Engineering Large Model Enterprise Applications: Best Practices

37 Interactive Technology Team

May 27, 2024 · Artificial Intelligence

Enhancing AI Code Review Quality with Contextual Embedding and Function Calling

The article explains how AI code reviews suffer from missing context, and improves them by embedding the codebase, using Retrieval‑Augmented Generation to fetch relevant snippets, and adding a function‑calling tool that lets the model autonomously request additional code, resulting in precise, bug‑detecting feedback.

AI code reviewEmbeddingFunction Calling

0 likes · 8 min read

Enhancing AI Code Review Quality with Contextual Embedding and Function Calling

Huawei Cloud Developer Alliance

May 27, 2024 · Artificial Intelligence

Unlocking AI Large Model Potential: Key Takeaways from Shanghai HCDG AI Model Salon

The Shanghai HCDG AI Model Salon showcased cutting‑edge large‑model technologies, including ModelArts‑based LLM deployment, RAG advancements, and observability best practices, highlighting how Huawei Cloud and partners are accelerating AI adoption across industries.

AIHuawei CloudLangChain

0 likes · 5 min read

Unlocking AI Large Model Potential: Key Takeaways from Shanghai HCDG AI Model Salon

Baidu Intelligent Cloud Tech Hub

May 27, 2024 · Databases

Baidu’s Enterprise Vector Database: Architecture, Performance, and RAG Secrets

An exclusive interview with Baidu’s senior database architects reveals the motivations behind building a dedicated enterprise vector database, details its novel column‑store engine, C++‑based retrieval stack, performance gains over open‑source solutions, multi‑modal support, RAG integration, and future research directions.

AIRAGStorage Engine

0 likes · 28 min read

Baidu’s Enterprise Vector Database: Architecture, Performance, and RAG Secrets

AI Large Model Application Practice

May 27, 2024 · Artificial Intelligence

Building Agentic RAG with LlamaIndex: From Tool Agents to a Top Agent

This article walks through the design and implementation of an Agentic Retrieval‑Augmented Generation system using LlamaIndex, showing how to wrap multiple RAG engines as tools, orchestrate them with hierarchical AI agents, and scale the solution with tool retrieval for large document collections.

AI AgentLlamaIndexPython

0 likes · 14 min read

Building Agentic RAG with LlamaIndex: From Tool Agents to a Top Agent

Alibaba Cloud Native

May 25, 2024 · Cloud Native

Build a Retrieval‑Augmented Generation (RAG) App with Spring Cloud Alibaba AI and Redis

This guide explains how to implement a Retrieval‑Augmented Generation (RAG) workflow by loading a beer‑info JSON dataset into a Redis vector store, wiring it with Spring Cloud Alibaba AI Starter, and exposing a web API that returns LLM‑generated answers.

AIJavaRAG

0 likes · 5 min read

Build a Retrieval‑Augmented Generation (RAG) App with Spring Cloud Alibaba AI and Redis

Eric Tech Circle

May 22, 2024 · Artificial Intelligence

Deploy and Build AI Apps with Dify: A Complete Open‑Source Guide

This article introduces Dify, an open‑source LLM application platform, outlines its core features such as workflows, model support, RAG pipelines, agents, and observability, compares it with alternatives, and provides step‑by‑step deployment instructions using Docker Compose and Helm for local and Kubernetes environments.

AI PlatformDockerKubernetes

0 likes · 7 min read

Deploy and Build AI Apps with Dify: A Complete Open‑Source Guide

Alibaba Cloud Big Data AI Platform

May 11, 2024 · Artificial Intelligence

How to Build a High‑Performance RAG System with Milvus on Alibaba Cloud PAI

This guide explains how to integrate Milvus vector search with Alibaba Cloud PAI to create a Retrieval‑Augmented Generation (RAG) solution, covering background, prerequisites, deployment steps, configuration parameters, and practical usage through the Web UI.

AIAlibaba CloudLangChain

0 likes · 7 min read

How to Build a High‑Performance RAG System with Milvus on Alibaba Cloud PAI

Baidu Tech Salon

May 10, 2024 · Artificial Intelligence

Baidu Comate: Core Capabilities of Intelligent Code Assistant

The article surveys Baidu Comate, an AI‑powered code assistant built on the Wenxin (ERNIE) large model, tracing software development from the 1950s crisis through the internet and open‑source era to today’s AI‑driven tools, and highlights its features and demonstration at a global development conference.

AI CodingBaidu ComateIDE plugin

0 likes · 7 min read

Baidu Comate: Core Capabilities of Intelligent Code Assistant

DataFunSummit

May 10, 2024 · Artificial Intelligence

LLMOps: Definition, Fine‑tuning Techniques, Application Architecture, Challenges and Solutions

This article introduces LLMOps by defining large language model operations, explains the three stages of LLM development, details modern fine‑tuning methods such as PEFT, Adapter, Prefix, Prompt and LoRA, outlines the architecture for building LLM applications, discusses the main difficulties of agent‑based deployments, and presents practical solutions including Prompt IDE, low‑code deployment, monitoring and cost control.

AI OperationsFine-tuningLLMOps

0 likes · 14 min read

LLMOps: Definition, Fine‑tuning Techniques, Application Architecture, Challenges and Solutions

Java Backend Technology

May 8, 2024 · Artificial Intelligence

Explore the Latest Open‑Source AI Projects: Llama 3, MaxKB, Phidata & RAGFlow

This article highlights four cutting‑edge open‑source AI initiatives—Meta’s Llama 3 large language model, the MaxKB knowledge‑base Q&A system, the Phidata framework for building AI assistants, and the RAGFlow retrieval‑augmented generation engine—detailing their capabilities, licensing, and where to access the code.

AIKnowledge BaseLLM

0 likes · 7 min read

Explore the Latest Open‑Source AI Projects: Llama 3, MaxKB, Phidata & RAGFlow

21CTO

May 6, 2024 · Databases

How Oracle’s New 23ai Database Brings AI-Powered Vector Search to Enterprises

Oracle’s latest release, Database 23ai, upgrades its 23c platform with AI-driven vector search, RAG capabilities, and enhanced JSON and graph querying, positioning the database as a unified, secure, and scalable solution for handling structured, semi‑structured, and unstructured data across cloud and on‑premises environments.

AIOracleRAG

0 likes · 7 min read

How Oracle’s New 23ai Database Brings AI-Powered Vector Search to Enterprises

AI Large Model Application Practice

May 3, 2024 · Artificial Intelligence

Can Giant Context LLMs Replace RAG? Exploring the Limits of Long‑Context Retrieval

This article examines whether the rapid growth of large‑language‑model context windows can eliminate the need for retrieval‑augmented generation, presenting experimental needle‑in‑a‑haystack tests, analysis of model performance across token lengths and needle positions, and practical guidance using an open‑source evaluation tool.

AILLMNeedle-in-a-Haystack

0 likes · 13 min read

Can Giant Context LLMs Replace RAG? Exploring the Limits of Long‑Context Retrieval

DataFunTalk

Apr 29, 2024 · Artificial Intelligence

Practical Experience and Q&A Exploration of Patent Large Models

This article presents a comprehensive overview of the development, training, data preparation, algorithmic strategies, evaluation methods, and RAG integration for a domain‑specific patent large language model, highlighting challenges, practical results, and future research directions.

Domain-specific ModelPatent AIRAG

0 likes · 19 min read

Practical Experience and Q&A Exploration of Patent Large Models

Rare Earth Juejin Tech Community

Apr 29, 2024 · Artificial Intelligence

Building Enterprise‑Grade Retrieval‑Augmented Generation (RAG) Systems: Challenges, Fault Points, and Best Practices

This comprehensive guide explores the complexities of building enterprise‑level Retrieval‑Augmented Generation (RAG) systems, detailing common failure points, architectural components such as authentication, input guards, query rewriting, document ingestion, indexing, storage, retrieval, generation, observability, caching, and multi‑tenant considerations, and provides actionable best‑practice recommendations for developers and technical leaders.

Enterprise AILLMObservability

0 likes · 32 min read

Building Enterprise‑Grade Retrieval‑Augmented Generation (RAG) Systems: Challenges, Fault Points, and Best Practices

Huolala Tech

Apr 25, 2024 · Artificial Intelligence

How LLM‑Powered Multi‑Agent AI Boosts Vehicle Selection in HuoLala’s Customer Service

This article details the design and implementation of an LLM‑driven multi‑agent AI customer‑service assistant for vehicle selection at HuoLala, covering system architecture, algorithmic solutions, retrieval‑augmented generation, NLU/NLG agents, performance improvements, and future outlooks.

AI Customer ServiceLLMMulti-Agent System

0 likes · 12 min read

How LLM‑Powered Multi‑Agent AI Boosts Vehicle Selection in HuoLala’s Customer Service

Volcano Engine Developer Services

Apr 18, 2024 · Databases

How VikingDB Powers AI Retrieval and Scalable Vector Search

VikingDB is a cloud‑native vector database that originated at ByteDance, offering high‑performance ANN search, hybrid dense‑sparse retrieval for Retrieval‑Augmented Generation, extensive scaling and filtering capabilities, and ready‑to‑use SDKs for real‑world AI applications.

AI RetrievalANN indexingRAG

0 likes · 17 min read

How VikingDB Powers AI Retrieval and Scalable Vector Search

DevOps

Apr 17, 2024 · Artificial Intelligence

Engineering Capabilities for Enterprise Large Model Applications: Prompt Engineering, RAG, and Fine‑Tuning

The article explores how enterprises can build and improve large‑model applications by combining prompt engineering, retrieval‑augmented generation (RAG), and fine‑tuning, discusses their relationships, optimization dimensions, testing challenges, and provides practical guidance for SE4AI implementation.

AI EngineeringEnterprise AIFine-tuning

0 likes · 20 min read

Engineering Capabilities for Enterprise Large Model Applications: Prompt Engineering, RAG, and Fine‑Tuning

dbaplus Community

Apr 17, 2024 · Operations

Boost IT Operations with Offline LLMs: A Step‑by‑Step RAG Guide Using LangChain

This article explains how to build an offline knowledge base for IT operations by combining large language models with Retrieval‑Augmented Generation (RAG) using LangChain, covering document loading, chunking, embedding, vector storage, and query‑time retrieval with concrete code examples.

EmbeddingLLMLangChain

0 likes · 6 min read

Boost IT Operations with Offline LLMs: A Step‑by‑Step RAG Guide Using LangChain

21CTO

Apr 12, 2024 · Artificial Intelligence

How I Built an AI‑Powered Resume Chatbot with LLMs and RAG

Senior developer Jon Olson shares how he created an AI resume assistant using GPT‑4/3.5, LangChain, LlamaIndex, and retrieval‑augmented generation, detailing prompt engineering, backend integration, and future routing features to help job seekers showcase their skills.

AI chatbotLLMLangChain

0 likes · 8 min read

How I Built an AI‑Powered Resume Chatbot with LLMs and RAG

Rare Earth Juejin Tech Community

Apr 12, 2024 · Artificial Intelligence

Typical Business and Technical Architectures for Large Language Model Applications

This article reviews the common business and technical architectures used in large language model (LLM) applications, explains AI Embedded, AI Copilot, and AI Agent modes—including single‑ and multi‑agent systems—and offers guidance on selecting appropriate technology stacks such as prompt‑only, function‑calling agents, RAG, and fine‑tuning.

AI AgentFine-tuningLLM

0 likes · 9 min read

Typical Business and Technical Architectures for Large Language Model Applications

Eric Tech Circle

Apr 11, 2024 · Artificial Intelligence

Build a Generative AI RAG App with Spring AI in Minutes

This guide walks you through setting up Spring AI, configuring model providers and vector stores, initializing a Spring Boot project, adding OpenAI credentials, and running a complete RAG (Retrieval‑Augmented Generation) demo with code snippets and sample API calls.

JavaOpenAIRAG

0 likes · 15 min read

Build a Generative AI RAG App with Spring AI in Minutes

Rare Earth Juejin Tech Community

Apr 11, 2024 · Artificial Intelligence

AI Hackathon Journey: Building the "Novel Jump" Bot on Coze Platform

This article recounts the author's participation in a Shenzhen AI Hackathon, detailing the development of an interactive novel‑character chatbot using the Coze platform, describing the workflow design, technical challenges, model choices, knowledge‑base construction, and the final demo and award outcomes.

AIChatbotCoze

0 likes · 12 min read

AI Hackathon Journey: Building the "Novel Jump" Bot on Coze Platform

HelloTech

Apr 10, 2024 · Artificial Intelligence

An Overview of LangChain: Architecture, Core Components, and Code Examples

LangChain is an open‑source framework that provides Python and JavaScript SDKs, templates, and services such as LangServe and LangSmith to compose models, embeddings, prompts, indexes, memory, chains, and agents via a concise expression language, enabling rapid prototyping, debugging, and deployment of LLM‑driven applications.

AI EngineeringJavaScriptLLM

0 likes · 19 min read

An Overview of LangChain: Architecture, Core Components, and Code Examples

Alibaba Cloud Developer

Apr 10, 2024 · Artificial Intelligence

Master LangChain in 10 Minutes: From Basics to Advanced AI Engineering

This guide walks AI engineers through a rapid 10‑minute boot‑strap of LangChain, explaining its purpose, core concepts, design questions, environment setup, and step‑by‑step code examples that cover APIs, chains, memory, retrieval‑augmented generation, tools, agents, and the overall architecture.

AI EngineeringLLMLangChain

0 likes · 28 min read

Master LangChain in 10 Minutes: From Basics to Advanced AI Engineering

Rare Earth Juejin Tech Community

Apr 8, 2024 · Artificial Intelligence

PreFLMR: Scaling Up Fine-Grained Late-Interaction Multi-modal Retrievers

The article introduces PreFLMR, an open‑source, general‑purpose pre‑trained multimodal retriever that leverages fine‑grained late‑interaction to boost retrieval‑augmented generation for knowledge‑intensive visual tasks, describes its M2KR benchmark, training stages, and strong experimental results across multiple tasks.

AIFLMRKnowledge Retrieval

0 likes · 11 min read

PreFLMR: Scaling Up Fine-Grained Late-Interaction Multi-modal Retrievers

Rare Earth Juejin Tech Community

Mar 30, 2024 · Artificial Intelligence

Comprehensive Guide to Coze: AI Bot Development, Prompt Engineering, and Workflow Design

This article provides an in‑depth overview of the Coze low‑code AI bot platform, covering its core features, product comparisons, step‑by‑step bot creation, RAG implementation, plugin usage, memory mechanisms, cron jobs, agent design, advanced workflow techniques, quality management, and future prospects.

AI botCozeLLM

0 likes · 25 min read

Comprehensive Guide to Coze: AI Bot Development, Prompt Engineering, and Workflow Design

AI Large Model Application Practice

Mar 29, 2024 · Artificial Intelligence

How RAG Architecture Evolves: From Simple Chains to Flexible RAG Flows

This article examines the evolution of Retrieval‑Augmented Generation (RAG) architectures for large language models, outlines the challenges they face, introduces the modular RAG Flow concept with four workflow paradigms, and provides a step‑by‑step implementation using LangChain and LlamaIndex with code examples.

LLMLangChainRAG

0 likes · 15 min read

How RAG Architecture Evolves: From Simple Chains to Flexible RAG Flows

Sohu Tech Products

Mar 27, 2024 · Artificial Intelligence

Building a RAG Application with Baidu Vector Database and Qianfan Embedding

This tutorial walks through building a Retrieval‑Augmented Generation application by setting up Baidu’s Vector Database and Qianfan embedding service, configuring credentials, creating a document database and vector table, loading and chunking PDFs, generating embeddings, storing them, and performing scalar, vector and hybrid similarity searches, ready for integration with Wenxin LLM for answer generation.

AI applicationsBaidu QianfanEmbedding

0 likes · 11 min read

Building a RAG Application with Baidu Vector Database and Qianfan Embedding

Sohu Tech Products

Mar 27, 2024 · Artificial Intelligence

NVIDIA NeMo Framework, TensorRT‑LLM, and RAG for Large Language Model Solutions

NVIDIA’s comprehensive LLM ecosystem combines the full‑stack NeMo Framework for data curation, distributed training, fine‑tuning, inference acceleration with TensorRT‑LLM and Triton, plus Retrieval‑Augmented Generation and Guardrails, enabling efficient, low‑latency, knowledge‑grounded model deployment across clusters.

AI accelerationModel TrainingNeMo Framework

0 likes · 16 min read

NVIDIA NeMo Framework, TensorRT‑LLM, and RAG for Large Language Model Solutions

Eric Tech Circle

Mar 24, 2024 · Artificial Intelligence

Running Local LLMs: Ollama vs Hugging Face – A Hands‑On Comparison

This guide compares Ollama and Hugging Face for running large language models locally, detailing API and local execution methods, installation steps, model selection, resource requirements, integration with AnythingLLM, container deployment, embedding and vector store setup, and practical observations on performance and limitations.

AnythingLLMDockerEmbedding

0 likes · 15 min read

Running Local LLMs: Ollama vs Hugging Face – A Hands‑On Comparison

NewBeeNLP

Mar 18, 2024 · Artificial Intelligence

Mastering RAG and LLM Techniques: From Retrieval to Fine‑Tuning

This article provides a comprehensive technical guide on Retrieval‑Augmented Generation (RAG), open‑source large language models such as LLaMA, fine‑tuning methods, evaluation metrics, memory‑optimization tricks, and attention‑related optimizations for modern AI systems.

LLMLangChainMemory Optimization

0 likes · 19 min read

Mastering RAG and LLM Techniques: From Retrieval to Fine‑Tuning

DataFunTalk

Mar 15, 2024 · Artificial Intelligence

NVIDIA’s NeMo Framework and TensorRT‑LLM: Full‑Stack Solutions for Large Language Models and Retrieval‑Augmented Generation

This article explains NVIDIA’s end‑to‑end ecosystem for large language models, covering the NeMo Framework’s data processing, distributed training, model fine‑tuning, inference acceleration with TensorRT‑LLM, deployment via Triton, and Retrieval‑Augmented Generation (RAG) techniques that enhance model reliability and performance.

AINeMoNvidia

0 likes · 16 min read

NVIDIA’s NeMo Framework and TensorRT‑LLM: Full‑Stack Solutions for Large Language Models and Retrieval‑Augmented Generation

Sohu Tech Products

Mar 13, 2024 · Artificial Intelligence

Build a Minimal Retrieval‑Augmented Generation (Tiny‑RAG) from Scratch

This step‑by‑step guide explains how to implement a lightweight Retrieval‑Augmented Generation system—Tiny‑RAG—by creating embedding classes, loading and chunking documents, building a simple vector store, performing similarity search, and integrating a large language model for answer generation, complete with runnable Python code.

EmbeddingLLMPython

0 likes · 14 min read

Build a Minimal Retrieval‑Augmented Generation (Tiny‑RAG) from Scratch

Baidu Geek Talk

Mar 13, 2024 · Artificial Intelligence

Understanding Retrieval-Augmented Generation (RAG) and Building a Personal Knowledge Base with ERNIE SDK and LangChain

The article explains Retrieval-Augmented Generation (RAG), its workflow, advantages, comparison with fine-tuning, and provides a step-by-step implementation using Baidu's ERNIE SDK, LangChain, and ChromaDB to build a personal knowledge base that answers queries with retrieved context.

AIERNIE SDKKnowledge Base

0 likes · 13 min read

Understanding Retrieval-Augmented Generation (RAG) and Building a Personal Knowledge Base with ERNIE SDK and LangChain

Xiaohe Frontend Team

Mar 6, 2024 · Artificial Intelligence

What the New “Generative AI Act Two” Reveals About the Next AI Wave

Sequoia Capital’s “Generative AI Act Two” report highlights a shift from hype‑driven model releases to user‑centric, end‑to‑end solutions, emphasizing the rise of foundational models as components, the importance of developer tools, emerging RAG and fine‑tuning techniques, and the evolving competitive landscape.

AI MarketFine-tuningFoundational models

0 likes · 6 min read

What the New “Generative AI Act Two” Reveals About the Next AI Wave

JD Retail Technology

Mar 4, 2024 · Artificial Intelligence

How JD Retail Integrates LLMs with SFT, RAG, and AI Agents for Real-World Impact

This article examines JD Retail's end‑to‑end large language model framework that combines supervised fine‑tuning, retrieval‑augmented generation, and ReAct‑based AI agents to overcome retail‑specific challenges, improve model accuracy, reduce hallucinations, and enable autonomous multi‑step business workflows.

AI AgentLLMRAG

0 likes · 20 min read

How JD Retail Integrates LLMs with SFT, RAG, and AI Agents for Real-World Impact

AI Large Model Application Practice

Mar 1, 2024 · Artificial Intelligence

Why LangGraph Is Needed: Extending LangChain with Loops and Fine‑Grained Agent Control

LangGraph, introduced in LangChain 0.1, addresses the limitations of simple Chains by adding loop capabilities and detailed control over Agent execution, enabling complex multi‑Agent, RAG, and self‑repair scenarios through a state‑graph architecture.

LCELLangChainLangGraph

0 likes · 12 min read

Why LangGraph Is Needed: Extending LangChain with Loops and Fine‑Grained Agent Control

Alibaba Cloud Big Data AI Platform

Feb 27, 2024 · Artificial Intelligence

Build a Knowledge‑Enhanced LLM Chatbot with Alibaba Cloud PAI: A Step‑by‑Step RAG Guide

This comprehensive guide walks AI developers through building a Retrieval‑Augmented Generation (RAG) chatbot on Alibaba Cloud PAI, covering architecture, vector store setup, model deployment, knowledge ingestion, multi‑modal retrieval, fusion, re‑ranking, prompt design, and end‑to‑end configuration with code examples.

Alibaba CloudChatbotLLM

0 likes · 26 min read

Build a Knowledge‑Enhanced LLM Chatbot with Alibaba Cloud PAI: A Step‑by‑Step RAG Guide

Rare Earth Juejin Tech Community

Feb 25, 2024 · Artificial Intelligence

Pinecone Vector Database and Embedding Model Summary from DeepLearning.AI’s AI Course

This article reviews the author’s hands‑on experience with Pinecone’s serverless vector database, various embedding and generation models such as all‑MiniLM‑L6‑v2, text‑embedding‑ada‑002, clip‑ViT‑B‑32, and GPT‑3.5‑turbo‑instruct, and demonstrates how they are applied to semantic search, RAG, recommendation, hybrid, and facial similarity tasks using Python code examples.

AIPineconePython

0 likes · 9 min read

Pinecone Vector Database and Embedding Model Summary from DeepLearning.AI’s AI Course

AI Large Model Application Practice

Feb 23, 2024 · Artificial Intelligence

How to Build a Text‑to‑SQL Chatbot with Vanna’s Open‑Source RAG Framework

This guide explains Vanna, an open‑source Python RAG framework for Text2SQL, covering its core concepts, RAG‑based architecture, step‑by‑step model training, code examples for customization, and how to deploy a conversational database chatbot with a Flask web UI.

ChatbotLLMPython

0 likes · 11 min read

How to Build a Text‑to‑SQL Chatbot with Vanna’s Open‑Source RAG Framework

Cloud Native Technology Community

Feb 8, 2024 · Artificial Intelligence

How Retrieval‑Augmented Generation Boosts LLM Accuracy and Trust

Retrieval‑augmented generation (RAG) enhances large language models by fetching up‑to‑date, authoritative information from external sources, addressing hallucinations, outdated knowledge, and lack of citations, while offering cost‑effective implementation, improved relevance, user trust, and greater developer control through vector databases, semantic search, and prompt engineering.

AIPrompt engineeringRAG

0 likes · 10 min read

How Retrieval‑Augmented Generation Boosts LLM Accuracy and Trust

Baobao Algorithm Notes

Feb 4, 2024 · Industry Insights

Balancing Fun, Utility, and Slow Thinking: The Future of AI Agents

In this talk, the speaker examines the dual goals of AI agents—being entertaining and useful—while introducing the concepts of fast and slow thinking, multimodal perception, long‑term memory, retrieval‑augmented generation, and tool integration as essential steps toward building truly valuable digital companions.

AI agentsFuture AILong-term Memory

0 likes · 18 min read

Balancing Fun, Utility, and Slow Thinking: The Future of AI Agents

Rare Earth Juejin Tech Community

Jan 31, 2024 · Artificial Intelligence

Advanced RAG with Semi‑Structured Data Using LangChain, Unstructured, and ChromaDB

This tutorial demonstrates how to build an advanced Retrieval‑Augmented Generation (RAG) system for semi‑structured PDF data by leveraging LangChain, the unstructured library, ChromaDB vector store, and OpenAI models, covering installation, PDF partitioning, element classification, summarization, and query execution.

AIChromaDBLangChain

0 likes · 11 min read

Advanced RAG with Semi‑Structured Data Using LangChain, Unstructured, and ChromaDB

Data Thinking Notes

Jan 7, 2024 · Artificial Intelligence

Boost Text2SQL Accuracy with Retrieval‑Augmented Generation and LangChain

This article explains how Retrieval‑Augmented Generation (RAG) can improve LLM‑based Text2SQL conversion, covering RAG fundamentals, LangChain implementation steps, practical enhancements for SQL agents, and future directions for integrating domain knowledge.

AI agentsLLMLangChain

0 likes · 16 min read

Boost Text2SQL Accuracy with Retrieval‑Augmented Generation and LangChain

DaTaobao Tech

Dec 27, 2023 · Artificial Intelligence

Deploying a Private LLM Knowledge Base on a MacBook

The guide walks through installing and quantizing the open‑source ChatGLM3‑6B model and the m3e‑base embedder on a MacBook, wrapping them with a FastAPI OpenAI‑compatible service, routing requests through a One‑API gateway, storing metadata in MongoDB and vectors in PostgreSQL pgvector, deploying FastGPT for RAG, ingesting data, and demonstrating 5‑7 second response times, while outlining future improvements.

ChatGLM3DeploymentFastAPI

0 likes · 23 min read

Deploying a Private LLM Knowledge Base on a MacBook

AI Large Model Application Practice

Dec 12, 2023 · Artificial Intelligence

Boost Enterprise LLM Performance: Solving Common RAG Challenges

This article explains Retrieval‑Augmented Generation for enterprise LLMs, outlines four production‑grade problems, and presents practical solutions such as parent‑child chunking, multi‑vector and multi‑query retrieval, and context‑aware question refinement with concrete prompts and workflow diagrams.

LLMRAG

0 likes · 13 min read

Boost Enterprise LLM Performance: Solving Common RAG Challenges

Baobao Algorithm Notes

Dec 6, 2023 · Artificial Intelligence

How to Systematically Fix Bad Cases in Large Language Models

The article outlines a structured approach to identifying, categorizing, evaluating impact, and repairing undesirable responses from large language models, covering both model‑level interventions across training stages and practical inference‑time techniques such as parameter tuning, prompt engineering, RAG, and pre/post‑processing safeguards.

Model AlignmentPrompt engineeringRAG

0 likes · 9 min read

How to Systematically Fix Bad Cases in Large Language Models

DataFunTalk

Nov 17, 2023 · Databases

Cost as the Primary Driver of Vector Database Industry Development

Vector databases gain traction because they dramatically reduce storage, learning, scaling, and large‑model limitations costs by enabling semantic similarity search, RAG‑based prompt optimization, efficient high‑dimensional indexing, and cloud‑native architectures, making them essential for modern AI applications despite the promotional context.

AIBig DataRAG

0 likes · 11 min read

Cost as the Primary Driver of Vector Database Industry Development

Architect

Nov 8, 2023 · Artificial Intelligence

AI Agents Unleashed: From Assistants API to Multi‑Agent Frameworks

The article dissects the rise of AI agents—from OpenAI's Assistants API and multimodal perception‑brain‑action pipelines to retrieval‑augmented generation, tool‑use strategies, single‑ and multi‑agent deployments, and emerging frameworks like AutoGen—while highlighting concrete examples, benchmark results, and current limitations.

AI agentsAssistants APIEmbodied AI

0 likes · 38 min read

AI Agents Unleashed: From Assistants API to Multi‑Agent Frameworks

AI Large Model Application Practice

Oct 18, 2023 · Artificial Intelligence

How to Extract and Embed Tables and Images from PDFs for Multimodal RAG

This article explains a practical approach to parsing PDFs containing text, tables, and images, using the open‑source Unstructured library and LlaVA model, then embedding each modality into a vector store with multi‑vector retrieval to enable accurate semantic search in private‑knowledge RAG pipelines, with optional LangChain integration.

LLMLangChainPDF processing

0 likes · 12 min read

How to Extract and Embed Tables and Images from PDFs for Multimodal RAG

dbaplus Community

Oct 14, 2023 · Artificial Intelligence

Demystifying Retrieval‑Augmented Generation: From Theory to Working Chatbot

This guide explains the Retrieval‑Augmented Generation (RAG) technique, detailing how user queries are matched to private knowledge bases, how relevant passages are retrieved, and how large language models use those passages to generate context‑aware answers, complete with code examples and practical tips.

ChatbotEmbeddingLLM

0 likes · 19 min read

Demystifying Retrieval‑Augmented Generation: From Theory to Working Chatbot

phodal

Sep 24, 2023 · Artificial Intelligence

Designing a JVM‑Based LLM Framework: Insights from Chocolate Factory

This article explores the design principles, architectural decisions, and practical code examples behind the Chocolate Factory framework, a JVM‑centric LLM development platform inspired by LangChain, LlamaIndex, Spring AI, and PromptFlow, highlighting SDK construction, RAG workflows, and prompt engineering challenges.

AI DevelopmentFrameworkJVM

0 likes · 11 min read

Designing a JVM‑Based LLM Framework: Insights from Chocolate Factory

phodal

Sep 3, 2023 · Artificial Intelligence

Engineering LLM Applications: Architecture, Prompt Modeling, and Multi‑Language Strategies

This article shares practical insights from months of building LLM proof‑of‑concepts, covering language‑agnostic architectures, FFI integration, prompt engineering, RAG patterns, DSL design, and four core architectural principles for scalable AI applications.

AI ArchitectureDSLFFI

0 likes · 13 min read

Engineering LLM Applications: Architecture, Prompt Modeling, and Multi‑Language Strategies

Java High-Performance Architecture

Aug 18, 2023 · Databases

Redis 7.2 Unified Release: Boost AI, Vector Search, and Real‑Time Functions

Redis 7.2, the first Unified Redis Release, introduces AI‑ready vector indexing, hybrid semantic search, scalable RAG support, server‑side Triggers and Functions, enhanced geospatial queries, and a preview of high‑performance searchable indexes, while expanding client library support and integrating Redis Data Integration for seamless enterprise data pipelines.

AIRAGServerless Functions

0 likes · 8 min read