Tagged articles
26 articles
Page 1 of 1
DataFunTalk
DataFunTalk
May 15, 2026 · Artificial Intelligence

Exploring Multimodal GraphRAG: Combining Document Intelligence, Knowledge Graphs, and Large Models

This article provides a comprehensive technical overview of multimodal GraphRAG, detailing document‑intelligence parsing pipelines, layout analysis, OCR‑pipeline vs OCR‑free approaches, knowledge‑graph integration for chunk relationships, multimodal indexing, retrieval‑generation workflows, and a comparative analysis of RAG, GraphRAG, and KG‑QA solutions.

Document IntelligenceGraphRAGKnowledge Graph
0 likes · 23 min read
Exploring Multimodal GraphRAG: Combining Document Intelligence, Knowledge Graphs, and Large Models
DataFunTalk
DataFunTalk
May 10, 2026 · Artificial Intelligence

Exploring Multimodal GraphRAG: Combining Document Intelligence, Knowledge Graphs, and Large Models

This article presents a detailed technical walkthrough of multimodal GraphRAG, covering document‑intelligence parsing pipelines, multimodal graph index construction, knowledge‑graph‑driven chunk linking, recent research progress, performance trade‑offs, and practical recommendations for deploying RAG solutions.

Document IntelligenceGraphRAGKnowledge Graph
0 likes · 23 min read
Exploring Multimodal GraphRAG: Combining Document Intelligence, Knowledge Graphs, and Large Models
DataFunTalk
DataFunTalk
May 5, 2026 · Artificial Intelligence

Agent Architecture in Action: Building Next‑Gen Recommendation and Search Systems

This article reviews cutting‑edge AI search and recommendation techniques—including Alibaba Cloud's Agentic RAG, Huawei Noah's LLM‑enhanced recommendation pipeline, and Baidu's generative ranking model GRAB—detailing their architectural evolution, multimodal retrieval strategies, GPU acceleration, and measured performance gains.

AI searchAgentic RAGGPU Acceleration
0 likes · 6 min read
Agent Architecture in Action: Building Next‑Gen Recommendation and Search Systems
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
May 1, 2026 · Artificial Intelligence

Zero Deployment, Zero Ops: Alibaba Cloud Milvus Embedding Service Makes Vectorization Plug‑and‑Play

The article explains how Alibaba Cloud's Milvus Embedding Service eliminates the need for self‑hosted embedding models by integrating model inference, vector generation and Milvus indexing into a managed pipeline, dramatically reducing deployment complexity, operational overhead, and time‑to‑value for semantic search, RAG and multimodal retrieval use cases.

Alibaba CloudEmbeddingMilvus
0 likes · 19 min read
Zero Deployment, Zero Ops: Alibaba Cloud Milvus Embedding Service Makes Vectorization Plug‑and‑Play
DataFunTalk
DataFunTalk
Apr 24, 2026 · Artificial Intelligence

Exploring Multimodal GraphRAG: Document Intelligence, Knowledge Graphs, and Large‑Model Integration

This article presents a detailed technical walkthrough of multimodal GraphRAG, covering document‑intelligence parsing pipelines, layout‑analysis models, knowledge‑graph augmentation, multimodal indexing and retrieval, and a comparative analysis of RAG, GraphRAG, and KG‑QA approaches, with concrete examples, model sizes, benchmark scores, and research citations.

Document IntelligenceGraphRAGKnowledge Graph
0 likes · 25 min read
Exploring Multimodal GraphRAG: Document Intelligence, Knowledge Graphs, and Large‑Model Integration
AI Waka
AI Waka
Mar 26, 2026 · Artificial Intelligence

Building Production‑Ready AI Agents with NVIDIA Nemotron: A Full‑Stack Guide

This guide explains how to assemble NVIDIA's Nemotron Speech, RAG, and Safety models into a low‑latency, secure production AI agent stack, covering performance benchmarks, multimodal retrieval, safety data sets, integration code, and deployment options for cloud, on‑premise, and edge environments.

Content SafetyEdge ComputingMultimodal Retrieval
0 likes · 9 min read
Building Production‑Ready AI Agents with NVIDIA Nemotron: A Full‑Stack Guide
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Mar 17, 2026 · Artificial Intelligence

DeepImageSearch Ushers in the Deep Search Era: Enabling AI to Understand Visual Histories

DeepImageSearch introduces a new paradigm that shifts image retrieval from isolated semantic matching to corpus‑level contextual reasoning, supported by the DISBench benchmark and the ImageSeeker framework, revealing that even state‑of‑the‑art multimodal models struggle with multi‑step visual‑history queries.

DISBenchDeepImageSearchImageSeeker
0 likes · 15 min read
DeepImageSearch Ushers in the Deep Search Era: Enabling AI to Understand Visual Histories
DataFunSummit
DataFunSummit
Dec 19, 2025 · Artificial Intelligence

How Agentic RAG, LLM‑Powered Recommendations, and Generative Ranking Transform AI Search and Ads

This article surveys cutting‑edge AI techniques—including Alibaba Cloud's Agentic RAG for multimodal search, Huawei Noah's LLM‑enhanced recommendation evolution, and Baidu's generative ranking (GRAB) for ads—detailing their architectures, optimization tricks, performance gains, and real‑world deployment results.

AI searchAgentic RAGGPU Acceleration
0 likes · 9 min read
How Agentic RAG, LLM‑Powered Recommendations, and Generative Ranking Transform AI Search and Ads
PaperAgent
PaperAgent
Dec 12, 2025 · Artificial Intelligence

How BookRAG Redefines Long-Document Retrieval with Hierarchical Indexing

BookRAG introduces a hierarchical, structure‑aware indexing method that combines tree‑based document representation with graph‑based entity linking and an agent‑driven retrieval pipeline, achieving up to 71.2% recall improvement on multimodal long‑document benchmarks while cutting token usage and latency dramatically.

Agent RetrievalHierarchical IndexingLLM
0 likes · 7 min read
How BookRAG Redefines Long-Document Retrieval with Hierarchical Indexing
Tencent Advertising Technology
Tencent Advertising Technology
Nov 28, 2025 · Artificial Intelligence

How Retrv-R1 Redefines Universal Multimodal Retrieval with Reasoning‑Driven MLLM

Retrv‑R1, a reasoning‑driven multimodal large language model framework, tackles the precision‑efficiency dilemma of universal multimodal retrieval by introducing a two‑stage coarse‑to‑fine pipeline, an information‑compression module, a detail‑inspection mechanism, and a three‑stage training strategy, achieving SOTA performance across accuracy, efficiency, and generalization benchmarks.

GeneralizationMLLMMultimodal Retrieval
0 likes · 21 min read
How Retrv-R1 Redefines Universal Multimodal Retrieval with Reasoning‑Driven MLLM
Alibaba Cloud Native
Alibaba Cloud Native
Aug 25, 2025 · Artificial Intelligence

How 1688 AI App Redefines B2B E‑commerce with AI‑Powered Search and Multimodal Interfaces

The article examines the design shift from the traditional 1688 App to the AI‑native 1688 AI App, detailing how AI‑driven interfaces, system prompts, embedding‑based retrieval, multi‑agent routing, and AI gateways transform B2B product discovery, recommendation, and customization.

AI searchB2B e-commerceMultimodal Retrieval
0 likes · 20 min read
How 1688 AI App Redefines B2B E‑commerce with AI‑Powered Search and Multimodal Interfaces
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Jul 8, 2025 · Artificial Intelligence

How Video Retrieval‑Augmented Generation Transforms Multimodal AI Search

This article explains the end‑to‑end implementation of Video RAG in OpenSearch LLM, covering offline parsing, key‑frame extraction, audio transcription, slice creation, multimodal vectorization, hybrid indexing, and online query processing while addressing challenges like recall performance and long‑video efficiency.

ASRKey Frame ExtractionLLM
0 likes · 10 min read
How Video Retrieval‑Augmented Generation Transforms Multimodal AI Search
Alibaba Cloud Developer
Alibaba Cloud Developer
Jun 19, 2025 · Artificial Intelligence

Build Efficient Multimodal Text‑Image Search with Alibaba Cloud Milvus

This guide explains how to use Alibaba Cloud Milvus to create a scalable, high‑performance multimodal search system that supports text‑to‑image, image‑to‑image, and cross‑modal queries across various business scenarios, detailing architecture, deployment steps, validation, and resource cleanup.

AIMilvusMultimodal Retrieval
0 likes · 8 min read
Build Efficient Multimodal Text‑Image Search with Alibaba Cloud Milvus
Big Data Technology & Architecture
Big Data Technology & Architecture
Jun 9, 2025 · Databases

Why Data Warebase Could Be the Next Game‑Changer for AI Workloads

The article examines how emerging data‑infrastructure trends, multi‑modal databases like Neon, Supabase, and ClickHouse, and the convergence of OLTP, OLAP, and vector search are reshaping AI workloads, introducing the Data Warebase concept that unifies warehouse and database capabilities to meet modern AI workflow demands.

AIHTAPMultimodal Retrieval
0 likes · 32 min read
Why Data Warebase Could Be the Next Game‑Changer for AI Workloads
Meituan Technology Team
Meituan Technology Team
Oct 31, 2024 · Artificial Intelligence

Selected Meituan Papers from CIKM 2024: Summaries of Eight Research Works

This article highlights eight Meituan research papers accepted at CIKM 2024—spanning self‑supervised sequential recommendation, rating‑consistent explanation generation, CTR prediction via recommendation pre‑training, cross‑domain interest transfer, multimodal vector retrieval, design‑aware poster layout, order‑fulfillment cycle‑time forecasting, and delivery‑scope substitution—offering insights from both internal and university collaborations.

AI researchCTR predictionCross‑Domain Recommendation
0 likes · 16 min read
Selected Meituan Papers from CIKM 2024: Summaries of Eight Research Works
Huolala Tech
Huolala Tech
Aug 22, 2024 · Artificial Intelligence

How Large Language Models Automate Order Cancellation Responsibility at HuoLala

This article explains how HuoLala leverages large language models, multimodal feature integration, and retrieval‑augmented generation to automatically determine responsibility for order cancellations, improving accuracy, explainability, and driver‑user experience.

AIMultimodal RetrievalOrder Cancellation
0 likes · 10 min read
How Large Language Models Automate Order Cancellation Responsibility at HuoLala
Meituan Technology Team
Meituan Technology Team
Jul 4, 2024 · Artificial Intelligence

Meituan Search Advertising: Evolution of Recall Strategies and Generative Approaches

Meituan’s search advertising has progressed from rule‑based keyword mining to hierarchical recall that partitions traffic and supply, and now to generative recall using large language models, chain‑of‑thought generation, diffusion‑enhanced multimodal vectors, and knowledge distillation, expanding the decision space while tackling compute and ROI challenges.

Generative ModelsMeituanMultimodal Retrieval
0 likes · 19 min read
Meituan Search Advertising: Evolution of Recall Strategies and Generative Approaches
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Apr 8, 2024 · Artificial Intelligence

PreFLMR: Scaling Up Fine-Grained Late-Interaction Multi-modal Retrievers

The article introduces PreFLMR, an open‑source, general‑purpose pre‑trained multimodal retriever that leverages fine‑grained late‑interaction to boost retrieval‑augmented generation for knowledge‑intensive visual tasks, describes its M2KR benchmark, training stages, and strong experimental results across multiple tasks.

AIFLMRKnowledge Retrieval
0 likes · 11 min read
PreFLMR: Scaling Up Fine-Grained Late-Interaction Multi-modal Retrievers
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Jul 12, 2023 · Artificial Intelligence

How ConaCLIP Boosts Lightweight Text-Image Retrieval with Dual‑Encoder Distillation

ConaCLIP introduces a fully‑connected knowledge interaction graph to distill large dual‑encoder models into compact ones, enhancing text‑image retrieval accuracy and efficiency on edge devices, with extensive experiments and supervision strategies demonstrating significant gains over existing baselines.

AIConaCLIPDual Encoder
0 likes · 9 min read
How ConaCLIP Boosts Lightweight Text-Image Retrieval with Dual‑Encoder Distillation
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Jul 11, 2023 · Artificial Intelligence

How FashionKLIP Boosts E‑Commerce Image‑Text Retrieval with a Multimodal Knowledge Graph

The ACL 2023 paper introduces FashionKLIP, an e‑commerce visual‑language model enhanced by a multimodal concept knowledge graph, detailing its automated knowledge graph construction, dual‑stream training strategy, and superior performance on FashionGen retrieval benchmarks compared to state‑of‑the‑art methods.

FashionKLIPKnowledge GraphMultimodal Retrieval
0 likes · 10 min read
How FashionKLIP Boosts E‑Commerce Image‑Text Retrieval with a Multimodal Knowledge Graph
Architect
Architect
May 18, 2021 · Big Data

Design and Optimization of Baidu's Image Processing and Ingestion Platform (Imazon) for Multimodal Retrieval

This article details Baidu's multimodal retrieval architecture, explaining the separation of online and offline services, the design of the Imazon image processing and ingestion platform, its technical indicators, large‑scale streaming and batch pipelines, optimization practices for high throughput, and the underlying content‑relationship engine.

DAGImage ProcessingMultimodal Retrieval
0 likes · 13 min read
Design and Optimization of Baidu's Image Processing and Ingestion Platform (Imazon) for Multimodal Retrieval
High Availability Architecture
High Availability Architecture
May 18, 2021 · Big Data

Design and Optimization of Baidu's Image Processing and Multimodal Retrieval Platform (Imazon)

This article details Baidu's large‑scale image processing and multimodal retrieval system, describing its offline‑online architecture, massive data ingestion pipeline, ANN search techniques, performance metrics, infrastructure components, and a series of optimizations for throughput, cost, and reliability in a high‑volume streaming environment.

BaiduImage ProcessingImazon
0 likes · 12 min read
Design and Optimization of Baidu's Image Processing and Multimodal Retrieval Platform (Imazon)
Baidu Geek Talk
Baidu Geek Talk
May 17, 2021 · Artificial Intelligence

Design and Optimization of Baidu's Image Processing and Multimodal Retrieval Platform (Imazon)

The Imazon platform unifies Baidu’s image acquisition, feature extraction, and ANN‑based multimodal retrieval into a cloud‑native, real‑time pipeline that ingests billions of images daily, optimizes storage and GPU usage, reduces message‑queue costs, and ensures high‑throughput, low‑latency search across text, visual, and voice queries.

Cloud NativeDAGImage Processing
0 likes · 13 min read
Design and Optimization of Baidu's Image Processing and Multimodal Retrieval Platform (Imazon)
Meituan Technology Team
Meituan Technology Team
Sep 24, 2020 · Artificial Intelligence

Multimodal Recall Solution for KDD Cup 2020: ImageBERT and LXMERT Based Approach

The second‑place team tackled KDD Cup 2020’s Multimodal Recall challenge by fine‑tuning ImageBERT and LXMERT on query‑image pairs, generating negatives, applying AMSoftmax and multi‑similarity losses, ensembling weighted predictions, and using score‑based post‑processing, boosting NDCG@5 to 0.8352 and powering Meituan’s multimodal search pipeline.

ImageBERTKDD Cup 2020LXMERT
0 likes · 23 min read
Multimodal Recall Solution for KDD Cup 2020: ImageBERT and LXMERT Based Approach
iQIYI Technical Product Team
iQIYI Technical Product Team
Jul 12, 2019 · Artificial Intelligence

Multimodal Video Retrieval Solution for iQIYI Challenge: Feature Fusion and Model Ensemble

The ‘One Name’ team from Nanjing University achieved a MAP of 0.8986 and third place in the iQIYI multimodal video retrieval challenge by fusing official face embeddings with scene features, using channel‑attention‑based video feature fusion, a multimodal SE‑ResNeXt module, and a carefully partitioned model ensemble.

Multimodal Retrievalfeature fusioniQIYI challenge
0 likes · 7 min read
Multimodal Video Retrieval Solution for iQIYI Challenge: Feature Fusion and Model Ensemble