How MMKG‑RDS Generates High‑Quality Multimodal Reasoning Data from Knowledge Graphs
The MMKG‑RDS framework introduced by 360 AI Lab creates a complete pipeline—from multimodal document parsing and knowledge‑graph construction to customizable task synthesis and multi‑dimensional quality assessment—enabling the production of high‑quality reasoning data that significantly boosts large‑model performance across diverse domains.
Background
High‑quality reasoning data is a critical bottleneck for large‑model capabilities, especially for long‑tail knowledge coverage, cross‑modal inference, and quantifiable data quality. Existing methods and knowledge‑graph‑based pipelines lack sufficient coverage, validation, and interpretability.
Key Features
Defines seven core node types (Document, Chunk, Entity, Assertion, Image, Table, Formula) and sixteen semantic relations, converting heterogeneous text, tables, and formulas into a structured multimodal knowledge graph.
Provides an end‑to‑end automated pipeline: document parsing → knowledge‑graph construction → quality filtering → compatibility with pre‑built graphs.
Supports deep multimodal representation that captures both document structure and underlying logical connections.
Offers a highly configurable mechanism with over 32 preset task types and domain‑specific schema options.
Implements multi‑dimensional quality assessment using support, difficulty, and complexity metrics.
Features a modular architecture with plug‑in support for large‑model APIs, Neo4j graph database, and standard benchmark datasets.
Architecture and Implementation
The framework consists of five modules: Preprocessing , KG‑Builder , Storage , Synthesis , and Analysis . The processing pipeline follows:
Document processing extracts images, titles, formulas, and other atomic elements from PDFs, PNGs, PPTs, and DOCs, converting them into unified JSON/CSV triples. Specialized conversions such as flow‑chart → Mermaid or icons → JSON are also supported.
KG construction links the seven node types with sixteen relation types, forming a heterogeneous graph that captures deep logical connections.
Task synthesis supports 32 configurable task types—including table QA, text QA, formula QA, image QA, entity QA, and cross‑modal QA—with both single‑hop and multi‑hop variants.
Generated data can be stored in Neo4j, NetworkX, or other common formats and customized for domains such as finance, law, or scientific research.
Evaluation
The MMKG‑RDS‑Bench dataset covers five domains (history, organic chemistry, law, stock research reports, and academic papers) and 17 task types, comprising 14,950 high‑quality samples. Fine‑tuning Qwen‑3 models (0.6B, 8B, 32B) with a modest number of synthetic samples improves inference accuracy by up to 9.2 % , demonstrating substantial gains even for lightweight models.
Cross‑modal table and formula reasoning data generated by the framework directly target existing model weaknesses, offering clear directions for next‑generation model optimization.
Application Scenarios
The framework is applicable to large‑model research and optimization, vertical‑domain intelligence, education and assessment, benchmark dataset construction, and enterprise knowledge management. It is especially valuable when high‑quality multimodal reasoning data, cross‑modal knowledge utilization, or custom task development are required.
Usage and Availability
Code and dataset are open‑source at https://github.com/360AILAB-NLP/MMKG-RDS. The technical report is available on arXiv at https://arxiv.org/pdf/2602.23632 ("MMKG‑RDS: Reasoning Data Synthesis via Deep Mining of Multimodal Knowledge Graphs").
360 Tech Engineering
Official tech channel of 360, building the most professional technology aggregation platform for the brand.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
