Artificial Intelligence 13 min read

Evaluating Research Ideas with InnoEval and SciAtlas: Leveraging 43M Papers and 3B Triples

As large language models accelerate idea generation and the volume of scientific papers soars, InnoEval formalizes multi‑perspective, knowledge‑grounded evaluation of research ideas, while SciAtlas provides a massive cross‑disciplinary knowledge graph that powers evidence‑rich assessments and agent‑driven workflows.

Machine Learning Algorithms & Natural Language Processing

Jun 28, 2026

Evaluating Research Ideas with InnoEval and SciAtlas: Leveraging 43M Papers and 3B Triples

Why Evaluate Research Ideas?

When large models deeply enter the research pipeline, the barrier to generating ideas drops dramatically, leading to an explosive growth of scientific output and a rapid rise in the number of papers. This creates a core challenge: how to reliably identify ideas that truly have scientific value, verifiability, and long‑term impact.

InnoEval: A Knowledge‑Grounded, Multi‑Perspective Evaluation System

InnoEval (ICML 2026) formalizes research‑idea evaluation as a systematic decision problem that relies on external knowledge, multi‑perspective reasoning, and evidence alignment. It decomposes evaluation into three key stages:

Heterogeneous Deep Knowledge Search : The system first parses the raw idea, generates multi‑turn queries, and performs fast search plus slow reading to retrieve high‑quality, relevant evidence from papers, online literature, web pages, and code repositories, producing a background‑knowledge report.

Innovation Review Committee : Multiple reviewer personas with different academic backgrounds, knowledge familiarity, and bias profiles independently assess the idea, mitigating the bias of a single model.

Multi‑Dimensional Decoupled Evaluation : Each idea is scored on Clarity, Novelty, Feasibility, Validity, and Significance (with support for custom dimensions). Dedicated evaluator agents analyze each dimension and produce a comprehensive report containing evidence, sub‑scores, a meta‑review, and improvement suggestions.

Experimental Results

Using real submissions from NeurIPS 2025 and ICLR 2025, three tasks were constructed: single‑idea evaluation, pairwise idea comparison, and group‑wise idea ranking. InnoEval outperformed baselines such as CoT, RAG, ResearchAgent, InternAgent, GraphEval, and ScholarEval. On point‑wise prediction it achieved a 16.18 % F1 improvement over the strongest baseline, and it also showed higher accuracy on pairwise and group‑wise tasks. Moreover, its evaluation reports were closer to human expert reviews in terms of reasonableness, evidence support, depth, constructiveness, and overall quality.

SciAtlas: A Large‑Scale Cross‑Disciplinary Scientific Knowledge Graph

InnoEval’s success highlighted the importance of high‑quality knowledge grounding. SciAtlas addresses this by integrating 26 disciplines, over 43 million papers, 1.57 billion entities, and 3 billion triples, covering papers, authors, institutions, keywords, venues, fields, sub‑fields, topics, and relationships such as citations, co‑authorship, keyword co‑occurrence, and hierarchical topics.

Unlike traditional keyword or vector similarity search, SciAtlas explicitly models scientific knowledge as a computable, structured network, enabling topological reasoning and evidence tracing across the graph.

From Idea to Evidence in SciAtlas

A research idea is no longer matched only to isolated documents; it can be located via semantic similarity, title anchors, citation links, keyword co‑occurrence, author collaboration networks, and domain hierarchies, creating a stable, explainable evidence space and positioning map.

SciAtlas employs neuro‑symbolic retrieval: keyword and semantic vector recall are combined with graph propagation and re‑ranking to uncover deeper knowledge connections. The system returns not just a paper list but also score breakdowns and path explanations, supporting literature review, idea grounding, trend analysis, and innovation assessment.

Agent Skills Built on SciAtlas

SciAtlas functionality is exposed as a set of Agent Skills, including: sciatlas-quick-paper-search: fast evidence seed retrieval. sciatlas-literature-review: generate reading lists and related‑work drafts. sciatlas-idea-grounding: check an idea’s relationship to existing work. sciatlas-idea-evaluate: assess novelty, feasibility, soundness, etc. sciatlas-idea-generate: propose new research directions based on evidence. sciatlas-trend-report: summarize development trajectories of a field. sciatlas-researcher-review: analyze a researcher’s trajectory and representative contributions.

All skills share the underlying search-papers retrieval pipeline, producing request.json, response.json, summary.txt, and report.md artifacts that make the process traceable and reproducible.

Getting Started

Install the CLI from the GitHub repository:

pip install "git+https://github.com/zjunlp/SciAtlas.git#subdirectory=sciatlas"

Configure the API token:

export SCIATLAS_API_BASE_URL="http://scinet.openkg.cn"
export SCIATLAS_API_KEY="your-personal-sciatlas-token"
export SCIATLAS_TIMEOUT=900

Example commands:

sciatlas search-papers \
  --query "open world agent" \
  --keyword "high:open world agent" \
  --top-k 10

sciatlas idea-evaluate \
  --idea "LLM-based multi-perspective evaluation for scientific research ideas" \
  --domain "artificial intelligence" \
  --time-range 2020-2025 \
  --keyword "high:idea evaluation" \
  --keyword "middle:LLM as a judge" \
  --top-k 10

To use a skill within an Agent, copy the agent-skill/ directory from the repository into the Agent’s skill folder (e.g., ~/.codex/skills) and restart the Agent.

Future Directions

InnoEval and SciAtlas together aim to enable AI research systems not only to generate ideas but also to understand, locate, critique, and improve them. Current limitations include a focus on paper‑level knowledge; deeper integration with scientific datasets, experiments, and vertical resources is still early. Future work will link SciAtlas with resources such as SciGraph to build a full‑stack knowledge infrastructure for scientific agents.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AI agents LLM knowledge graph InnoEval research idea evaluation SciAtlas

Written by

Machine Learning Algorithms & Natural Language Processing

Focused on frontier AI technologies, empowering AI researchers' progress.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.