Quantifying Robot Data Value: ATHENA Scales Influence Functions to Billion‑Parameter VLA with 313× Speedup

ATHENA introduces a data‑curation framework for billion‑parameter multi‑task Vision‑Language‑Action models that extends influence functions via Kronecker gradient compression and a multitask influence interaction scheme, achieving a 313× reduction in compute (from 8054.6 to 25.7 GPU‑hours) and improving task success rates while using fewer, higher‑value demonstrations.

Machine Heart
Machine Heart
Machine Heart
Quantifying Robot Data Value: ATHENA Scales Influence Functions to Billion‑Parameter VLA with 313× Speedup

Problem

Robot Vision‑Language‑Action (VLA) models can learn from large demonstration datasets, but low‑quality demonstrations increase collection, runtime, storage and training costs and may degrade downstream task performance.

Limitations of existing data valuation

Heuristic metrics such as trajectory length or motion smoothness measure superficial quality and lack causal linkage to closed‑loop task success. Influence functions can estimate the effect of adding or removing a training sample on downstream metrics, but prior work targets small, single‑task models and cannot scale to billion‑parameter multi‑task VLA models.

ATHENA framework

ATHENA extends influence functions to multi‑task VLA models with up to 1 B parameters. It introduces two technical components:

Kronecker gradient compression and random‑truncation Hessian inverse approximation. Linear‑layer weight gradients have an outer‑product (Kronecker) structure. ATHENA projects input activations and backward errors separately, then combines them into low‑dimensional features, avoiding explicit construction of full per‑sample gradients. Random truncation retains dominant low‑rank subspaces of the Hessian, reducing memory and compute for the inverse.

Multitask Influence Interaction (MII). For each demonstration ATHENA computes a local‑task influence (effect on its own task) and a cross‑task influence (effect on other tasks). The two scores are aggregated to produce a task‑balanced influence ranking, preventing tasks with large gradient magnitudes from dominating the selection.

Computation efficiency

On a benchmark of 50 tasks and ≈560.5 K timesteps, the unaccelerated influence‑function pipeline requires ≈8054.6 GPU‑hours. ATHENA reduces the total to 25.7 GPU‑hours, a 313.4× speed‑up, making influence‑based data curation feasible for billion‑parameter VLA models.

Experimental evaluation

Simulation (RoboTwin 2.0). Using the JAX pi‑series VLA, 2500 clean demonstrations (9.34 h total) across 50 tasks were evaluated in clean and randomized environments.

With 90 % of demonstrations retained, ATHENA achieves average success rates of 44.70 % (clean) and 17.72 % (randomized), surpassing full‑data fine‑tuning (43.42 % / 15.44 %).

With 50 % of demonstrations, ATHENA matches full‑data performance in the clean setting and exceeds it in the randomized setting (30.33 % vs 29.43 %).

Real‑robot (ALOHA platform). Six tasks (Pick Fruits, Wipe Board, Stack Bowls, Box Return, Seal Stamping, Shelf Retrieval) were collected with 720 high‑quality demonstrations (≈6.9 h). Each task was tested with 25 random object placements.

Single‑task full‑data training (Single‑100 %) yields 46.7 % average success.

Joint full‑data training (Joint‑100 %) yields 60.0 %.

ATHENA using 66.7 % of the data reaches 68.0 % average success, outperforming Single‑100 %, Joint‑100 %, random 66.7 % sampling and a human‑prior Oracle baseline.

Implications

ATHENA shows that causal, scalable influence‑function based valuation can select fewer but more valuable demonstrations, improving both simulated and real‑world robot performance. As robot data pools grow, such methods provide a principled alternative to heuristic or human‑perceived quality filters.

Reference

Paper: “ATHENA: Accelerated Multi‑Task Heterogeneous Influence Functions for Robot Data Curation”, arXiv:2606.16208. Project page: https://sii-quantum.github.io/ATHENA.github.io/

Code example

https://sii-quantum.github.io/ATHENA.github.io/
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

roboticslarge-scale modelsmultitask learningvision-language-actiondata curationinfluence functions
Machine Heart
Written by

Machine Heart

Professional AI media and industry service platform

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.