ICLR 2026: Nvidia & Oxford Introduce Atom‑Level Protein Binder Generator with SOTA Performance

A joint team from Nvidia, Oxford University and the Quebec AI Institute presents Complexa, an atom‑level protein binder generation framework that unifies generative and refinement steps, achieves state‑of‑the‑art in‑silico success rates, and scales efficiently with test‑time compute.

HyperAI Super Neural
HyperAI Super Neural
HyperAI Super Neural
ICLR 2026: Nvidia & Oxford Introduce Atom‑Level Protein Binder Generator with SOTA Performance

Dataset: From Monomer Enrichment to Complex Reconstruction

The study identifies a structural data gap: public protein‑protein complex entries are scarce, while the AlphaFold Database (AFDB) contains abundant monomer structures. By treating multi‑domain proteins as pseudo‑multimers and extracting inter‑domain contacts, the authors construct a new dataset called Teddymer with roughly 3.5 million dimer clusters, converting monomer‑rich data into usable complex‑like examples.

Complexa: An Atom‑Level Framework for Protein Binder Generation

Complexa builds on the La‑Proteína foundation and introduces a transformer‑based architecture that generates only the binder portion conditioned on target interface hotspots. The model encodes protein targets with an Atom37 scheme (residue coordinates, amino‑acid type, hotspot flags) and small‑molecule targets with atom‑level type, charge and coordinates, enabling joint modeling of binder and target.

During training, random global translation noise is added to binder coordinates, forcing the network to learn precise spatial placement. The training pipeline proceeds in stages: monomer modeling → generic structure generation → binder‑specific fine‑tuning, with LoRA adapters to control over‑fitting while reusing the monomer encoder.

At inference, Complexa applies test‑time compute expansion: increasing sample counts and optionally employing beam search or Monte‑Carlo tree search. This allows the model to improve generation quality dynamically as more compute is allocated.

Experimental Evaluation

Benchmarks span protein‑protein, protein‑small‑molecule, and enzyme design tasks. Across all targets, Complexa outperforms prior methods (e.g., RFDiffusion, BindCraft) in success rate, sampling speed, and structural novelty. Notably, it directly outputs high‑quality sequences without a separate redesign step such as ProteinMPNN.

Conditional tags enable explicit control over generated secondary‑structure types (α‑helix vs. β‑sheet), increasing diversity. In test‑time compute experiments, simply raising the number of samples surpasses baselines on easy tasks, while advanced search strategies further boost performance on harder targets, demonstrating scalable gains.

Physical plausibility analyses show improved interface hydrogen‑bonding and energy metrics, indicating that Complexa can refine fine‑grained interactions for stronger binding stability.

On multi‑chain targets where existing methods fail under limited compute, Complexa succeeds after resource expansion, and it generalizes to enzyme design benchmarks, confirming broad applicability.

Broader Context

Recent AI‑driven protein binder work (e.g., RFDiffusion, BoltzGen) has shown feasibility, but Complexa advances the paradigm by merging generation and refinement into a single, test‑time extensible system. Industry collaborations, such as Bayer’s integration of AI protein‑engineering platforms, illustrate the shift from isolated model performance to systematic, scalable design pipelines.

Overall, the paper argues that the next frontier in protein design is not merely achieving high‑quality designs, but enabling continuous, efficient, and extensible generation at scale.

Complexa framework illustration
Complexa framework illustration
Training data composition for Complexa
Training data composition for Complexa
Conditional generation process in Complexa
Conditional generation process in Complexa
Performance comparison on protein targets
Performance comparison on protein targets
Inference time scaling analysis
Inference time scaling analysis
Enzyme design benchmark results
Enzyme design benchmark results
Generative AISOTAICLR 2026Complexaprotein binder designTeddymer dataset
HyperAI Super Neural
Written by

HyperAI Super Neural

Deconstructing the sophistication and universality of technology, covering cutting-edge AI for Science case studies.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.