How Uni‑Fold + Alibaba PAI Boost Protein Structure Prediction to 6.6k Amino Acids
DeepMind’s AlphaFold inspired Uni‑Fold, now accelerated with Alibaba Cloud’s PAI platform, can predict protein structures up to 6.6k amino acids—covering 99.992% of known sequences—delivering ten‑minute inference for SARS‑CoV‑2 spike trimers and setting new performance benchmarks for AI‑driven structural biology.
Abstract: By combining DeepMind’s Uni‑Fold with Alibaba Cloud’s PAI inference acceleration (FoldAcc), the maximum supported amino‑acid sequence length per prediction is increased to 6.6k, covering 99.992% of known protein sequences and achieving the best known inference performance.
Using the SARS‑CoV‑2 spike protein—a trimeric structure with a typical length close to 4k—as an example, the original AlphaFold model runs out of memory, whereas Uni‑Fold + FoldAcc completes inference in about ten minutes.
Since AlphaFold2’s release in 2020, AI‑assisted protein‑structure prediction has attracted great attention. However, scaling AI‑based prediction for industrial use still faces challenges in infrastructure, tool completeness, and model deployment efficiency.
In August 2022, DeepScience (DPTech) upgraded and open‑sourced Uni‑Fold, reproducing full‑size AlphaFold2 and AlphaFold‑Multimer training, and improving training speed by 220% over competing projects such as OpenFold and FastFold.
To address the long‑standing Evoformer inference bottleneck, DeepScience and Alibaba Cloud’s PAI team integrated multi‑GPU parallelism, mixed‑precision, and compiler optimizations, enabling accelerated multi‑GPU inference for longer amino‑acid sequences.
Typical acceleration results (on A100‑80G GPUs with bf16 enabled) are shown below:
Uni‑Fold upgrades now support complex training
Protein structure research underpins cancer detection, targeted drug discovery, and aging studies. Traditional methods (X‑ray crystallography, cryo‑EM) are time‑ and resource‑intensive, making rapid, scalable prediction essential.
In December 2021, DeepScience released Uni‑Fold v1.0.0, the first open‑source full‑size AlphaFold2 training code. The August 2022 update added support for both monomer and complex predictions.
The open‑source Uni‑Fold, built on PyTorch, reproduces and improves AlphaFold (‑Multimer) models, achieving comparable or better accuracy on recent PDB test sets with template similarity < 40% .
Training time has been reduced from 11 days to about 4 days, outperforming other open‑source projects.
Machine Learning Platform PAI provides end‑to‑end AI engineering support. It is the only Chinese platform continuously listed in Gartner’s Data Science and Machine Learning Platform report, offering comprehensive services for AI development and deployment.
PAI’s proprietary inference accelerator, PAI‑Blade , delivers optimal performance across GPUs, CPUs, and edge devices through joint model‑system optimization. Its core component, BladeDISC , features industry‑leading dynamic‑size model optimization and large‑granularity operator fusion, and was open‑sourced in February 2022.
Looking ahead, AI‑driven protein‑structure prediction exemplifies the emerging "AI for Science" paradigm, driving breakthroughs across biology, physics, and chemistry, while posing new challenges for AI infrastructure.
DeepScience and Alibaba Cloud will continue to provide robust AI foundations for biomedicine, energy, materials, and other scientific domains.
Appendix
DeepScience open‑source Uni‑Fold: https://github.com/dptech-corp/Uni-Fold
Alibaba open‑source AI compiler BladeDISC: https://github.com/alibaba/BladeDISC
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Big Data AI Platform
The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
