How AI Learned to Read the Genomic “Dialects” of 300,000 People for Precise Expression Prediction
This article reviews a study that overcomes the limitation of reference‑genome‑only models by pre‑training a genomic language model on 300,000 European individuals’ variants, creating UKBioBERT and the two‑stage UKBioFormer, which together deliver markedly better gene‑function representations and personalized expression predictions across cell lines and populations.
