MIT's Pichia-CLM model learns yeast DNA language, boosting protein yield up to 3‑fold
A MIT research team introduced Pichia-CLM, a GRU‑based language model trained on a 27 k‑pair Pichia pastoris dataset that optimizes codon usage, and demonstrated across six proteins that it consistently outperforms four commercial codon‑optimization tools, delivering up to a three‑fold increase in heterologous protein secretion.
