Researchers from Zhejiang University and HKUST (Guangzhou) have developed an advanced AI model, ProtET, that harnesses the power of multi-modal learning to enable controllable protein editing through simple text-based instructions. This breakthrough, detailed in Health Data Science, bridges the gap between biological language and the manipulation of protein sequences, advancing functional protein design across various domains, such as enzyme activity, stability, and antibody binding.
Proteins are vital to all biological processes, and their precise modification holds tremendous potential in areas like medical therapies, synthetic biology, and biotechnology. Traditional methods of protein editing typically involve time-consuming laboratory experiments and single-task optimization models. However, ProtET introduces a transformative approach using a transformer-structured encoder and a hierarchical training paradigm. The model aligns protein sequences with natural language descriptions through contrastive learning, allowing researchers to modify proteins intuitively using text-based instructions.
Led by Mingze Yin from Zhejiang University and Jintai Chen from HKUST (Guangzhou), the research team trained ProtET on a dataset of over 67 million protein–biotext pairs sourced from the Swiss-Prot and TrEMBL databases. ProtET demonstrated exceptional performance, achieving improvements in protein stability of up to 16.9%, as well as optimization of catalytic activities and antibody-specific binding.
“ProtET introduces a flexible, controllable approach to protein editing, allowing researchers to fine-tune biological functions with unparalleled precision,” said Mingze Yin, the study’s lead author. The model successfully optimized protein sequences across various experimental conditions, including enzyme catalytic activity, protein stability, and antibody-antigen interaction binding. In zero-shot tasks, ProtET even designed antibodies against SARS-CoV, generating stable and functional 3D structures, proving its practical applications in biomedical research.
Looking ahead, the research team envisions ProtET becoming an indispensable tool in protein engineering, paving the way for breakthroughs in synthetic biology, genetic therapies, and biopharmaceutical manufacturing.
This study represents a significant milestone in AI-driven protein design, demonstrating how cross-modal integration can unlock new possibilities for scientific innovation and discovery.
By Impact Lab

