学位论文详细信息
Principles of Machine Learning-Guided Protein Engineering
machine learning;protein engineering;synthetic biology
Biswas, Surojit ; Joung, Keith
University:Havard University
Department:Medical Sciences
关键词: machine learning;    protein engineering;    synthetic biology;   
Others  :  https://dash.harvard.edu/bitstream/handle/1/37365914/BISWAS-DISSERTATION-2020.pdf?sequence=1&isAllowed=y
美国|英语
来源: Digital Access to Scholarship at Harvard
PDF
【 摘 要 】

Protein engineering has enormous academic, industrial, and biomedical potential. However, it is limited by our ability to efficiently explore astronomically large sequence spaces to find rare high-functioning variants. In this thesis, we find that when screening or selection capacity is high, directed evolution is often sufficient to find such variants. In such settings, machine learning can be used to explore distant regions of sequence space that may serve as substrates for directed evolution. However, under resource constraints typical of many high-value protein systems and late-stage or high-fidelity engineering efforts, screening and selection capacity is low, making directed evolution substantially less effective. Toward this end, we developed a semi-supervised machine learning framework, UniRep, that from scratch and from sequence alone learned to distill the fundamental features of a protein – including biophysical, structural, and evolutionary information – into a holistic statistical representation. Trained on a vast, exponentially growing, unlabeled sequence database, UniRep not only enables state-of-the-art predictive performance on a diverse variety of protein informatics tasks, but also when combined with in silico directed evolution, enables engineering in resource constrained settings where only a small number – low-N – of variants can be functionally characterized. Taken together, we conclude that semi- and self-supervised machine learning, process virtualization, and a few carefully chosen experimental measurements may rapidly accelerate and reduce the costs of protein engineering in a manner that other (semi-)rational design approaches and directed evolution cannot.

【 预 览 】
附件列表
Files Size Format View
Principles of Machine Learning-Guided Protein Engineering 16051KB PDF download
  文献评价指标  
  下载次数:34次 浏览次数:7次