Journal of computational biology: A journal of computational molecular cell biology | |
BBK* (Branch and Bound Over K*): A Provable and Efficient Ensemble-Based Protein Design Algorithm to Optimize Stability and Binding Affinity Over Large Sequence Spaces | |
Jonathan D.Jou^11  Bruce R.Donald^1,42  Adegoke A.Ojewole^1,23  Vance G.Fowler^34  | |
[1] Computational Biology and Bioinformatics Program, Duke University, Durham, North Carolina^2;Department of Biochemistry, Duke University Medical Center, Durham North Carolina^4;Department of Computer Science, Duke University, Durham, North Carolina^1;Division of Infectious Diseases, Duke University Medical Center, Durham, North Carolina^3 | |
关键词: molecular ensembles; OSPREY; predicting binding affinity; protein design; structural biology; sublinear algorithms.; | |
DOI : 10.1089/cmb.2017.0267 | |
学科分类:生物科学(综合) | |
来源: Mary Ann Liebert, Inc. Publishers | |
【 摘 要 】
Computational protein design (CPD) algorithms that compute binding affinity, Ka, search for sequences with an energetically favorable free energy of binding. Recent work shows that three principles improve the biological accuracy of CPD: ensemble-based design, continuous flexibility of backbone and side-chain conformations, and provable guarantees of accuracy with respect to the input. However, previous methods that use all three design principles are single-sequence (SS) algorithms, which are very costly: linear in the number of sequences and thus exponential in the number of simultaneously mutable residues. To address this computational challenge, we introduce BBK*, a new CPD algorithm whose key innovation is the multisequence (MS) bound: BBK* efficiently computes a single provable upper bound to approximate Ka for a combinatorial number of sequences, and avoids SS computation for all provably suboptimal sequences. Thus, to our knowledge, BBK* is the first provable, ensemble-based CPD algorithm to run in time sublinear in the number of sequences. Computational experiments on 204 protein design problems show that BBK* finds the tightest binding sequences while approximating Ka for up to 105-fold fewer sequences than the previous state-of-the-art algorithms, which require exhaustive enumeration of sequences. Furthermore, for 51 protein–ligand design problems, BBK* provably approximates Ka up to 1982-fold faster than the previous state-of-the-art iMinDEE// algorithm. Therefore, BBK* not only accelerates protein designs that are possible with previous provable algorithms, but also efficiently performs designs that are too large for previous methods.
【 授权许可】
Unknown
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO201910255205648ZK.pdf | 572KB | download |