Protein engineering can be performed by combinatorial techniques (directed evolution) and data-driven methods using machine-learning algorithms. The main characteristic of directed evolution (DE) is the application of an effective and efficient screen or selection on a diverse mutant library. As it is important to have a diverse mutant library for the success of DE, we compared the performance of DNA-shuffling and recombination PCR on fluorescent proteins using sequence information as well as statistical methods. We found that the diversity of the libraries DNA-shuffling and recombination PCR generates were dependent on type of skew primers used and sensitive to nucleotide identity levels between genes. DNA-shuffling and recombination PCR produced libraries with different crossover tendencies, suggesting that the two protocols could be used in combination to produce better libraries. Data-driven protein engineering uses sequence, structure and function data along with analyzed empirical activity information to guide library design. Boolean Learning Support Vector Machines (BLSVM) to identify interacting residues in fluorescent proteins and the gene templates were modified to preserve interactions post recombination. By site-directed mutagenesis, recombination and expression experiments, we validated that BLSVM can be used to identify interacting residues and increase the fraction of active proteins in the library.As an extension to the above experiments, DE was applied on monomeric Red Fluorescent Proteins to improve its spectral characteristics and structure-guided protein engineering was performed on penicillin G acylase (PGA), an industrially relevant catalyst, to change its substrate specificity.
【 预 览 】
附件列表
Files
Size
Format
View
Optimization of Recombination Methods and Expanding the Utility of Penicillin G Acylase