BMC Bioinformatics | |
Software for the analysis and visualization of deep mutational scanning data | |
Jesse D Bloom1  | |
[1] Division of Basic Sciences and Computational Biology Program, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle 98109, WA, USA | |
关键词: Amino-acid preferences; Sequence logo; Deep mutational scanning; | |
Others : 1232559 DOI : 10.1186/s12859-015-0590-4 |
|
received in 2015-01-09, accepted in 2015-04-22, 发布年份 2015 |
【 摘 要 】
Background
Deep mutational scanning is a technique to estimate the impacts of mutations on a gene by using deep sequencing to count mutations in a library of variants before and after imposing a functional selection. The impacts of mutations must be inferred from changes in their counts after selection.
Results
I describe a software package, dms_tools, to infer the impacts of mutations from deep mutational scanning data using a likelihood-based treatment of the mutation counts. I show that dms_tools yields more accurate inferences on simulated data than simply calculating ratios of counts pre- and post-selection. Using dms_tools, one can infer the preference of each site for each amino acid given a single selection pressure, or assess the extent to which these preferences change under different selection pressures. The preferences and their changes can be intuitively visualized with sequence-logo-style plots created using an extension to weblogo.
Conclusions
dms_tools implements a statistically principled approach for the analysis and subsequent visualization of deep mutational scanning data.
【 授权许可】
2015 Bloom; licensee BioMed Central.
Files | Size | Format | View |
---|---|---|---|
Figure 6. | 156KB | Image | download |
Figure 5. | 88KB | Image | download |
Figure 4. | 68KB | Image | download |
Figure 3. | 295KB | Image | download |
Figure 2. | 149KB | Image | download |
Figure 1. | 119KB | Image | download |
Figure 6. | 156KB | Image | download |
Figure 5. | 88KB | Image | download |
Figure 4. | 68KB | Image | download |
Figure 3. | 295KB | Image | download |
Figure 2. | 149KB | Image | download |
Figure 1. | 119KB | Image | download |
【 图 表 】
Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 5.
Figure 6.
Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 5.
Figure 6.
【 参考文献 】
- [1]Fowler DM, Fields S. Deep mutational scanning: a new style of protein science. Nat Methods. 2014; 11(8):801-7.
- [2]Fowler DM, Araya CL, Fleishman SJ, Kellogg EH, Stephany JJ, Baker D et al.. High-resolution mapping of protein sequence-function relationships. Nat Methods. 2010; 7(9):741-6.
- [3]Traxlmayr MW, Hasenhindl C, Hackl M, Stadlmayr G, Rybka JD, Borth N et al.. Construction of a stability landscape of the CH3 domain of human IgG1 by combining directed evolution with high throughput sequencing. J Mol Biol. 2012; 423:397-412.
- [4]McLaughlin Jr RN, Poelwijk FJ, Raman A, Gosal WS, Ranganathan R. The spatial architecture of protein function and adaptation. Nature. 2012; 491(7422):138.
- [5]Starita LM, Pruneda JN, Lo RS, Fowler DM, Kim HJ, Hiatt JB et al.. Activity-enhancing mutations in an E3 ubiquitin ligase identified by high-throughput mutagenesis. Proc Natl Acad Sci USA. 2013; 110(14):1263-72.
- [6]Melamed D, Young DL, Gamble CE, Miller CR, Fields S. Deep mutational scanning of an RRM domain of the Saccharomyces cerevisiae poly (A)-binding protein. RNA. 2013; 19(11):1537-51.
- [7]Roscoe BP, Thayer KM, Zeldovich KB, Fushman D, Bolon DN. Analyses of the effects of all ubiquitin point mutants on yeast growth rate. J Mol Biol. 2013; 425:1363-77.
- [8]Firnberg E, Labonte JW, Gray JJ, Ostermeier M. A comprehensive, high-resolution map of a gene’s fitness landscape. Mol Biol Evol. 2014; 31(6):1581-92.
- [9]Bloom JD. An experimentally determined evolutionary model dramatically improves phylogenetic fit. Mol Biol Evol. 2014; 30:1956-78.
- [10]Melnikov A, Rogov P, Wang L, Gnirke A, Mikkelsen TS. Comprehensive mutational scanning of a kinase in vivo reveals context-dependent fitness landscapes. Nucleic Acids Res. 2014; 42:112.
- [11]Thyagarajan B, Bloom JD. The inherent mutational tolerance and antigenic evolvability of influenza hemagglutinin. eLife. 2014; 3:03300.
- [12]Wu NC, Young AP, Al-Mawsawi LQ, Olson CA, Feng J, Qi H et al.. High-throughput profiling of influenza A virus hemagglutinin gene at single-nucleotide resolution. Sci Rep. 2014; 4:4942.
- [13]Wu NC, Young AP, Al-Mawsawi LQ, Olson CA, Feng J, Qi H et al.. High-throughput identification of loss-of-function mutations for anti-interferon activity in the influenza A virus NS segment. J Virol. 2014; 88(17):10157-64.
- [14]Olson CA, Wu NC, Sun R. A comprehensive biophysical description of pairwise epistasis throughout an entire protein domain. Curr Biol. 2014; 24(22):2643-51.
- [15]Kitzman JO, Starita LM, Lo RS, Fields S, Shendure J. Massively parallel single-amino-acid mutagenesis. Nat Methods. 2015; 12:203-6.
- [16]Firnberg E, Ostermeier M. PFunkel: efficient, expansive, user-defined mutagenesis. PLoS One. 2012; 7:52031.
- [17]Jain PC, Varadarajan R. A rapid, efficient, and economical inverse polymerase chain reaction-based method for generating a site saturation mutant library. Anal Biochem. 2014; 449:90-8.
- [18]Findlay GM, Boyle EA, Hause RJ, Klein JC, Shendure J. Saturation editing of genomic regions by multiplex homology-directed repair. Nat. 2014; 513(7516):120-3.
- [19]Fowler DM, Araya CL, Gerard W, Fields S. Enrich: software for analysis of protein function by enrichment and depletion of variants. Bioinformatics. 2011; 27(24):3430-1.
- [20]Bank C, Hietpas RT, Wong A, Bolon DN, Jensen JD. A bayesian mcmc approach to assess the complete distribution of fitness effects of new mutations: uncovering the potential for adaptive walks in challenging environments. Genet. 2014; 196(3):841-52.
- [21]Araya CL, Fowler DM, Chen W, Muniez I, Kelly JW, Fields S. A fundamental protein property, thermodynamic stability, revealed solely from large-scale measurements of protein function. Proc Natl Acad Sci. 2012; 109(42):16858-63.
- [22]Bank C, Hietpas RT, Jensen JD, Bolon DN. A systematic survey of an intragenic epistatic landscape. Mol Biol Evol. 2015; 32(1):229-38.
- [23]Hiatt JB, Patwardhan RP, Turner EH, Lee C, Shendure J. Parallel, tag-directed assembly of locally derived short sequence reads. Nat Methods. 2010; 7(2):119-22.
- [24]Wu NC, De La Cruz J, Al-Mawsawi LQ, Olson CA, Qi H, Luan HH et al.. HIV-1 quasispecies delineation by tag linkage deep sequencing. PloS one. 2014; 9(5):97505.
- [25]Bloom JD. An experimentally informed evolutionary model improves phylogenetic fit to divergent lactamase homologs. Mol Biol Evol. 2014; 31:2753-769.
- [26]Yampolsky LY, Stoltzfus A. The exchangeability of amino acids in proteins. Genet. 2005; 170(4):1459-72.
- [27]Stoltzfus A, Yampolsky LY. Climbing mount probable: mutation as a cause of nonrandomness in evolution. J Hered. 2009; 100(5):637-47.
- [28]Pearson K. Mathematical contributions to the theory of evolution. On a form of spurious correlation which may arise when indices are used in the measurement of organs. Proc Royal Society London. 1896; 60(359–367):489-98.
- [29]Pearson K. On the constants of index-distributions as deduced from the like constants for the components of the ratio, with special reference to the opsonic index. Biometrika. 1910; 7(4):531-41.
- [30]Ogliore R, Huss G, Nagashima K. Ratio estimation in SIMS analysis. Nuclear instruments and methods in physics research section B: beam interactions with materials and atoms. 2011; 269(17):1910-18.
- [31]Van Kempen G, Van Vliet L. Mean and variance of ratio estimators used in fluorescence ratio imaging. Cytometry. 2000; 39(4):300-5.
- [32]Stan Development Team. PyStan: the Python interface to Stan, Version 2.5.0. 2014. http://mc-stan. org/pystan.html webcite
- [33]Gelman A, Rubin DB. Inference from iterative simulation using multiple sequences. Stat Sci. 1992; 7:457-72.
- [34]Crooks GE, Hon G, Chandonia JM, Brenner SE. Weblogo: a sequence logo generator. Genome Res. 2004; 14(6):1188-90.
- [35]Blainey P, Krzywinski M, Altman N. Points of significance: replication. Nat Methods. 2014; 11(9):879-80.
- [36]Shortle D, Lin B. Genetic analysis of staphylococcal nuclease: identification of three intragenic “global” suppressors of nuclease-minus mutations. Genet. 1985; 110:539-55.
- [37]Rennell D, Bouvier SE, Hardy LW, Poteete AR. Systematic mutation of bacteriophage T4 lysozyme. J Mol Biol. 1991; 222:67-87.
- [38]Shafikhani S, Siegel RA, Ferrari E, Schellenberger V. Generation of large libraries of random mutants in Bacillus subtilis by PCR-based plasmid multimerization. Biotechniques. 1997; 23:304-10.
- [39]Guo HH, Choe J, Loeb LA. Protein tolerance to random amino acid change. Proc Natl Acad Sci USA. 2004; 101:9205-210.
- [40]Bloom JD, Silberg JJ, Wilke CO, Drummond DA, Adami C, Arnold FH. Thermodynamic prediction of protein neutrality. Proc Natl Acad Sci USA. 2005; 102:606-11.