期刊论文详细信息
BMC Bioinformatics
Software for the analysis and visualization of deep mutational scanning data
Jesse D Bloom1 
[1] Division of Basic Sciences and Computational Biology Program, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle 98109, WA, USA
关键词: Amino-acid preferences;    Sequence logo;    Deep mutational scanning;   
Others  :  1232559
DOI  :  10.1186/s12859-015-0590-4
 received in 2015-01-09, accepted in 2015-04-22,  发布年份 2015
【 摘 要 】

Background

Deep mutational scanning is a technique to estimate the impacts of mutations on a gene by using deep sequencing to count mutations in a library of variants before and after imposing a functional selection. The impacts of mutations must be inferred from changes in their counts after selection.

Results

I describe a software package, dms_tools, to infer the impacts of mutations from deep mutational scanning data using a likelihood-based treatment of the mutation counts. I show that dms_tools yields more accurate inferences on simulated data than simply calculating ratios of counts pre- and post-selection. Using dms_tools, one can infer the preference of each site for each amino acid given a single selection pressure, or assess the extent to which these preferences change under different selection pressures. The preferences and their changes can be intuitively visualized with sequence-logo-style plots created using an extension to weblogo.

Conclusions

dms_tools implements a statistically principled approach for the analysis and subsequent visualization of deep mutational scanning data.

【 授权许可】

   
2015 Bloom; licensee BioMed Central.

附件列表
Files Size Format View
Figure 6. 156KB Image download
Figure 5. 88KB Image download
Figure 4. 68KB Image download
Figure 3. 295KB Image download
Figure 2. 149KB Image download
Figure 1. 119KB Image download
Figure 6. 156KB Image download
Figure 5. 88KB Image download
Figure 4. 68KB Image download
Figure 3. 295KB Image download
Figure 2. 149KB Image download
Figure 1. 119KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

【 参考文献 】
  • [1]Fowler DM, Fields S. Deep mutational scanning: a new style of protein science. Nat Methods. 2014; 11(8):801-7.
  • [2]Fowler DM, Araya CL, Fleishman SJ, Kellogg EH, Stephany JJ, Baker D et al.. High-resolution mapping of protein sequence-function relationships. Nat Methods. 2010; 7(9):741-6.
  • [3]Traxlmayr MW, Hasenhindl C, Hackl M, Stadlmayr G, Rybka JD, Borth N et al.. Construction of a stability landscape of the CH3 domain of human IgG1 by combining directed evolution with high throughput sequencing. J Mol Biol. 2012; 423:397-412.
  • [4]McLaughlin Jr RN, Poelwijk FJ, Raman A, Gosal WS, Ranganathan R. The spatial architecture of protein function and adaptation. Nature. 2012; 491(7422):138.
  • [5]Starita LM, Pruneda JN, Lo RS, Fowler DM, Kim HJ, Hiatt JB et al.. Activity-enhancing mutations in an E3 ubiquitin ligase identified by high-throughput mutagenesis. Proc Natl Acad Sci USA. 2013; 110(14):1263-72.
  • [6]Melamed D, Young DL, Gamble CE, Miller CR, Fields S. Deep mutational scanning of an RRM domain of the Saccharomyces cerevisiae poly (A)-binding protein. RNA. 2013; 19(11):1537-51.
  • [7]Roscoe BP, Thayer KM, Zeldovich KB, Fushman D, Bolon DN. Analyses of the effects of all ubiquitin point mutants on yeast growth rate. J Mol Biol. 2013; 425:1363-77.
  • [8]Firnberg E, Labonte JW, Gray JJ, Ostermeier M. A comprehensive, high-resolution map of a gene’s fitness landscape. Mol Biol Evol. 2014; 31(6):1581-92.
  • [9]Bloom JD. An experimentally determined evolutionary model dramatically improves phylogenetic fit. Mol Biol Evol. 2014; 30:1956-78.
  • [10]Melnikov A, Rogov P, Wang L, Gnirke A, Mikkelsen TS. Comprehensive mutational scanning of a kinase in vivo reveals context-dependent fitness landscapes. Nucleic Acids Res. 2014; 42:112.
  • [11]Thyagarajan B, Bloom JD. The inherent mutational tolerance and antigenic evolvability of influenza hemagglutinin. eLife. 2014; 3:03300.
  • [12]Wu NC, Young AP, Al-Mawsawi LQ, Olson CA, Feng J, Qi H et al.. High-throughput profiling of influenza A virus hemagglutinin gene at single-nucleotide resolution. Sci Rep. 2014; 4:4942.
  • [13]Wu NC, Young AP, Al-Mawsawi LQ, Olson CA, Feng J, Qi H et al.. High-throughput identification of loss-of-function mutations for anti-interferon activity in the influenza A virus NS segment. J Virol. 2014; 88(17):10157-64.
  • [14]Olson CA, Wu NC, Sun R. A comprehensive biophysical description of pairwise epistasis throughout an entire protein domain. Curr Biol. 2014; 24(22):2643-51.
  • [15]Kitzman JO, Starita LM, Lo RS, Fields S, Shendure J. Massively parallel single-amino-acid mutagenesis. Nat Methods. 2015; 12:203-6.
  • [16]Firnberg E, Ostermeier M. PFunkel: efficient, expansive, user-defined mutagenesis. PLoS One. 2012; 7:52031.
  • [17]Jain PC, Varadarajan R. A rapid, efficient, and economical inverse polymerase chain reaction-based method for generating a site saturation mutant library. Anal Biochem. 2014; 449:90-8.
  • [18]Findlay GM, Boyle EA, Hause RJ, Klein JC, Shendure J. Saturation editing of genomic regions by multiplex homology-directed repair. Nat. 2014; 513(7516):120-3.
  • [19]Fowler DM, Araya CL, Gerard W, Fields S. Enrich: software for analysis of protein function by enrichment and depletion of variants. Bioinformatics. 2011; 27(24):3430-1.
  • [20]Bank C, Hietpas RT, Wong A, Bolon DN, Jensen JD. A bayesian mcmc approach to assess the complete distribution of fitness effects of new mutations: uncovering the potential for adaptive walks in challenging environments. Genet. 2014; 196(3):841-52.
  • [21]Araya CL, Fowler DM, Chen W, Muniez I, Kelly JW, Fields S. A fundamental protein property, thermodynamic stability, revealed solely from large-scale measurements of protein function. Proc Natl Acad Sci. 2012; 109(42):16858-63.
  • [22]Bank C, Hietpas RT, Jensen JD, Bolon DN. A systematic survey of an intragenic epistatic landscape. Mol Biol Evol. 2015; 32(1):229-38.
  • [23]Hiatt JB, Patwardhan RP, Turner EH, Lee C, Shendure J. Parallel, tag-directed assembly of locally derived short sequence reads. Nat Methods. 2010; 7(2):119-22.
  • [24]Wu NC, De La Cruz J, Al-Mawsawi LQ, Olson CA, Qi H, Luan HH et al.. HIV-1 quasispecies delineation by tag linkage deep sequencing. PloS one. 2014; 9(5):97505.
  • [25]Bloom JD. An experimentally informed evolutionary model improves phylogenetic fit to divergent lactamase homologs. Mol Biol Evol. 2014; 31:2753-769.
  • [26]Yampolsky LY, Stoltzfus A. The exchangeability of amino acids in proteins. Genet. 2005; 170(4):1459-72.
  • [27]Stoltzfus A, Yampolsky LY. Climbing mount probable: mutation as a cause of nonrandomness in evolution. J Hered. 2009; 100(5):637-47.
  • [28]Pearson K. Mathematical contributions to the theory of evolution. On a form of spurious correlation which may arise when indices are used in the measurement of organs. Proc Royal Society London. 1896; 60(359–367):489-98.
  • [29]Pearson K. On the constants of index-distributions as deduced from the like constants for the components of the ratio, with special reference to the opsonic index. Biometrika. 1910; 7(4):531-41.
  • [30]Ogliore R, Huss G, Nagashima K. Ratio estimation in SIMS analysis. Nuclear instruments and methods in physics research section B: beam interactions with materials and atoms. 2011; 269(17):1910-18.
  • [31]Van Kempen G, Van Vliet L. Mean and variance of ratio estimators used in fluorescence ratio imaging. Cytometry. 2000; 39(4):300-5.
  • [32]Stan Development Team. PyStan: the Python interface to Stan, Version 2.5.0. 2014. http://mc-stan. org/pystan.html webcite
  • [33]Gelman A, Rubin DB. Inference from iterative simulation using multiple sequences. Stat Sci. 1992; 7:457-72.
  • [34]Crooks GE, Hon G, Chandonia JM, Brenner SE. Weblogo: a sequence logo generator. Genome Res. 2004; 14(6):1188-90.
  • [35]Blainey P, Krzywinski M, Altman N. Points of significance: replication. Nat Methods. 2014; 11(9):879-80.
  • [36]Shortle D, Lin B. Genetic analysis of staphylococcal nuclease: identification of three intragenic “global” suppressors of nuclease-minus mutations. Genet. 1985; 110:539-55.
  • [37]Rennell D, Bouvier SE, Hardy LW, Poteete AR. Systematic mutation of bacteriophage T4 lysozyme. J Mol Biol. 1991; 222:67-87.
  • [38]Shafikhani S, Siegel RA, Ferrari E, Schellenberger V. Generation of large libraries of random mutants in Bacillus subtilis by PCR-based plasmid multimerization. Biotechniques. 1997; 23:304-10.
  • [39]Guo HH, Choe J, Loeb LA. Protein tolerance to random amino acid change. Proc Natl Acad Sci USA. 2004; 101:9205-210.
  • [40]Bloom JD, Silberg JJ, Wilke CO, Drummond DA, Adami C, Arnold FH. Thermodynamic prediction of protein neutrality. Proc Natl Acad Sci USA. 2005; 102:606-11.
  文献评价指标  
  下载次数:33次 浏览次数:1次