期刊论文详细信息
BMC Medical Genomics
From big data analysis to personalized medicine for all: challenges and opportunities
David Meyre2  Michelle Turcotte1  Akram Alyass1 
[1] Department of Clinical Epidemiology and Biostatistics, McMaster University, 1280 Main Street West, Hamilton, ON, Canada;Department of Pathology and Molecular Medicine, McMaster University, 1280 Main Street West, Hamilton, ON, Canada
关键词: High-dimensionality;    Integrative methods;    Cloud computing;    High-throughput technologies;    Personalized medicine;    Omics;    Big data;   
Others  :  1219426
DOI  :  10.1186/s12920-015-0108-y
 received in 2015-01-21, accepted in 2015-06-15,  发布年份 2015
PDF
【 摘 要 】

Recent advances in high-throughput technologies have led to the emergence of systems biology as a holistic science to achieve more precise modeling of complex diseases. Many predict the emergence of personalized medicine in the near future. We are, however, moving from two-tiered health systems to a two-tiered personalized medicine. Omics facilities are restricted to affluent regions, and personalized medicine is likely to widen the growing gap in health systems between high and low-income countries. This is mirrored by an increasing lag between our ability to generate and analyze big data. Several bottlenecks slow-down the transition from conventional to personalized medicine: generation of cost-effective high-throughput data; hybrid education and multidisciplinary teams; data storage and processing; data integration and interpretation; and individual and global economic relevance. This review provides an update of important developments in the analysis of big data and forward strategies to accelerate the global transition to personalized medicine.

【 授权许可】

   
2015 Alyass et al.

【 预 览 】
附件列表
Files Size Format View
20150717021958752.pdf 1751KB PDF download
Fig. 6. 42KB Image download
Fig. 5. 38KB Image download
Fig. 4. 44KB Image download
Fig. 3. 44KB Image download
Fig. 2. 40KB Image download
Fig. 1. 25KB Image download
【 图 表 】

Fig. 1.

Fig. 2.

Fig. 3.

Fig. 4.

Fig. 5.

Fig. 6.

【 参考文献 】
  • [1]Hood L, Flores M: A personal view on systems medicine and the emergence of proactive P4 medicine: predictive, preventive, personalized and participatory. New Biotechnol 2012, 29(6):613-24.
  • [2]Khoury MJ, Gwinn ML, Glasgow RE, Kramer BS: A population approach to precision medicine. Am J Prev Med 2012, 42(6):639-45.
  • [3]Taubes G: Epidemiology faces its limits. Science 1995, 269(5221):164-9.
  • [4]Loos RJ, Schadt EE: This I believe: gaining new insights through integrating “old” data. Front Genet 2012, 3:137.
  • [5]Schadt EE, Bjorkegren JL: NEW: network-enabled wisdom in biology, medicine, and health care. Sci Transl Med 2012, 4(115):115rv1.
  • [6]Schadt EE: Molecular networks as sensors and drivers of common human diseases. Nature 2009, 461(7261):218-23.
  • [7]Tremblay-Servier M: Personalized medicine: the medicine of tomorrow. Foreword. Metab Clin Exp 2013, 62 Suppl 1:S1.
  • [8]Hardy BJ, Seguin B, Goodsaid F, Jimenez-Sanchez G, Singer PA, Daar AS: The next steps for genomic medicine: challenges and opportunities for the developing world. Nat Rev Genet 2008, 9(Suppl 1):S23-7.
  • [9]Mardis ER. The $1,000 genome, the $100,000 analysis? Genome Medicine. 2010;2(11).
  • [10]Yuan Y, Failmezger H, Rueda OM, Ali HR, Graf S, Chin SF, et al.: Quantitative image analysis of cellular heterogeneity in breast tumors complements genomic profiling. Sci Transl Med 2012, 4(157):157ra43.
  • [11]Kumar V, Gu Y, Basu S, Berglund A, Eschrich SA, Schabath MB, et al.: Radiomics: the process and the challenges. Magn Reson Imaging 2012, 30(9):1234-48.
  • [12]Brugmann A, Eld M, Lelkaitis G, Nielsen S, Grunkin M, Hansen JD, et al.: Digital image analysis of membrane connectivity is a robust measure of HER2 immunostains. Breast Cancer Res Treat 2012, 132(1):41-9.
  • [13]Gottret P, Schieber G: Health transitions, disease burdens, and health expenditure patterns. Health Financing Revisited: A Practitioner’s Guide: The International Bank for Reconstruction and Development. 2006.
  • [14]Lu C, Schneider MT, Gubbins P, Leach-Kemon K, Jamison D, Murray CJ: Public financing of health in developing countries: a cross-national systematic analysis. Lancet 2010, 375(9723):1375-87.
  • [15]Li A, Meyre D: Jumping on the Train of Personalized Medicine: A Primer for Non-Geneticist Clinicians: Part 2. Fundamental Concepts in Genetic Epidemiology. Curr Psychiatr Rev 2014, 10(4):101-17.
  • [16]Li A, Meyre D: Jumping on the Train of Personalized Medicine A Primer for Non- Geneticist Clinicians Part 1. Fundamental Concepts in Molecular Genetics. Curr Psychiatr Rev 2014, 10(4):91-100.
  • [17]Li A, Meyre D: Jumping on the Train of Personalized Medicine A Primer for Non-Geneticist Clinicians Part 3. Clinical Applications in the Personalized Medicine Area. Curr Psychiatr Rev 2014, 10(4):118-30.
  • [18]Hood L. Systems Biology and P4 Medicine: Past, Present, and Future. Rambam Maimonides Med J. 2013;4(2).
  • [19]Vecchio G, Fenech M, Pompa PP, Voelcker NH. Lab-on-a-Chip-Based High-Throughput Screening of the Genotoxicity of Engineered Nanomaterials. Small (Weinheim an der Bergstrasse, Germany). 2014.
  • [20]Schadt EE: The changing privacy landscape in the era of big data. Mol Syst Biol 2012, 8:612.
  • [21]Phillips KA, Ann Sakowski J, Trosman J, Douglas MP, Liang SY, Neumann P: The economic value of personalized medicine tests: what we know and what we need to know. Genet Med 2014, 16(3):251-7.
  • [22]Hekim N, Coskun Y, Sinav A, Abou-Zeid AH, Agirbasli M, Akintola SO, et al.: Translating biotechnology to knowledge-based innovation, peace, and development? Deploy a Science Peace Corps--an open letter to world leaders. Omics 2014, 18(7):415-20.
  • [23]Ozdemir V, Badr KF, Dove ES, Endrenyi L, Geraci CJ, Hotez PJ, et al.: Crowd-funded micro-grants for genomics and “big data”: an actionable idea connecting small (artisan) science, infrastructure science, and citizen philanthropy. Omics 2013, 17(4):161-72.
  • [24]Dove ES, Ozdemir V: All the post-genomic world is a stage: the actors and narrators required for translating pharmacogenomics into public health. Per Med 2013, 10(3):213-6.
  • [25]Mbuagbaw L, van der Kop ML, Lester RT, Thirumurthy H, Pop-Eleches C, Ye C, et al.: Mobile phone text messages for improving adherence to antiretroviral therapy (ART): an individual patient data meta-analysis of randomised trials. BMJ Open 2013., 3(12) Article ID e003950
  • [26]Hardin G: The Tragedy of the Commons. Science 1968, 162(3859):1243-8.
  • [27]Ostrom E: Coping with Tragedies of the Commons. Ann Rev Politic Sci 1999, 2(1):493-535.
  • [28]Ostrom E. Governing the Commons: The Evolution of Institutions for Collective Action. Cambridge University Press; 1990.
  • [29]De Vries R: How can we help? From “sociology in” to “sociology of” bioethics. J Law Med Ethics 2004, 32(2):279-92.
  • [30]Dove ES, Ozdemir V: The epiknowledge of socially responsible innovation. EMBO Rep 2014, 15(5):462-3.
  • [31]Finishing the euchromatic sequence of the human genome. Nature. 2004;431(7011):931–45.
  • [32]McDermott JE, Wang J, Mitchell H, Webb-Robertson BJ, Hafen R, Ramey J, et al.: Challenges in Biomarker Discovery: Combining Expert Insights with Statistical Analysis of Complex Omics Data. Expert Opin Med Diagn 2013, 7(1):37-51.
  • [33]Kristensen VN, Lingjaerde OC, Russnes HG, Vollan HK, Frigessi A, Borresen-Dale AL: Principles and methods of integrative genomic analyses in cancer. Nat Rev Cancer 2014, 14(5):299-313.
  • [34]Shendure J, Lieberman AE: The expanding scope of DNA sequencing. Nat Biotechnol 2012, 30(11):1084-94.
  • [35]Pal A, McCarthy MI: The genetics of type 2 diabetes and its clinical relevance. Clin Genet 2013, 83(4):297-306.
  • [36]Scholz MB, Lo CC, Chain PS: Next generation sequencing and bioinformatic bottlenecks: the current state of metagenomic data analysis. Curr Opin Biotechnol 2012, 23(1):9-15.
  • [37]Berger B, Peng J, Singh M: Computational solutions for omics data. Nat Rev Genet 2013, 14(5):333-46.
  • [38]Gomez-Cabrero D, Abugessaisa I, Maier D, Teschendorff A, Merkenschlager M, Gisel A et al. Data integration in the era of omics: current and future challenges. BMC Syst Biol. 2014;8(Suppl 2).
  • [39]McShane LM, Cavenagh MM, Lively TG, Eberhard DA, Bigbee WL, Williams PM et al. Criteria for the use of omics-based predictors in clinical trials: explanation and elaboration. BMC Med. 2013;11(1).
  • [40]Brown NJ, MacDonald DA, Samanta MP, Friedman HL, Coyne JC: A critical reanalysis of the relationship between genomics and well-being. Proc Natl Acad Sci U S A 2014, 111(35):12705-9.
  • [41]Wilson G, Aruliah DA, Brown CT, Chue Hong NP, Davis M, Guy RT et al. Best Practices for Scientific Computing. PLoS Biol. 2014;12(1).
  • [42]How Do Scientists Develop and Use Scientific Software?. IEEE Computer Society, Washington, DC, USA; 2009.
  • [43]A Survey of the Practice of Computational Science. ACM, New York, NY, USA; 2011.
  • [44]Marshall E: Human genome 10th anniversary. Waiting for the revolution. Science 2011, 331(6017):526-9.
  • [45]Cesario A, Auffray C, Russo P, Hood L: P4 Medicine Needs P4 Education. Curr Pharm Des 2014, 20(38):6071-2.
  • [46]Schatz MC, Langmead B, Salzberg SL: Cloud computing and the DNA data race. Nat Biotech 2010, 28(7):691-3.
  • [47]Schadt EE, Linderman MD, Sorenson J, Lee L, Nolan GP: Cloud and heterogeneous computing solutions exist today for the emerging big data problems in biology. Nat Rev Genet 2011, 12(3):224.
  • [48]Armbrust M, Fox A, Griffith R, Joseph AD, Katz R, Konwinski A, et al.: A View of Cloud Computing. Commun ACM 2010, 53(4):50-8.
  • [49]Marx V: Biology: The big challenges of big data. Nature 2013, 498(7453):255-60.
  • [50]Hiltemann S, Mei H, de Hollander M, Palli I, van der Spek P, Jenster G, et al.: CGtag: complete genomics toolkit and annotation in a cloud-based Galaxy. GigaScience 2014, 3(1):1. BioMed Central Full Text
  • [51]Liu B, Madduri RK, Sotomayor B, Chard K, Lacinski L, Dave UJ, et al.: Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses. J Biomed Inform 2014, 49:119-33.
  • [52]Zheng G, Li H, Wang C, Sheng Q, Fan H, Yang S, et al.: A platform to standardize, store, and visualize proteomics experimental data. Acta Biochim Biophys Sin 2009, 41(4):273-9.
  • [53]Jo H, Jeong J, Lee M, Choi DH. Exploiting GPUs in Virtual Machine for BioCloud. BioMed Res Int. 2013;2013.
  • [54]Yung LS, Yang C, Wan X, Yu W: GBOOST: a GPU-based tool for detecting gene-gene interactions in genome-wide case control studies. Bioinformatics 2011, 27(9):1309-10.
  • [55]Manavski SA, Valle G. CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment. BMC bioinformatics. 2008;9(Suppl 2).
  • [56]McArt DG, Bankhead P, Dunne PD, Salto-Tellez M, Hamilton P, Zhang SD: cudaMap: a GPU accelerated program for gene expression connectivity mapping. BMC bioinformatics 2013, 14:305. BioMed Central Full Text
  • [57]Schatz MC, Trapnell C, Delcher AL, Varshney A. High-throughput sequence alignment using Graphics Processing Units. BMC bioinformatics. 2007;8(1).
  • [58]Tradeoffs Between Synchronization, Communication, and Computation in Parallel Linear Algebra Computations2014. ACM, New York, NY, USA; 2014.
  • [59]Fadista J, Bendixen C. Genomic Position Mapping Discrepancies of Commercial SNP Chips. PloS one. 2012;7(2).
  • [60]Merali Z: Computational science: …Error. Nature News 2010, 467(7317):775-7.
  • [61]Robiou-du-Pont S, Li A, Christie S, Sohani ZN, Meyre D: Should we have blind faith in bioinformatics software? Illustrations from the SNAP web-based tool. PLoS One 2015, 10(3):e0118925.
  • [62]Khan MA, Soto-Jimenez LM, Howe T, Streit A, Sosinsky A, Stern CD: Computational tools and resources for prediction and analysis of gene regulatory regions in the chick genome. Genesis 2013, 51(5):311-24.
  • [63]Heath AP, Greenway M, Powell R, Spring J, Suarez R, Hanley D et al. Bionimbus: a cloud for managing, analyzing and sharing large genomics datasets. Journal of the American Medical Informatics Association. JAMIA. 2014.
  • [64]Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, et al.: Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 2004, 5(10):R80. BioMed Central Full Text
  • [65]Saito R, Smoot ME, Ono K, Ruscheinski J, Wang P-L, Lotia S, et al.: A travel guide to Cytoscape plugins. Nat Methods 2012, 9(11):1069-76.
  • [66]Dai L, Gao X, Guo Y, Xiao J, Zhang Z. Bioinformatics clouds for big data manipulation. Biology Direct. 2012;7(1).
  • [67]Tenenbaum JD, Sansone SA, Haendel M: A sea of standards for omics data: sink or swim? J Am Med Inform Assoc 2014, 21(2):200-3.
  • [68]Oberst A, Dillon CP, Weinlich R, McCormick LL, Fitzgerald P, Pop C, et al.: Catalytic activity of the caspase-8-FLIP(L) complex inhibits RIPK3-dependent necrosis. Nature 2011, 471(7338):363-7.
  • [69]Clarke R, Ressom HW, Wang A, Xuan J, Liu MC, Gehan EA, et al.: The properties of high-dimensional data spaces: implications for exploring gene and protein expression data. Nat Rev Cancer 2008, 8(1):37-49.
  • [70]Noble WS: How does multiple testing correction work? Nat Biotechnol 2009, 27(12):1135-7.
  • [71]Dudoit S, Laan MJvd. Multiple Testing Procedures with Applications to Genomics. Springer Science & Business Media; 2007.
  • [72]Miller RG, Jr. Simultaneous Statistical Inference. Springer New York; 2011.
  • [73]Westfall PH, Troendle JF: Multiple testing with minimal assumptions. Biom J 2008, 50(5):745-55.
  • [74]Westfall PH. Resampling-Based Multiple Testing: Examples and Methods for P-Value Adjustment. John Wiley & Sons; 1993.
  • [75]Benjamini Y, Hochberg Y: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Series B Methodol 1995, 57:289-300.
  • [76]Parkhomenko E, Tritchler D, Beyene J: Genome-wide sparse canonical correlation of gene expression with genotypes. BMC Proc 2007, 1 Suppl 1:S119. BioMed Central Full Text
  • [77]Yao F, Coquery J, Le Cao KA: Independent Principal Component Analysis for biologically meaningful dimension reduction of large biological data sets. BMC Bioinformatics 2012, 13:24. BioMed Central Full Text
  • [78]Le Cao KA, Gonzalez I, Dejean S: integrOmics: an R package to unravel relationships between two omics datasets. Bioinformatics 2009, 25(21):2855-6.
  • [79]Fan Y, Tang CY: Tuning parameter selection in high dimensional penalized likelihood. J R Stat Soc B 2013, 75(3):531-52.
  • [80]Park H, Sakaori F, Konishi S: Robust sparse regression and tuning parameter selection via the efficient bootstrap information criteria. J Stat Comput Simul 2013, 84(7):1596-607.
  • [81]Bühlmann P, Geer Svd. Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer Science & Business Media; 2011.
  • [82]Zhang C-H, Huang J: The sparsity and bias of the Lasso selection in high-dimensional linear regression. Ann Stat 2008, 36(4):1567-94.
  • [83]Sass S, Buettner F, Mueller NS, Theis FJ: A modular framework for gene set analysis integrating multilevel omics data. Nucleic Acids Res 2013, 41(21):9622-33.
  • [84]Expectation Propagation for Approximate Bayesian Inference2001. Morgan Kaufmann Publishers Inc, San Francisco, CA, USA; 2001.
  • [85]Isci S, Dogan H, Ozturk C, Otu HH. Bayesian Network Prior: Network Analysis of Biological Data Using External Knowledge. Bioinformatics. 2013.
  • [86]Reshetova P, Smilde AK, Kampen AHCv, Westerhuis JA. Use of prior knowledge for the analysis of high-throughput transcriptomics and metabolomics data. BMC Systems Biology. 2014;8(Suppl 2).
  • [87]Dolédec S, Chessel D: Co-inertia analysis: an alternative method for studying species–environment relationships. Freshw Biol 1994, 31(3):277-94.
  • [88]Fagan A, Culhane AC, Higgins DG: A multivariate analysis approach to the integration of proteomic and gene expression data. Proteomics 2007, 7(13):2162-71.
  • [89]Culhane AC, Perriere G, Higgins DG: Cross-platform comparison and visualisation of gene expression data using co-inertia analysis. BMC Bioinformatics 2003, 4:59. BioMed Central Full Text
  • [90]Meng C, Kuster B, Culhane AC, Gholami AM: A multivariate approach to the integration of multi-omics datasets. BMC Bioinformatics 2014, 15:162. BioMed Central Full Text
  • [91]Alter O, Brown PO, Botstein D: Generalized singular value decomposition for comparative analysis of genome-scale expression data sets of two different organisms. Proc Natl Acad Sci U S A 2003, 100(6):3351-6.
  • [92]Hartigan JA: Direct Clustering of a Data Matrix. J Am Stat Assoc 1972, 67(337):123-9.
  • [93]Cheng Y, Church GM: Biclustering of expression data. Proceedings / International Conference on Intelligent Systems for Molecular Biology. ISMB Int Conf Intell Syst Mol Biol 2000, 8:93-103.
  • [94]Tomescu OA, Mattanovich D, Thallinger GG: Integrative omics analysis. A study based on Plasmodium falciparum mRNA and protein data. BMC Syst Biol 2014, 8 Suppl 2:S4. BioMed Central Full Text
  • [95]Hamid JS, Greenwood CMT, Beyene J: Weighted kernel Fisher discriminant analysis for integrating heterogeneous data. Comput Stat Data Anal 2012, 56(6):2031-40.
  • [96]Haider S, Pal R: Integrated analysis of transcriptomic and proteomic data. Curr Genomics 2013, 14(2):91-110.
  • [97]Chen G, Gharib TG, Huang CC, Taylor JM, Misek DE, Kardia SL, et al.: Discordant protein and mRNA expression in lung adenocarcinomas. Mol Cell Proteomics 2002, 1(4):304-13.
  • [98]Gygi SP, Rochon Y, Franza BR, Aebersold R: Correlation between protein and mRNA abundance in yeast. Mol Cell Biol 1999, 19(3):1720-30.
  • [99]Yeung ES: Genome-wide correlation between mRNA and protein in a single cell. Angew Chem Int Ed Engl 2011, 50(3):583-5.
  • [100]Van den Bulcke T, Lemmens K, Van de Peer Y, Marchal K: Inferring Transcriptional Networks by Mining ‘Omics’ Data. Curr Bioinforma 2006, 1(3):301-13.
  • [101]Hwang D, Smith JJ, Leslie DM, Weston AD, Rust AG, Ramsey S, et al.: A data integration methodology for systems biology: experimental verification. Proc Natl Acad Sci U S A 2005, 102(48):17302-7.
  • [102]Nagarajan R, Scutari M, Lèbre S. Bayesian Networks in R: with Applications in Systems Biology. Springer Science & Business Media; 2013.
  • [103]Friedman N, Linial M, Nachman I, Pe’er D: Using Bayesian networks to analyze expression data. J Comput Biol 2000, 7(3–4):601-20.
  • [104]Huang S, Li J, Ye J, Fleisher A, Chen K, Wu T, et al.: A sparse structure learning algorithm for Gaussian Bayesian Network identification from high-dimensional data. IEEE Trans Pattern Anal Mach Intell 2013, 35(6):1328-42.
  • [105]Hoffman MM, Buske OJ, Wang J, Weng Z, Bilmes JA, Noble WS: Unsupervised pattern discovery in human chromatin structure through genomic segmentation. Nat Methods 2012, 9(5):473-6.
  • [106]Allen JD, Xie Y, Chen M, Girard L, Xiao G. Comparing Statistical Methods for Constructing Large Scale Gene Networks. PLoS One. 2012;7(1).
  • [107]Hu P, Greenwood CM, Beyene J: Integrative analysis of multiple gene expression profiles with quality-adjusted effect size models. BMC Bioinformatics 2005, 6:128. BioMed Central Full Text
  • [108]Yoo S, Huang T, Campbell JD, Lee E, Tu Z, Geraci MW, et al.: MODMatcher: Multi-Omics Data Matcher for Integrative Genomic Analysis. PLoS Comput Biol 2014, 10(8):e1003790.
  • [109]Wang XF: Joint generalized models for multidimensional outcomes: a case study of neuroscience data from multimodalities. Biom J 2012, 54(2):264-80.
  • [110]Batmanghelich NK, Dalca AV, Sabuncu MR, Polina G: Joint modeling of imaging and genetics. Inf Process Med Imaging 2013, 23:766-77.
  • [111]O’Reilly PF, Hoggart CJ, Pomyen Y, Calboli FC, Elliott P, Jarvelin MR, et al.: MultiPhen: joint model of multiple phenotypes can increase discovery in GWAS. PLoS One 2012, 7(5):e34861.
  • [112]Chu JH, Hersh CP, Castaldi PJ, Cho MH, Raby BA, Laird N, et al.: Analyzing networks of phenotypes in complex diseases: methodology and applications in COPD. BMC Syst Biol 2014, 8:78. BioMed Central Full Text
  • [113]Grosdidier S, Ferrer A, Faner R, Pinero J, Roca J, Cosio B, et al.: Network medicine analysis of COPD multimorbidities. Respir Res 2014, 15(1):111. BioMed Central Full Text
  • [114]Barabasi AL, Gulbahce N, Loscalzo J: Network medicine: a network-based approach to human disease. Nat Rev Genet 2011, 12(1):56-68.
  • [115]Ozdemir V, Kolker E, Hotez PJ, Mohin S, Prainsack B, Wynne B, et al.: Ready to put metadata on the post-2015 development agenda? Linking data publications to responsible innovation and science diplomacy. Omics 2014, 18(1):1-9.
  • [116]Snyder M, Mias G, Stanberry L, Kolker E: Metadata checklist for the integrated personal OMICS study: proteomics and metabolomics experiments. Omics 2014, 18(1):81-5.
  • [117]Kolker E, Ozdemir V, Martens L, Hancock W, Anderson G, Anderson N, et al.: Toward more transparent and reproducible omics studies through a common metadata checklist and data publications. Omics 2014, 18(1):10-4.
  • [118]Ioannidis JP, Khoury MJ: Improving validation practices in “omics” research. Science 2011, 334(6060):1230-2.
  • [119]Hand DJ: Deconstructing Statistical Questions. J R Stat Soc Ser A Stat Soc 1994, 157(3):317-56.
  文献评价指标  
  下载次数:10次 浏览次数:26次