期刊论文详细信息
Frontiers in Neuroinformatics
Generic Machine Learning Pattern for Neuroimaging-Genetic Studies in the Cloud
Jean-Baptiste ePoline2  Benoit eDa Mota2  Gaël eVaroquaux2  Marcella eRietschel3  Patricia eConrod4  Gabriel eAntoniu5  Hervé eLemaitre6  Alexandru eCostan7  Radu eTudoran7  Bertrand eThirion8  Vincent eFrouin1,10  Goetz eBrasche1,11  Tomas ePaus1,14 
[1] psychiatry, Université Paris Sud, Université Paris Descartes;CEA, DSV, I²BM, Neurospin;Central Institute of Mental Health;Department of psychiatry, University Montreal;Henry H. Wheeler Brain Imaging Center, University of California at Berkeley;;INSERM CEA unit 1000 imaging &Inria Rennes;Inria Saclay;Institute of psychiatry, Kings College;Medical Faculty Mannheim, University of heidelberg;Microsoft, Advance Technology Lab Europe;Montreal Neurological Institute, McGill University;Rotman Research Institute, University of Toronto;School of psychology, University of Nottingham;
关键词: fMRI;    machine learning;    heritability;    Neuroimaging genetics;    Cloud computing;   
DOI  :  10.3389/fninf.2014.00031
来源: DOAJ
【 摘 要 】

Brain imaging is a natural intermediate phenotype to understand the link between genetic information and behavior or brain pathologies risk factors. Massive efforts have been made in the last few years to acquire high-dimensional neuroimaging and genetic data on large cohorts of subjects. The statistical analysis of such data is carried out with increasingly sophisticated techniques and represents a great computational challenge. Fortunately, increasing computational power in distributed architectures can be harnessed, if new neuroinformatics infrastructures are designed and training to use these new tools is provided. Combining a MapReduce framework (TomusBLOB) with machine learning algorithms (Scikit-learn library), we design a scalable analysis tool that can deal with non-parametric statistics on high-dimensional data. End-users describe the statistical procedure to perform and can then test the model on their own computers before running the very same code in the cloud at a larger scale. We illustrate the potential of our approach on real data with an experiment showing how the functional signal in subcortical brain regions can be significantly fit with genome-wide genotypes. This experiment demonstrates the scalability and the reliability of our framework in the cloud with a two weeks deployment on hundreds of virtual machines.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:0次