期刊论文详细信息
Frontiers in Genetics
Large-Scale Automatic Feature Selection for Biomarker Discovery in High-Dimensional OMICs Data
Olivier Perin2  Yves Fradet3  Alain Bergeron3  Marie Pier Scott Boyer4  Mickael Leclercq4  Benjamin Vittrant4  Arnaud Droit4  Marie Laure Martin-Magniette6 
[1] Centre de Recherche du CHU de Québec–Université Laval, Québec City, QC, Canada;Digital Sciences Department, L'Oréal Advanced Research, Aulnay-sous-bois, France;Département de Chirurgie, Oncology Axis, Université Laval, Québec City, QC, Canada;Département de Médecine Moléculaire, Université Laval, Québec City, QC, Canada;Institute of Plant Sciences Paris Saclay IPS2, CNRS, INRA, Université Paris-Sud, Université Evry, Université Paris-Saclay, Paris Diderot, Sorbonne Paris-Cité, Orsay, France;UMR MIA-Paris, AgroParisTech, INRA, Université Paris-Saclay, Paris, France;
关键词: machine learning;    omics;    biomarkers signature;    feature selection;    precision medicine;   
DOI  :  10.3389/fgene.2019.00452
来源: DOAJ
【 摘 要 】

The identification of biomarker signatures in omics molecular profiling is usually performed to predict outcomes in a precision medicine context, such as patient disease susceptibility, diagnosis, prognosis, and treatment response. To identify these signatures, we have developed a biomarker discovery tool, called BioDiscML. From a collection of samples and their associated characteristics, i.e., the biomarkers (e.g., gene expression, protein levels, clinico-pathological data), BioDiscML exploits various feature selection procedures to produce signatures associated to machine learning models that will predict efficiently a specified outcome. To this purpose, BioDiscML uses a large variety of machine learning algorithms to select the best combination of biomarkers for predicting categorical or continuous outcomes from highly unbalanced datasets. The software has been implemented to automate all machine learning steps, including data pre-processing, feature selection, model selection, and performance evaluation. BioDiscML is delivered as a stand-alone program and is available for download at https://github.com/mickaelleclercq/BioDiscML.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:0次