期刊论文详细信息
Journal of Translational Medicine
In silico prediction of novel therapeutic targets using gene–disease association data
Research
Enrico Ferrero1  Philippe Sanseau2  Ian Dunham3 
[1] Computational Biology and Stats, Target Sciences, GSK Medicines Research Centre, Gunnels Wood Road, SG1 2NY, Stevenage, UK;Computational Biology and Stats, Target Sciences, GSK Medicines Research Centre, Gunnels Wood Road, SG1 2NY, Stevenage, UK;Open Targets, Wellcome Genome Campus, Hinxton, CB10 1SD, Cambridge, UK;European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, CB10 1SD, Cambridge, UK;Open Targets, Wellcome Genome Campus, Hinxton, CB10 1SD, Cambridge, UK;
关键词: Drug discovery;    Target discovery;    Gene–disease associations;    Machine learning;    Data mining;   
DOI  :  10.1186/s12967-017-1285-6
 received in 2017-07-07, accepted in 2017-08-22,  发布年份 2017
来源: Springer
PDF
【 摘 要 】

BackgroundTarget identification and validation is a pressing challenge in the pharmaceutical industry, with many of the programmes that fail for efficacy reasons showing poor association between the drug target and the disease. Computational prediction of successful targets could have a considerable impact on attrition rates in the drug discovery pipeline by significantly reducing the initial search space. Here, we explore whether gene–disease association data from the Open Targets platform is sufficient to predict therapeutic targets that are actively being pursued by pharmaceutical companies or are already on the market.MethodsTo test our hypothesis, we train four different classifiers (a random forest, a support vector machine, a neural network and a gradient boosting machine) on partially labelled data and evaluate their performance using nested cross-validation and testing on an independent set. We then select the best performing model and use it to make predictions on more than 15,000 genes. Finally, we validate our predictions by mining the scientific literature for proposed therapeutic targets.ResultsWe observe that the data types with the best predictive power are animal models showing a disease-relevant phenotype, differential expression in diseased tissue and genetic association with the disease under investigation. On a test set, the neural network classifier achieves over 71% accuracy with an AUC of 0.76 when predicting therapeutic targets in a semi-supervised learning setting. We use this model to gain insights into current and failed programmes and to predict 1431 novel targets, of which a highly significant proportion has been independently proposed in the literature.ConclusionsOur in silico approach shows that data linking genes and diseases is sufficient to predict novel therapeutic targets effectively and confirms that this type of evidence is essential for formulating or strengthening hypotheses in the target discovery process. Ultimately, more rapid and automated target prioritisation holds the potential to reduce both the costs and the development times associated with bringing new medicines to patients.

【 授权许可】

CC BY   
© The Author(s) 2017

【 预 览 】
附件列表
Files Size Format View
RO202311102619691ZK.pdf 1730KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  • [34]
  • [35]
  • [36]
  • [37]
  • [38]
  • [39]
  • [40]
  • [41]
  • [42]
  • [43]
  • [44]
  • [45]
  • [46]
  • [47]
  • [48]
  • [49]
  • [50]
  • [51]
  • [52]
  • [53]
  • [54]
  • [55]
  • [56]
  • [57]
  • [58]
  • [59]
  • [60]
  • [61]
  • [62]
  • [63]
  • [64]
  • [65]
  • [66]
  • [67]
  • [68]
  • [69]
  • [70]
  • [71]
  • [72]
  • [73]
  • [74]
  • [75]
  • [76]
  • [77]
  • [78]
  • [79]
  • [80]
  • [81]
  • [82]
  • [83]
  • [84]
  • [85]
  • [86]
  • [87]
  • [88]
  • [89]
  • [90]
  • [91]
  • [92]
  • [93]
  • [94]
  • [95]
  文献评价指标  
  下载次数:1次 浏览次数:0次