With the advent of big data, scientists are collecting biological data faster than they have in the past, including genomic profiles which describe individuals by thousands of genes at a time. Adding to this library of knowledge are gene interaction networks, which model overarching cellular processes by describing how genes interact with each other.When approached with genomic profile data together with gene interaction data, it becomes a question of how to integrate these two pieces of knowledge together for machine learning. Previous studies have attempted to employ some form of feature engineering process to "collapse" the network topology alongside the genomic profiles, losing the potential for global network information.Instead, we explore a framework based upon network propagation. We explain how network propagation algorithms can enhance standalone genomic profiles, called embeddings, and show these enhancements lead to improved predictive accuracies on drug response classification. We next show that these embeddings contain predictive signals that are not necessarily implicated by gene ranking methods such as PageRank. Last, we apply network propagation to a dataset presented by the DREAM organization, and show we can improve a naive linear regression that solves for a drug sensitive ranking task.
【 预 览 】
附件列表
Files
Size
Format
View
Towards the integration of genomic profiles and gene interaction networks for machine learning