The ;;big data;; revolution of the past decade has allowed researchers to procure or access biological data at an unprecedented scale, on the front of both volume (low-cost high-throughput technologies) and variety (multi-platform genomic profiling).This has fueled the development of new integrative methods, which combine and consolidate across multiple sources of data in order to gain generalizability, robustness, and a more comprehensive systems perspective.The key challenges faced by this new class of methods primarily relate to heterogeneity, whether it is across cohorts from independent studies or across the different levels of genomic regulation.While the different perspectives among data sources is invaluable in providing different snapshots of the global system, such diversity also brings forth many analytic difficulties as each source introduces a distinctive element of noise.In recent years, many styles of data integration have appeared to tackle this problem ranging from Bayesian frameworks to graphical models, a wide assortment as diverse as the biology they intend to explain.My focus in this work is dimensionality reduction-based methods of integration, which offer the advantages of efficiency in high-dimensions (an asset among genomic datasets) and simplicity in allowing for elegant mathematical extensions.In the course of these chapters I will describe the biological motivations, the methodological directions, and the applications of three canonical reductionist approaches for relating information across multiple data groups.
【 预 览 】
附件列表
Files
Size
Format
View
Integrative Analysis Methods for Biological Problems Using Data Reduction Approaches