Machine learning has been a source for continuous methodological advances in the field of computational learning from data. Systems biology has profited in various waysfrom machine learning techniques but in particular from network inference, i.e. thelearning of interactions given observed quantities of the involved components or datathat stem from interventional experiments. Originally this domain of system biologywas confined to the inference of gene regulation networks but recently expanded to otherlevels of organization of biological and ecological systems. Especially the application tospecies interaction networks in a varying environment is of mounting importance inorder to improve our understanding of the dynamics of species extinctions, invasions,and population behaviour in general.The aim of this thesis is to demonstrate an extensive study of various state-of-artmachine learning techniques applied to a genetic regulation system in plants and toexpand and modify some of these methods to infer species interaction networks in anecological setting. The first study attempts to improve the knowledge about circadianregulation in the plant Arabidopsis thaliana from the view point of machine learning andgives suggestions on what methods are best suited for inference, how the data shouldbe processed and modelled mathematically, and what quality of network learning canbe expected by doing so. To achieve this, I generate a rich and realistic synthetic dataset that is used for various studies under consideration of different effects and methodsetups. The best method and setup is applied to real transcriptional data, which leadsto a new hypothesis about the circadian clock network structure.The ecological study is focused on the development of two novel inference methodsthat exploit a common principle from transcriptional time-series, which states that expressionprofiles over time can be temporally heterogeneous. A corresponding conceptin a spatial domain of 2 dimensions is that species interaction dynamics can be spatiallyheterogeneous, i.e. can change in space dependent on the environment and otherfactors. I will demonstrate the expansion from the 1-dimensional time domain to the2-dimensional spatial domain, introduce two distinct space segmentation schemes, andconsider species dispersion effects with spatial autocorrelation. The two novel methodsdisplay a significant improvement in species interaction inference compared to competingmethods and display a high confidence in learning the spatial structure of differentspecies neighbourhoods or environments.
【 预 览 】
附件列表
Files
Size
Format
View
Machine learning in systems biology at different scales : from molecular biology to ecology