Data do not always obey the normality assumption, and outliers can have dramatic impacts on the quality of the least squares methods. We use Huber's loss function in developing robust methods for time-course multivariate responses. We use spline basis expansion of the time-varying regression coefficients to reduce dimensionality, and downweight the influence of outliers with Huber's loss function on vectors of residuals.Our research is motivated by time-course microarray experiments to better understand the transcription regulatory network by studying the relationship between gene expressions and transcription factors. The gene expressions are taken as multivariate responses in such studies.The dissertation consists of three parts. The first part develops a robust score test for linear models by a modification of the well-known Rao's score test based on Huber's M estimator.The test statistic is asymptotically normal, and the simulation study suggests that the test has higher power in the presence of outliers than the score test based on the least squares.In the second part of the dissertation, we propose a robust clustering method based on the EM algorithm applied to a modified multivariate normal density, designed to downweight outliers by Huber's loss function. We discuss practical algorithms, and assess the performance of the proposed method through Monte Carlo simulations.Variable selection has received much attention in recent literature. A number of methods have been developed including Lasso. The group Lasso is an extension of the Lasso with the goal of selecting important groups of variables rather than individual variables. In the third part of the dissertation, we propose two robust group Lasso algorithms for the multivariate time-course data, and illustrate the robustness properties of the proposed method for analyzing time-course data.
【 预 览 】
附件列表
Files
Size
Format
View
Robust methods for analyzing multivariate responses with application to time-course data