In this thesis, I develop different techniques for the patternextraction and visual exploration of a collection of data matrices.Specifically, I present methods to help home in on and visualize anunderlying structure and its evolution over ordered (e.g., time) orunordered (e.g., experimental conditions) index sets.The first partof the thesis introduces a biclustering technique for such threedimensional data arrays.This technique is capable of discoveringpotentially overlapping groups of samples and variables that evolvesimilarly with respect to a subset of conditions. To facilitate andenhance visual exploration, I introduce a framework that utilizeskernel smoothing to guide the estimation of bicluster responses overthe array.In the second part of the thesis, I introduce two matrixfactorization models.The first is a data integration model thatdecomposes the data into two factors: a basis common to all datamatrices, and a coefficient matrix that varies for each data matrix.The second model is meant for visual clustering of nodes in dynamicnetwork data, which often contains complex evolving structure.Hence,this approach is more flexible and additionally lets the basis evolvefor each matrix in the array.Both models utilize a regularizationwithin the framework of non-negative matrix factorization to encouragelocal smoothness of the basis and coefficient matrices, which improvesinterpretability and highlights the structural patterns underlying thedata, while mitigating noise effects. I also address computationalaspects of applying regularized non-negative matrix factorizationmodels to large data arrays by presenting multiple algorithms,including an approximation algorithm based on alternating leastsquares.
【 预 览 】
附件列表
Files
Size
Format
View
Statistical Techniques for Exploratory Analysis of Structured Three-Way and Dynamic Network Data.