Modern measurement systems monitor a growing number of variables at low cost. In the problem of characterizing the observed measurements, budget limitations usually constrain the number n of samples that one can acquire, leading to situations where the number p of variables is much larger than n. In this situation, classical statistical methods, founded on the assumption that n is large and p is fixed,fail both in theory and in practice. A successful approach to overcome this problem is to assume a parsimonious generative model characterized by a number k ofparameters, where k is much smaller than p.In this dissertation we develop algorithms to fit low-dimensional generative modelsand extract relevant information from high-dimensional, multivariate signals. First,we define extensions of the well-known Scalar Shrinkage-Thresholding Operator, thatwe name Multidimensional and Generalized Shrinkage-Thresholding Operators, andshow that these extensions arise in numerous algorithms for structured-sparse linear and non-linear regression. Using convex optimization techniques, we show thatthese operators, defined as the solutions to a class of convex, non-differentiable, optimization problems have an equivalent convex, low-dimensional reformulation. Ourequivalence results shed light on the behavior of a general class of penalties that includes classical sparsity-inducing penalties such as the LASSO and the Group LASSO.In addition, our reformulation leads in some cases to new efficient algorithms for avariety of high-dimensional penalized estimation problems.Second, we introduce two new classes of low-dimensional factor models that account for temporal shifts commonly occurring in multivariate signals. Our first contribution, called Order Preserving Factor Analysis, can be seen as an extension of thenon-negative, sparse matrix factorization model to allow for order-preserving temporal translations in the data. We develop an efficient descent algorithm to fit this modelusing techniques from convex and non-convex optimization. Our second contributionextends Principal Component Analysis to the analysis of observations suffering fromcircular shifts, and we call it Misaligned Principal Component Analysis. Wequantify the effect of the misalignments in the spectrum of the sample covariance matrix in the high-dimensional regime and develop simple algorithms to jointly estimatethe principal components and the misalignment parameters.
【 预 览 】
附件列表
Files
Size
Format
View
Learning from High-Dimensional Multivariate Signals.