学位论文详细信息
Exploring New Forms of Random Projections for Prediction and Dimensionality Reduction in Big-Data Regimes
Random Projections;Dimensionality Reduction;Nonlinear Random Projections;Biologically Inspired Random Projections;Supervised Random Projections;Deep Learning;Random-weighted Neural Networks
Karimi, Amir-Hosseinaffiliation1:Faculty of Mathematics ; advisor:Ghodsi, Ali ; advisor:Wong, Alexander ; Ghodsi, Ali ; Wong, Alexander ;
University of Waterloo
关键词: Dimensionality Reduction;    Master Thesis;    Supervised Random Projections;    Random Projections;    Biologically Inspired Random Projections;    Nonlinear Random Projections;    Deep Learning;    Random-weighted Neural Networks;   
Others  :  https://uwspace.uwaterloo.ca/bitstream/10012/13220/3/karimi_amir-hossein.pdf
瑞士|英语
来源: UWSPACE Waterloo Institutional Repository
PDF
【 摘 要 】

The story of this work is dimensionality reduction. Dimensionality reductionis a method that takes as input a point-set P of n points in R^d where d istypically large and attempts to find a lower-dimensional representation ofthat dataset, in order to ease the burden of processing for down-streamalgorithms. In today’s landscape of machine learning, researchers andpractitioners work with datasets that either have a very large number ofsamples, and or include high-dimensional samples. Therefore, dimensionalityreduction is applied as a pre-processing technique primarily to overcome thecurse of dimensionality.Generally, dimensionality reduction improves time and storage space requiredfor processing the point-set, removes multi-collinearity and redundancies inthe dataset where different features may depend on one another, and mayenable simple visualizations of the dataset in 2-D and 3-D making therelationships in the data easy for humans to comprehend. Dimensionalityreduction methods come in many shapes and sizes. Methods such as PrincipalComponent Analysis (PCA), Multi-dimensional Scaling, IsoMaps, and LocallyLinear Embeddings are amongst the most commonly used method of this family ofalgorithms. However, the choice of dimensionality reduction method provescritical in many applications as there is no one-size-fits-all solution, andspecial care must be considered for different datasets and tasks.Furthermore, the aforementioned popular methods are data-dependent, andcommonly rely on computing either the Kernel / Gram matrix or the covariancematrix of the dataset. These matrices scale with increasing number of samplesand increasing number of data dimensions, respectively, and are consequentlypoor choices in today’s landscape of big-data applications.Therefore, it is pertinent to develop new dimensionality reduction methodsthat can be efficiently applied to large and high-dimensional datasets, byeither reducing the dependency on the data, or side-stepping it altogether.Furthermore, such new dimensionality reduction methods should be able toperform on par with, or better than, traditional methods such as PCA. Toachieve this goal, we turn to a simple and powerful method called randomprojections.Random projections are a simple, efficient, and data-independent method forstably embedding a point-set P of n points in R^d to R^k where d is typicallylarge and k is on the order of log n. Random projections have a long historyof use in dimensionality reduction literature with great success. In thiswork, we are inspired to build on the ideas of random projection theory, andextend the framework and build a powerful new setup of random projections forlarge high-dimensional datasets, with comparable performance tostate-of-the-art data-dependent and nonlinear methods. Furthermore, we studythe use of random projections in domains other than dimensionality reduction,including prediction, and show the competitive performance of such methodsfor processing small dataset regimes.

【 预 览 】
附件列表
Files Size Format View
Exploring New Forms of Random Projections for Prediction and Dimensionality Reduction in Big-Data Regimes 2593KB PDF download
  文献评价指标  
  下载次数:23次 浏览次数:29次