Fairbanks, James Paul ; Bader, David A. Computational Science and Engineering Vuduc, Richard Park, Haesun Chau, Duen Horng (Polo) Randall, Dana ; Bader, David A.
Graph analysis uses graph data collected on a physical, biological, or socialphenomena to shed light on the underlying dynamics and behavior of the agents in that system. Many fields contribute to this topic including graph theory, algorithms, statistics, machine learning, and linear algebra. This dissertation advances a novel framework for dynamic graph analysis that combines numerical, statistical, and streaming algorithms to provide deepunderstanding into evolving networks. For example, one can be interested in the changing influence structure over time. These disparate techniques eachcontribute a fragment to understanding the graph; however, their combinationallows us to understand dynamic behavior and graph structure. Spectral partitioning methods rely on eigenvectors for solving data analysisproblems such as clustering. Eigenvectors of large sparse systems must be approximated with iterative methods. This dissertation analyzes how data analysis accuracy depends on the numerical accuracy of the eigensolver. This leads to new bounds on the residual tolerance necessary to guarantee correct partitioning. We present a novel stopping criterion for spectral partitioning guaranteed to satisfy the Cheeger inequality along with an empirical study of the performance on real world networks such as web, social, and e-commerce networks. This work bridges the gap between numerical analysis and computational data analysis.
【 预 览 】
附件列表
Files
Size
Format
View
Graph analysis combining numerical, statistical, and streaming techniques