Zakrzewska, Anita N. ; Bader, David A. Computational Science and Engineering Catalyurek, Umit Dilkina, Bistra Dovrolis, Constantine Aluru, Srinivas Riedy, Jason ; Bader, David A.
Graph analysis can be used to study streaming data from a variety of sources, such as social networks, financial transactions, and online communication. The analysis of streaming data poses many challenges, including dealing with the high volume of data and the speed with which it is generated. This dissertation addresses challenges that occur throughout the graph analysis process. Because many datasets are large and growing, it may be infeasible to collect and build a graph from all the data that has been generated. This work addresses the challenges created by large volumes of streaming data through new sampling techniques. The algorithms presented can sample a subgraph in a single pass over an edge stream and are therefore appropriate for streaming applications. A sampling algorithm that can produce a temporally biased subgraph is also presented. Before graph analysis techniques can be applied, a graph must first be created from the data collected. When creating dynamic graphs, it is not obvious how to de-emphasize old information, especially when edges are derived from interactions. This work evaluates several methods of aging old data to create dynamic graphs. This dissertation also contributes new techniques for dynamic community detection and analysis. A new algorithm for local community detection on dynamic graphs is presented. Because it incrementally updates results when the graph changes, the method is suitable for streaming data. The creation of dynamic graphs allows us to study community changes over time. This work addresses the topic of community analysis with a vertex-level measure of community change. Together, these contributions advance the study of streaming relational data through graph analysis.