| Final report: spatio-temporal data mining of scientific trajectory data | |
| Gaffney, S ; Smyth, P | |
| Lawrence Livermore National Laboratory | |
| 关键词: Pacific Ocean; Mining; 99 General And Miscellaneous//Mathematics, Computing, And Information Science; Vortices; Trajectories; | |
| DOI : 10.2172/15005339 RP-ID : UCRL-CR-142043 RP-ID : W-7405-ENG-48 RP-ID : 15005339 |
|
| 美国|英语 | |
| 来源: UNT Digital Library | |
PDF
|
|
【 摘 要 】
With the increasing availability of massive observational and experimental data sets (across a wide variety of scientific disciplines) there is an increasing need to provide scientists with efficient computational tools to explore such data in a systematic manner. For example, techniques such as classification and clustering are now being widely used in astronomy to categorize and organize stellar objects into groups and catalogs, which in turn provide the impetus for scientific hypothesis formation and discovery (e.g., see Fayyad, Djorgovski and Weir (1996); or Cheeseman and Stutz (1996) or Fayyad and Smyth (1999) in a more general context). Data-driven exploration of massive spatio-temporal data sets is an area where there is particular need of data mining techniques. Scientists are overwhelmed by the vast quantities of data which simulations, experiments, and observational instruments can produce. Analysis of spatio-temporal data is inherently challenging, yet most current research in data mining is focused on algorithms based on more traditional feature-vector data representations. Scientists are often not particularly interested in raw grid-level data, but rather in the phenomena and processes which are ''driving'' the data. In particular, they are often interested in the temporal and spatial evolution of specific ''spatially local'' structures of interest, e.g., birth-death processes for vortices and interfaces in fluid-flow simulations and experiments, trajectories of extra-tropical cyclones from sea-level pressure data over the Atlantic and Pacific oceans, and sunspot shape and size evolution over time from daily chromospheric images of the Sun. The ability to automatically detect, cluster, and catalog such objects in principle provides an important ''data reduction front-end'' which can convert 4-d data sets (3 spatial and 1 temporal dimension) on a massive grid to a much more abstract representation of local structures and their evolution. In turn, these higher-level representations provide a general framework and basis for further scientific hypothesis generation and investigation, e.g., investigating correlations between local phenomena (such as storm paths) and global trends (such as temperature changes). In this work we focused on detecting and clustering trajectories of individual objects in massive spatio-temporal data sets. There are two primary technical problems involved. First, the local structures of interest must be detected, characterized, and extracted from the mass of overall data. Second, the evolution (in space and/or time) of these structures needs to be modeled and characterized in a systematic manner if the overall goal of producing a reduced and interpretable description of the data is to be met.
【 预 览 】
| Files | Size | Format | View |
|---|---|---|---|
| 15005339.pdf | 5789KB |
PDF