Proliferation of mobile devices and sensors in common use is anticipated to have a key role in building a smarter planet. The future Internet will be dominated by vast amount of sensing data. This thesis attacks a grand challenge in this vision: how to extract human-consumable information from such data? The prime contribution of this work is a framework called FusionSuite that facilitates the development of future applications and services based on mobile sensing data. The data in this type of "wide-area" sensing applications are not obtained from a controlled sensing system, rather collected from a dynamic and uncoordinated set of users and autonomous systems.This thesis focuses on studying research challenges that are revealed when building a general-purpose data distillation framework. Four research problems are identified and addressed: i) how to model non-linear cyber-physical systems using sparsely collected data with guarantees on the accuracy, ii) how to clean the unreliable, irrelevant, and noisy data collected by unscreened users to achieve the best performance in modeling, iii) how to share users information without breaching their personal privacy, and iv) how to collect data from autonomous sensing systems to optimize the modeling accuracy. FusionSuite tackles all issues in an application-independent manner to simplify wide-area sensing application development.The first challenge is addressed by combining techniques from data mining and estimation theory where the advantages in error-bounded modeling meet the complex high dimensional data analysis. We overcome the second challenge by creating a general abstraction of sensing data that can be handed to a machine learning and data mining construct called FactFinder. The privacy concerns are addressed by allowing users to share features unusable in inferring the private information, but yet useful for the purpose of data modeling. Finally, an optimized data collection protocol enables prioritizing communication in a resource constrained multi-hop collection network such that the desirable modeling accuracy is reached with the minimal cost.We demonstrate the features of FusionSuite by developing a participatory sensing application for green transportation, called GreenGPS. The service helps users save fuel by calculating the most fuel efficient route between any given origin and destination. In order to predict how fuel consumption varies from road to road and vehicle to vehicle, the engine performance data are collected from users. The fuel consumption models enable accurate prediction and routing. FusionSuite resolves major challenges in this application that are common to many wide-area sensing services and hence prove to be a necessary tool in building such systems. As an experimental testbed, 100 vehicles are equipped with cellphones to collect fuel efficiency data enabling the GreenGPS service and the FusionSuite libraries to run in a small scale.
【 预 览 】
附件列表
Files
Size
Format
View
Cyber-physical data distillation in a sensor rich world