学位论文详细信息
Query Answering over Functional Dependency Repairs
Data Cleaning;Probabilistic Databases;Computer Science
Galiullin, Artur
University of Waterloo
关键词: Data Cleaning;    Probabilistic Databases;    Computer Science;   
Others  :  https://uwspace.uwaterloo.ca/bitstream/10012/7890/1/Artur_Galiullin.pdf
瑞士|英语
来源: UWSPACE Waterloo Institutional Repository
PDF
【 摘 要 】

Inconsistency often arises in real-world databases and, as a result, critical queries over dirty data may lead users to make ill-informed decisions. Functional dependencies (FDs) can be used to specify intended semantics of the underlying data and aid with the cleaning task. Enumerating and evaluating all the possible repairs to FD violations is infeasible, while approaches that produce a single repair or attempt to isolate the dirty portion of data are often too destructive or constraining. In this thesis, we leverage a recent advance in data cleaning that allows sampling from a well-defined space of reasonable repairs, and provide the user with a data management tool that gives uncertain query answers over this space. We propose a framework to compute probabilistic query answers as though each repair sample were a possible world. We show experimentally that queries over many possible repairs produce results that are more useful than other approaches and that our system can scale to large datasets.

【 预 览 】
附件列表
Files Size Format View
Query Answering over Functional Dependency Repairs 2230KB PDF download
  文献评价指标  
  下载次数:5次 浏览次数:11次