This dissertation studies the schema matching problem that finds semantic correspon- dences (called matches) between disparate data sources. Examples of semantic matches include “location = address” and “name = concat(first-name,last-name).”Schema matching is one of the key challenges for many data sharing and exchange applications. Prime examples of such applications arise in numerous contexts, including data warehousing, scientific collaboration, e-commerce, bioinformatics, and data integra- tion on the World Wide Web. Despite significant progress, many challenges remain. These include discovering complex matches, a prevalent problem in practice, tuning a matching system, and deploying a matching system effectively in an application.In this dissertation, we develop solutions for the three challenges mentioned above. First, we develop a system that discovers both one-to-one and complex matches and pro- vides a novel explanation facility that helps users analyze matches. Next, we develop a framework that automatically tunes multi-component matching systems by synthesiz- ing a collection of matching scenarios. Finally, we show that we can efficiently exploit discovered semantic matches without extra user effort in certain applications.
【 预 览 】
附件列表
Files
Size
Format
View
Developing, tuning, and using schema matching systems