科技报告详细信息
Data Foundry: Data Warehousing and Integration for Scientific Data Management
Musick, R. ; Critchlow, T. ; Ganesh, M. ; Fidelis, Z. ; Zemla, A. ; Slezak, T.
Lawrence Livermore National Laboratory
关键词: Data Base Management;    Management;    Business;    Data;    99 General And Miscellaneous//Mathematics, Computing, And Information Science;   
DOI  :  10.2172/793555
RP-ID  :  UCRL-ID-127593
RP-ID  :  W-7405-Eng-48
RP-ID  :  793555
美国|英语
来源: UNT Digital Library
PDF
【 摘 要 】

Data warehousing is an approach for managing data from multiple sources by representing them with a single, coherent point of view. Commercial data warehousing products have been produced by companies such as RebBrick, IBM, Brio, Andyne, Ardent, NCR, Information Advantage, Informatica, and others. Other companies have chosen to develop their own in-house data warehousing solution using relational databases, such as those sold by Oracle, IBM, Informix and Sybase. The typical approaches include federated systems, and mediated data warehouses, each of which, to some extent, makes use of a series of source-specific wrapper and mediator layers to integrate the data into a consistent format which is then presented to users as a single virtual data store. These approaches are successful when applied to traditional business data because the data format used by the individual data sources tends to be rather static. Therefore, once a data source has been integrated into a data warehouse, there is relatively little work required to maintain that connection. However, that is not the case for all data sources. Data sources from scientific domains tend to regularly change their data model, format and interface. This is problematic because each change requires the warehouse administrator to update the wrapper, mediator, and warehouse interfaces to properly read, interpret, and represent the modified data source. Furthermore, the data that scientists require to carry out research is continuously changing as their understanding of a research question develops, or as their research objectives evolve. The difficulty and cost of these updates effectively limits the number of sources that can be integrated into a single data warehouse, or makes an approach based on warehousing too expensive to consider.

【 预 览 】
附件列表
Files Size Format View
793555.pdf 380KB PDF download
  文献评价指标  
  下载次数:16次 浏览次数:68次