学位论文详细信息
Fault Tolerance and Reliability in Scientific Workflows
Fault Tolerance;Reliability;Scientific Worflows;Web Services
Mouallem, Pierre ; Peter Wurman, Committee Member,Munindar Singh, Committee Member,Mladen Vouk, Committee Chair,Mouallem, Pierre ; Peter Wurman ; Committee Member ; Munindar Singh ; Committee Member ; Mladen Vouk ; Committee Chair
University:North Carolina State University
关键词: Fault Tolerance;    Reliability;    Scientific Worflows;    Web Services;   
Others  :  https://repository.lib.ncsu.edu/bitstream/handle/1840.16/306/etd.pdf?sequence=1&isAllowed=y
美国|英语
来源: null
PDF
【 摘 要 】
The emerging technologies of web services, agents and service-oriented workflows will enable scientific projects and experiments to be conducted on a larger scale than ever before. Data used and produced in such projects and experiments become increasingly complex and heterogeneous. Thus the need for a tool (or a set of tools) to efficiently design, manage and maintain problem solving flows (scientific workflows) using various components. The DOE Scientific Data Management (SDM) initiative aims to develop a framework that helps scientists to manage data in distributed and collaborative environments. It also provides tools that help them create and manage scientific workflows that use network-based (web) services, agent technologies and semantic mediation techniques. The current SDM's framework is known as SPA/Kepler and is Ptolemy II based. One of the vulnerabilities ofservice dependent workflows is that they require that the web services they use to be available whenever the workflow is run. If key web services are not available, the workflow cannot finish successfully. At that point a scientist using such as service would have to wait for it to be restored,This, of course, impacts workflows reliability and availability, and may be sufficient for an end-user to stop using workflows that use those services.. The work reported here uses the SPA/Kepler framework to explore the issue of reliability of service-based scientific workflows.For example, a workflow that invokes 3 services in a series may have .an acceptably high overall failure probability. This thesis explores the issues related to improvement of the overall workflow reliability using fault tolerance. Specifically, the work focuses on failure-masking and fail-over through redundancy, and in the context of individual services, rather than on provision of checkpointing and recovery.. Analyses show that even a relatively simple redundancy based fault-tolerance approach, such as duplication of key services, can provide an order of magnitude or better reliability. In the context of an actual implementation, one option is to find locations of alternative (functionally equivalent) services during workflow design, and then use that information at run-time if the primary service fails. A more practical method is to publish the list of services used by the workflow to a UDDI type service and have a way of dynamically matching needed services with functionally equivalent ones if a fail-over is required. A prototype solution of the latter, based on a commercially available brokering service, was developed for one of the SDM pilot workflows to show its viability. It is discussed in detail.
【 预 览 】
附件列表
Files Size Format View
Fault Tolerance and Reliability in Scientific Workflows 998KB PDF download
  文献评价指标  
  下载次数:6次 浏览次数:26次