| Cooperative fault-tolerant distributed computing U.S. Department of Energy Grant DE-FG02-02ER25537 Final Report | |
| Sunderam, Vaidy S. | |
| Emory University, Atlanta, GA | |
| 关键词: Productivity; Resource Management Distributed Computing, Fault Tolerance, High Performance Computing; 99 General And Miscellaneous//Mathematics, Computing, And Information Science; Distributed Computing, Fault Tolerance, High Performance Computing; Kernels; | |
| DOI : 10.2172/916972 RP-ID : DOE/ER/25537-1 RP-ID : FG02-02ER25537 RP-ID : 916972 |
|
| 美国|英语 | |
| 来源: UNT Digital Library | |
PDF
|
|
【 摘 要 】
The Harness project has developed novel software frameworks for the execution of high-end simulations in a fault-tolerant manner on distributed resources. The H2O subsystem comprises the kernel of the Harness framework, and controls the key functions of resource management across multiple administrative domains, especially issues of access and allocation. It is based on a “pluggable” architecture that enables the aggregated use of distributed heterogeneous resources for high performance computing. The major contributions of the Harness II project result in significantly enhancing the overall computational productivity of high-end scientific applications by enabling robust, failure-resilient computations on cooperatively pooled resource collections.
【 预 览 】
| Files | Size | Format | View |
|---|---|---|---|
| 916972.pdf | 82KB |
PDF