会议论文详细信息
21st International Conference on Computing in High Energy and Nuclear Physics
Resilient FTS3 service at GridKa
物理学;计算机科学
Hartmann, T.^1 ; Bubeliene, J.^1 ; Hoeft, B.^1 ; Obholz, L.^1 ; Petzold, A.^1 ; Wisniewski, K.^1
Karlsruhe Institute of Technology (KIT), Steinbuch Centre for Computing (SCC), Hermann-von-Helmholtz-Platz 1, Eggenstein-Leopoldshafen
D-76344, Germany^1
关键词: Database access;    Database clusters;    Database queries;    File transfers;    High availability;    Normal operations;    Resilient systems;    Service components;   
Others  :  https://iopscience.iop.org/article/10.1088/1742-6596/664/6/062019/pdf
DOI  :  10.1088/1742-6596/664/6/062019
学科分类:计算机科学(综合)
来源: IOP
PDF
【 摘 要 】

The FTS (File Transfer Service) service provides a transfer job scheduler to distribute and replicate vast amounts of data over the heterogeneous WLCG infrastructures. Compared to the channel model of the previous versions, the most recent version of FTS simplifies and improves the flexibility of the service while reducing the load to the service components. The improvements allow to handle a higher number of transfers with a single FTS3 setup. Covering now continent-wide transfers compared to the previous version, whose installations handled only transfers within specific clouds, a resilient system becomes even more necessary with the increased number of depending users. Having set up a FTS3 services at the German T1 site GridKa at KIT in Karlsruhe, we present our experiences on the preparations for a high-availability FTS3 service. Trying to avoid single points of failure, we rely on a database cluster as fault tolerant data back-end and the FTS3 service deployed on an own cluster setup to provide a resilient infrastructure for the users. With the database cluster providing a basic resilience for the data back-end, we ensure on the FTS3 service level a consistent and reliable database access through a proxy solution. On each FTS3 node a HAproxy instance is monitoring the integrity of each database node and distributes database queries over the whole cluster for load balancing during normal operations; in case of a broken database node, the proxy excludes it transparently to the local FTS3 service. The FTS3 service itself consists of a main and a backup instance, which takes over the identity of the main instance, i.e., IP, in case of an error using a CTDB (Cluster Trivial Database) infrastructure offering clients a consistent service.

【 预 览 】
附件列表
Files Size Format View
Resilient FTS3 service at GridKa 3086KB PDF download
  文献评价指标  
  下载次数:19次 浏览次数:32次