We discuss an object-based, multi-paradigm approach to the development of large-scale, high performance parallel applications. Our approach is characterized by three essential ingredients: (i) Plurality, i.e. a programmer decomposes her application into a number of smaller, and relatively independent modules.Eachone of these modules is written in the language/framework that allows for its most compact and elegant expression; (ii) Specialization of languages, i.e. each language is specialized for the expression of a particular and important subclass of parallel programs; and (iii) Interoperability between paradigms, which is to say that modules written in different languages and frameworks can actively interoperate.We believe that language specialization engenders productivity, since it allows programmers to employ higher-level abstractions that are closely attuned to the semantics of an intended domain. Specialization also affords performance benefits, since the designers of the runtime system can make assumptions about thedynamic behaviors of programs, and optimize for these behaviors.Finally, interoperability is a core requirement of the system, and allows it to achieve completeness of expression simultaneously with high-level specification in abstract notations.As a proof of concept, we develop three specialized programming languages. Each one addresses an important subclass of computational patterns encountered in scientific and engineering applications. The first of these, Charisma, captures parallel programs with fixed communication patterns that can bedetermined by static analysis. The second, DivCon, allows the succinctexpression of divide-and-conquer applications, especially those that exhibit generative recursion on distributed collections of data elements. The third, Distree,is a flexible framework for the expression of iterative, tree-based algorithms.We assume the presence of a common programming substrate on top of whichtranslated specifications of these languages execute. In our work, we utilize the Charm++ adaptive runtime system (ARTS) for this purpose. The Charm++ ARTS is based on a coarse-grained, message-driven actor model.There are typically tens of such coarse-grained actors per processing element. In the terminology of Charm++, these actors are simply called objects. The co-location ofmultiple objects enables run-time optimizations such as automated overlap of computation on one object, with the communication latency of another, and migration-based dynamic load balancing. This leads to good performance of our translated codes.The runtime system provides an object-based, message-driven substrate through which our specialized languages can actively interoperate. We shall see that the models of computation provided byCharisma and Distree are well-aligned with this object-based substrate.Even though the DivCon language provides a mixture of imperative and functional semantics, it is ultimately translated into the interactions of coarse-grained, message-driven objects. This means that our three mini-languages can interoperate with each other, andalso with Charm++. In addition, the message-driven nature of Charm++ allows the implicit transfer of control and data between modules. Multi-module programs are therefore automatically interleaved based on the availability of data. In fact,idle time in one module is automatically overlapped with useful work in another. Therefore, interoperability across different views of data and controlis not only possible, but also efficient.We believe that this combination of abstract specification, interoperation between modules,and an object-based view of parallelism backed by an adaptive runtime system affords our approach significant productivity and performance benefits. We substantiate our claims through discussions along the following themes.(i) We present the syntactic and semantic constructs of our specialized languages. We demonstrate their simplicity and the semantic consonance between the constructsprovided by each language, and the characteristics of programs that fall within its range of expression.(ii) We consider the expression of several common and important examples of HPC applications in these specialized languages. As we shall see, the specifications of these applications in our specializedlanguages are succinct and abstract away details such as the schedule of computation. (iii) We provide performance comparisons between hand-tuned codes and their counterparts written in the high-productivityprogramming systems. (iv) Finally, we identify and overcome the challenges in enabling interoperability between modules expressed in different paradigms. Support for interoperability allows the composition of large parallel applications from productively expressed modules, without sacrificing performance. We will demonstrate this through a Barnes-Hut application that is composed from pieces of code written in Charisma, DivCon, Distree and Charm++.
【 预 览 】
附件列表
Files
Size
Format
View
Incompleteness + interoperability: a multi-paradigm approach to parallel programming for science and engineering applications