Programming abstractions for

Yüklə 0,54 Mb.

Pdf görüntüsü

səhifə	13/23
tarix	24.12.2017
ölçüsü	0,54 Mb.
	#17201

1 ... 9 10 11 12 13 14 15 16 ... 23

5.1. KEY POINTS

• Directive-based language extensions: tools like OpenMP

, OpenACC

, OpenStream

decorate a host

language, like C++ or Fortran, with annotations. In OpenMP and its derivatives, the objective is

that these “pragmas” can be ignored to yield purely sequential code with the same semantics. The

directive language is separate from the host language. Directives are similar to annotations in Java

and C#, which are user-extensible and often used to drive aspect-oriented transformation tools. Both

directives and annotations share problems with integration with the host language. Directive-based

tools like OpenMP suﬀer from compositionality issues, for example in calling a parallel function from

within a parallel loop.

• Global-view vs. Local-view Languages: Global-view languages are those in which data structures, such

as multidimensional arrays, are declared and accessed in terms of their global problem size and indices,

as in shared-memory programming. In contrast, local-view languages are those in which such data

structures are accessed in terms of local indices and node IDs.

• Multiresolution Language Philosophy: This is a concept in which programmers can move from language

features that are more declarative, abstract, and higher-level to those that are more imperative, control-

oriented, and low-level, as required by their algorithm or performance goals. The goal of this approach

is to support higher-level abstractions for convenience and productivity without removing the ﬁne-

grained control that HPC programmers often require in practice. Ideally, the high-level features are

implemented in terms of the lower-level ones in a way that permits programmers to supply their own

implementations. Such an approach supports a separation of roles in which computational scientists

can write algorithms at high levels while parallel computing experts can tune the mappings of those

algorithms to the hardware platform(s) in distinct portions of the program text.

5.1

Key Points

During the PADAL workshop, we identiﬁed the following key points to be considered when designing lan-

guages for data locality issues.

• Communication and locality should be clearly evident in the source code, so that programmers have a

clear model of data movement and its associated costs. At the same time, the programming language

should make it easy to port ﬂat-memory code to locality-aware code, or to write code that can execute

eﬃciently on both local and remote data. One mechanism to accomplish this is to encode locality in

the type system, so that modifying the locality characteristics of a piece of code only requires changing

type declarations. In languages that support generic programming, this also enables a programmer to

write the same code for both local and remote data, with the compiler producing eﬃcient translations

for both cases.

• In addition to providing primitives for moving data to where computation is located, a programming

language should also enable a user to move computation to where data are located. This is particularly

important for irregular applications in which the distribution of data is not known until runtime or

changes over the course of the computation.

For large data sets, code movement is likely to be

signiﬁcantly cheaper than moving data.

• A program should not require rewriting when moving to a diﬀerent machine architecture. Instead, the

language should provide a machine model that does not have to be hard-coded into an application.

In particular, the machine model should be represented separately from user code, using a runtime

data structure. The language should either automatically map user code to the machine structure at

compile or launch time or provide the user with mechanisms for adapting to the machine structure

during execution.

• A uniﬁed machine model should be provided that encompasses all elements of a parallel program,

including placement of execution, load balancing, data distribution, and resilience.

http://openmp.org/

http://www.openacc-standard.org/

http://openstream.info/

Programming Abstractions for Data Locality

5.2. STATE OF THE ART

• Seamless composition of algorithms and libraries should be supported by the language; composition

should not require a code rewrite. The machine model can facilitate composition by allowing a subset

of the machine structure to be provided to an algorithm or library.

• The language should provide features at multiple levels of abstraction, following the multiresolution

design philosophy. For example, it may provide data-parallel operations over distributed data struc-

tures, with the compiler and runtime responsible for scheduling and balancing the computation. At

the same time, the language might also allow explicit operations over the local portions of the data

structure. Such a language would be a combination of global and local view, providing default global-

view declarations and operations while also allowing the user to build and access data structures in a

local-view manner.

• Higher-level features in the language and runtime should be built on top of the same lower-level

features that the user has access to. This enables a user to replace the built-in, default operations with

customized mechanisms that are more suitable to the user’s application. The compiler and runtime

should perform optimizations at multiple levels of abstraction, enabling such custom implementations

to reap the advantages of lower-level optimizations.

5.2

State of the Art

HPF and ZPL are two languages from the 1990s that support high-level locality speciﬁcations through the

distribution of multidimensional arrays and index sets to rectilinear views of the target processors. Both

can be considered global view languages, and as a result all communication was managed by the compiler

and runtime. A key distinction between the languages was that all communication in ZPL was syntactically

evident, while in HPF it was invisible. While ZPL’s approach made locality simpler for a programmer

to reason about, it also required code to be rewritten whenever a local/non-distributed data structure or

algorithm was converted to a distributed one. HPF’s lack of syntactic communication cues saved it from this

problem, but it fell afoul of others in that it did not provide a clear semantic model for how locality would be

implemented for a given program, requiring programmers to wrestle with a compiler to optimize for locality,

and to then to rewrite their code when moving to a second compiler that took a diﬀerent approach.

As we consider current and next-generation architectures, we can expect the locality model for a com-

pute node to diﬀer from one vendor or machine generation to the next. For this reason, the ZPL and HPF

approaches are non-viable. To this end, we advocate pursuing languages that make communication syntac-

tically invisible (to avoid ZPL’s pitfalls) while supporting a strong semantic model as a contract between

the compiler and programmer (to avoid HPF’s). Ideally, this model would be reinforced by execution-time

queries to support introspection about the placement of data and tasks on the target architecture.

Chapel is an emerging language that takes this prescribed approach, using a ﬁrst-class language-level

feature, the locale to represent regions of locality in the target architecture. Programmers can reason about

the placement of data and tasks on the target architecture using Chapel’s semantic model, or via runtime

queries. Chapel follows the Partitioned Global Address Space (PGAS) philosophy, supporting direct access

to variables stored on remote locales based on traditional lexical scoping rules. Chapel also follows the

multiresolution philosophy by supporting low-level mechanisms for placing data or tasks on speciﬁc locales,

as well as high-level mechanisms for mapping global-view data structures or parallel loops to the locales.

Advanced users may implement these data distributions and loop decompositions within Chapel itself, and

can even deﬁne the model used to describe a machine’s architecture in terms of locales.

X10 [22] is another PGAS language that uses places as analogues to Chapel’s locales. In X10, execution

must be colocated with data. Operating on remote data requires spawning a task at the place that owns the

data. The user can specify that the new task run asynchronously, in which case it can be explicitly synchro-

nized later and any return value accessed through a future. Thus, X10 makes communication explicit in the

form of remote tasks. Hierarchical Place Trees [92] extend X10’s model of places to arbitrary hierarchies,

allowing places to describe every location in a hierarchical machine.

Uniﬁed Parallel C (UPC), Co-Array Fortran (CAF), and Titanium [93] are three of the founding PGAS

languages. UPC supports global-view data structures and syntactically-invisible communication while CAF

has local-view data structures and syntactically-evident communication. Titanium has a local-view data

Programming Abstractions for Data Locality

Yüklə 0,54 Mb.

Dostları ilə paylaş:

1 ... 9 10 11 12 13 14 15 16 ... 23