Programming abstractions for

Yüklə 0,54 Mb.

Pdf görüntüsü

səhifə	14/23
tarix	24.12.2017
ölçüsü	0,54 Mb.
	#17201

1 ... 10 11 12 13 14 15 16 17 ... 23

5.3. DISCUSSIONS

model built around ZPL-style multidimensional arrays. Its type system distinguishes between data guaran-

teed to be local and data that may be remote using annotations on variable declarations. On the other hand,

access to local and remote data is provided by the same syntax. Thus, Titanium strikes a balance between

the HPF and ZPL approaches, making communication explicit in declarations but allowing the same code

fragments to operate on local and remote data.

UPC, CAF, and Titanium diﬀer from HPF, ZPL, Chapel, and X10 in that programs are written using

an explicit bulk-synchronous Single-Program, Multiple Data (SPMD) execution model. These copies of the

executing binary form the units of locality within these languages, and remote variable instances are refer-

enced based on the symmetric namespaces inherent in the SPMD model. While the ﬂat SPMD models used

in these languages (and commonly in MPI) do allow programmers to colocate data and execution, they will

likely be insuﬃcient for leveraging the hierarchical architectural locality present in emerging architectures.

Currently such models are often forced to be part of hybrid programming models in which distinct locality

abstractions are used to express ﬁner-grained locality concerns.

Recent work in Titanium has replaced the ﬂat SPMD model with the more hierarchical Recursive Single-

Program, Multiple-Data (RSPMD) model [53]. This model groups together data and execution contexts into

teams that are arranged in hierarchical structures, which match the structure of recursive and compositional

algorithms and emerging hierarchical architectures. While the total set of threads is ﬁxed at startup as in

SPMD, hierarchical teams can be created dynamically, and threads can enter and exit teams as necessary.

Titanium provides a mechanism for querying the machine structure at runtime, allowing the same program

to target diﬀerent platforms by building the appropriate team structure during execution.

Other work has been done to address the limitations of the ﬂat SPMD model in the context of Phalanx

[36] and UPC++ [96], both active libraries for C++. The Phalanx library uses the Hierarchical Single-

Program, Multiple-Data (HSPMD) model, which is a hybrid of SPMD and dynamic tasking. The HSPMD

model retains the cooperative nature of SPMD by allowing thread teams, as in RSPMD, but it allows new

teams of threads to be spawned dynamically. Unlike SPMD and RSPMD, the total set of executing threads

is not ﬁxed at startup. Both RSPMD and HSPMD allow expression of locality and concurrency at multiple

levels, though through slightly diﬀerent mechanisms, allowing the user to take advantage of hierarchical

architectures. The UPC++ library uses RSPMD as its basic execution model but additionally allows X10-

style asynchronous tasks to be spawned at remote locations. This allows execution to be moved dynamically

to where data are located and adds a further degree of adaptibility to the basic bulk-synchronous SPMD

model.

5.3

Discussions

AGREEMENTS

• PRINCIPLES

– Avoid losing information through premature “lowering” of the program representation. In partic-

ular, many locality-oriented analyses and optimizations are most naturally eﬀective when applied

to multidimensional index spaces and data structures. To that end, languages lacking multidimen-

sional data structures, or compilers that aggressively normalize to 1D representations, undermine

such optimization strategies. Expression of computations in their natural dimensionality and

maintenance of that dimensionality during compilation are key.

– A common theme in many promising language- and library-oriented approaches to locality is

to express distribution- and/or locality-oriented speciﬁcations in a program’s variable and type

declarations rather than scattering it throughout the computation.

Since locality is a cross-

cutting concern, this minimizes the amount of code that needs to change when the mapping of

the program’s constructs to the hardware must. The compiler and runtime can then leverage the

locality properties exposed in these declarations to customize and optimize code based on that

information.

– Isolate cross-cutting locality concerns. Locality — data layout and distribution — is fundamen-

tally more diﬃcult than parallelization because it aﬀects all the code that touches the data.

Programming Abstractions for Data Locality

5.3. DISCUSSIONS

• SOLUTIONS

– There is a lot of consensus around multidimensional data/partitioning/slicing, and how to iterate

over them (generators, iterators) - parameterisation of layouts.

– There is potential to expose polyhedral concepts to the programmer/IR, and to pass them down

to the backend (eg Chapels domains, mappings)

– The back-end model for the target for code generation and implementation of models/compilers

is missing - for portable implementation, for interoperability

– While we can do a lot with libraries and template metaprogramming, there is a compelling case

for a language-based approach

DISAGREEMENTS

• No consensus on the requirements on intermediate representations and runtime systems

• There was disagreement within the workshop attendees about the extent to which a language’s locality-

speciﬁcation mechanisms should be explicit (“allocate/run this here”) vs. suggestive (“allocate/run

this somewhere near-ish to here, please”) vs. locality-oblivious or automatic (“I don’t want to worry

about this, someone else [the compiler / the parallel expert] should ﬁgure it out for me.”). This

disagreement is arguably an indication that pursuing multiresolution features would be attractive. In

such a model, a programmer could be more or less explicit as the situation warrants; and/or distinct

concerns (algorithm vs. mapping) could be expressed in diﬀerent parts of the program by programmers

with diﬀering levels of expertise.

• Language-based abstractions are central to tackling locality management, but two fundamentally dif-

ferent design philosophies have emerged that lead to very diﬀerent implementation strategies – An

explicit ”do what I say” approach that is covered in subsection 5.3.2 and an implicit ”do what I mean”

approach that is described in subsection 5.3.3. More research is required to tease out the beneﬁts and

challenges of implementing each of these approaches in practice because the strategies diﬀer on a very

fundamental level.

• Explicit programmer control over locality is essential, but the expression of locality needs to be portable

across machines. The naive interpretation is that ”explicit control” is synonymous with explicitly

mapping a particular task to speciﬁc core. However, in such a case, machine portability would be

compromised. The loose interpretation of ”explicit control” led to misunderstandings and requires

more speciﬁc deﬁnition. For the purpose of these discussions, ”explicit control” refers primarily to the

interface provided to the performance layer (the “do what I say” layer for expert programmers), and

is not intended to be exposed to the “do what I mean” layer for non-expert programmers.

5.3.1

Multiresolution Tools

The power of a language-based attack on locality is that we can support abstractions that can really help.

A key distinction to be understood and confronted is the tension between tools that control how code is

executed (perhaps with the help of powerful abstractions), and tools that automatically make good choices,

based on information provided by the programmer:

“do what I say”

“do what I mean”

“but tell me only once”

“and let me take it from here”

Vigorous debate on this question led to the conclusion that there is a clear need for both - in fact, a need

for a spectrum of levels of control. A key feature of any automated solution is that it can inter-operate with

explicit control.

Programming Abstractions for Data Locality

Yüklə 0,54 Mb.

Dostları ilə paylaş:

1 ... 10 11 12 13 14 15 16 17 ... 23