A mapReduce Skeleton for Skandium

Yüklə 427,52 Kb.

Pdf görüntüsü

səhifə	4/19
tarix	05.03.2018
ölçüsü	427,52 Kb.
	#30153

1 2 3 4 5 6 7 8 9 ... 19

Chapter 2 Related Background

Chapter 2

Related Background

This chapter presents the related background material that deﬁnes fundamental notions

of the project. In section 2.1 the concept of algorithmic skeletons is presented. In

2.2 the background of the MapReduce model is covered and in 2.3 Phoenix, the ﬁrst

implementation of MapReduce on shared memory architectures, is described. In 2.4

we introduce the details of the Skandium library. Finally in 2.5 we describe a set of

typical MapReduce applications that we use to evaluate the performance of the skeleton

and in 2.6 we present the principles behind the Java’s automatic memory management

which as we will see in later chapters, has a signiﬁcant impact on the performance of

our implementation.

2.1

Algorithmic Skeletons

The Algorithmic skeletons [6] is a programming model that provides a high level of

abstraction to the programmer and aims at facilitating the efﬁcient use of parallel hard-

ware. This model presents the user with a set of algorithmic skeletons which are in-

dependent algorithmic techniques that capture the high level computational structure

of algorithm e.g the pipeline, the task farm. The user selects the skeleton that best de-

scribes the problem in hand and ﬁlls the gaps of the abstract structure with his own code

fragments that turn the abstract computation into a complete solution to the problem.

By selecting a skeleton, the user implicitly selects the problem decomposition (i.e

the identiﬁcation of processes that can operate concurrently) as it is inherent to the

structure of the skeleton. He also implicitly selects problem distribution (i.e. the static

or dynamic mapping of the parallel processes to the computational resourses) which

is provided by one of the carefully optimized implementations of the skeleton for the

Chapter 2. Related Background

particular machine. With the decomposition and the distribution deﬁned, the commu-

nication and the synchronization of the processes can also be handled by the runtime

system, completely hiding the difﬁcult aspects of parallel programming from the user.

The unique characteristic of this model is its fragmented nature, that contrasts the

monolithic programming model of previous abstract systems that also aimed at hiding

parallel programming issues from the programmer. The skeletons are totally indepen-

dent of each other, each can more easily ﬁnd an efﬁcient implementation on a variety

of architectures and new skeletons can be added to the system without affecting the be-

haviour of the others. More importantly, when previous systems presented an abstract

unrestricted programming model that was meant to solve any problem and inevitably

could not be supported by a realistic and efﬁcient implementation, this model presents

restricted algorithmic structures that the system knows how to implement efﬁciently

and at the same time can guide the user to good programming style.

2.2

The MapReduce Skeleton

The computation described by the MapReduce skeleton is the following: The input

is processed by a set of map functions that run in parallel. Each mapper is assigned

a section of the original input and its output is a set of intermediate key-value pairs.

The intermediate values are then passed to the reduce stage where a set of reducers

run in parallel. All the pairs that have the same key are processed by the same re-

duce function. Each reduce function applies a reduction operation on the associated

with the same key values to form a smaller set of values for each key, typically one.

The user-deﬁned parts of the code in this skeleton are the map and the reduce func-

tion while the runtime system automatically generates parallelism by creating several

instances of these functions and distributing across the nodes of the parallel machine

The computation is depicted in Figure 2.1.

In Google’s implementation of the model [8] which targets large scale clusters of

commodity machines, the run time system also hides from the programmer issues of

fault tolerance, load balancing and inter-machine communication. The model has been

used across a wide range of domains within Google including large scale machine

learning problems and large scale graph computations. According to [8] the fact that

the programmer focuses on functionality and not on parallelization and develops a

simple computation that is automatically executed efﬁciently on a thousand machines

has led to the speeding up of the development and prototype process. The code of an

Chapter 2. Related Background

Figure 2.1: The MapReduce programming model

application is simpler and easier to understand since all the details that obscure the

original simple computation are hidden in the library.

2.3

The Phoenix Implementation

Phoenix [5] [13] [10] was the ﬁrst implementation of the MapReduce programming

model that targets multi-core and multi-processor platforms. The increasing number

of the cores on a single chip and the fact that traditional programming techniques will

fail to target the high level of parallelism in these machines effectively, was the main

motivation for applying a structured parallel programming model like MapReduce on

shared memory architectures.

Unlike Google’s implementation that targets large clusters and the main perfor-

mance concern is to reduce network trafﬁc and avoid remote ﬁle accesses, phoenix

implementation is more focused on the careful selection of the in-memory data struc-

tures that are used to store and retrieve the intermediate key-value pairs. The system

also aims at using the memory hierarchy of the machine as efﬁciently as possible.

Phoenix’s ﬁrst implementation [5] provided an application programming interface

for C and C++. Apart from the Map and the Reduce function, the programmer provides

a function that is used by the runtime for key comparison and a function that splits the

input data across the Map tasks. The programmer can optionally provide a function

that partitions the intermediate pairs for Reduce tasks based on their keys while the

Yüklə 427,52 Kb.

Dostları ilə paylaş:

1 2 3 4 5 6 7 8 9 ... 19