Providing High Availability Using Lazy Replication Rivaka Ladin, Barbara Liskov, Liuba Shrira, Sanjay Ghemawat

Yüklə 163 Kb.

Providing High Availability Using Lazy Replication

Outline

Replication Model

System Guarantees

Operation Classification

Update operation classification

Vector timestamp

RM components

Query

Causal Update

Gossip messages

Control the size of update log

Control the size of executed operation table

Control the size of executed operation table (con’t)

Forced Update

Two phase protocol

Fail Recovery

Immediate Update

3 phase protocol

Number of Messages for different operations

Capacity of a 3-replica system

Capacity of the Unreplicated System

Discussion

Qustions?

Yüklə 163 Kb.

Dostları ilə paylaş:

Providing High Availability Using Lazy Replication Rivaka Ladin, Barbara Liskov, Liuba Shrira, Sanjay Ghemawat

Providing High Availability Using Lazy Replication

Rivaka Ladin, Barbara Liskov, Liuba Shrira, Sanjay Ghemawat

Presented by Huang-Ming Huang

Outline

Model

Algorithm

Performance Analysis

Discussion

Replication Model

System Guarantees

Each client obtains a consistent service over time

Relaxed consistency between replicas

Operation Classification

Update operation classification

Causal update

Forced update : performed in the same order (relative to one another) at all replicas.

Immediate update : performed at all replicas in the same order relative to all other operations.

Vector timestamp

Given two timestamps

Each part of the vector timestamp corresponds to each replica manager in the system.

RM components

Query

The replica manager blocks the query q operation until the condition holds:

The replica manger returns valueTS back to FE.

FE updates its own timestamp

Causal Update

Gossip messages

Goal : bring the states of replication managers up to date.

Consists of :

Upon receiving gossip

Control the size of update log

Timestamp table

A log record r can be removed from the log when

Control the size of executed operation table

Each update carries an extra time field

FE returns an ACK

RM inserts the received ACK to the log.

Control the size of executed operation table (con’t)

A message m from FE is late if

An update is discard if it is late

An ACK is kept at least until it is late

Remove an entry c in executed operation table when

Forced Update

Use the primary to assign a global unique identifier.

The primary carries out a two phase protocol for updates.

Two phase protocol

Upon receiving an update, the primary sends it to all other replicas.

Upon receiving responses from all most half of the backups,

Backups know the commitment from gossip messages.

Fail Recovery

New coordinator informs participants about the failure.

Participants inform coordinator about most recent forced updates

Coordinator assign UID with the largest it knows after the sub-majority of replicas has responded.

Immediate Update

Primary use 3 phase protocol.

3 phase protocol

Number of Messages for different operations

Query : 2

Casual : 2 + (N-1)/K

Forced : 2N/2+ (N-1)/K

Immediate : 2N +2(N/2-1)+(N-1)K

Capacity of a 3-replica system

Capacity of the Unreplicated System

Discussion

No time guarantee for gossip messages

Scalability

Qustions?