Programming for beginners: Role of Quorum Consistency in Distributed Systems

Distributed System

A Distributed System is a collection of independent but interconnected computers that appear to the users of the system as a single coherent system. These computers are connected by a computer network and are equipped with distributed system software.

Distributed system software enables these independent computers to coordinate their activities and share computing resources to perform operations seamlessly. This software ensures that the distributed system functions as a unified entity, providing services and resources efficiently and reliably.

Consistency

Consistency in distributed systems refers to the guarantee that all the systems that are part of a distributed system show the same data to users at the same time. It ensures that whenever you read data, you get the latest update, and all parts of the system are in sync with each other regarding the data.

There are two primary types of consistency in Distributed Systems.

1. Strong Consistency

It guarantees that after a write is done, any read afterward will show the latest value. This means any change made is instantly shown across the whole system.

Example: If you update a record in a distributed database, all future reads, no matter which part of the system they come from, will immediately show the updated value.

Let’s take an example on the concept of consistency in a distributed system step-by-step using an example of a person's bank account balance across 5 nodes.

Initial State

All 5 nodes have the same initial balance for a Bob's account, say $100.

Write and Read Operations

Suppose Bob deposited 50$ into his account, then new balance is 150$, and the update should be reflected in all the computes.

Any read operation, regardless of which node it queries, will return the updated balance of $150 immediately after the write is complete.

Eventual Consistency

In a system with eventual consistency, updates eventually reach all nodes, but there might be a temporary delay between a write happening and it being reflected everywhere. This implies that read operations might not always immediately fetch the latest data.

Write and Read Operations

Suppose Bob deposited 50$ into his account, then new balance is 150$ it might not update immediately in all the systems.

In Eventual consistency, after the write, some nodes might still return the old balance of $100. After some time, all nodes will converge to the new balance of $150.

Role of Quorum in Distributed Computing

In a distributed system, you have many computers connected and working together. Keeping everything consistent and synchronized across these computers can be challenging. This is where the idea of a quorum comes into play.

A quorum is the minimum number of computers (or nodes) that must agree on a decision for it to be approved and considered valid. Imagine it as a voting system where a decision is made only if enough people vote in favour of it. If the required number of computers agree, the decision passes; otherwise, it doesn’t. This ensures that even if some computers fail or disagree, the system can still make reliable decisions as long as the quorum is met.

Let me explain it with an example of a distributed system with 10 connected computers and a replication factor of 3 for the data and a quorum size of 2.

How Quorum works in this Scenario?

Imagine you want to perform an operation, such as updating a piece of data stored in your system. Client initiate a request to update the data. The update request is sent to all 3 nodes where the data is replicated. For the update to be accepted and applied, at least 2 out of the 3 nodes must agree (vote in favour) of the update. If 2 or more nodes agree, the quorum is achieved, and the update is applied. If fewer than 2 nodes agree, the quorum is not met, and the update is not applied.

How to choose a Quorum value?

In a system with N nodes and a replication factor R (where R<N), determining a good quorum value depends on balancing the need for consistency, availability, and fault tolerance. Here’s a simple guideline to help you determine a good quorum value:

Basic Quorum rule

A common rule for choosing a quorum value Q is that, Q should be more than half of the replication factor R. Mathematically, Q should be ⌈𝑅/2⌉ + 1 = ⌈3/2⌉ + 1 = 2.

By setting Q to more than half of R, you ensure that any update requires agreement from a majority of the replicas, reducing the risk of conflicting updates.

Points to note

1. Higher Quorum for Higher Consistency

A higher quorum value provides stronger consistency guarantees but increase the latency.

2. Lower Quorum for Better Latency

Since fewer nodes involved in decisions, leading to quicker responses.

3. If the Write Quorum is set to 1, then the system is designed for faster writes.

A write quorum of 1 means that only one node needs to acknowledge the write operation for it to be considered successful. This setup minimizes the time it takes to complete a write because the system does not wait for multiple nodes to respond. Therefore, the system is designed for faster writes.

4. If the Read Quorum is set to 1, then the system is designed for faster reads.

A read quorum of 1 means that only one node needs to respond to a read request for it to be considered successful. This setup minimizes the time it takes to complete a read because the system does not wait for multiple nodes to provide data. Thus, the system is designed for faster reads.

5. If the Read_Quorum + Write_Quorum > R, then strong consistency is Guaranteed.

If the sum of the read quorum and the write quorum is greater than the replication factor R, strong consistency is achieved. This is because at least one node that participated in the most recent write will be involved in any read operation. This ensures that reads will always reflect the most recent write.

System Design Questions

Programming for beginners

Wednesday, 22 May 2024

Role of Quorum Consistency in Distributed Systems

No comments:

Post a Comment