Programming for beginners: Understanding Partitioning and Replication in Distributed Computing

Partitioning

Partitioning involves dividing large datasets into smaller, more manageable dataset so that they can be processed by smaller servers or commodity hardware. This makes future processing more efficient and feasible.

Let me explain with an example. Imagine you have a collection of family photos and videos totalling around 5TB in size. Assume you cannot afford to purchase a single piece of hardware with such large storage capacity, you can buy three smaller, more affordable hardware units, each with a capacity of 2TB. You would then divide the 5TB of data into three segments/splits (2TB + 2TB + 1TB) and store each segment on a separate machine. This way, you effectively manage the large dataset using multiple smaller, more economical hardware units.

Everything is going well so far. However, after some time, one of your systems—let's call it 'System 2' crashes due to hardware issues, resulting in the loss of all the data stored on it. Consequently, all the memories stored in that segment are gone, as you did not maintain any backups of these segments. This loss means that a portion of your family photos and videos is no longer accessible, highlighting the importance of having a backup strategy in place to prevent such data loss.

This is where redundancy becomes crucial.

Replication

Redundancy ensures that systems can tolerate partial failures, thereby increasing their availability and reliability.

To implement redundancy, you can purchase five commodity systems, each with a 2TB capacity. Instead of simply dividing your 5TB of data into five 2TB + 2TB + 1TB blocks, you divide it into 5 1TB segments. You then organize these segments across the five systems in such a way that no two identical segments are stored on the same hardware. This way, even if one system fails, you still have copies of the lost data on other systems. One possible arrangement of this redundant storage is illustrated below.

System 1: Split 1 and 2

System 2: Split 2 and 3

System 3: Split 3 and 4

System 4: Split 4 and 5

System 5: Split 5 and 1

By using redundancy in this manner, you can protect the data against data loss even if one of the systems crashes, we can ensure that your family memories are preserved.

System Design Questions

Programming for beginners

Sunday, 26 May 2024

Understanding Partitioning and Replication in Distributed Computing

No comments:

Post a Comment