System design fundamentals: What is the CAP theorem?

👋 CAP Theorem is one of the important concepts used in Distributed Systems.

Its applicability to various systems states that any distributed system or data store can simultaneously provide only two of three guarantees: consistency, availability, and partition tolerance (CAP).The theorem formalizes the tradeoff between consistency and availability when there’s a partition.

A distributed system is a network that stores data on more than one node (physical or virtual machines) at the same time. Because all cloud applications are distributed systems, it’s essential to understand the CAP theorem when designing a cloud app so that you can choose a data management system that delivers the characteristics your application needs most.

CAP Theorem Proof

Let’s look at a simple proof of the CAP theorem. Imagine a distributed system consisting of two nodes:

The distributed system acts as a plain register with the value of variable X. There’s a network failure that results in a network partition between the two nodes in the system. An end-user performs a write request, and then a read request. Let’s examine a case where a different node of the system processes each request. In this case, our system has two options:

It can fail at one of the requests, breaking the system’s availability
It can execute both requests, returning a stale value from the read request and breaking the system’s consistency

The system can’t process both requests successfully while also ensuring that the read returns the latest value written by the write. This is because the results of the write operation can’t be propagated from node A to node B because of the network partition.

Consistency, Availability, and Partition Tolerance explained

Let’s take a detailed look at the three distributed system characteristics to which the CAP theorem refers.

Consistency

Consistency means that all clients see the same data at the same time, no matter which node they connect to. For this to happen, whenever data is written to one node, it must be instantly forwarded or replicated to all the other nodes in the system before the write is deemed ‘successful.’

Availability

Availability means that any client making a request for data gets a response, even if one or more nodes are down. Another way to state this—all working nodes in the distributed system return a valid response for any request, without exception.

Partition Tolerance

A partition is a communications break within a distributed system—a lost or temporarily delayed connection between two nodes. Partition tolerance means that the cluster must continue to work despite any number of communication breakdowns between nodes in the system.

The CAP theorem states that it is impossible for a distributed system to simultaneously achieve consistency, availability, and partition tolerance. Therefore, it is important to understand the requirements of the system and choose a data management approach that meets its critical needs. It is also important to be aware of the CAP theorem when designing any cloud application or networked system.

CAP Theorem Database Architecture

Distributed networks heavily depend on NoSQL databases as they offer horizontal scalability, and they are highly distributed. Hence, they can easily and rapidly scale across a growing network of multiple interconnected nodes. But as discussed above, one can only have two of the three available functionalities. The different combinations and their use cases are discussed below:

CP System:

This system focuses more on consistency and partition tolerance. So these systems are not available most of the time. When any issue occurs in the system, it has to shut down the non-consistent node until the partition is resolved, and during that time, it is not available.

AP System:

This type of database focuses more on availability and partition tolerance rather than consistency. When any issue occurs in the system, then it will no longer remain in a consistent state. However, all the nodes remain available, and affected nodes might return a previous version of data, and the system will take some time to become consistent.

CA System:

This type of database focuses more on consistency and availability across all nodes than partition tolerance. Fault-Tolerance is the basic necessity of any distributed system, and hence it is almost rare to use a CA type of architecture for any practical purpose.

CAP Theorem and Microservices

Micro-services are loosely coupled, independently deployable application components that incorporate their own stack—including their own database and database model—and communicate with each other over a network. as have become especially popular in hybrid cloud and multi-cloud environments, and they are also widely used in on-premises data centers. If you want to create a micro-services application, you can use the CAP theorem to help you determine a database that will best fit your needs.

Conclusion

We are able to reach a comparatively greater degree of computing power thanks to the distributed systems architecture. Systems must be designed by taking into account the practical implications in everyday life and selecting the best design for the application. Now we understand why the CAP theorem is crucial for data management strategies. Given the complexity of the architecture, efficient network management is also necessary. When designed appropriately, these systems may eliminate problems like human error to maintain data relevancy and cut down on extra components.

Manish Kumar Paneri

Search This Blog