Review of CAP Theorem in Distributed Systems

Note

For a comprehensive coverage of blockchain development in Ethereum or mastering Solidity programming, taking our below self paced courses is highly recommended:

 

Recap

In our previous article (Review of challenges in distributed systems), we covered the challenges of using distributed database systems.

In this article, we discuss three properties of the CAP theorem and how they are used in different distributed database systems

 

The CAP theorem

Proposed by Eric Brewer, a Berkeley computer scientist, the CAP theorem asserts that any distributed system with shared state can only have at most, two desirable properties out of the following three:

  • Consistency
  • Availability
  • Partition tolerance

There is no guarantee that network nodes are free of failure in a distributed heterogeneous environment. Therefore, the CAP theorem suggests that the designer of large distributed systems has to make a trade-off between consistency and availability. Traditional RDBMS is more targeted to ensure consistency and availability in a centralized system.
The following diagram illustrates the three properties of the CAP theorem and the designer’s choices in different distributed database systems:

Blockchain and Ethereum development
For example, the Cassandra database, a massively scalable open source NoSQL database from the Apache foundation, is the right choice for use cases where scalability, high availability, and performance are the most important design objectives.
Design choices made in Cassandra favor availability and partition tolerance over consistency, although you can eventually make it consistent by tuning the Cassandra database with the replication factor and consistency level. Its linear scalability, together with its strong fault-tolerance on commodity hardware or cloud infrastructure, makes it a perfect option for many mission critical applications. CouchDB falls in the same category.

 

One-to-One Live Blockchain Classes

Coding Bootcamps school offers One-to-One Live Blockchain Classes for Beginners.

 

On the other hand, MongoDB is strongly consistent by default, which also means it compromises availability. It employs a single-master system to ensure all writes occur on the primary node, and nothing is written to the secondary node. All reads go to the primary node by default. If the primary goes down, no writes can happen until a secondary takes over as the primary. HBase falls in the same category.

 

Next Article

In our next article (Horizontal Scaling versus Vertical Scaling in Distributed Systems), we discuss the differences between horizontal and vertical scaling in distributed systems.

This article is written in collaboration with Brian Wu who is a leading author of “Learn Ethereum: Build your own decentralized applications with Ethereum and smart contracts” book. He has written 7 books on blockchain development.

 

Resources

coming soon