Note

If you are new to the blockchain technology, taking our Introduction to Blockchain Technology self-paced course is highly recommended. Also, for a comprehensive coverage of blockchain development in Ethereum or mastering Solidity programming, taking our below self paced courses is highly recommended:

Recap

In our previous article (A roadmap for Implementing Ethereum 2.0), we covered the roadmap for implementing version 2 of Ethereum.

In this article, we learn about how to work with decentralized data and content storages like IPFS or BigChainDB in Ethereum.

Working with decentralized data and content storage

Data and content management are two of the main capabilities in many of the real-world business applications, such as information portals, Wikipedia, and ecommerce and social media applications. There is no exception in the decentralized world. During the EVM discussion, we briefly looked at the EVM capability for storing data on Ethereum.
Although it is convenient, it is not generally intended to be used for data storage. It is very expensive too. There are a few options application developers can leverage to manage and access decentralized data and contents for decentralized applications, including Swarm (the Ethereum blockchain solution), IPFS and BigchainDB (a big data platform for blockchain). We will cover them in details in our next article series.

Swarm

Swarm provides a content distribution service for Ethereum and DApps. Here are some features of Swarm:

  • It is a decentralized storage platform, a native base layer service of the Ethereum web 3 stack.
  • It intends to be a decentralized store of Ethereum’s public record as an alternative to an Ethereum on-chain storage solution.
  • It allows DApps to store and distribute code, data, and contents, without jamming all the information on the blockchain.

Imagine you are developing a blockchain-based medical record system, you want to keep track when the medical records are added, where the medical records are recorded, and who has accessed the medical records and for what purpose. All these are the immutable transaction records you want to maintain in the blockchain. But, the medical records themselves, including physician notes, medical diagnosis, and imaging, and so on, may not be suitable to be stored in the Ethereum blockchain. Swarm or IPFS are best suited for such use cases.
A typical Ethereum DApp application architecture for such use cases may look like the diagram here:

Ethereum blockchain development

DApps can create, manage, and store data and content directly into a decentralized file system, like IPFS and Swarm, and access and retrieve the data and content using a Swarm hash. When DApps submit all transactions to the Ethereum network, the transactions can reference the Swarm resources with the referenced Swarm hash.

Internally, Swarm maintains a specific type of content addressed distributed hash table (DHT) across the decentralized nodes. File or content uploaded into the Swarm network is treated as the blobs, and chopped into different chunks. A Merkle tree is then created out of all those chunks, and is used to ensure the content integrity. Trunks are further distributed to participating nodes and stored into the DHT. When an access request is made, the content is served by the node(s) closest to the address of a chunk.
Swarm offers several APIs for accessing and managing the contents, including a
CLI (command-line interface) and JSON-RPC APIs. JavaScript packages are available through the erebos, swarm-js or swarmgw packages, which can be leveraged by most of the UI/JavaScript- based DApps.

IPFS

IPFS is similar to Swarm; it is a peer-to-peer distributed filesystem that was designed to store and share the content across a decentralized network. Both IPFS and Swarm offer the decentralized data and content storage with content addressable hash, generated directly from the content. Both are used to store any kind of content, which can be referenced from the transactions in the Blockchain network.

Behind the scenes, there are quite a few technical differences; mainly, in terms how each chop large datasets into chunks and store them in a distributed network. IPFS may be thought of as a single BitTorrent swarm, exchanging objects within one Git repository. Swarm may be seen as more integrated with the Ethereum blockchain, and has an incentivized system for content sharing. However, Filecoin can be an overlay on top of IPFS for providing a similar incentivized system.
The DApp application architecture in the Swarm section applies to IPFS too. In the same way, IPFS offers several APIs for accessing and managing the contents, including a CLI interface, JSON-RPC APIs, and an HTTP interface. JavaScript packages and Go library are available too, which can be leveraged by most of the UI/JavaScript or Go-based DApps.

BigchainDB

BigchainDB is a decentralized database combining both traditional database and data management capabilities and blockchain features. As a blockchain database, BigchainDB is complementary to other decentralized systems, such as decentralized file storage like IPFS or Swarm, and smart contract blockchains like Ethereum or EOS. It is another alternative for storing decentralized data and content. It can be used as the data storage for traditional applications, or can be leveraged as the decentralized data storage for decentralized blockchain platforms, like Ethereum. Although it can be used as a file repository, it is not recommended since it is best suited to structured or unstructured data.

Within the Ethereum community, there is a lot of interest in integrating BigchainDB with Ethereum smart contracts. Some EIPs and POCs (prototype of concept) were proposed to explore such integration options. One of the PoCs is to leverage the Oraclize service to retrieve data from BigchainDB within a smart contract. On successful retrieval of data, the smart contract evaluates and executes the logic and performs the requested operation. As the diagram shows, there are two ways a DApp can integrate with BigChainDB. One is to directly interact with BigchainDB as the decentralized data storage through HTTP GET and POST. The second option is to leverage Oraclize service in smart contracts to access external data from BigChainDB:

Ethereum blockchain development

 

This process uses the following rules:

  • BigchainDB offers several interfaces for connecting to BigchainDB servers and storing and retrieving data from the blockchain database, including a CLI interface and HTTP APIs.
  • When storing data in the database, you will need to use HTTP POST to send the data to the database server. You use the HTTP GET interface to retrieve data from the database.
  • BigchainDB also provides database drivers for developers to connect to the network servers from high-level programming languages, like JavaPython. and JavaScript/Node.js.

 

Next Article

In our next article (How Decentralized Messaging with Whisper Works in Ethereum), we discuss how decentralized messaging with Whisper works in Ethereum.

This article is written in collaboration with Brian Wu who is a leading author of “Learn Ethereum: Build your own decentralized applications with Ethereum and smart contracts” book. He has written 7 books on blockchain development.

Resources

Free Webinars on Blockchain

Here is the list of our free webinars that are highly recommended:

 

Free Courses

Here is the list of our 10 free self-paced courses that are highly recommended:

 

Self-Paced Blockchain Courses

If you like to learn more about Hyperledger Fabric, Hyperledger Sawtooth, Ethereum or Corda, taking the following self-paced classes is highly recommended:

  1. Intro to Blockchain Technology
  2. Blockchain Management in Hyperledger for System Admins
  3. Hyperledger Fabric for Developers
  4. Intro to Blockchain Cybersecurity
  5. Learn Solidity Programming by Examples
  6. Introduction to Ethereum Blockchain Development
  7. Learn Blockchain Dev with Corda R3
  8. Intro to Hyperledger Sawtooth for System Admins

 

Live Blockchain Courses

If you want to master Hyperledger Fabric, Ethereum or Corda, taking the following live classes is highly recommended:

 

Articles and Tutorials on Blockchain Technology

If you like to learn more about blockchain technology and how it works, reading the following articles is highly recommended:

 

Articles and Tutorials on Ethereum and Solidity

If you like to learn more about blockchain development in Ethereum with Solidity, reading the following articles and tutorials is highly recommended:

 

Articles and Tutorials on Hyperledger Family

If you like to learn more about blockchain development with Hyperledger, reading the following articles and tutorials is highly recommended:

 

Articles and Tutorials on R3 Corda

If you like to learn more about blockchain development on Corda , reading the following articles and tutorials is highly recommended:

 

Articles and Tutorials on Other Blockchain Platforms

If you like to learn more about blockchain development in other platforms, reading the following articles and tutorials is highly recommended: