How to Work with Ethereum Swarm Storage

Note

For a comprehensive coverage of blockchain development in Ethereum or mastering Solidity programming, taking our below self paced courses is highly recommended:

 

Recap

In our previous article (How to Run Ethereum IPFS Storage), we discussed how to run Ethereum IPFS storage.

 

In this article, we learn how to work with Ethereum Swarm storage.

 

Working with Swarm

We discussed Swarm in our previous article series,Deep Research on Ethereum, in the Decentralized data and content storagearticle. We learned that Swarm is a distributed storage platform and content distribution service, and the native base layer service of the Ethereum Web3 stack. We showed the use case for Swarm and the typical Ethereum DApp application architecture diagram for it. In this article, we will show you how to install Swarm and get started using Swarm for decentralized content storage.

Swarm is very similar to IPFS; it is made up of Ethereum as the storage layer based on the Ethereum web3 stack and supports the Ethereum Geth client. It connects to Ethereum blockchain and requires an Ethereum account. Nodes in the Swarm network use the bzz wire protocol, which is based on devp2p/rlpx as a transport protocol for communication among Ethereum nodes. devp2p/rlpx is a TCP-based transport protocol that’s used for P2P communication among Ethereum nodes. RLPx carries encrypted messages belonging to one or more capabilities to send and receive packets. One of the primary objectives of Swarm is to allow DApps to efficiently store and share their data with the end user. Swarm is still under development and not fully operational.
The following diagram shows the Swarm distributed storage architecture:

 Ethereum blockchain development in Infura

 

Swarm has a distributed chunk store, which has the basic unit of storage with a fixed maximum size (currently,this is 4 KB). The chunk store is deterministically derived from its addressed content. When any kind of readable source, such as images, texts, or video records, is uploaded to Swarm, the Swarm API layer will chop this data into fixed-sized chunks. A unique cryptographic hash is generated for each chunk. The hashes of these chunks will be used to generate another unique hash of a new chunk. Currently, 128 hashes make up a new chunk. The content gets mapped to a chunk tree. This builds up a Merkle tree, and the root hash of the tree is the address that you use to retrieve the uploaded file.

This hierarchical Swarm hash construct allows chunk data within a unique hash, providing data integrity and allowing protected random access. When the chunk is damaged or has been tampered with, the tree can tell by just hashing it.

On top of the chunk Merkle trees, Swarm provides a crucial third layer: manifest files. The manifest defines a document collection for organizing content and defines a mapping between arbitrary paths and files. The metadata in the Swarm manifests are associated with the collection, files, and media mime type. Here is an example:

{
  "entries": [{
  "hash": "4b3a73e43............. 048d",
  "contentType": "text/html; charset=utf-8", "path": "index/"
  },{
  "hash": "69b0a42a9382........... 98Ta",
  "contentType": "application/pdf", "path": "a.pdf"
  }]}

The preceding manifest file specifies a document collection; each document defines its path, unique cryptographic hash, and content type. You can think of a manifest as a dictionary that provides a service for the user and tells them what the content is and where to find it. Swarm exposes the manifest API via the bzz URL scheme.

 

One-to-One Live Blockchain Classes

Coding Bootcamps school offers One-to-One Live Blockchain Classes for Beginners.

 

The Swarm node directly connects to the Ethereum network and has the cryptographic hash associated with their bzz-account address. The nodes pool provides a distributed storage and content distribution service.
The actual storage layer of Swarm supports both localstore and netstore, and they have the following properties:

  • Localstore has both a memory store and dbstore:
    • In-memory fast cache
    • Persistent disk storage
  • Netstore does the following:
  • Implements the distributed preimage archive
  • Extends localstore to a Swarm distributed storage

In the concept of a distributed preimage archive adopted in Swarm, nodes that close to a chunk’s address actually host the data, alongside providing information about its content. The access frequency of the chunks determines what the nodes are storing. When nodes reach a certain storage limit, the oldest unaccessed chunks will be purged.

 

Next Article
In our next article (How to Install Ethereum Swarm Storage), we discuss how to install Ethereum Swarm storage.

This article is written in collaboration with Brian Wu who is a leading author of “Learn Ethereum: Build your own decentralized applications with Ethereum and smart contracts” book. He has written 7 books on blockchain development.

 

Resources

coming soon