It’s never a good idea to store big chunks of data on a blockchain. For starters this is simply impossible, as the amount of data that may be included in a transaction is limited.
But even if you could store big chunks of data on a blockchain, it would be prohibitively expensive in terms of cost and resources, as every peer in the network would have to store those pieces of data that you chose to store on-chain. Moreover, if your plan is to do all of this over a public blockchain network, everything that you store on-chain would become visible for every peer in the network. That makes storing sensitive information a terrible idea, unless you have a fetish for disclosing all your secrets to the world.
You may be wondering, “How can someone implement a decentralized application that requires the storage of large chunks of data, if those data can’t be stored directly on-chain?” Fortunately, decentralized storage solutions like IPFS already exist within the Web3 ecosystem to help us with this goal.
IPFS (InterPlanetary File System) is a distributed system for storing and accessing files, websites, applications, and data. It is a public network, which means that anyone can start downloading and storing content on the network right away. IPFS is a content-addressable network, so all content stored in the network is identified by a unique identifier called the Content Identifier, or CID. The CID of some content is derived from the hash of the content. This means that if the content changes, its CID changes, so different versions of the same content will have completely different identifiers.
To download content from the IPFS network, we need to tailor a request specifying the CID of the content we want to download. The IPFS client will take care of the rest, leveraging the network’s underlying protocols. It will find the peers in the network that are storing the content we are looking for, and download it for us. “Get” requests, which is what we call download operations in IPFS, are usually specified through a link that looks like this:
In the, end downloading data from the IPFS network means providing IPFS clients with one of these links. If you want to learn more about the specifics of IPFS, you can check these tutorials out, or the IPFS doc page.
So why is IPFS blockchain’s best friend for dealing with data? The fact that content in the IPFS is uniquely identified, and that if the data in any piece of content changes its CID also changes with it, means that data in the IPFS network is immutable. These CIDs are just a few bytes long, and can be stored on-chain and used in smart contracts to point to data stored in the IPFS network. With this functionality, we don’t need to store large chunks of data on-chain anymore. The only thing stored and managed on-chain becomes the CID of the corresponding data. If a user needs to access the specific content (and not just the identifier), he can do so by making a request to the IPFS network for that CID. Cool, right?
But enough with the theory behind IPFS. What are some good use cases for this integration between IPFS and the blockchain? A perfect example of the need of decentralized storage in the blockchain are NFTs (Non-Fungible Tokens).
NFTs are used to represent one-of-a-kind digital assets. When an NFT represents a collectible crypto-cat, all the specifics of that kitty NFT can be stored directly on-chain. But what happens when what we are minting as an NFT is a digital asset like a song, an image, or a deep learning dataset? After all, we can’t store such items directly on-chain.
Here is where IPFS and content-addressable decentralized storage solutions excel. We can store our digital asset in IPFS, and then use the CID (and maybe some additional metadata), to mint the NFT. Anyone can validate the ownership on-chain, and access the asset in question in the IPFS network. You can see an illustration of how this process would work at nft.storage, and in the image below.
Do you need to operate an Ethereum node and an IPFS node to implement and orchestrate these kinds of use cases and interactions in a decentralized application? Well, not really. There are several alternatives: you can use IPFS gateways for the IPFS side of things (like Pinata, Infura, or Textile), or even delegate the operation of all your nodes to someone else. What is clear is that even taking those steps, the operation of storing your asset and minting your token cannot be done atomically.
I was reflecting on the idea of atomicity of operations between decentralized storage systems and blockchain platforms when I realized something. A few weeks ago I wrote a comparison between different Optimistic Layer 2 solutions. In that article, one of the platforms looked at was Metis. Metis is an optimistic L2 rollup solution.
One of the Metis features that caught my attention was its VM integration with IPFS. According to the Metis whitepaper, the company supports decentralized storage “out-of-the-box” in its VM through an IPFS resolver. The idea of atomically interacting with the IPFS network and making transactions on-chain is something that really interested me. Atomic operation in IPFS and on-chain seemed like an impossible thing to me, but this may actually be possible in the L2 world. I decided to look deeper into Metis technology to understand if this kind of atomic operation would be possible using the company’s L2 solution.
Metis includes two types of storage in its VM. The regular VM storage, responsible for storing the blocks and account states; and a special storage layer that integrates with IPFS (see figure below). Metis leverages IPFS cluster technology. IPFS cluster nodes are regular IPFS peers that can run private sub-networks, so that the data stored in the IPFS cluster is not shared with public peers from the public IPFS network (making it really convenient for the storage of sensitive information). IPFS cluster nodes can choose to store content in the public network, or restrict content access to one of its connected subnetworks.
Content stored in IPFS may be accessed from Metis through the IPFS resolver of the VM. When a user invokes a method that needs to interact with the special storage layer, the IPFS router in the VM intercepts the corresponding operations, and sends them to the IPFS network through the IPFS Resolver. The IPFS resolver behaves as an IPFS client and is also responsible for encrypting both the data and the final CID of the content (if the information needs to be private), so it can be committed on-chain without privacy and security worries.
VM architecture (Source: Metis whitepaper)
To illustrate how all this integration works, let’s use an example. Imagine that you want to mint an NFT for your new song on the Ethereum network using Metis. If the NFT factory smart contract is already deployed and in place, the only thing that you need to worry about is triggering the right operations to store the song in IPFS, and mint the NFT. The Metis VM will be responsible for intercepting the IPFS operation, encrypting the data (if necessary), and interacting with the IPFS network to store the song. The result of this operation is the CID of the song, which is then used in the L2 transaction sent to mint the song. This L2 transaction is then rolled up in L1, and eventually persisted in the Ethereum network. In this way, the Metis node manages all the interactions necessary to atomically store data in the IPFS network and persist the result on the blockchain.
Another interesting part of this integration, which is specific to Metis, is that DACs (Decentralized Autonomous Companies) can use these IPFS layers to store sensitive information in a decentralized way, without having to rely on centralized storage systems. The data are conveniently encrypted with the corresponding DAC credentials. Furthermore, when a DAC is created for the first time, a new “charter” is also created to determine the rules of that DAC. In the charter, the DAC creator can include access rights and the operation permissions for this IPFS and sensitive data storage integrations.
Let’s imagine that a big retail company is using Metis to track the entire lifecycle of its products, from production to distribution to sale. (There are already numerous companies using blockchain technology for this purpose, such as Carrefour, Costco, and Maersk.) A smart contract in a blockchain network is used by every party involved in the lifecycle of the product. Every status update in the lifecycle of the product is conveniently registered on-chain. These updates can include information such as: the time when a specific entity in the supply chain manipulated the product, how that was done, and what the next step (or owner) is in the chain. All of this can be done today with any blockchain network with support to run smart contracts. Unfortunately, in real life all of these interactions are governed through legal contracts, and acknowledged by “real-life documents” such as delivery notes.
One of the added values of having a blockchain network orchestrating these interactions is that all entities have a common information system that stores all the supply chain information. But what happens with the documents related to the actions performed on the blockchain? They need to be stored somewhere else.
This is where solutions like Metis work like a charm. These documents may include sensitive information, so they can’t be stored in the clear in a public network. Even more, presumably not every document should be accessible by every party. Through Metis IPFS integration, every DAC involved in this supply chain use case is able to perform the transaction to trigger an update to the state of a product in the blockchain, while storing the corresponding document to the IPFS network.
As described above, these documents would be conveniently encrypted with the keys that give access to the document exclusively to the right entities. The status update in the smart contract would add a pointer to the document’s CID in case anyone wants to check the “real-life document” associated with the product status update. In this process, DACs will be able to determine which other DACs or entities have access to these documents. Our companies are thus able to share a common information system which is consistent with their state on the blockchain, without having to worry about implementing additional schemes or having to maintain an independent system for document storage.
It should be clear for everyone by now exactly how important decentralized storage systems are for the success of Web3. But something that people don’t realize when thinking about L2 solutions is that they are not exclusively for scalability; they actually perform a much broader range of functions. L2 platforms can become across-the-board enhancements when compared to L1 in terms of both scalability and in terms of features. This article offers a clear example of that additional functionality, in the form of integrated decentralized storage.
Decentralized Applications increasingly need to store large amounts of data in an immutable way, leveraging blockchain technologies and decentralized storage systems. L2 solutions can take this opportunity to inject additional features into L1 networks, just as Metis has done with its IPFS integration.
I can’t wait to see what comes next in the L2 ecosystem. Are you aware of any other cool projects with innovative L2 ideas? Do not hesitate to ping me :).
Follow us on:
Twitter: https://x.com/MetisL2
Medium: https://metisl2.medium.com/
Github: https://github.com/MetisProtocol
Hackernoon: https://hackernoon.com/u/MetisDAO
Telegram community: https://t.me/MetisL2
Instagram: https://www.instagram.com/metisl2/
Reddit: https://www.reddit.com/r/METIS_IO/