Data Availability (DA) does not mean permanent storage. Blockchain does not have an obligation to permanently store data. This article will clarify the misunderstandings about the data availability layer in plain language. The role of DA is to enhance security, not to store historical data.
Table of Contents
– Saving historical data does not equal security.
– What is data availability?
– What data availability is not.
– DA: Data availability only ensures complete data publication.
– DS: Data storage and historical indexing.
– Why is data availability important?
– Fraud-proof mechanism.
– Validium escape hatch requires the latest state.
– Data availability is a crucial pillar of network security.
– The debate in the market about whether data availability should be built in external projects like Celestia or remain on Ethereum is not about where historical data should be stored, but about network security.
Some readers may question, “Isn’t historical data storage the same as security for Layer2?” In fact, historical data is not the most important consideration for Layer2 security. What Vitalik insists on is not historical data storage.
If there are such questions, it means that the usage and definition of the data availability layer have been misunderstood.
Data availability does not guarantee permanent access to all historical data for users or nodes. Data availability projects such as Celestia or EigenDA provide temporary storage space, which fundamentally differs from decentralized storage facilities like Arweave, which permanently store data, although both provide hard drives.
Because existing mainstream Rollups consider Ethereum as DA and compress complete transaction information before uploading, it leads to the misconception that data availability represents permanent storage. However, with recent upgrades like Dencun, which introduces EIP-4844, outdated Rollups transaction data will also be deleted. Because the original purpose of uploading was not for permanent storage.
Data availability only guarantees access before blocks are finally confirmed, providing a basis for Ethereum to arbitrate disputed blocks. Therefore, some people believe that the name should be changed to “Data Publication” (DP).
For example, if a node on Arbitrum discovers an error in blocks transmitted by other nodes and publishes a fraud proof, there needs to be accurate data for Ethereum to compute and arbitrate. Without DA ensuring data availability, the fraud-proof mechanism cannot proceed.
Once a transaction is finally confirmed, such as a block confirmed by over 2/3 of nodes in the Ethereum network (approximately 60 or more new blocks becoming the longest chain), it will be permanently confirmed. Once confirmed, there will be no further disputes, only consensus. DA for related transactions will no longer be necessary. This is why EIP-4844 decides to periodically delete this data because permanent storage is irrelevant to its purpose.
Some readers may find it strange that if data availability means ensuring complete data publication on the network, which will be deleted or not guaranteed access after a period of time, what should be done if the complete transaction history of Rollups needs to be accessed for special reasons? This is where data storage comes into play.
However, blockchain historical data storage is not a very important issue.
As long as any party among all the nodes, due to interests or other reasons, voluntarily preserves complete transaction data, it can be ensured. For example:
– Blockchain explorers: because it is a critical resource for their business.
– Rollups projects: because it is a service for users.
– Enthusiastic users: because they want the industry to improve.
On the other hand, the design of most nodes will retain block header data (including transaction block hash values), which allows easy verification of authenticity when obtaining complete historical data from a party.
To summarize, the assumption of data storage and historical indexing is close to 1/N. As long as the network is large enough (with N nodes), as long as one node is willing to provide complete data, it can almost guarantee that anyone can obtain correct historical data from somewhere.
Next, let’s explain why DA is crucial for Rollups or validium, to the point where L2BEAT considers DA as one of the five major risk models.
The fraud-proof mechanism relies on complete transaction information to operate.
Even in extreme cases where all nodes collude and stop sending information to an honest node, without ensuring data availability, the honest node cannot distinguish whether it is due to unstable network connection or an attack on the network. The assumption of 1/N trust for fraud-proof will not hold.
Most Layer2 networks have anti-censorship withdrawal mechanisms, such as the escape hatch, which suspends the network for a period of time when a user’s withdrawal request is consistently ignored or maliciously rejected by the sequencer. If the forced withdrawal function is also ignored by nodes and no response is received for several days, the user can initiate an emergency button.
When the escape hatch is activated, all transactions on the network are suspended, but users can withdraw based on the state root to achieve an anti-censorship withdrawal mechanism.
However, Validium requires the latest state root, which requires at least one node to provide it. With reliable DA, users can better ensure access to the state root and make withdrawals.
Therefore, from these two scenarios, it can be understood that data availability plays a very important role in the Layer2 ecosystem, even though it is not responsible for permanently storing data. It remains a crucial security component.
The function of the data availability layer is not to provide complete transaction history but to ensure the smooth operation of the network and the security of user assets by providing the state before transactions are finally confirmed. It is a crucial key to the security model of Layer2.
Without DA, in extreme cases, no matter how well-designed the transaction proof mechanisms (fraud proof, zero-knowledge proof) are, they are basically useless. This is why data availability is so important, and it is not surprising that many Ethereum developers do not agree with external DA.
While projects like ZkEVM, fraud proofs, and ecosystem development are heavily promoted, it is important to always consider whether these fundamental infrastructures can ensure security.
DA
Data Availability
Data Publication
Data Storage