Decentralised Storage: IPFS, Filecoin, Arweave, and the Future of Data Permanence
The centralisation of data storage represents one of the most significant vulnerabilities in modern internet infrastructure. Three cloud providers — Amazon Web Services, Microsoft Azure, and Google Cloud — control approximately two-thirds of global cloud storage capacity. This concentration creates systemic risks: single points of failure capable of disrupting millions of services simultaneously, surveillance capabilities concentrated in entities subject to governmental pressure, and pricing power wielded by oligopolistic providers with limited competitive constraint.
Decentralised storage protocols address these vulnerabilities by distributing data across networks of independent storage providers, using cryptographic techniques and economic incentives to ensure data availability, integrity, and permanence without relying on any single entity. For the Web3 ecosystem, decentralised storage provides the data layer that complements blockchain’s transaction layer — together forming the infrastructure for a genuinely decentralised internet.
The Architecture of Decentralised Storage
Decentralised storage systems share common architectural principles whilst differing substantially in implementation.
Content addressing replaces location-based addressing. Traditional storage identifies data by where it resides (a specific server, path, and filename). Decentralised storage identifies data by what it contains — a cryptographic hash of the content itself serves as the address. This seemingly simple shift carries profound implications: content-addressed data is automatically deduplicated, inherently verifiable (any corruption changes the hash), and location-independent (the data can be retrieved from any node holding a copy).
Data distribution spreads content across multiple nodes. Rather than storing a file on a single server (or a single provider’s replicated servers), decentralised systems distribute data fragments across geographically dispersed nodes. This distribution provides redundancy (no single node failure causes data loss), censorship resistance (no single authority can remove content), and performance optimisation (data can be retrieved from the nearest available node).
Economic incentives motivate storage providers. Decentralised storage systems cannot rely on corporate infrastructure budgets or advertising revenue. Instead, they create token-based incentive systems that compensate storage providers for reliably hosting data. These incentives must be carefully calibrated — sufficient to attract reliable storage capacity, not so excessive as to inflate costs beyond competitive levels.
Verification mechanisms ensure data integrity and availability. Storage providers must demonstrate that they continue to hold the data they committed to store. Various cryptographic proof mechanisms — proof of replication, proof of space-time, proof of access — enable continuous verification without requiring the complete re-download and comparison of stored data.
IPFS: The InterPlanetary File System
IPFS provides the content-addressing and distribution layer that serves as the foundation for much of Web3’s storage infrastructure. Developed by Protocol Labs and operational since 2015, IPFS implements a peer-to-peer network for storing and sharing content-addressed data.
IPFS operates through several key mechanisms.
Content Identifiers (CIDs) — cryptographic hashes that uniquely identify content — serve as the addressing system. When data is added to IPFS, it receives a CID derived from its content. Any node holding the data can serve it to requesters who know the CID, and any requester can verify that received data matches the requested CID by recomputing the hash.
The Distributed Hash Table (DHT) maintains a mapping between CIDs and the network nodes hosting corresponding content. When a client requests a CID, the DHT directs the request to nodes that have announced they hold the requested data.
BitSwap — IPFS’s data exchange protocol — manages the actual transfer of data between nodes. BitSwap implements a credit-based system that incentivises nodes to share data whilst protecting against freeloading.
Pinning determines data persistence. By default, IPFS nodes may garbage-collect data they have retrieved but no longer need. Pinning instructs a node to retain specific content indefinitely. Pinning services — both centralised (Pinata, Infura) and decentralised (Filecoin, Crust) — provide persistent storage guarantees beyond what individual node operators can offer.
IPFS is extensively used for storing NFT metadata and media, DApp front-end hosting, and decentralised website deployment. Its content-addressing model is particularly well-suited to NFT applications, where the content hash ensures that the media associated with an NFT cannot be modified after minting.
Filecoin: Incentivised Storage
Filecoin, also developed by Protocol Labs, adds an economic incentive layer atop IPFS infrastructure. Where IPFS provides the network protocol for content distribution, Filecoin creates a marketplace where storage providers are compensated for reliably storing data over specified time periods.
Storage deals are the fundamental unit of the Filecoin economy. A client specifying storage requirements (data size, duration, redundancy) is matched with storage providers willing to host the data at agreed prices. The deal terms are recorded on the Filecoin blockchain, creating an enforceable contract between client and provider.
Proof of Replication (PoRep) demonstrates that a storage provider has created a unique physical copy of the client’s data. This proof prevents providers from claiming to store data they have not actually replicated, ensuring genuine redundancy.
Proof of Space-Time (PoSt) provides ongoing verification that stored data remains available over time. Providers must periodically generate cryptographic proofs that they continue to hold the data, with failure to prove resulting in collateral slashing — the forfeiture of tokens staked as performance guarantees.
Filecoin’s economics create a competitive storage marketplace. Storage prices are determined by supply and demand — when storage capacity exceeds demand, prices fall; when demand outstrips capacity, prices rise. This market-driven pricing has produced storage costs that are competitive with centralised alternatives for many use cases, particularly for archival and cold storage applications.
Arweave: Permanent Storage
Arweave takes a fundamentally different approach to decentralised storage: rather than offering time-limited storage deals, Arweave provides permanent storage through a single upfront payment. Data stored on Arweave is intended to remain accessible indefinitely — not for years or decades, but forever.
This permanence claim rests on a distinctive economic model. The upfront storage fee exceeds current storage costs, with the surplus placed in an endowment that generates returns used to fund ongoing storage as hardware costs decline over time. Arweave’s economic modelling assumes that storage costs will continue their historical decline (approximately 30% per year), enabling the endowment to sustain storage indefinitely.
The Permaweb — Arweave’s application layer — enables permanent websites, permanent documents, and permanent applications. Content deployed to the Permaweb remains accessible at its original address regardless of the deploying entity’s continued existence or willingness to maintain the content.
Bundling protocols (notably Irys, formerly Bundlr) provide scalable data ingestion, batching multiple storage transactions for efficient on-chain settlement. These protocols have made Arweave practical for applications generating high volumes of small data objects — social media posts, IoT sensor readings, and real-time data streams.
Arweave’s permanence model is particularly relevant for applications where data longevity is paramount: legal documents, cultural archives, scientific datasets, and historical records. For NFT art, Arweave’s permanence guarantee addresses the concern that NFT media stored on less permanent systems might become inaccessible over time.
Comparative Analysis
The choice between decentralised storage systems involves trade-offs across several dimensions.
Cost structure — IPFS is free for data distribution but provides no persistence guarantees. Filecoin charges recurring storage fees competitive with cloud storage for cold data. Arweave charges a single upfront fee that is higher than short-term alternatives but potentially economical for data requiring decades of storage.
Persistence guarantees — IPFS provides none natively (data persists only as long as nodes choose to host it). Filecoin guarantees persistence for the deal duration, typically measured in months or years. Arweave targets permanent persistence, though the practical meaning of “permanent” in a system that has existed for less than a decade remains a matter of reasonable debate.
Performance — IPFS provides the fastest retrieval for popular content (cached across many nodes) but slower retrieval for rare content. Filecoin prioritises storage reliability over retrieval speed, with retrieval requiring unsealing operations that introduce latency. Arweave provides consistent retrieval performance through its gateway infrastructure.
Decentralisation — All three systems are more decentralised than centralised cloud storage, but the degree varies. IPFS’s open participation model allows anyone to run a node. Filecoin’s hardware requirements (significant storage capacity and computational power) create barriers that concentrate participation among professional operators. Arweave’s mining requirements similarly favour well-resourced participants.
Enterprise Adoption
Enterprise adoption of decentralised storage is driven by specific use cases where decentralisation provides advantages beyond ideological preference.
Compliance and audit trails — Immutable, timestamped storage of compliance documentation provides audit trails that regulators trust precisely because they cannot be retrospectively modified. Financial institutions subject to record-keeping requirements are exploring decentralised storage for compliance documentation.
Data sovereignty — Organisations operating across jurisdictions face conflicting data residency requirements. Decentralised storage with geographic pinning controls allows data distribution that satisfies multiple jurisdictions’ requirements simultaneously.
Censorship resistance — Media organisations, human rights organisations, and whistleblower platforms use decentralised storage to ensure that published content cannot be removed through pressure on hosting providers.
Cost optimisation — For archival data with infrequent access requirements, decentralised storage can be more cost-effective than cloud provider archival tiers, particularly when considering the multi-decade storage horizons common for regulatory compliance.
Swiss enterprises benefit from the country’s robust data protection framework (FADP) when implementing decentralised storage. The legal clarity around data handling, combined with proximity to Web3 infrastructure development teams, provides advantages for enterprises navigating the technical and regulatory dimensions of decentralised storage adoption.
Challenges
Data availability in purely decentralised systems depends on economic incentives remaining sufficient to motivate storage providers. If token values decline or storage market dynamics shift, providers may exit the network, potentially compromising data availability.
Data retrieval performance remains inferior to optimised centralised solutions for latency-sensitive applications. CDN-backed cloud storage delivers content in single-digit milliseconds; decentralised systems typically operate in hundreds of milliseconds to seconds.
Regulatory compliance — particularly data deletion requirements under GDPR and similar frameworks — conflicts with immutable storage models. The right to erasure is conceptually incompatible with permanent storage, requiring architectural solutions (encryption with key deletion, off-chain data with on-chain pointers) that add complexity.
User experience for non-technical users remains poor. Interacting with decentralised storage systems requires understanding content hashes, pinning services, and gateway infrastructure — concepts alien to users accustomed to drag-and-drop cloud storage.
Outlook
Decentralised storage is transitioning from ideological alternative to practical infrastructure. The maturation of IPFS, Filecoin, and Arweave has produced systems capable of serving enterprise requirements, not merely crypto-native applications.
The most significant near-term development is likely the convergence of decentralised storage with traditional cloud infrastructure. Hybrid architectures — using decentralised storage for permanence and censorship resistance whilst leveraging centralised CDNs for performance — offer the practical advantages of both models. This convergence, rather than wholesale replacement, represents the most probable path to mainstream adoption.
For Web3’s broader development, decentralised storage is foundational infrastructure. Without reliable, permanent, censorship-resistant data storage, the promises of decentralised applications, identity systems, and governance structures rest on fragile centralised substrates. The maturation of decentralised storage removes one of the most significant infrastructure dependencies constraining Web3’s evolution.
Donovan Vanderbilt is a contributing editor at ZUG WEB3, the decentralised protocol intelligence publication of The Vanderbilt Portfolio AG, Zurich. He covers Web3 infrastructure, decentralised protocols, and the technical foundations of the decentralised internet.