TLDR: Zero Knowledge Proofs (ZKPs) unlock the ability for users to receive value for their personal data in a cost-effective, privacy-enabled way. zkTLS is one such application of ZKPs, enabling users to make existing (off-chain) data and credentials from web2 apps verifiable without revealing the contents of that data. We’re especially excited about a new category of crypto networks we see coalescing around users providing and receiving value from their data via ZKPs — we propose to call these DeData networks. Like DeFi and DePIN before it, DeData networks will enable new modalities of application experiences, capital market formation, and value exchange – with AI serving as the source of structural demand.
Users today are largely unable to receive value for the data they share online, despite it driving trillions of dollars in economic activity. Why not?
Data in web2 lives behind closed APIs operated by companies whose profit center comes from extracting value from this data.
Data in web3 is open but isn’t able to meaningfully capture value because it's public and isn’t cost-effective to query (requires gas).
But tides are shifting. Regulations such as GDPR are now demanding companies provide users ownership of their data. Businesses, specifically in web3, are beginning to provide incentives to users for their data. As these tailwinds grow stronger, users will recognize the power of their data and require tools to be able to own and receive value from it. ZK (and blockchains) helps solve this!
Zero knowledge proofs (or ZKPs) enable users to safely disclose specific properties about their data for a third party to verify without leaking unintended information. This unlocks the ability for users to capture value from their data as it becomes verifiable while still being private. Real-life applications of ZKPs are rapidly emerging, from digital ID management for 3.6MM Argentians and on-chain account recovery by email. One nascent application of ZKPs that has garnered attention recently is zkTLS: let’s explore what it is and why it serves as a stepping stone to a far bigger endgame.
zkTLS, often referred to as “web proofs”, is an application of ZKPs which allow users to verifiably demonstrate the existence of personal data or credentials served by an API, such as those that power consumer websites or mobile apps, in a manner that ensures both its privacy and authenticity (via an attestor/notary). For example, zkTLS allows a user to prove that they have a certain number of Twitter followers or if their bank balance is over a certain amount by generating a ZKP of the Twitter or Bank of America API response respectively generated in a SSL/TLS session. While the UX to generate these web proofs today is fairly clunky, it provides builders powerful new mechanisms to bootstrap:
User Base - zkTLS allows builders to leverage the giant trove of existing web2 credentials without needing to rely on an external API. For example, a new DEX might offer zero fees to users who can prove they have traded meaningful volume at Coinbase or Binance. A web3-powered social network might “vampire-attack” an existing social network by providing added rewards to influencers with a large audience. Nosh, a web-powered food delivery network, is leveraging this to kickstart their network by enabling restaurants and drivers to port over their reputation data from platforms such as DoorDash.
Marketplaces - zkTLS allows builders to bootstrap custom marketplaces which offer improved UX, lower fees, and safer transfers on top of existing inventory. For example, ZKP2P is creating its own ticket reselling marketplace on top of Ticketmaster using ZKPs and a smart contract on-chain as escrow. Further, zkTLS introduces new capabilities for these marketplaces to stitch together web2 + web3 interactions such as accepting Venmo payments for tickets (users can generate a proof that they sent Venmo as payment which must be verified onchain before release of inventory),
Networks — zkTLS allows users to contribute verifiable proofs of their existing web2 data and credentials to networks in exchange for rewards (we cover this more in Passive DeData Networks below).
While zkTLS enables application builders to innovate especially on GTM, there’s a novel, much larger area of opportunity ZKPs enable that we’re especially excited by: token-enabled networks with data as the underlying primitive. These networks combine the magical properties of ZK along with the core unlock of blockchains (permissionless value exchange) to enable users to create and earn meaningful value from their data.
The birth of decentralized finance networks, or DeFi, coalesced around a singular and simple premise: provide users the ability to receive more value by bringing a previously off-chain asset, in this case money, on-chain — thanks to capital market formation and coordination mechanisms (including the ability to speculate) enabled by blockchains. Similarly, decentralized physical infrastructure networks, or DePIN, provide users the ability to receive more value by bringing a previously off-chain asset (physical infrastructure such as GPU chips or internet bandwidth) on-chain and de-risking these networks by bootstrapping the supply side.
Continuing this trend, we believe a new type of network will emerge that provides users the ability to receive more value by bringing a previously off-chain asset — their personal data — on-chain. We propose to name these networks decentralized data, or DeData, networks. The idea of coordinating around user-owned data isn’t a new concept (DataDAOs have been around since at least 2022). In our view, these types of networks weren’t previously successful for two primary reasons:
Lack of Demand – Marketplaces end up being extremely difficult to create (and sustain) without overwhelming demand, often in the form of a well-packaged consumer product. As much as we might think our personal data would be valuable to a person or business, there historically hasn’t been meaningful demand for it.
Lack of Privacy – Prior incarnations didn’t enable users to keep their data private (which erodes durable value capture), adding friction to a supply side that had demand-side challenges to begin with.
So why is now different? In short, AI and ZKPs. DeData networks will leverage ZKPs as the core primitive, or node, of the network and operate as a marketplace with AI being the new structural source of demand:
The “supply side” will consist of ZKPs per piece of user data either individually persisted on-chain or persisted off-chain with the aggregated proof persisted on-chain. Because of the verifiability ZKPs provide, it is now possible to bootstrap the supply side via token incentives similar to DeFi or DePIN. One way to frame this is to think of data as another form of liquidity. DeData protocols could create liquidity pools similar to DeFi around specific pockets of data to properly price the network density required for specific demand-side needs. In this design, users who submit data to these networks may receive a similar LP-style token that shows ownership of this data, provides the ability to claim rewards (or speculate on the amount of rewards), and enables them to ‘opt-out’ by burning this token.
The “demand side” will consist of entities who are willing to pay to query the network’s verifiable data (likely off-chain via ZK co-processors). We believe the Cambrian explosion of LLMs (both foundational and narrow) as well as autonomous AI agents such as truth_terminal offer the missing demand that these networks have historically struggled with. For example, the size of the data set to train Llama 3 is ~15T tokens (roughly equivalent to the corpus of publicly available data). In a world where agents and models must specialize to compete, the training data must also become specialized. As AI quickly exhausts publicly available training sets, there will be increased demand to access specific niches of private data (~600-1200T tokens) to differentiate. In fact, we suspect the most successful networks will be the ones focused around a well-defined niche in order to achieve the level of data density needed for the demand side to find it valuable.
DeData protocols will likely support some form of multi-touch attribution (MTA) by disseminating revenue amongst the nodes of the network that were accessed by the demand side. For example, if a query accessing 90 ZKPs on a protocol with a 10% take rate generates $100, $10 would go to the protocol and $1 might be dispersed to each of the owners of the ZKPs accessed.
DeFi primitives such as trading, derivatives, and prediction markets will be created on top of these networks powered by blockchains. Trading is specifically interesting because it is only possible as a result of a core benefit of ZKPs: privacy. A user might trade the network ownership of a particular piece of data to someone else to earn yield without needing to reveal the data itself.
One category of DeData Networks will be based on verifiable, non-native data that users contribute to the network via tools such as zkTLS. We call this category of DeData networks passive because users can passively contribute their data to the network from 3rd party applications, vs. a native application they are actively using. For example, Vana is creating an ecosystem of these networks where users can contribute their data from Reddit, Twitter, or Tindr to various networks to earn rewards.
We’re excited to see what types of networks in this category get built and what data proves to be valuable. Some interesting ideas include:
Networks built on consumer finance data such as your shopping history. For example, imagine a shopping LLM that enables users to find exactly what they’re looking for while earning $ from surfacing personalized ads thanks to its Chrome extension which auto-generates zkTLS proofs of your past and ongoing online purchases from Amazon, Target, etc.
Networks built on health data from Apple Health, 3P fitness or sleep trackers, etc
Dating app that leverages web2 credentials and data to offer a more personalized matching algorithm
“Your margin is my opportunity” — Jeff Bezos
In contrast to passive DeData networks, active DeData networks are vertically integrated such that all user-owned data that comprise the network is created from an application natively and explicitly designed to power this network. For example, a founder today building a health wearables or clinical trials company may design this as a DeData network where user data is stored on-chain as openly attributable ZKPs from day one. The benefits of this approach:
Revenue share with users (compared to existing competitors who capture 100% of revenue) in native tokens, aligning incentives by providing your users equity in the network
Bootstrap supply side with token incentives
Provide transparency of demand side so users know who their data is being sold to
Allow users to opt-out of data sharing
Minimize risk of data leak
While likely further out on the time horizon than passive DeData networks, I expect active DeData networks to become the big winners of this space. Users ultimately want to use apps which provide value to them. If they can do this AND receive incremental value in the form of rewards, all the better!
Thanks to ZKPs and the rise of AI, DeData networks offer a nascent, yet immense area for innovation. While there are many challenges and unknowns including identifying what types of data will monetize effectively, staleness issues, and marketplace dynamics, we at ZKsync as OG builders, researchers, and stewards of all things ZK are incredibly excited about the potential of these networks and the applications they will support.
If you are building in this area and/or this piece resonated with you, we’d love to hear from you and support you in building your app/network on ZKsync Era! Please DM me directly on TG (@karthiksenthil) or email me at kse@matterlabs.dev.
Special thanks to Porter and Alex G from Matter Labs, Regan + Mike + Pierre from Lattice, Tracy from Pluto, Richard from ZKP2P, Kyle from Multicoin, Ismael from LaGrange, Mike from Blockworks, Marek and Hubert from vlayer, Sal from EV3, Aaron from Telah for their valuable feedback and contributions to this piece.