Announcing Spice.xyz — Data and AI infrastructure for web3
Spice is data and AI infrastructure for web3. Now available in Preview 🚀
Designed for apps and ML, Spice enables a new class of data-driven applications
Data is fundamental, but web3 data is hard to get and use
Developers can query web3 data with SQL over high-performance Apache Arrow APIs
Data and AI infrastructure is the first step in building an intelligent application platform
Ironically, while most blockchains are public and open, extracting data from them to power apps and ML is painfully difficult. If data is the new oil, then trying to use blockchain data in apps is like trying to fill up your car with a dinosaur. To get meaningful quantities of useful data, you are forced to build and operate massively complex infrastructure.
That means blockchain nodes, big data systems, ETL and ML pipelines, data lakes, data warehouses, and query systems. You also have to understand virtual-machine logs, the inner workings of smart contracts, and their state across a huge array of blockchains, scaling layers, and dApps.
It’s complex, error-prone, and costs massive resources and engineering time that have nothing to do with your core business.
One of the largest U.S. crypto hedge funds told us they have no choice but to source on-chain data externally. It’s simply not feasible to build and operate all the required infrastructure, even at their scale. It’s a notable admission for an industry notorious for keeping everything in-house.
We’re also seeing increasing demand for data to power AI for NFT marketplace recommendations, web3-native marketing, social network analysis, financial and trading analysis, fraud and security monitoring, and many others.
Developers need a solution for applications and ML because they can’t build data-driven apps and experiences without data.
So we built Spice.xyz.
Spice is data and AI infrastructure for web3.
It’s web3 data made easy.
And purpose-designed for applications and ML.
Create data-driven experiences
Imagine this. A user visits an NFT marketplace like OpenSea or Gamestop’s new NFT Marketplace. Their wallet has built-in AI-driven security and fraud monitoring and displays a verification seal, giving them the peace of mind to connect.
After connecting, in their first visit to the site, they are shown a curated set of relevant NFTs based on their public wallet history. Wow! They can discover their next great NFT and get smart info, like an AI-driven estimate of its value.
They purchase some cool NFT sunglasses and are excited to use them in their next game. Because the NFT was minted on Immutable’s platform, they can take those same glasses to a completely different game or metaverse.
Combining public, private, and cross-chain data enables compelling experiences across sites, games, and metaverses. The unique data-native aspect of web3 and blockchain applications aligns perfectly with building intelligent data and AI-driven applications.
And building these experiences is possible when you leverage a data and AI platform designed for applications and ML.
How it works
Spice delivers data over high-performance Apache Arrow APIs to your application, notebook, or ML pipeline.
SQL queries can span multiple chains and even your own datasets.
We’re also working on applying ML to create out-of-the-box intelligent datasets, like web3 cohorts, community identification, and social network analysis: areas of massive, untapped value.
Built with a foundation of Apache Arrow and Flight
Apache Arrow is a specification for an in-memory columnar data format that’s very efficient for analytics operations. Arrow’s zero-copy read semantics coupled with the Flight client-server framework means extremely fast and efficient data transport and access without serialization overhead.
Spice is built with a foundation of Arrow and Flight. You benefit from the same high-performance data operations, and you can leverage the Apache Arrow ecosystem of popular projects like pandas, NumPy, and Apache Spark.
“XMTP is enabling web3 messaging between wallets. Spice enables us to leverage the rich Python ecosystem of data science and ML tools to build the best web3 data-driven experiences possible.”
Accelerated data access on an open lakehouse platform
Real-time data processing architecture is critical for next-generation intelligent applications. Spice uses an open lakehouse platform with Apache Parquet files stored in cloud data lake storage (S3, ADLS, Google Cloud Storage) combined with time-series databases. Data from blockchain nodes are coalesced and consolidated into the open lakehouse platform, enabling high-performance queries against very large data volumes to get subsecond responses. Data is accessed from the platform by elastic compute engines such as Apache Spark (batch), Python (ML), Apache Kafka (streaming), and Dremio (SQL).
Current solutions don’t meet the need of apps and ML
Today, developers often use Dune Analytics (80M raised, 1B valuation), an awesome product that we love, but one that’s designed for analytics and not specifically for applications and ML.
Similarly, The Graph (70M raised, 7B market cap) is also an awesome project that we even use internally. Still, it is not designed for real-time, low-latency applications or fetching millions of rows of data.
Alchemy ($560M raised, $10B valuation) and Infura ($725M raised, 7B valuation) are great platforms (we’re Alchemy Certified Infrastructure) that provide node and point data APIs. Great for targeting dApps and basic analytics, but not designed for bulk data and machine learning.
These services are fantastic building blocks but we need a solution purpose-designed for apps and ML.
Common use-cases with new challenges
Consider how marketing, gaming, marketplaces, security, financial services, trading, social networks, investing, and communications all need access to data. Data-driven solutions in these fields are already common and assumed in web2 but have entirely new challenges in web3. Every company will need easy access to blockchain data to meet these challenges.
We’re working with XMTP on web3 native marketing, wallet-to-wallet security, social network analysis, and community recommendations. These are areas with mature solutions in web2 but with significant challenges in the pseudonymous world of web3. They also apply to marketplaces, social networks, wallets like Metamask or Brave Wallet, and more.
For example, take one very common scenario today — email newsletters. For a DAO to send each of their members a message through XMTP, they need a bulk data-friendly way to query for their member addresses.
Similarly, metastitch is building web3-native marketing and NFT analytics, so data is critical to their business. Using Spice combined with popular Python data science libraries like pandas and NumPy they built a new class of NFT analytics in less than a week.
Investors, hedge funds, and traders familiar with high-speed data feeds might be surprised that many crypto funds can only use manual queries and CSV today. Top crypto funds we’re working with love Spice’s data science, ML, and time-series focus, along with high-performance APIs and with best-in-class granularity of data. Other performance and latency sensitive fields like financial services, gaming, or security can also benefit.
The ability to fetch big historical datasets for ML training and at the same time be alerted in real-time for new events is important to Yakoa who is building a suite of NFT services, like a real-time NFT authenticity screener.
The case for data + AI was made successfully in web2 by massive data companies like Databricks and the same case applies to web3 as those areas of security, analytics, fraud detection, supply chain monitoring, trading, e-commerce, and recommendations are built.
We’re excited about the future of data-native, intelligent applications!
We’re growing the team and are focused on shipping value with Spice every week.
The roadmap includes enhanced NFT, IPFS, and ENS support, more blockchains including Solana and Binance-Smart-Chain (BSC), and of course, AI-powered, intelligent datasets.
We’re building the future of AI-driven apps
We’re building an application platform that combines the open-source Spice.ai decision engine with “ML-ready” data enabling developers to build the next generation of intelligent applications.
Spice.xyz is an important step toward that vision and just the start. We are working to empower developers with the tools and data to build intelligent applications across web2, web3, and web4 😉!
Join us in building the next generation of data and AI-driven applications!
The Spice AI Team
About Spice AI
Founded in June 2021 by Microsoft and GitHub alumni Luke Kim and Phillip LeBlanc, Spice AI creates technology to help developers build intelligent apps that learn and adapt.
Before co-founding Spice AI, Luke was the co-creator of Azure Incubations in the Office of the Azure CTO, where he led cross-functional engineering teams to create and develop technologies like Dapr and OAM.
Phillip was an engineering manager and IC working on distributed systems at GitHub and Microsoft and has contributed to services developers use every day, including GitHub Actions, Azure Active Directory, and Visual Studio App Center.