Chutes
SN64Run any AI model instantly without managing servers, serverless AI compute at scale
The AWS Lambda of Bittensor. Pick any AI model, call an API endpoint, and Chutes handles the GPUs, load balancing, scaling, and infrastructure. 46+ models available, OpenAI-compatible API, used by multiple other subnets as their inference backbone.
// Serverless AI. Just call the API.
Chutes is a serverless AI compute platform. Developers choose an AI model (or bring their own code), call an API endpoint, and Chutes handles everything: finding available GPUs, routing requests, scaling up or down based on demand, and managing the entire infrastructure. No server setup. No GPU procurement. No DevOps.
The simple version: Imagine a power outlet for AI. You plug in what you want to run, and the electricity (compute) just flows. You don't think about power plants, transmission lines, or transformers. Chutes is that outlet for AI models.
Centralized equivalent: Think AWS Lambda or Google Cloud Functions, but specifically for AI inference, with GPUs instead of CPUs, and powered by a decentralized miner network instead of corporate data centers.
How it works:
- Miners run AI models ("chutes") on their GPUs and decide which models to keep "hot" in memory for fast response times. They compete for bounties by being the first to launch cold models. Rewards are based on compute units (55%), invocations (25%), unique chutes hosted (15%), and bounty completions (5%).
- Validators verify miner activity through digital fingerprints and activity reports. The main validator (operated by the Chutes team with approximately 16 H200 GPUs) coordinates the platform, while other validators audit reward calculations.
- The problem it solves: Running AI models requires expensive GPUs, complex infrastructure, and constant maintenance. Most developers and companies can't afford dedicated AI infrastructure. Cloud providers charge premium prices and lock you into their ecosystem.
- The opportunity: The AI inference market is growing exponentially. Every chatbot, image generator, code assistant, and AI agent needs compute. Serverless AI removes the biggest barrier: infrastructure management.
- The Bittensor advantage: Chutes has become infrastructure for the Bittensor ecosystem itself. Subnets like Affine (SN120) require miners to deploy models through Chutes. This creates recursive demand: as Bittensor grows, Chutes usage grows automatically.
- Traction signals: 13,999 holders (fourth-largest in the network). 420,115 TAO market cap. 82 GitHub stars. OpenAI-compatible API serving 46+ models. Founded by Jon Durbin. Used as deployment infrastructure by multiple other subnets.
Category: Inference and Compute | Centralized Competitor: AWS Lambda, Google Cloud Functions, Replicate, Together AI, Fireworks AI
Chutes is the closest thing Bittensor has to core infrastructure. While most subnets build applications on top of the network, Chutes provides the compute layer that other subnets build on. This "picks and shovels" positioning creates a flywheel: every new subnet that needs inference can use Chutes, driving organic demand.
Mechanism:
The reward structure is carefully designed to balance multiple objectives. Compute units (55% of rewards) incentivize raw throughput. Invocations (25%) reward being available when users call. Unique chutes (15%) encourage model diversity, preventing miners from all running the same popular model. Bounties (5%) reward responsiveness: being the first to spin up a model that a user requests but isn't currently loaded.
The OpenAI-compatible API is a strategic choice. Any application built for OpenAI can switch to Chutes with a single URL change. This dramatically lowers the adoption barrier and positions Chutes as a drop-in replacement for centralized inference providers.
The codebase has 414 commits across 9 contributors in a substantial 93MB repository. Development is active at 4-19 commits per week, with Jon Durbin as the primary contributor handling configuration, error handling, and integration improvements. Recent work includes Pydantic serialization fixes and cold model handling.
Market metrics reflect Chutes' infrastructure status. At 420,115 TAO market cap, it's one of the largest subnets. 13,999 holders make it the fourth most widely held. Gini of 0.613 and HHI of 0.042 show well-distributed ownership. Root proportion of 0.163 confirms strong organic demand.
The 7-day picture shows correction: -3,599 TAO net outflow and -2.8% price decline. The 30-day is more notable at -15.4%, reflecting a rotation from Chutes into subnets like Templar (SN3) that have captured recent narrative momentum. Realized PnL of 169,408 TAO is the highest we've seen, meaning significant profit-taking has already occurred. Unrealized PnL of 102,486 TAO shows remaining conviction.
The roadmap targets long-running jobs (training, not just inference), trusted execution environments for enterprise security, and expansion into a comprehensive AI platform. If Chutes executes on long jobs, it bridges the gap between inference subnet and full compute marketplace.
- Current outflows: -3,599 TAO net 7-day outflow and -15.4% 30-day price decline suggest a rotation phase. Large realized PnL (169k TAO) confirms significant profit-taking.
- Central validator dependency: The main validator operates approximately 16 H200 GPUs. While auditors exist, the platform's availability depends heavily on this central coordinator.
- Competition from centralized providers: Together AI, Replicate, and Fireworks AI offer similar serverless inference with VC-subsidized pricing. Chutes must compete on cost or decentralization advantages.
- Ecosystem dependency: Much of Chutes' value comes from being used by other subnets. If those subnets change their deployment requirements, demand could shift.