grail
SN81Verifiable post-training for AI models, proving your model improvements are real
The "finishing school" for AI. Where τemplar pre-trains the raw model, grail teaches it how to behave. It runs verifiable reinforcement learning, where every training step is cryptographically proven, meaning nobody can fake improvements or substitute models.
// Proving AI got smarter, cryptographically.
grail is a post-training subnet that takes pre-trained AI models and makes them better at specific tasks through reinforcement learning. The critical innovation is that every training step produces a cryptographic proof, so anyone can independently verify that the improvements are real and haven't been fabricated.
The simple version: Imagine a tutor who teaches a smart student how to actually pass exams. The student (AI model) already knows a lot from reading textbooks (pre-training on τemplar), but grail trains it to solve specific problems and prove each answer is genuine.
Centralized equivalent: Think OpenAI's RLHF (reinforcement learning from human feedback) pipeline that turns GPT base models into ChatGPT, but with cryptographic proof that the training actually happened.
How it works:
- Miners generate multiple solution attempts ("rollouts") for assigned problems, tracking every token and probability. They solve math word problems (GSM8K) and logic puzzles (3-SAT), uploading their work with cryptographic commitments.
- Validators derive deterministic problem sets from public randomness, verify each rollout against the GRAIL protocol (checking token commitments, model bindings, and solution correctness), then score based on unique, valid, and successful rollouts.
- The problem it solves: Post-training is where AI models go from "smart but unreliable" to "useful and aligned." Right now, this process is a black box inside big AI labs. There's no way to verify whether published training improvements are real.
- The opportunity: If decentralized post-training can produce competitive models with verifiable proofs, it creates an open market for model improvement. Anyone can contribute to making AI better, and everyone can verify the results.
- The Bittensor advantage: The GRAIL protocol (Guaranteed Rollout Authenticity via Inference Ledger) uses PRF-based index derivation, sketch commitments, and verifier-supplied challenges to bind every training step to a specific model. This is security at the math level, not the policy level.
- Traction signals: Part of the One Covenant ecosystem alongside τemplar (SN3) and Basilica (SN39), forming a full-stack decentralized AI training pipeline. 2,661 holders. 112,699 TAO market cap. Bandwidth reduced by ~100x, enabling decentralized RL at centralized speed.
Category: Model Fine-Tuning | Centralized Competitor: OpenAI RLHF Pipeline, Anthropic Constitutional AI Training, Google DeepMind Gemini Post-Training
grail occupies a unique position: it's one of the few subnets building genuine cryptographic verification into AI training. Most subnets verify outputs. grail verifies the training process itself.
Mechanism:
The protocol works in windows. Each window, validators derive a set of problems from public randomness (mixing drand beacon values with block hashes for unpredictability). Miners receive these problems and generate GRPO-style rollouts, multiple attempts at solving each problem while tracking token IDs and log probabilities.
The critical piece is the GRAIL protocol. Every rollout includes a cryptographic commitment that binds the tokens to the specific model that generated them. Validators can verify this binding without re-running the inference, catching miners who might try to substitute a better model's outputs for their own. This is what makes the training "verifiable": you can prove which model produced which results.
Currently the system supports two environments: 3-SAT (satisfiability problems with 3-10 variables) and GSM8K (grade-school math word problems). Scoring uses a superlinear curve that rewards sustained improvement, discouraging one-off lucky solutions.
The codebase has 1,047 commits across 7 contributors, though development has paused in the last 2 weeks (0 commits). The last active period showed 37 commits in one week followed by 7 the next, suggesting a burst-and-stabilize pattern. The repo is 14.4MB with ErfanMhi as the primary recent contributor.
Financially, grail is growing. Net 7-day inflow of 3,121 TAO is strong for its 112,699 TAO market cap. Root proportion of 0.188 shows organic demand. The Gini of 0.731 is higher than average, indicating more concentrated holdings. Emission buy acceleration of 1.35x (chain buys at 8.1% vs EMA of 6.0%) shows accumulation is picking up.
grail's strategic value is as the second stage of the Covenant pipeline: τemplar pre-trains, grail post-trains, and Basilica (SN39) serves inference. Together they form a complete decentralized alternative to what OpenAI, Anthropic, and Google do internally. The question is whether the cryptographic overhead is worth the verifiability premium.
- Development pause: Zero commits in the last 2 weeks. For a protocol this complex, sustained development is essential.
- Holder concentration: Gini of 0.731 and HHI of 0.078 are relatively high. A few large holders exiting could significantly impact price.
- Limited environments: Currently only 3-SAT and GSM8K. Expanding to more diverse training environments (code, reasoning, instruction following) would dramatically increase utility.
- Pipeline dependency: grail's value is tied to the Covenant ecosystem. If τemplar's models decline in quality, grail's post-training becomes less valuable.