Token Economics
Cost per correct answer. Why agent loops cost 10–100× more than baseline.
The Core Math
CKG
$0.000506
per correct answer
RAG
$0.013046
per correct answer
GraphRAG
$0.020098
per correct answer
CKG is 26× cheaper than RAG, 40× cheaper than GraphRAG. At enterprise scale (1M queries/month), that's $13M saved annually.
Where the Tokens Go
In a typical RAG system:
- Embedding vector search: 3,000 tokens
- Retrieved chunks (top-k): 8,000 tokens
- Prompt + query: 1,200 tokens
- Reranking overhead: 2,500 tokens
- Retries & refinement loops: 3,200 tokens
- Total: 17,900 tokens per query
With CKG:
- Pre-compiled graph: 274 tokens
- Traversal + typed edges: 0 additional tokens
- No reranking needed: 0 tokens
- Total: 274 tokens per query
⚠️ Runaway Agent Loops: The Real Cost
⚠️ Disclaimer: This excludes runaway-loop incidents
The $13M annual savings assumes single-shot queries. In agentic workflows, ~70% of session tokens carry history the model no longer needs. When agents loop, context accumulates quadratically. Our baseline does NOT account for runaway costs — incidents where a single agent session burns 10–100× more tokens than expected.
Scenario: Agent Loop Cost Explosion
Without CKG: Context accumulation spiral
Iteration 1: 12K context
→ tool_result (3K tokens)
→ re-send entire context on next call
Iteration 2: 15K (original + result)
Iteration 3: 19K
Iteration 4: 24K
Iteration 5: 31K
Iteration 6: 41K
Iteration 7: 54K
Iteration 8: 71K
8 iterations = 247K tokens (not 96K baseline)
Cost: $7.41 per session (vs $0.002 expected)
With CKG: Fixed knowledge footprint
Iteration 1: 12K context + 274 tokens compiled knowledge
Iteration 2: 12K context + 274 tokens (knowledge re-arrives compiled, not history)
Iteration 3: 12K context + 274 tokens
...
Iteration 8: 12K context + 274 tokens
8 iterations = 98K tokens (stays within budget)
Cost: $0.003 per session (3,000× cheaper than RAG runaway)
The Three Cost Problems CKG Solves
-
Token inflation (per query): RAG bloats queries with loosely-matched text. CKG compiles knowledge once, reuses it across all queries. 65× fewer tokens per query.
-
Accumulation in loops: Agents send full conversation history on every call. CKG knowledge re-arrives at 274 tokens, not 71K, keeping costs linear not quadratic.
-
Retry cycles: RAG failures trigger refinement loops (rerank, expand chunks, query expansion). CKG guarantees first-pass accuracy (0% hallucination rate), eliminating retry overhead.
Enterprise Impact
For a mid-market company deploying agents across 3 teams (10 agents, 2M queries/month, avg 5 iterations per session):
- RAG baseline: $260K/month, assumes no runaway loops
- RAG with incidents: $2.6M+/month (if 3–5 runaway cascades per month)
- CKG: $6K/month, linear cost growth
Learn More
• Read the open CKG Compiler benchmark (PDF)
• What is Retrieval Density Score (RDS)?
• What is GAO? (Generative Agent Optimization)