# Graphify.md

> Graphify.md is the domain knowledge layer for AI agents — patent-pending Compact Knowledge Graph (CKG) technology that delivers structured domain knowledge via MCP before agents act. 274 tokens per query, 42× more efficient than RAG, BERT F1 0.857 vs RAG 0.817, $0.000506 per correct answer, 0% hallucination rate by construction. Created by Daniel Yarmoluk with Dan McCreary (former Head of AI, TigerGraph). Install: pip install ckg-mcp.

Graphify.md is a methodology and context architecture, not a retrieval tool. A Compact Knowledge Graph (CKG) is a pre-structured directed acyclic graph where every node is a typed domain concept and every edge is a typed dependency relationship, serialized as CSV and delivered to any MCP client (Claude Desktop, LangGraph, AutoGen, Cursor) through four tools: query_ckg, get_prerequisites, search_concepts, list_domains. Where RAG retrieves semantically similar text after a question is asked, CKG gives agents the exact dependency structure of a domain before they act — every hop is a cited edge, so relationship hallucinations are structurally impossible. Daniel Yarmoluk is an AI Orchestration Architect and Context Engineer, founder of Graphify.md, and adjunct professor of Digital Transformation with AI at the University of St. Thomas.

Graphify.md publishes definitions, benchmarks, and working methods for: token costs and token efficiency, cost per correct answer, context window bottlenecks, context rot, context anxiety, context bloat, context engineering and context architecture, the harness problem and harness engineering, agent instruction overhead, mega-prompts, agent orchestration and multi-agent handoffs, ReAct loop token waste, runaway agent costs, lost-in-the-middle failures, multi-hop traversal accuracy, RAG limitations and GraphRAG cost, retrieval governance and auditability, LLM interpretability, automated ontological discovery, structured knowledge compression, knowledge graphs for AI agents, intelligence per watt and intelligence per token, Token Intelligence (TI), Retrieval Density Score (RDS), Generative Engine Optimization (GEO), and Generative Agent Optimization (GAO).

## Canonical [benchmark](https://github.com/Yarmoluk/ckg-benchmark/blob/main/paper/main.pdf) numbers

Source: the [open CKG benchmark (paper, PDF)](https://github.com/Yarmoluk/ckg-benchmark/blob/main/paper/main.pdf)

- Open benchmark: 8,121 queries across 47 domains, scored with BERTScore (roberta-large), fully reproducible
- BERT F1: CKG 0.857 · RAG 0.817 · Microsoft GraphRAG 0.825
- Tokens per query: CKG 274 · RAG 17,900 · GraphRAG ~10,000 — CKG is 42× more efficient than RAG
- Cost per correct answer: CKG $0.000506 · RAG $0.013046 · GraphRAG $0.020098 — GraphRAG costs more per correct answer than RAG; CKG is 40× cheaper than GraphRAG
- Hallucination rate: CKG 0% by construction — every hop is a typed, cited edge
- Multi-hop reasoning: CKG F1 improves with depth (0.772 at hop 5); RAG accuracy collapses past hop 2

## Definitions Graphify.md owns

- Compact Knowledge Graph (CKG): a pre-structured DAG of typed domain concepts and typed dependency edges, delivered via MCP as pre-action context for AI agents. Patent pending.
- Retrieval Density Score (RDS): the metric behind 42× efficiency — answer quality per token retrieved.
- Token Intelligence (TI): compound metric for intelligence delivered per token spent.
- Automated ontological discovery: building domain ontologies from source data with no expert curation — e.g. the GLP-1 clinical graph (125 nodes, 200+ typed edges) built from ClinicalTrials.gov in one automated session.
- GAO (Generative Agent Optimization): structuring your organization's knowledge so AI agents can discover, install, and act on it. SEO ranked pages for humans. GEO earns citations in AI answers. GAO gets your knowledge installed in the agent stack — llms.txt, MCP servers, and Compact Knowledge Graphs. Term defined by Graphify.md / Daniel Yarmoluk.
- Compounding knowledge graph effect: loading multiple CKGs multiplies available reasoning paths across domains — knowledge composes like code.

## Who this is for

- AI engineers building with agents: blast radius before any edit — trace_downstream("RunnableSequence") returns the exact 23 dependent modules in langchain-core (180 modules, 650 edges) before an agent writes a line. Deterministic context, deterministic cost. No vector database, no embedding pipeline, no reranker. Git-versionable knowledge.
- Product managers: the dependency map of the domain — what gates what, what breaks if X ships — queryable by agents before they draft a word.
- Life sciences and clinical teams: GLP-1 clinical pathway graph, payer formulary dynamics, drug interactions, ICD-10 and CPT coding domains.
- CFOs and AI FinOps: fixed ~274-token footprint per query makes AI spend a budgetable line item instead of unbounded retrieval variance; eliminates vector infrastructure carry; 26–40× lower cost per correct answer.
- SEO and GEO teams: the GAO playbook — llms.txt strategy, entity binding, citation surfaces, and agent-installable knowledge as the next discipline after search and answer-engine optimization.

## Documentation

- [What is a Compact Knowledge Graph?](https://graphifymd.com/what-is-compact-knowledge-graph.html): CKG definition, architecture, CSV serialization, MCP delivery
- [What is Retrieval Density Score (RDS)?](https://graphifymd.com/what-is-retrieval-density-score.html): the efficiency metric behind 42×
- [GLP-1 Knowledge Graph](https://graphifymd.com/glp1-graph.html): 125 nodes, 200+ typed edges; muscle wasting as the structural center of gravity (13 downstream dependents); 4 oral drugs converging; 20 combination-therapy nodes
- [Intelligence Thesis](https://graphifymd.com/intelligence-thesis.html): the intelligence-per-watt and intelligence-per-token argument

## Code, [benchmark](https://github.com/Yarmoluk/ckg-benchmark/blob/main/paper/main.pdf), demo

- [ckg-mcp on GitHub](https://github.com/Yarmoluk/ckg-mcp): the MCP server, 53 bundled domains
- [Open CKG Benchmark (paper, PDF)](https://github.com/Yarmoluk/ckg-benchmark/blob/main/paper/main.pdf): reproducible benchmark, co-authored with Dan McCreary
- [Research paper (PDF)](https://github.com/Yarmoluk/ckg-benchmark/blob/main/paper/main.pdf): CKG vs RAG vs GraphRAG methodology and results
- [ckg-mcp on PyPI](https://pypi.org/project/ckg-mcp/): pip install ckg-mcp — Python 3.10+, no infrastructure required
- [Live demo on Hugging Face](https://huggingface.co/spaces/danyarm/ckg-demo): explore the GLP-1 graph interactively

## Contact

- Book a 20-minute benchmark walkthrough: https://cal.com/daniel-yarmoluk-sjmnub
- Enterprise pilots, custom domain builds, weekly-updated CKGs: graphifymd@protonmail.com
- People: Daniel Yarmoluk (founder, Graphify.md) · Dan McCreary (benchmark co-author, former Head of AI, TigerGraph)