Open Benchmark · 45 Domains · 7,928 Queries

42× more intelligence
per token. Zero hallucinations.
Proven at scale.

Graphify.md replaces RAG and GraphRAG with pre-structured knowledge graphs that outperform on every structural query type — using 11× fewer tokens per query, with no hallucinations by construction. When domain graphs interact, intelligence compounds. That's CKGO.

Request a 30-Minute Pilot Call Read the Paper → Download PDF →
42×
RDS Advantage vs. RAG
Reasoning Density Score · F1 / tokens
0.4709
CKG Macro F1
Track 1 · 44 domains · vs. 0.123 RAG
11×
Fewer Tokens per Query
269 vs. 2,982 (RAG) vs. 3,450 (GraphRAG)
0.5298
CKG F1 · Track 2
GLP-1 · pipeline-generated · no expert curation
7,928
Benchmark Queries
T1–T5 · 45 domains · open dataset

The numbers aren't estimated.
They're measured.

An open benchmark across 45 domains, 7,928 queries, three retrieval architectures. Every result is reproducible.

System Comparison · Track 1 · 44 Domains
System Macro F1 Tokens/query Run Cost RDS
CKG 0.471 269 $7.81 0.00201
RAG 0.123 2,982 $76.23 0.0000482
GraphRAG 0.120 3,450 $44.43 0.0000452
F1 by Query Type · CKG vs RAG
Entity
Lookup
0.21 / 0.09
Dependency
Chain
0.63 / 0.08
Multi-Hop
Path
0.66 / 0.20
Category
Aggregate
0.96 / 0.29
Cross-
Concept
0.32 / 0.12

Blue = CKG  |  Grey = RAG  |  T1 entity lookup is the designed negative control — CKG stores structure, not prose.

Open
Benchmark

Fully Reproducible · Peer Reviewed

45 domains · 7,928 queries · three systems · all results and evaluation code published on GitHub. Co-authored with Dan McCreary (Intelligent Textbooks, ex-Optum). Pending ArXiv submission (cs.IR).  github.com/Yarmoluk/ckg-benchmark →

42×
RDS vs RAG
11×
Fewer Tokens

How it works

Three steps from your data to a deployed knowledge system — no annotation budget, no expert curation required.

01

Identify your structured data source

Any domain with stable relationships: regulatory registries, clinical trial databases, financial filings, product catalogues, policy documents, internal knowledge bases.

02

Pipeline extracts your knowledge graph

Entities, dependencies, and taxonomy are extracted into a Compact Knowledge Graph (CKG) — a directed acyclic graph encoding your domain's structure. Proprietary methodology. No hallucination by construction.

03

Query via deterministic BFS/DFS retrieval

Queries resolve by traversing the graph — not by semantic similarity. Results are exact, reproducible, and hallucination-free by construction. 269 tokens per query average.

Foundation Models
GPT  ·  Claude  ·  Gemini
LLM inference
any provider
Context Intelligence Layer
Graphify.md CKG
Pre-structured domain knowledge · 269 tokens · no hallucinations
42× RDS
Patent Pending
Enterprise Knowledge
APIs  ·  Registries  ·  Databases  ·  Documents
SEC · USPTO
ClinicalTrials · GDELT

Commercial proof: no expert curation required.

Track 2 built a GLP-1/Obesity pharmacology domain entirely from the ClinicalTrials.gov public API — no textbook, no domain expert, no annotation. The result exceeded the hand-curated educational average by 12.5%.

Track 2 · GLP-1/Obesity Pharmacology

Built from ClinicalTrials.gov API in one automated session

668 semaglutide trials · 224 tirzepatide trials · 158 pipeline agents (retatrutide, cagrisema, orforglipron). 90 concepts · 170 dependency edges · 170 benchmark queries.

F1 = 0.530

vs. 0.471 hand-curated educational average — commercial domain outperforms expert-curated domains.

Finding 1

Performance depends on graph structure, not curation source

Expert annotation is not a prerequisite for the CKG advantage. Any domain with stable concept relationships expressible in a DAG achieves the same retrieval superiority.

Finding 2

28× RDS advantage preserved on commercial domains

The token efficiency holds on enterprise data: 11× fewer tokens per query, 28× compound RDS over RAG. CKG F1 = 0.530 vs. RAG 0.154 on the same queries.

Finding 3

T4 category aggregation reaches F1 = 0.998

Near-perfect enumeration of agents by drug class, indication by anatomy, and trial by program. RAG: 0.108. GraphRAG: 0.031.

Interactive — click nodes to explore dependencies  ·  drag to reposition  ·  Open full screen →

Structural Insights — What the Graph Reveals
Hub Node Analysis

GLP-1RA drug class carries 20 downstream concepts

The GLP-1RA drug class node is the single highest-dependency hub in the graph — 20 concepts depend on it directly. Obesity pathophysiology (12×) and Weight loss endpoints (10×) form the next tier. Remove any of these three and the graph reorganizes structurally.

Unique Structural Position

Tirzepatide is the only dual GLP-1 + GIP agonist in the graph

Every other drug in the DRUG taxonomy activates a single receptor pathway. Tirzepatide's simultaneous GLP-1 and GIP receptor activation is not a marketing claim — it is a structural position no other agent in the 90-concept graph shares. The graph made this unambiguous before reading a single trial paper.

Multi-Hop Path

Obesity to cardiovascular mortality: 7 hops

The graph traces a 7-hop dependency chain from obesity pathophysiology through insulin resistance, visceral adiposity, metabolic syndrome, dyslipidemia, cardiovascular disease, and MACE endpoints. The SELECT trial outcome was visible in this architecture before the data published — the path existed in the structure.

Pipeline Coverage

14 active trial programs across 7 taxonomic layers

SUSTAIN, STEP, SURMOUNT, AWARD, LEADER, SCALE, SELECT, CVOT design — all 14 programs mapped with their endpoint dependencies. Seven taxonomies (FOUND, PATH, DRUG, TRIAL, COMPL, SPEC, COMBO) partition 90 concepts into reasoning layers an LLM can traverse without hallucination.

Every knowledge-intensive industry
has the same problem.

27 verticals deployed. The architecture is identical across domains — only the knowledge graph changes.

New Series

5-Minute CKG Series

Live interactive knowledge graphs — any topic, production-ready in minutes.

CKG: 269 tokens/query  ·  RAG: 2,982 tokens  ·  11× cheaper
Food & Retail
Aldi Grocery Disruption
45 concepts · 78 edges · 180+ stores opening 2026
Explore Graph →
Workforce
Sell AI. Cut Jobs.
52 concepts · 88 edges · The C-Suite windfall map
Explore Graph →
AI / Tech
Agentic AI Landscape 2026
45 concepts · 82 edges · LangGraph to MCP
Explore Graph →
Trade & Economics
Tariffs & Supply Chain 2026
42 concepts · 70 edges · 82% of supply chains affected
Explore Graph →
Media & Culture
World's Greatest Rappers
47 concepts · 78 edges · GOAT debate settled in a graph
Explore Graph →
Music & Lyrics
Eminem Discography
45 concepts · 55 edges · 220M records, one breakdown
Explore Graph →
🚗
Automotive AI
China vs. West competitive intelligence. 75 entities, 6 domains — BYD, Waymo, NVIDIA, Tesla mapped against capital, policy, and technology layers.
Live Graph →
📊
Financial Services
Portfolio intelligence, EDGAR filings, compliance monitoring, deal sourcing. Structural reasoning across regulatory frameworks.
⚖️
Legal & IP
Contract analysis, regulatory compliance, precedent chains. Deterministic traversal where hallucinations are inadmissible.
🏦
Insurance
Risk selection, portfolio optimization, regulatory mapping. Multi-hop reasoning across policy logic RAG cannot traverse.
🏗️
Construction
Project specifications, subcontractor networks, compliance requirements. Structured intelligence for complex procurement decisions.
🧬
AI Infrastructure
Model architectures, benchmarking frameworks, API ecosystems. Technical intelligence for AI procurement and integration decisions.
🎓
Education
Curriculum knowledge graphs, learning dependency chains, competency frameworks. Personalized instruction at scale.

The moat is built in.

Patent-protected methodology, peer-reviewed benchmark, and a construction pipeline that scales to any structured domain.

Patent Pending · USPTO · Priority date locked

April 16, 2026 · Provisional application on file

Conversion in progress · Represented by patent counsel

Microsoft, Intel, Unlikely AI · Citations: Buehler Lab · MIT

44 domains · 7,758 queries · F1 3.7× over RAG · 42× RDS · Published open benchmark — academic citation trail established before non-provisional filing.

GLP-1/Obesity Track 2: pipeline-generated CKG from ClinicalTrials.gov API, zero human annotation. F1 0.53 — exceeds hand-curated average by 12.5%. The method scales to any structured domain automatically.

What's Protected
  • 01 Multi-Domain Compound Environment — The architecture in which heterogeneous knowledge graph domains interact to produce emergent intelligence unavailable from any single graph. Non-linear returns by design.
  • 02 Dynamic Context Adaptation — Real-time graph re-weighting based on portfolio, customer profile, or operational context — with no manual reconfiguration. The system adapts to the user's world.
  • 03 Compression Methodology — Proprietary process for compressing domain knowledge into structured LLM context. The 42× RDS advantage is a direct output of this claim.
  • 04 Serialized Delivery Format — Proprietary encoding schema as the native delivery format for LLM context. Human-readable, zero build cost, hallucination rate of zero by construction.
The Next Frontier

The hardest problem in AI
isn't building agents.
It's representing information.

RAG searches by similarity. It doesn't understand structure. Accuracy, zero hallucination, and 80% token reduction aren't aspirational — they're the result of searching structure, not probability.

🕸️

Structure search, not similarity search

RAG guesses from all available information. CKG traverses pre-built dependency paths — no vector similarity, no hallucination. The answer is in the graph or it isn't there.

Intelligence compounds at the knowledge layer

When structured domain graphs interact, emergent connections appear that no single graph — and no retrieval pipeline — could surface. This is CKGO: the orchestration of knowledge, not agents.

Patent Pending · April 2026
📡

Generative Answer Optimization

As AI replaces search, the question isn't how to rank your content — it's whether your domain knowledge is structured enough for AI to cite accurately. CKG is the answer layer for GAO.

"Intelligence doesn't compound through agents. It compounds through structure."

Ready to deploy in your domain?

The technology is built. The benchmark is published. The patent is filed. The next step is a 30-minute call to scope your domain and structure a pilot.

Schedule a 30-Minute Call

Daniel Yarmoluk · Founder · Graphify.md
daniel.yarmoluk@gmail.com