CKG Specification 1.0

Status: Draft · final review  ·  Version: 1.0.0-draft  ·  Editor: Graphify.md (Daniel Yarmoluk)
License: format & spec — open · reference data CC BY 4.0 · reference client MIT

Draft — not for public distribution. Pending legal sign-off. This document specifies the open format and conformance of a Compressed Knowledge Graph. The discovery, extraction, and compression method that produces a CKG is proprietary and patent-pending — explicitly out of scope (§8).

1 Scope · 2 Overview · 3 Data model · 4 Serialization · 5 Query semantics · 6 Conformance (L0/L1/L2) · 7 Metadata & versioning · 8 Out of scope · 9 References

1 · Scope

This specification defines the structure, serialization, and conformance of a Compressed Knowledge Graph (CKG) — a portable, model-agnostic knowledge layer an AI agent reads before it acts. It defines what a conformant CKG is and how to validate and certify one. It does not define how one is produced.

2 · Overview

A CKG is a directed acyclic graph (DAG) of typed domain concepts connected by explicit prerequisite (dependency) edges. Relationships are declared, not inferred — no graph database, no embeddings, no runtime similarity search. A CKG is serialized as plain-text (CSV or Markdown) and is exportable to RDF/Turtle and JSON-LD, so it is human-readable, Git-diffable, and reproducible. It is consumed by deterministic traversal — queried, never summarized.

3 · Data model

3.1 Concept (node)

A node is a single typed domain concept. A conformant node carries:

FieldReq.Definition
ConceptIDMUSTUnique identifier within the graph (integer or stable string).
ConceptLabelMUSTHuman-readable concept name. Non-empty.
DependenciesMUSTPipe-delimited list of prerequisite ConceptIDs (may be empty). Encodes the edges.
TaxonomyIDMUSTCategory/grouping code for the concept.
ConfidenceSHOULDCalibrated score in [0,1]. Required for L2.
ProvenanceSHOULDSource citation(s) for the concept. Required for L2.

3.2 Edge

An edge is a directed prerequisite/dependency relationship: A depends-on B, declared in A's Dependencies field. Extended typed relationships (enables, causes, gates, contradicts) MAY be layered on the base dependency edge. Every dependency MUST resolve to an existing ConceptID.

3.3 Graph constraints

4 · Serialization

4.1 CSV (canonical)

ConceptID,ConceptLabel,Dependencies,TaxonomyID
1,Function,,FOUND
2,Domain and Range,1,FOUND
3,Limit,2,CALC
# Dependencies: pipe-delimited prerequisite ConceptIDs (e.g. "2|5")

4.2 Markdown

An equivalent human-first serialization: one section per concept, with label, taxonomy, dependencies (as [[links]]), confidence, and source. Lossless round-trip with CSV.

4.3 RDF/Turtle & JSON-LD

A conformant CKG MAY be exported to RDF/Turtle (concepts as subjects, dependencies as a ckg:prerequisite predicate) and JSON-LD for interoperability with semantic-web tooling. These are views; CSV/Markdown remain canonical.

5 · Query semantics

A conformant client exposes at minimum:

OperationReturns
list_domains()Available CKG domains.
query_ckg(domain, concept, depth)The sub-graph of prerequisites and dependents up to depth hops.
get_prerequisites(domain, concept)The full prerequisite chain to root.
search_concepts(domain, query)Concepts matching the query.
validate_ckg(graph, profile)A conformance report (§6).

Query classes a conformant graph supports: T1 entity lookup · T2 direct dependency · T3 multi-hop path · T4 category aggregation · T5 cross-concept relationship.

6 · Conformance

Three levels. A graph advances only by meeting the level below it.

LevelMeaningChecked by
L0Raw — auto-extracted, unreviewed.
L1Structurally valid (§6.1).machine — validate_ckg
L2Authority-certified (§6.2).a named human authority

6.1 · L1 — structural conformance

A graph is L1 if and only if it satisfies, machine-checkably:

  1. All required node fields present (§3.1); labels non-empty; valid TaxonomyID.
  2. ConceptIDs unique; every Dependencies reference resolves.
  3. The graph is acyclic (DAG).
  4. If present, Confidence ∈ [0,1].
  5. No untyped free-text assertions (no surface for hallucination).

6.2 · L2 — authority conformance & certification

An L1 graph becomes L2 when a domain authority reviews it against a Conformance Profile and signs. This is the human-in-the-loop, anti-black-box layer.

6.2.1 — The Conformance Profile

A profile, authored by the domain authority (ontologist / SME / brand manager), declares the rules the graph must obey:

6.2.2 — The certification procedure

  1. validate_ckg(graph, profile) confirms the graph passes L1 and meets the profile's machine-checkable rules → a conformance report.
  2. A named domain authority reviews every flagged item — approve · edit · prune · add — with each node's confidence and provenance visible.
  3. The authority signs. The attestation records: reviewer identity, timestamp, profile name + version, and a content hash of the graph.
  4. The result is a named, versioned L2 artifact (e.g. parcels-pursuit-ckg@1.2) with an immutable attestation block (§7).

6.2.3 — Re-certification

Any prune/add/edit after certification produces a new version (e.g. @1.3) that re-enters review. An L2 attestation binds to exactly one content hash; changing the content invalidates the attestation until re-signed.

7 · Metadata & versioning

A CKG carries a metadata block: name, version (semver), conformance_level (L0/L1/L2), profile (name@version, for L2), attestation (reviewer, timestamp, content hash, for L2), license, and updated. Versioning is semver; any content change increments at least PATCH and resets L2 to pending.

8 · Out of scope (the method)

This specification does not define how a CKG is discovered, extracted, or compressed from source material. The automated ontological discovery and compression method — including the retrieval architecture (index-route + full-graph load) — is proprietary and the subject of pending patent applications. Conformance is defined entirely on the artifact, independent of how it was produced. Any process that emits a graph meeting §3–§6 is conformant.

9 · References

Reference client: github.com/Yarmoluk/ckg-mcp (MIT)
Reference data & benchmark: github.com/Yarmoluk/ckg-benchmark (CC BY 4.0) — 45 domains, 7,928 queries; Macro-F1 0.471 vs RAG 0.123 vs GraphRAG 0.120
Conformance levels: L0 raw · L1 structural · L2 authority-certified

CKG Specification 1.0.0-draft · Graphify.md · © 2026 · format open, method proprietary · draft for review, not for public distribution.