---
name: ckg-azure-purview-catalog
description: "Microsoft Purview data catalog & governance platform — complete CKG of the learn.microsoft.com/purview documentation surface. 288 nodes, 413 edges, 10 domains. Covers the Data Map (assets, collections/domains, scans, scan rule sets, resource sets, integration runtimes, sources), the Unified Catalog (governance domains, data products, glossary terms, critical data elements, OKRs, managed attributes), classification & sensitivity labels (MIP), Purview Data Quality (profiling, rules, dimensions, scoring), data lineage (ADF/Synapse/Fabric/Power BI), Unified Catalog access policies & Data Map policies, collection-based RBAC & data-governance roles, Fabric/OneLake integration, and data estate health/insights."
metadata:
  node_type: reference
  type: reference
  version: 1.0.0
  date: 2026-06-18
  source: "Microsoft Learn documentation — https://learn.microsoft.com/purview"
  nodes: 288
  edges: 413
  domains: 10
  formats:
    - md
---

# Microsoft Purview Data Catalog & Governance Platform — Compressed Knowledge Graph (CKG) v1.0.0
# Source: learn.microsoft.com/purview  (live docs, fetched 2026-06-18)
# NOTE: Microsoft Purview data governance splits into two solutions: the Data Map (metadata store,
#       scanning/classification, collections) and the new Unified Catalog (governance domains,
#       data products, glossary, CDEs, OKRs, data quality, health). The legacy 'classic Data Catalog'
#       experience is being superseded by the Unified Catalog. Several features are marked Preview in docs.
# Generated by Graphify.md | graphifymd.com
# 288 concepts · 413 dependency edges · 10 domains
# Paste into any LLM, ask: "What depends on [concept]?" or "Trace the path from Scan to Data quality score."

## META
domain:      azure-purview-catalog
nodes:       288
edges:       413
domains:     10 (CORE · META · SRCH · DQ · LIN · GOV · GLOS · OPEN · RBAC · INSIGHTS)
edge_type:   technical dependency (source concept required to understand/implement the target)
version:     1.0.0

## CORE — Platform, Data Map & Unified Catalog (22 concepts)
# Source: data-map · data-map-scan-ingestion · unified-catalog · data-governance-overview · purview-portal · account-upgrades
Microsoft Purview                                   deps: none (root)
Microsoft Purview account                           deps: Microsoft Purview
Microsoft Purview portal (new)                      deps: Microsoft Purview account
Classic governance portal                           deps: Microsoft Purview account
Free vs Enterprise account type                     deps: Microsoft Purview account
Microsoft Entra ID tenant                           deps: none (root)
Data governance solution                            deps: Microsoft Purview account
Federated data governance                           deps: Data governance solution
Data Map                                            deps: Data governance solution
Unified Catalog                                     deps: Data governance solution
Classic Data Catalog                                deps: Data Map
Data Map vs Unified Catalog split                   deps: Data Map | Unified Catalog
New experience upgrade                              deps: Microsoft Purview account | Microsoft Purview portal (new)
Data Map capacity unit (CU)                         deps: Data Map
Operations throughput                               deps: Data Map capacity unit (CU)
Metadata storage                                    deps: Data Map capacity unit (CU)
Elastic Data Map (autoscale)                        deps: Data Map capacity unit (CU) | Operations throughput | Metadata storage
Technical metadata                                  deps: Metadata storage
Business metadata                                   deps: Metadata storage
Operational metadata                                deps: Metadata storage
Semantic metadata                                   deps: Metadata storage
Data Map monitoring metrics                         deps: Data Map capacity unit (CU) | Metadata storage

## META — Assets, Schema & Metadata Model (26 concepts)
# Source: data-map · data-map-scan-ingestion · data-map-resource-sets · data-map-data-sources-register-manage · data-gov-live-view
Data asset                                          deps: Data Map | Technical metadata
Asset fully qualified name (FQN)                    deps: Data asset
Asset schema                                        deps: Data asset | Technical metadata
Column                                              deps: Asset schema
Asset type                                          deps: Data asset
Table asset                                         deps: Asset type | Asset schema
View asset                                          deps: Asset type | Asset schema
File asset                                          deps: Asset type
Folder / container asset                            deps: Asset type
Nested data (JSON)                                  deps: Asset schema | Column
Asset business metadata curation                    deps: Data asset | Business metadata
Asset description                                   deps: Asset business metadata curation
Asset owner / expert (contacts)                     deps: Asset business metadata curation
Asset certification                                 deps: Data asset | Asset business metadata curation
Related entities (child objects)                    deps: Data asset
Live view                                           deps: Data asset | Microsoft Entra ID tenant
Resource set                                        deps: Data asset | Asset schema
Resource set pattern                                deps: Resource set
Built-in resource set patterns                      deps: Resource set pattern
Regex-based pattern ({GUID}/{N}/{HEX}/{LOC})        deps: Built-in resource set patterns
Complex pattern ({SparkPartitions}/date-in-path)    deps: Built-in resource set patterns
Resource set detection                              deps: Resource set pattern | Resource set
Resource set display name extraction                deps: Resource set detection
Advanced resource sets                              deps: Resource set | Resource set detection
Resource set partition count / sample path / size   deps: Advanced resource sets
Resource set pattern rules (custom)                 deps: Resource set pattern | Advanced resource sets

## SRCH — Scanning, Classification, Search & Browse (38 concepts)
# Source: data-map-scan-ingestion · data-map-data-sources · data-map-classification · data-map-scan-rule-set · register-scan-* · unified-catalog-data-assets-search
Data source registration                            deps: Data Map | Data source
Scan                                                deps: Data source registration
Scan authentication method                          deps: Scan | Microsoft Entra ID tenant
Managed identity (system-assigned)                  deps: Scan authentication method | Microsoft Purview account
Service principal credential                        deps: Scan authentication method
Azure Key Vault credential connection               deps: Scan authentication method
Scan scope (entity selection)                       deps: Scan
Scan level (L1/L2/L3)                               deps: Scan
L1 scan (basic metadata)                            deps: Scan level (L1/L2/L3)
L2 scan (schema)                                    deps: Scan level (L1/L2/L3) | Asset schema
L3 scan (classification)                            deps: Scan level (L1/L2/L3) | Data classification
Auto detect scan level                              deps: Scan level (L1/L2/L3)
Data sampling for classification                    deps: L3 scan (classification)
Scan rule set                                       deps: Scan
System scan rule set                                deps: Scan rule set
Custom scan rule set                                deps: Scan rule set
File types supported for scanning                   deps: Scan rule set
Custom file type / parser                           deps: Custom scan rule set | File types supported for scanning
Scan trigger                                        deps: Scan
Scan schedule (daily/weekly/monthly)                deps: Scan trigger
Run scan now (one-time)                             deps: Scan trigger
Full vs incremental scan                            deps: Scan
Deleted asset detection                             deps: Scan | Full vs incremental scan
Scan run monitoring                                 deps: Scan
Ingestion                                           deps: Scan | Data Map
Ingestion from scans                                deps: Ingestion | Resource set detection
Ingestion from lineage connections                  deps: Ingestion | Data lineage
Data classification                                 deps: Data asset
System classification (200+)                        deps: Data classification
Custom classification                               deps: Data classification
Classification rule                                 deps: Data classification
Custom classification rule (regex / dictionary)     deps: Custom classification | Classification rule
Asset-level vs column-level classification          deps: Data classification | Column
Auto-applied classification on scan                 deps: Data classification | L3 scan (classification)
Unified Catalog search                              deps: Unified Catalog | Data asset
AI-powered copilot search                           deps: Unified Catalog search
Browse by collection / source type                  deps: Unified Catalog search | Collection
Search by governance domain / data product          deps: Unified Catalog search | Governance domain | Data product

## DQ — Purview Data Quality (31 concepts)
# Source: unified-catalog-data-quality · unified-catalog-data-quality-rules · unified-catalog-data-quality-profiling · unified-catalog-data-quality-scan · unified-catalog-data-quality-scores
Data quality                                        deps: Unified Catalog | Data product
Data quality life cycle                             deps: Data quality
Data source connection (DQ)                         deps: Data quality | Managed identity (system-assigned)
Data quality compute (Spark/Delta)                  deps: Data quality
Data profiling                                      deps: Data quality | Data source connection (DQ)
AI-recommended columns for profiling                deps: Data profiling
Profiling statistics (min/max/std/uniqueness)       deps: Data profiling
Column-level profiling drill-down                   deps: Profiling statistics (min/max/std/uniqueness) | Column
Data quality rule                                   deps: Data quality | Data profiling
Out-of-the-box (OOB) rule                           deps: Data quality rule
AI-generated rule                                   deps: Data quality rule | AI-recommended columns for profiling
Custom rule (functions / expressions)               deps: Data quality rule
Data quality dimension                              deps: Data quality rule
Completeness dimension                              deps: Data quality dimension
Conformity dimension                                deps: Data quality dimension
Consistency dimension                               deps: Data quality dimension
Accuracy dimension                                  deps: Data quality dimension
Uniqueness dimension                                deps: Data quality dimension
Freshness / timeliness dimension                    deps: Data quality dimension
Rule-to-column assignment                           deps: Data quality rule | Column
Freshness rule (entity/table level)                 deps: Data quality rule | Freshness / timeliness dimension
Data quality rule at CDE level                      deps: Data quality rule | Critical data element (CDE)
Data quality scan                                   deps: Data quality rule | Rule-to-column assignment
Data quality scan scheduling                        deps: Data quality scan | Scan schedule (daily/weekly/monthly)
Data quality job monitoring                         deps: Data quality scan
Data quality score (rule level)                     deps: Data quality scan | Data quality rule
Data quality score (asset/product/domain)           deps: Data quality score (rule level) | Data asset | Data product | Governance domain
Data quality alert / notification                   deps: Data quality scan | Data quality score (rule level)
Data quality action center                          deps: Data quality scan | Data quality score (rule level)
Data quality managed virtual network                deps: Data quality | Managed Virtual Network IR
Data Governance Processing Unit (DGPU)              deps: Data quality compute (Spark/Delta)

## LIN — Data Lineage (16 concepts)
# Source: data-gov-classic-lineage · data-map-lineage-azure-data-factory · data-map-lineage-azure-synapse-analytics · data-map-lineage-power-bi · register-scan-fabric-tenant
Data lineage                                        deps: Data Map | Data asset
Lineage graph (source-process-target)               deps: Data lineage
Lineage process / activity                          deps: Lineage graph (source-process-target)
Source and target entity                            deps: Lineage graph (source-process-target)
Entity-level (table) lineage                        deps: Lineage graph (source-process-target) | Table asset
Column / attribute-level lineage                    deps: Entity-level (table) lineage | Column
Process execution status                            deps: Lineage process / activity
Azure Data Factory lineage                          deps: Data lineage | Ingestion from lineage connections
Azure Synapse pipeline lineage                      deps: Data lineage | Ingestion from lineage connections
Power BI lineage                                    deps: Data lineage | Power BI tenant source
Microsoft Fabric lineage                            deps: Data lineage | Fabric tenant source
Azure Data Share lineage                            deps: Data lineage
Azure Machine Learning lineage                      deps: Data lineage
Databricks Unity Catalog lineage                    deps: Data lineage
Lineage on Data Map assets                          deps: Data lineage | Asset fully qualified name (FQN)
Unified Catalog asset lineage view                  deps: Data lineage | Unified Catalog search

## GOV — Governance, Access Policies & Security (31 concepts)
# Source: unified-catalog-data-product-access-policies · data-map-sensitivity-labels · data-governance-private-endpoints-managed-virtual-network · customer-key-overview · register-scan-azure-sql-database#policies
Unified Catalog access policy                       deps: Unified Catalog | Data product
Self-service data access                            deps: Unified Catalog access policy
Request access (data product)                       deps: Self-service data access | Data product
Permitted access / usage purpose                    deps: Unified Catalog access policy
Terms of use attestation                            deps: Unified Catalog access policy
No-copy attestation                                 deps: Terms of use attestation
Tiered / sequential approval                        deps: Request access (data product)
Manager approval tier                               deps: Tiered / sequential approval
Privacy & compliance review tier                    deps: Tiered / sequential approval
Access request approver                             deps: Tiered / sequential approval
Access provider tier                                deps: Tiered / sequential approval
Request status (pending/approved/completed)         deps: Request access (data product)
Inherited / aggregated policy                       deps: Unified Catalog access policy | Policy on glossary term | Policy on CDE | Policy on governance domain
Policy on governance domain                         deps: Unified Catalog access policy | Governance domain
Policy on glossary term                             deps: Unified Catalog access policy | Glossary term
Policy on CDE                                       deps: Unified Catalog access policy | Critical data element (CDE)
Disable access management                           deps: Unified Catalog access policy
Data owner policy (Data Map)                        deps: Data Map | Data source
DevOps policy                                       deps: Data owner policy (Data Map)
Self-service access policy (Data Map)               deps: Data owner policy (Data Map)
Sensitivity label                                   deps: Microsoft Entra ID tenant
Microsoft Purview Information Protection (MIP)      deps: Sensitivity label
Label scope (Files & other data assets)             deps: Sensitivity label
Sensitivity label on Data Map asset                 deps: Sensitivity label | Data asset | Microsoft Purview Information Protection (MIP)
Auto-labeling policy                                deps: Sensitivity label on Data Map asset | Auto-applied classification on scan
Label travels with data                             deps: Microsoft Purview Information Protection (MIP)
M365 license requirement (labels)                   deps: Sensitivity label on Data Map asset
Customer-managed key (CMK)                          deps: Microsoft Purview account
Encryption at rest                                  deps: Microsoft Purview account
Private endpoint                                    deps: Microsoft Purview account
Microsoft Purview firewall                          deps: Microsoft Purview account | Private endpoint

## GLOS — Business Glossary, Domains, Products, CDEs & OKRs (37 concepts)
# Source: unified-catalog-governance-domains · unified-catalog-data-products · unified-catalog-glossary-terms · unified-catalog-critical-data-elements · unified-catalog-okrs · unified-catalog-attributes-business-concept
Governance domain                                   deps: Unified Catalog
Governance domain type                              deps: Governance domain
Functional unit domain                              deps: Governance domain type
Line of business domain                             deps: Governance domain type
Data domain (entity)                                deps: Governance domain type
Regulatory domain                                   deps: Governance domain type
Project domain                                      deps: Governance domain type
Governance domain owner                             deps: Governance domain
Governance domain published status                  deps: Governance domain
Business concept                                    deps: Governance domain
Data product                                        deps: Governance domain | Data asset
Data product owner                                  deps: Data product
Associated data assets                              deps: Data product | Data asset
Data product published / draft status               deps: Data product
Data product business context (use case)            deps: Data product
Glossary term                                       deps: Governance domain | Business concept
Classic glossary term migration                     deps: Glossary term | Classic Data Catalog
Active glossary term (carries policy)               deps: Glossary term | Policy on glossary term
Term applied to data product / asset / column       deps: Glossary term | Data product | Data asset | Column
Related terms                                       deps: Glossary term
Critical data element (CDE)                         deps: Governance domain | Business concept
CDE expected data type                              deps: Critical data element (CDE)
CDE column mapping                                  deps: Critical data element (CDE) | Column
CDE associated data products                        deps: Critical data element (CDE) | Data product
CDE status (draft/published/expired)                deps: Critical data element (CDE)
CDE bulk import (CSV)                               deps: Critical data element (CDE)
CDE related glossary terms                          deps: Critical data element (CDE) | Glossary term
Custom / business concept attributes                deps: Business concept
Managed attribute group                             deps: Custom / business concept attributes
OKR (objectives and key results)                    deps: Governance domain | Data product
Objective definition                                deps: OKR (objectives and key results)
Key result                                          deps: OKR (objectives and key results)
OKR target date                                     deps: OKR (objectives and key results)
OKR progress status                                 deps: Key result
OKR owner                                           deps: OKR (objectives and key results)
OKR related data products                           deps: OKR (objectives and key results) | Data product
Enterprise glossary view                            deps: Glossary term | Critical data element (CDE)

## OPEN — Sources, Integration Runtime & Fabric/OneLake (37 concepts)
# Source: data-map-data-sources · data-map-integration-runtime-choose · register-scan-fabric-tenant · register-scan-power-bi-tenant · register-scan-azure-databricks-unity-catalog
Data source                                         deps: Data Map
Azure data source category                          deps: Data source
Database source category                            deps: Data source
File source category                                deps: Data source
Services and apps source category                   deps: Data source
Multicloud source (AWS / GCP)                       deps: Data source
Azure SQL Database source                           deps: Azure data source category
Azure Data Lake Storage Gen2 source                 deps: Azure data source category
Azure Blob Storage source                           deps: Azure data source category
Azure Synapse Analytics source                      deps: Azure data source category
Azure Databricks Unity Catalog source               deps: Azure data source category
Amazon S3 source                                    deps: Multicloud source (AWS / GCP) | File source category
Amazon RDS / Redshift source                        deps: Multicloud source (AWS / GCP) | Database source category
Google BigQuery source                              deps: Multicloud source (AWS / GCP) | Database source category
Snowflake source                                    deps: Database source category
SAP source (ECC / S4HANA / HANA / BW)               deps: Services and apps source category
On-premises SQL Server source                       deps: Database source category
Power BI tenant source                              deps: Services and apps source category
Fabric tenant source                                deps: Services and apps source category
Same-tenant vs cross-tenant scan                    deps: Fabric tenant source | Power BI tenant source
OneLake                                             deps: Fabric tenant source
Fabric item                                         deps: Fabric tenant source
Fabric Lakehouse                                    deps: Fabric item
Fabric Warehouse                                    deps: Fabric item
Fabric semantic model (dataset)                     deps: Fabric item
Fabric sub-item (tables/files)                      deps: Fabric Lakehouse | OneLake
Fabric metadata scanning (admin API)                deps: Fabric tenant source
OneLake security role (Read)                        deps: OneLake | Fabric item
Integration runtime (IR)                            deps: Scan | Data source
Azure integration runtime                           deps: Integration runtime (IR)
Managed Virtual Network IR                          deps: Integration runtime (IR) | Private endpoint
Managed private endpoint                            deps: Managed Virtual Network IR | Private endpoint
Self-hosted integration runtime (SHIR)              deps: Integration runtime (IR)
Kubernetes-supported SHIR                           deps: Self-hosted integration runtime (SHIR)
AWS integration runtime                             deps: Integration runtime (IR) | Multicloud source (AWS / GCP)
Java Runtime Environment (JRE/JDK) prereq           deps: Self-hosted integration runtime (SHIR)
IR hibernation                                      deps: Managed Virtual Network IR

## RBAC — Domains, Collections & Roles (38 concepts)
# Source: data-map-domains-collections-manage · data-governance-roles-permissions · data-gov-classic-permissions · data-map-domains
Domain (Data Map)                                   deps: Data Map
Default domain                                      deps: Domain (Data Map)
Custom domain                                       deps: Domain (Data Map)
Root collection                                     deps: Default domain
Collection                                          deps: Domain (Data Map) | Root collection
Collection hierarchy (sub-collections)              deps: Collection
Collection-based RBAC                               deps: Collection
Permission inheritance                              deps: Collection-based RBAC | Collection hierarchy (sub-collections)
Restrict inherited permissions                      deps: Permission inheritance
Register source to collection                       deps: Collection | Data source registration
Move asset between collections                      deps: Collection | Data asset
Domain administrator                                deps: Domain (Data Map)
Collection administrator                            deps: Collection-based RBAC | Root collection
Data curator                                        deps: Collection-based RBAC
Data reader                                         deps: Collection-based RBAC
Data source administrator                           deps: Collection-based RBAC | Data source registration
Insights reader                                     deps: Collection-based RBAC | Data reader
Policy author                                       deps: Collection-based RBAC | Data owner policy (Data Map)
Workflow administrator                              deps: Collection-based RBAC
Tenant-level role group                             deps: Microsoft Purview account | Microsoft Entra ID tenant
Purview Administrators role group                   deps: Tenant-level role group
Data Source Administrators role group               deps: Tenant-level role group
Data Governance role group                          deps: Tenant-level role group
Catalog-level permission                            deps: Unified Catalog | Tenant-level role group
Data Governance Administrator                       deps: Catalog-level permission
Governance Domain Creator                           deps: Catalog-level permission | Governance domain
Global Catalog Reader                               deps: Catalog-level permission
Local Catalog Reader                                deps: Catalog-level permission | Governance domain
Global Asset Curator                                deps: Catalog-level permission | Glossary term
Data Health Owner                                   deps: Catalog-level permission | Data estate health
Data Health Reader                                  deps: Catalog-level permission | Data estate health
Governance-domain-level permission                  deps: Governance domain
Data Steward                                        deps: Governance-domain-level permission
Data Product Owner (role)                           deps: Governance-domain-level permission | Data product
Data Quality Steward                                deps: Governance-domain-level permission | Data quality scan
Data Profile Steward                                deps: Governance-domain-level permission | Data profiling
Data Quality Reader / Metadata Reader               deps: Governance-domain-level permission | Data quality score (rule level)
Governance Domain Reader                            deps: Governance-domain-level permission

## INSIGHTS — Data Estate Health & Insights (12 concepts)
# Source: unified-catalog · unified-catalog-controls · unified-catalog-observability · data-map-sensitivity-labels (precanned reports)
Data estate health                                  deps: Unified Catalog | Governance domain
Health management                                   deps: Data estate health
Health control                                      deps: Data estate health
Health score                                        deps: Health control
Health action                                       deps: Health control | Health score
Data observability                                  deps: Data estate health | Data quality score (asset/product/domain)
Governance progress reporting                       deps: Data estate health | Health score
Data Estate Insights (classic reports)              deps: Classic Data Catalog | Data asset
Classification insights                             deps: Data Estate Insights (classic reports) | Data classification
Sensitivity labeling insights                       deps: Data Estate Insights (classic reports) | Sensitivity label on Data Map asset
Asset / scan insights                               deps: Data Estate Insights (classic reports) | Scan run monitoring
Glossary insights                                   deps: Data Estate Insights (classic reports) | Glossary term

---

## APPENDIX A — KEY RESOURCES & API/PORTAL SURFACE

- **CORE** — Microsoft Purview account (Azure resource) -> Microsoft Purview portal (purview.microsoft.com) + classic governance portal (web.purview.azure.com). Two solutions: Data Map + Unified Catalog. Data Map billed in Capacity Units (1 CU = 25 ops/sec + 10 GB metadata storage), elastic autoscale. Free vs Enterprise account type gates Unified Catalog access.
- **META** — Data asset (FQN, schema, columns) ingested into Data Map. Resource set = one catalog object representing many partition files (Parquet/CSV/Avro/Orc); detected via patterns ({GUID},{N},{HEX},{LOC},{SparkPartitions}). Advanced Resource Sets add partition count/sample path/size + pattern rules (Data Curator at root collection toggles in Settings > Account).
- **SRCH** — Register source -> Scan (auth: Managed Identity / Service Principal / Key Vault) -> Scan rule set (system or custom) + scan level (Auto/L1/L2/L3) + trigger (Once/Recurring daily-weekly-monthly). Ingestion applies resource-set patterns + lineage. 200+ system classifications + custom (regex/dictionary). Sampling: 128 rows / 1 MB (structured), 20 MB (docs).
- **DQ** — Purview Data Quality runs on Apache Spark 3.5 + Delta Lake 3.2.1, Managed Identity only. Profiling (AI-recommended columns) -> rules (OOB / AI-generated / custom) measuring 6 dimensions: Completeness, Conformity, Consistency, Accuracy, Uniqueness, Freshness. <=200 rules/asset. Scores roll up rule -> asset -> data product -> governance domain. Billed in DGPU.
- **LIN** — Lineage graph = Source entity -> Process/activity -> Target entity; entity (table) level + column/attribute level + process execution status. Auto-captured from ADF, Synapse pipelines, Microsoft Fabric, Power BI, Data Share, AML, Databricks Unity Catalog. Viewable on Data Map assets and in Unified Catalog asset details.
- **GOV** — Unified Catalog access policy on a data product: usage purpose + terms-of-use/attestations + tiered approval (Manager -> Privacy review -> Approver -> Access provider). Policies inherited/aggregated from governance domain, glossary term, CDE. Data Map policies: data owner, DevOps, self-service. Sensitivity labels via MIP (label scope Files & other data assets, M365 license required); auto-labeling policies fire on classification at scan. CMK, encryption, private endpoints, firewall.
- **GLOS** — Governance domain (Functional unit / Line of business / Data domain / Regulatory / Project) is a boundary that houses business concepts: data products (group of assets + use case), glossary terms (active, carry policy), critical data elements (CDE = logical column grouping, preview), OKRs (objective + key results, preview), and custom/business-concept attributes. Statuses: Draft / Published / Expired.
- **OPEN** — Sources by category: Azure, Database, File, Services & apps, multicloud (AWS S3/RDS/Redshift, GCP BigQuery). IR types: Azure IR (default, public), Managed Virtual Network IR (+ managed private endpoints, v2), Self-hosted IR (on-prem/VNet, needs JRE/JDK), Kubernetes-supported SHIR, AWS IR. Fabric/OneLake: Fabric tenant source -> items (Lakehouse, Warehouse, Semantic model, Notebook, Pipeline) + sub-items; requires Fabric metadata scanning + read-only admin API security group; OneLake security role (Read).
- **RBAC** — Data Map: Default domain (= upgraded root collection) + up to 4 custom domains -> collections (hierarchy, inheritance, restrict). Collection roles: Collection Admin, Data Curator, Data Reader, Data Source Admin, Insights Reader, Policy Author, Workflow Admin (+ Domain Admin). Unified Catalog roles: Catalog level (Data Governance Admin, Governance Domain Creator, Global/Local Catalog Reader, Global Asset Curator, Data Health Owner/Reader); Governance-domain level (Governance Domain Owner/Reader, Data Steward, Data Product Owner, Data Quality/Profile Stewards & Readers).
- **INSIGHTS** — Data estate health (Unified Catalog): Health management -> health controls -> health score -> health actions; data observability (uses Catalog roles). Classic Data Estate Insights reports (precanned): asset/scan, classification, sensitivity labeling, glossary insights — hydrated on scan.

## APPENDIX B — KEY ROLES (Data Map collections + Unified Catalog)

- **Data Map collection roles** — Domain Administrator, Collection Administrator, Data Curator, Data Reader, Data Source Administrator, Insights Reader, Policy Author, Workflow Administrator. Assigned on a domain/collection; inherited by sub-collections unless restricted.
- **Tenant role groups** — Purview Administrators, Data Source Administrators, Data Governance.
- **Unified Catalog — catalog level** — Data Governance Administrator, Governance Domain Creator, Global Catalog Reader, Local Catalog Reader, Global Asset Curator, Data Health Owner, Data Health Reader.
- **Unified Catalog — governance-domain level** — Governance Domain Owner, Governance Domain Reader, Data Steward, Data Product Owner, Data Quality Steward, Data Profile Steward, Data Quality Reader, Data Quality Metadata Reader, Data Profile Reader, Local Catalog Reader.
- Data Stewards / Data Product Owners also need Data Map (data reader) permissions to add assets to data products. Policy Author alone is insufficient to create Data Map policies — also needs Data Source Admin.

## APPENDIX C — NAMING, STATE & NOTES

- **Data Map vs Unified Catalog:** the **Data Map** is the metadata-storage + scanning/classification engine (collections, assets, lineage); the **Unified Catalog** is the governance experience on top (governance domains, data products, glossary, CDEs, OKRs, data quality, health). They share assets but use different permission models (collection RBAC vs catalog/governance-domain roles).
- **Classic Data Catalog -> Unified Catalog:** the new Unified Catalog supersedes the classic Data Catalog experience and is rolling out by region; an account must be upgraded to the **Enterprise** version to access it. Classic glossary terms can be migrated into Unified Catalog (preview).
- **Default domain:** when an account is upgraded to the new portal experience, the primary account's **root collection becomes the default domain**; up to 4 custom domains can be added. Collections live under domains.
- **Sensitivity labels != classifications:** classifications (200+ system + custom regex/dictionary) categorize data by business content; **sensitivity labels** (Highly Confidential / Restricted / Public) come from **Microsoft Purview Information Protection (MIP)**, require an M365 license in the same Entra tenant, travel with the data, and can be auto-applied via auto-labeling policies triggered on classification at scan. Label-on-Data-Map is **Preview**.
- **Preview surfaces (per docs, June 2026):** Critical data elements, OKRs, sensitivity-label-on-Data-Map, scan-scope 'new assets' toggle, and several source policies are marked Preview; Advanced Resource Sets are GA and rolling out to all Unified Catalog customers.
- **Data Quality engine:** runs on managed Apache Spark 3.5 + Delta Lake 3.2.1, **Managed Identity auth only**, billed per **DGPU**; six dimensions (Completeness, Conformity, Consistency, Accuracy, Uniqueness, Freshness); scores aggregate rule -> column -> asset -> data product -> governance domain. Rules can be pinned at the **CDE** level to govern a logical column concept across assets.
- **Fabric / OneLake convergence:** a Fabric tenant source brings in metadata + lineage for Fabric items (Lakehouse, Warehouse, Semantic model, Notebook, Pipeline) and Power BI; sub-item (table/file) metadata is scanned but sub-item lineage isn't; requires Fabric metadata-scanning admin settings + a read-only-admin-API security group, and a OneLake **Read** security role when OneLake security is enabled.
- **Integration runtimes** decide network reach: Azure IR (public, default, autoscaled), Managed Virtual Network IR (+ managed private endpoints, region-pinned, hibernates after 90 idle days), Self-hosted IR (on-prem/private, needs JRE/JDK for Parquet & some sources), Kubernetes-supported SHIR (scale, manual updates), and AWS IR (Amazon sources).

---
**Version:** 1.0.0 — extracted from live Microsoft Learn documentation (learn.microsoft.com/purview), 2026-06-18.
**Use for:** onboarding to Microsoft Purview, data-catalog/governance architecture & RBAC design, exam/cert prep (e.g., DP-203/SC-401 governance topics), grounding an LLM/agent, mapping a data-governance program.
**Ask the graph:** "What must I understand before a Data quality scan?" · "Trace Data source -> Sensitivity label on Data Map asset." · "What depends on Governance domain?"
