Loading...
This recipe automatically identifies potential reference data across all connected systems, groups similar columns into unified Reference Data Units (RDUs) using LLM intelligence, and detects inconsistencies such as missing, extra, or misaligned values. It gives data owners clear visibility into how reference data is reused across the enterprise, where definitions diverge, and where standardization is required—leading to cleaner analytics, stronger governance, and more reliable reporting across all business functions.
Reference data powers reporting, analytics, and operational decisions across every part of an organization.
Yet the same concept—like State, Status, or Category—often appears differently across systems.
These inconsistencies silently cause inaccurate KPIs, reconciliation issues, and compliance risks.
This recipe uncovers those variations, groups similar fields into business-friendly Reference Data Units (RDUs),
and reveals where alignment is strong—and where it breaks.
It provides a clear, enterprise-wide understanding of how core reference values are defined and reused.
Step 1 — Identify candidate reference columns
The recipe scans your environment to detect fields with reference-data-like behavior.
Step 2 — Group columns into Reference Data Units (RDUs)
Similar columns across systems are grouped using intelligent pattern detection.
Step 3 — Summarize enterprise-wide reuse
Understand where each RDU is used across tables, schemas, and systems.
Step 4 — Detect inconsistencies across sources
Highlights missing expected values, extra values, format differences, and placeholder entries.
| Reference Data Unit | What the recipe discovered | Systems Impacted | Key Issues |
|---|---|---|---|
| RDU1 (Customer Status) | Appears in 11 tables across 4 systems. Expected values such as Active, Inactive, Suspended were found alongside variations like "A", "I", "Suspnd", "Unknown". | 4 Systems | Missing • Extra • Misaligned Formats |
| RDU2 (State / Region) | Used across Retail, Logistics, and Insurance. Mix of full names ("Texas"), abbreviations ("TX"), and legacy region codes. | 3 Systems | Format inconsistencies • Missing values |
| RDU3 (Product Category) | Unified 7+ naming conventions. Normalized values such as "Electronics", "Elec", "ELX", and "Electronics " into one standardized category. | 5 Systems | Duplicates • Spelling Variants • Legacy Codes |
Make sure the following ingredients are available in your workspace: