Loading...
The Stale & Zombie Table Detector recipe identifies tables that are no longer refreshed, no longer used, or fully abandoned across connected data sources in OvalEdge. It analyzes profiling history, row-count changes, last update timestamps, and user-activity patterns to classify assets as stale, zombie, or both. The recipe correlates these findings with governance attributes, CDE status, and ETL lineage to highlight root causes such as pipeline failures, disabled jobs, or genuine business dormancy. It also generates monthly freshness trends to track catalog health. This provides stewards and data owners with a clear, automated foundation for cleanup, optimization, and governance decisions.
Large catalogs accumulate tables that are no longer updated or used. These stale and zombie tables drive unnecessary storage cost, distort governance quality metrics, and add operational noise. Automated detection prevents dormant data from silently eroding catalog hygiene.
Step 1 — Identify stale tables:
Compares last profiling and update timestamps to detect objects with no recent data refresh activity.
Step 2 — Identify zombie tables:
Reviews usage logs and query footprints to find datasets not accessed for long operational windows.
Step 3 — Compute stale-and-unused intersection:
Highlights high-priority candidates for archival by finding tables that are both inactive and unused.
Step 4 — Validate CDE sensitivity:
Cross-checks CDE flags to ensure no critical data assets are removed without proper governance review.
Step 5 — Assess pipeline or ETL issues:
Flags upstream failures when sudden inactivity aligns with connector or job anomalies.
Step 6 — Map downstream dependencies:
Surfaces which reports, dashboards, or datasets rely on stale tables to avoid breaking analytics.
Step 7 — Generate monthly freshness trends:
Tracks decay or recovery patterns in catalog health to support operational governance cycles.
| Insight Category | What the recipe discovered | Business Impact |
|---|---|---|
| Stale tables | Several domain tables show no updates for 90+ days despite being part of active pipelines. | Indicates upstream ingestion failures or dormant systems driving report inaccuracies. |
| Zombie tables | Multiple datasets have not been queried, viewed, or touched for over 12 months. | Clear candidates for archival to reduce storage and governance workload. |
| Combined stale–unused | A subset of tables appear in both lists and have no downstream dependencies. | High-confidence removal targets with negligible business risk. |
Make sure the following ingredients are available in your workspace: