Expose data catalog context to AI workflows with DataHub
Use DataHub metadata, ownership, schema, and lineage context to ground agent-assisted data discovery and governance workflows.
npx skills add agentskillexchange/skills --skill expose-data-catalog-context-to-ai-workflows-with-datahub
Use this when an agent needs trusted enterprise data context before writing queries, explaining datasets, routing governance questions, or preparing data-change reviews. The operator workflow is to connect DataHub, search or fetch dataset metadata, inspect schema, lineage, ownership, and glossary context, then return a grounded summary or next action for the agent. It belongs in workflows where the agent would otherwise infer dataset meaning from names, stale docs, or ad hoc Slack context. This is not a generic DataHub platform listing; the scope is the repeatable handoff of catalog metadata into AI workflows so agents avoid guessing about datasets, owners, lineage, and governed usage boundaries.
What this skill actually does
Inputs and prerequisites: DataHub, DataHub API or CLI, catalog metadata access.
Setup notes: Deploy or connect to DataHub, configure metadata ingestion for relevant data systems, grant API access, then have the agent query datasets, ownership, schema, lineage, and glossary context before data work.
Source and verification boundary: use https://github.com/datahub-project/datahub as the canonical reference before running the workflow; keep commands, API calls, CLI usage, and generated outputs reviewable against that upstream source.
Framework fit: publish this as a Multi-Framework workflow only when the operator can invoke the documented toolchain directly, rather than treating the upstream project as a generic product listing.