Data Catalog - Datris

Beta. Data Catalogs may evolve as we add features like sharing and access control.

A Data Catalog is a named bucket that groups related taps and pipelines. It’s pure metadata — assigning a tap to a catalog doesn’t change how the tap runs or where its data lands. Catalogs exist to make a busy platform browsable: instead of a flat list of 40 taps, you see “Stock Prices (12)”, “Weather (6)”, “Earnings (8)” and so on.

When to use a catalog

Use one whenever you have more than a handful of related taps that should travel together — same source, same domain, same project. You don’t have to use catalogs. Anything you create outside a catalog falls into the Uncataloged group, which works fine for one-off taps and pipelines.

How catalogs are stored

Catalogs aren’t a separate table or external system. Each tap and pipeline carries a nullable catalog string field; the platform groups by that field. Two consequences:

A catalog “exists” as soon as anything is assigned to it. There’s no separate create step required to use a name.
Renaming a catalog means updating the catalog field on every tap and pipeline in it. The Data Catalog UI handles this for you.

When you create a catalog explicitly via the UI, Datris persists the name by writing a placeholder tap named __catalog__<name> (description “Catalog placeholder”, enabled=false). This makes empty catalogs survive across UI reloads. The placeholder is hidden from the Taps list and ignored by the runtime.

The Data Catalog page

The Data Catalog tab in the UI shows one card per catalog plus an Uncataloged card.

Each card shows the catalog name and counts of taps + pipelines.
Click a card to expand it. Inside you’ll see:
- Each tap with its enabled/disabled status, last-run outcome, and record count.
- Each pipeline with a quick-jump link to its config.
+ Create Catalog — make a new empty catalog. The name is sanitized (lowercase, alphanumerics + underscores).
Delete catalog — confirms, then deletes the catalog’s placeholder plus every tap and pipeline inside it. This is destructive and not reversible. If you want to keep the items, re-tag them to a different catalog first (or to Uncataloged by clearing the field on each).
Delete Uncataloged — same delete icon on the Uncataloged group removes every tap and pipeline that isn’t in a named catalog. Confirm prompt reads “Delete ALL uncataloged items?” to make the scope obvious.
Delete individual tap or pipeline — expand any catalog (including Uncataloged) and use the trash icon on a specific row to delete just that item. Clicking the trash icon swaps the row’s action icons for an inline confirm: Delete / × for a tap; Config & Data / Data Only / × for a pipeline (Data Only wipes the rows but keeps the pipeline config so the next ingest fills it again).
Row actions menu (Uncataloged) — each item in the Uncataloged group has a ⋮ kebab menu after the trash icon with three actions:
- Edit — opens the tap or pipeline editor.
- Delete — same as the trash icon (inline confirm).
- Move to catalog — submenu listing every other catalog. Picking one sets the item’s catalog field and reloads the view. Before persisting, the UI checks the target catalog for a tap or pipeline of the same name; on a clash the move is blocked and a red banner explains the collision (“A tap named ‘foo’ already exists in catalog ‘reporting’. Rename one of them first.”).

Assigning a catalog

Four ways to put a tap or pipeline into a catalog:

Tap editor — when creating or editing a tap manually, set the Catalog field.
Pipeline editor — same field on the pipeline form.
CLI — pass --catalog <name> to datris ingest when creating a new pipeline. Re-ingesting an existing pipeline preserves its current catalog (and the rest of the config) — use the set_catalog MCP tool to change a catalog after the fact.
MCP agent — call set_catalog(pipeline="<name>", catalog="<label>") or set_catalog(tap="<name>", catalog="<label>") to set the label on an existing item. Pass an empty string (or omit catalog) to clear it back to Uncataloged. New pipelines created via create_pipeline accept an optional catalog argument directly.

Leave the field blank to keep the item Uncataloged.

API and MCP

There are no dedicated /catalogs REST endpoints — catalogs are just a field on the Tap and Pipeline configs. You read and write them through the existing endpoints:

GET /api/v1/taps and GET /api/v1/pipelines return the catalog field on each item.
POST / PUT to those endpoints accept a catalog field. To move an item, send a request with the new value.

From an MCP agent, the set_catalog tool wraps the read-modify-write for you on either a pipeline or a tap, so you don’t have to round-trip the full config yourself. For browsing, list taps or pipelines and group client-side by catalog.

What catalogs don’t do

They don’t isolate access — anyone with API access to a tap can see it regardless of catalog.
They don’t affect tap or pipeline behavior, scheduling, or destination.
They aren’t permissioned — catalogs are a flat namespace per tenant.

If you need true tenant isolation, that’s handled at the Datris environment level, not via catalogs. See tenant isolation in the configuration reference.

​When to use a catalog

​How catalogs are stored

​The Data Catalog page

​Assigning a catalog

​API and MCP

​What catalogs don’t do

When to use a catalog

How catalogs are stored

The Data Catalog page

Assigning a catalog

API and MCP

What catalogs don’t do