Monitoring

Pipeline Tokens

Every pipeline processing job is assigned a unique pipeline token — a UUID that identifies the job throughout its lifecycle. You can monitor jobs via the REST API, the MCP server, the CLI, or the Datris UI. Pipeline tokens are returned:

In the response body of POST /api/v1/pipeline/upload (for uncompressed files)
In the job status when querying by pipeline name

Job Status

Query by Pipeline Token

curl "http://localhost:8080/api/v1/pipeline/status?pipelinetoken=pt-abc12345-..."

Returns an array of status entries for the job, one per processing stage:

[
  {
    "id": 1,
    "dateTime": "2026-03-15T10:00:00Z",
    "pipeline": "sales_data",
    "processName": "StreamNotifier",
    "publisherToken": null,
    "pipelineToken": "pt-abc12345-...",
    "filename": "sales_data",
    "state": "begin",
    "code": "begin",
    "description": "Process started",
    "epoch": 1710500400000
  },
  {
    "id": 2,
    "dateTime": "2026-03-15T10:00:01Z",
    "pipeline": "sales_data",
    "processName": "DataQuality",
    "publisherToken": null,
    "pipelineToken": "pt-abc12345-...",
    "filename": "sales_data",
    "state": "processing",
    "code": "processing",
    "description": "Running CodeGen data quality rule",
    "epoch": 1710500401000
  },
  {
    "id": 3,
    "dateTime": "2026-03-15T10:00:03Z",
    "pipeline": "sales_data",
    "processName": "PostgresLoader",
    "publisherToken": null,
    "pipelineToken": "pt-abc12345-...",
    "filename": "sales_data",
    "state": "end",
    "code": "end",
    "description": "Process completed",
    "epoch": 1710500403000
  }
]

Query by Publisher Token

curl "http://localhost:8080/api/v1/pipeline/status?publishertoken=pub-abc12345-..."

Returns every status row whose publisherToken matches — covers all ingestion jobs a single caller submitted. Tap runs set a publisherToken on every job they spawn, so one query covers a structured tap (1 job) or a document tap (N jobs, one per file) in a single call. Use publisherToken when you need to watch “this entire run,” pipelineToken when you need the detail of one specific job. Add &withrollup=true to wrap the response in a {rollup, events} object that classifies each job (success, warning, error, processing, timed_out) and exposes rollup.allDone for a single boolean to poll on. See the status API reference for the full shape. Agents call the same query via the MCP get_pipeline_status tool — pass publisher_token from a run_tap response and poll until rollup.allDone is true. The MCP tool sets withrollup=true automatically. For an upload_data flow, get_job_status does the same with the pipelineToken returned from the upload.

Query by Pipeline Name

curl "http://localhost:8080/api/v1/pipeline/status?pipelinename=sales_data&page=1"

Returns an array of job summaries for the pipeline (20 per page):

[
  {
    "createdAtTimestamp": "2026-03-15T10:00:00Z",
    "createdAt": 1710500400000,
    "updatedAt": 1710500403000,
    "pipeline": "sales_data",
    "pipelineToken": "pt-abc12345-...",
    "process": "PostgresLoader",
    "startTime": "2026-03-15T10:00:00Z",
    "endTime": "2026-03-15T10:00:03Z",
    "totalTime": "3s",
    "status": "end"
  }
]

Job Lifecycle

Jobs progress through these states:

State	Description
`INITIALIZED`	Job created, queued for processing
`PROCESSING`	Running in a dedicated thread
`COMPLETED`	Finished (check status messages for success or error)
`CANCELLED`	Job was killed via the `kill_job` API or MCP tool

Processing Stages

Each job logs status messages as it progresses through stages:

FileNotifier / StreamNotifier - Initial file or stream intake
DataQuality - Validation (if configured)
Transformation - Data transformation (if configured)
JobRunner - Orchestration of destination loaders
[LoaderName] - Each destination loader (e.g., PostgresLoader, SparkObjectStoreLoader)

Each stage logs begin, processing (with details), and end messages.

Status Storage

Job statuses are stored in MongoDB in the {environment}-pipeline-status collection. Each entry contains the pipeline token, process name, status, message, and timestamp.

Concurrent Job Handling

All destination loaders for a single job execute in parallel on a 20-thread pool
Jobs targeting the same database table are serialized (only one runs at a time)
Multiple jobs for different pipelines run concurrently

Datris UI

The Datris UI provides a visual interface for managing your entire Datris platform. It includes tabs for MCP server status and tools, the Agents tab for live AI agent activity, pipeline management, ingestion monitoring with job history and error details, semantic search across vector databases, and secrets management — all without needing to use the API directly.

Agent Monitor

The Agents tab shows a live view of every AI agent currently connected to the platform’s MCP server, along with a streaming log of the tool calls each agent is making. The visualization pane draws one icon per active MCP session on the right of the MCP server icon, connected by a line that pulses whenever a call is in flight. Idle sessions fade out automatically once they disconnect. Each agent label uses the most descriptive identifier available, in this order:

The MCP clientInfo.name supplied during the client’s handshake (e.g. claude-ai, claude-code, cursor)
The tenant name (multi-tenant deployments)
The API-key name from the api-keys secret (single-tenant deployments with named keys)
The API-key prefix
The session short-id (last-resort fallback)

The activity log below the visualization lists every tool call as it happens — timestamped per the platform’s configured date format and timezone, with the calling agent, tool name, argument preview, record count, response size, status, and latency. Clicking a row expands it to reveal the full arguments and response bodies as pretty-printed JSON (capped at 2 KB per blob). A header toolbar lets you copy the full log (with per-row detail) to the clipboard or clear the on-screen history. Activity is held in an in-memory ring buffer on the MCP server (the most recent 200 calls) and served via the UI’s internal /api/v1/mcp/activity proxy. It is not persisted — restarts clear the history.

CLI

Check job status from the terminal:

datris status my_pipeline
datris health

Getting Started

Discovery

Taps

Ingestion

Destinations

Data Quality

Transformation

AI Features

Configuration

Examples

Pipeline Tokens

Job Status

Query by Pipeline Token

Query by Publisher Token

Query by Pipeline Name

Job Lifecycle

Processing Stages

Status Storage

Concurrent Job Handling

Datris UI

Agent Monitor

CLI

Getting Started

Discovery

Taps

Ingestion

Destinations

Data Quality

Transformation

AI Features

Configuration

Examples

Documentation Index

​Pipeline Tokens

​Job Status

​Query by Pipeline Token

​Query by Publisher Token

​Query by Pipeline Name

​Job Lifecycle

​Processing Stages

​Status Storage

​Concurrent Job Handling

​Datris UI

​Agent Monitor

​CLI

Pipeline Tokens

Job Status

Query by Pipeline Token

Query by Publisher Token

Query by Pipeline Name

Job Lifecycle

Processing Stages

Status Storage

Concurrent Job Handling

Datris UI

Agent Monitor

CLI