Datris CLI - Datris

The Datris CLI (datris) is a command-line interface for the Datris Data Platform. It communicates with the platform via the MCP server, providing a simple way to ingest data, run queries, search vector stores, and manage pipelines — all from the terminal.

Installation

brew tap datris/tap
brew install datris

Configuration

The CLI connects to the MCP server via SSE. Set the server URL with an environment variable:

export MCP_SERVER_URL=http://localhost:3000/sse    # default

JSON Output

Every command supports --json to return raw JSON instead of human-readable output. This is useful for scripting and programmatic use.

datris pipelines --json
datris query "SELECT * FROM trades" --json
datris analyze "top 5 stocks" --table trades --json
datris health --json

Commands

`datris help`

Show all available commands and options.

datris --help

`datris pipelines`

List all registered pipelines.

datris pipelines

`datris ingest`

Create a pipeline and ingest a data file in one step. The schema is auto-detected from the file. If the named pipeline already exists, the existing config is preserved and the file is uploaded into it — --dest, --table, --ai-validate, --ai-transform, and --catalog are ignored on re-ingest. Delete the pipeline first (datris delete <name>) if you need a fresh config.

datris ingest <FILE> [OPTIONS]

Option	Description
`--pipeline, -p`	Pipeline name (default: derived from filename)
`--dest, -d`	Destination: `postgres`, `mongodb`, `qdrant`, `weaviate`, `milvus`, `chroma`, `pgvector` (default: `postgres`)
`--table, -t`	Table/collection name (default: pipeline name)
`--database`	Database name (default: `datris`)
`--ai-validate`	AI data quality rule — plain-English instruction
`--ai-transform`	AI transformation instruction — plain-English instruction
`--ai-analyze`	Ask a question about the data after ingestion completes
`--catalog`	Catalog label to group this pipeline with related pipelines (e.g. `openclaw`). Free-form — no need to pre-create. Only applied when creating a new pipeline.
`--json`	Return raw JSON

Examples:

# Basic ingestion — auto-derives pipeline name from filename
datris ingest sales-data.csv --dest postgres

# With explicit pipeline name
datris ingest sales-data.csv --pipeline sales --dest postgres

# With AI validation
datris ingest trades.csv --dest postgres --ai-validate "all prices must be positive and dates must be YYYY-MM-DD"

# With AI transformation
datris ingest trades.csv --dest postgres --ai-transform "convert all date columns to YYYY/MM/DD format"

# Both validation and transformation
datris ingest trades.csv --dest postgres \
  --ai-validate "all prices must be positive" \
  --ai-transform "convert dates to YYYY/MM/DD and uppercase all ticker symbols"

# Ingest into MongoDB
datris ingest events.json --dest mongodb --database analytics

# Ingest into a vector store for RAG
datris ingest manual.pdf --dest pgvector

# Group related pipelines under a catalog
datris ingest legal-2026.pdf --pipeline legal_2026 --dest pgvector --catalog legal

# Ingest + analyze in one command
datris ingest trades.csv --dest postgres --ai-analyze "What are the top 5 stocks by volume?"

# Ingest a document and ask a question about it
datris ingest annual-report.pdf --dest pgvector --ai-analyze "What was the company's revenue?"

# Ingest + analyze, raw JSON output for scripts
datris ingest trades.csv --dest postgres --ai-analyze "top 5 by volume" --json

`datris query`

Execute a read-only SQL SELECT query against PostgreSQL.

datris query "SELECT * FROM public.sales LIMIT 10"
datris query "SELECT symbol, close FROM public.trades WHERE close > 100" --limit 50

Option	Description
`--limit`	Max rows returned (default: 100, max: 1000)
`--json`	Return raw JSON

`datris search`

Semantic search across a vector database.

datris search "What is the return policy?" --store pgvector --collection support_docs

Option	Description
`--store`	Vector store: `qdrant`, `weaviate`, `milvus`, `chroma`, `pgvector` (default: `pgvector`)
`--collection`	Collection/table name (required)
`--top-k`	Number of results (default: 5)
`--json`	Return raw JSON

`datris analyze`

Ask a question about your data using AI. Works with any destination type — auto-picks the right approach based on --dest.

datris analyze <QUESTION> --table <TABLE> [OPTIONS]

Option	Description
`--table, -t`	Table/collection name (required)
`--dest, -d`	Data source type: `postgres`, `mongodb`, `qdrant`, `weaviate`, `milvus`, `chroma`, `pgvector` (default: `postgres`)
`--top-k, -k`	Number of search results for vector stores (default: 5)
`--json`	Return raw JSON instead of AI narrative

Examples:

# Analyze PostgreSQL data — AI generates SQL, executes it, returns AI answer
datris analyze "What are the top 5 stocks by volume?" --table trades

# Analyze MongoDB data
datris analyze "How many events occurred in March?" --table events --dest mongodb

# Analyze vector store data (RAG) — semantic search + AI answer
datris analyze "What is the return policy?" --table support_docs --dest pgvector
datris analyze "What was quarterly revenue?" --table financial_docs --dest qdrant

# Raw JSON output for scripts
datris analyze "top 5 stocks" --table trades --json

How it works by destination:

PostgreSQL — AI generates a SQL query from your question, executes it, then summarizes the results in a natural language answer
MongoDB — fetches documents from the collection, then AI answers the question based on the data
Vector stores — performs semantic search to find relevant chunks, then AI generates an answer from the retrieved context

`datris query-mongo`

Query a MongoDB collection with optional filter and projection.

datris query-mongo events
datris query-mongo events --filter '{"status": "active"}' --limit 20
datris query-mongo events --projection '{"name": 1, "status": 1}'

Option	Description
`--filter`	MongoDB filter JSON (default: `{}`)
`--projection`	Fields to include/exclude
`--limit`	Max documents (default: 100)
`--json`	Return raw JSON

`datris status`

Get the latest job status for a pipeline.

datris status my_pipeline

`datris delete`

Delete a pipeline configuration and optionally its destination data.

datris delete my_pipeline
datris delete my_pipeline --keep-data

Option	Description
`--keep-data`	Keep destination data (only delete the pipeline config)
`--json`	Return raw JSON

`datris health`

Check the health of all backend services.

datris health

`datris secrets`

List all configured secrets.

datris secrets

`datris taps`

List all taps.

datris taps

`datris tap create`

Create a tap from a plain-English description.

datris tap create "<description>" --pipeline <name> [--name <tap-name>] [--cron "<expression>"] [--type structured|document]

Option	Description
`--pipeline`, `-p`	Target pipeline name (required)
`--name`, `-n`	Tap name (default: derived from pipeline name)
`--cron`	CRON expression for scheduling (Quartz format)
`--secret`	Vault secret name for credentials injected as env vars
`--script`	Path to a Python script file with a `fetch()` function (skips AI generation)
`--type`	`structured` (default) returns rows of records; `document` returns `{uri, filename, content}` dicts destined for a vector-store pipeline. See Document Taps

Examples:

# Structured: rows into a Postgres/Mongo pipeline
datris tap create "Fetch daily stock prices for S&P 500 from yfinance" --pipeline stocks --cron "0 0 0 * * ?"
datris tap create "Get weather data from Open-Meteo API" --pipeline weather --name weather-tap

# Document: raw file bytes into a vector-store pipeline
datris tap create "Discover all PDFs in the S3 bucket 'contracts' under prefix 2026/" \
  --pipeline contracts-vec --type document --secret aws-creds

Document taps require a target pipeline whose source is unstructuredAttributes and whose destination is a vector store (qdrant, pgvector, weaviate, milvus, or chroma). The server rejects tap create --type document against a structured pipeline with HTTP 400.

`datris tap run`

Run a tap manually. Output reflects whether the fetched records actually landed in the target pipeline:

$ datris tap run stock-prices
  Running tap 'stock-prices'...
  ✓ success — 33 records fetched
    → persisted to stock-prices
    → watch: datris pipeline status --publisher 5b2f4a1d-8c7e-4f0a-9b3d-6e1c2a4f8b9e

Or if the run succeeded but records weren’t persisted (missing target pipeline, test mode, script error, zero records), the CLI tells you exactly why:

  ✓ success — 33 records fetched
    → not persisted (no_target_pipeline)

Pass --json to get the full response including mode, persisted, persistedReason, publisherToken, and pipelineTokens.

`datris tap delete`

Delete a tap.

datris tap delete <name>

`datris version`

Get the server version.

datris version

Pipeline Name Auto-Detection

When --pipeline is not specified, the CLI derives the pipeline name from the filename:

sales-data.csv → sales_data
Q1 Revenue Report.csv → q1_revenue_report
trades.json → trades

The extension is stripped, hyphens and spaces are replaced with underscores, and the name is lowercased.

MCP

Documentation Index

​Installation

​Configuration

​JSON Output

​Commands

​datris help

​datris pipelines

​datris ingest

​datris query

​datris search

​datris analyze

​datris query-mongo

​datris status

​datris delete

​datris health

​datris secrets

​datris taps

​datris tap create

​datris tap run

​datris tap delete

​datris version

​Pipeline Name Auto-Detection

Installation

Configuration

JSON Output

Commands

`datris help`

`datris pipelines`

`datris ingest`

`datris query`

`datris search`

`datris analyze`

`datris query-mongo`

`datris status`

`datris delete`

`datris health`

`datris secrets`

`datris taps`

`datris tap create`

`datris tap run`

`datris tap delete`

`datris version`

Pipeline Name Auto-Detection