Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.datris.ai/llms.txt

Use this file to discover all available pages before exploring further.

The Datris CLI (datris) is a command-line interface for the Datris Data Platform. It communicates with the platform via the MCP server, providing a simple way to ingest data, run queries, search vector stores, and manage pipelines — all from the terminal.

Installation

brew tap datris/tap
brew install datris

Configuration

The CLI connects to the MCP server via SSE. Set the server URL with an environment variable:
export MCP_SERVER_URL=http://localhost:3000/sse    # default

JSON Output

Every command supports --json to return raw JSON instead of human-readable output. This is useful for scripting and programmatic use.
datris pipelines --json
datris query "SELECT * FROM trades" --json
datris analyze "top 5 stocks" --table trades --json
datris health --json

Commands

datris help

Show all available commands and options.
datris --help

datris pipelines

List all registered pipelines.
datris pipelines

datris ingest

Create a pipeline and ingest a data file in one step. The schema is auto-detected from the file. If the named pipeline already exists, the existing config is preserved and the file is uploaded into it — --dest, --table, --ai-validate, --ai-transform, and --catalog are ignored on re-ingest. Delete the pipeline first (datris delete <name>) if you need a fresh config.
datris ingest <FILE> [OPTIONS]
OptionDescription
--pipeline, -pPipeline name (default: derived from filename)
--dest, -dDestination: postgres, mongodb, qdrant, weaviate, milvus, chroma, pgvector (default: postgres)
--table, -tTable/collection name (default: pipeline name)
--databaseDatabase name (default: datris)
--ai-validateAI data quality rule — plain-English instruction
--ai-transformAI transformation instruction — plain-English instruction
--ai-analyzeAsk a question about the data after ingestion completes
--catalogCatalog label to group this pipeline with related pipelines (e.g. openclaw). Free-form — no need to pre-create. Only applied when creating a new pipeline.
--jsonReturn raw JSON
Examples:
# Basic ingestion — auto-derives pipeline name from filename
datris ingest sales-data.csv --dest postgres

# With explicit pipeline name
datris ingest sales-data.csv --pipeline sales --dest postgres

# With AI validation
datris ingest trades.csv --dest postgres --ai-validate "all prices must be positive and dates must be YYYY-MM-DD"

# With AI transformation
datris ingest trades.csv --dest postgres --ai-transform "convert all date columns to YYYY/MM/DD format"

# Both validation and transformation
datris ingest trades.csv --dest postgres \
  --ai-validate "all prices must be positive" \
  --ai-transform "convert dates to YYYY/MM/DD and uppercase all ticker symbols"

# Ingest into MongoDB
datris ingest events.json --dest mongodb --database analytics

# Ingest into a vector store for RAG
datris ingest manual.pdf --dest pgvector

# Group related pipelines under a catalog
datris ingest legal-2026.pdf --pipeline legal_2026 --dest pgvector --catalog legal

# Ingest + analyze in one command
datris ingest trades.csv --dest postgres --ai-analyze "What are the top 5 stocks by volume?"

# Ingest a document and ask a question about it
datris ingest annual-report.pdf --dest pgvector --ai-analyze "What was the company's revenue?"

# Ingest + analyze, raw JSON output for scripts
datris ingest trades.csv --dest postgres --ai-analyze "top 5 by volume" --json

datris query

Execute a read-only SQL SELECT query against PostgreSQL.
datris query "SELECT * FROM public.sales LIMIT 10"
datris query "SELECT symbol, close FROM public.trades WHERE close > 100" --limit 50
OptionDescription
--limitMax rows returned (default: 100, max: 1000)
--jsonReturn raw JSON

Semantic search across a vector database.
datris search "What is the return policy?" --store pgvector --collection support_docs
OptionDescription
--storeVector store: qdrant, weaviate, milvus, chroma, pgvector (default: pgvector)
--collectionCollection/table name (required)
--top-kNumber of results (default: 5)
--jsonReturn raw JSON

datris analyze

Ask a question about your data using AI. Works with any destination type — auto-picks the right approach based on --dest.
datris analyze <QUESTION> --table <TABLE> [OPTIONS]
OptionDescription
--table, -tTable/collection name (required)
--dest, -dData source type: postgres, mongodb, qdrant, weaviate, milvus, chroma, pgvector (default: postgres)
--top-k, -kNumber of search results for vector stores (default: 5)
--jsonReturn raw JSON instead of AI narrative
Examples:
# Analyze PostgreSQL data — AI generates SQL, executes it, returns AI answer
datris analyze "What are the top 5 stocks by volume?" --table trades

# Analyze MongoDB data
datris analyze "How many events occurred in March?" --table events --dest mongodb

# Analyze vector store data (RAG) — semantic search + AI answer
datris analyze "What is the return policy?" --table support_docs --dest pgvector
datris analyze "What was quarterly revenue?" --table financial_docs --dest qdrant

# Raw JSON output for scripts
datris analyze "top 5 stocks" --table trades --json
How it works by destination:
  • PostgreSQL — AI generates a SQL query from your question, executes it, then summarizes the results in a natural language answer
  • MongoDB — fetches documents from the collection, then AI answers the question based on the data
  • Vector stores — performs semantic search to find relevant chunks, then AI generates an answer from the retrieved context

datris query-mongo

Query a MongoDB collection with optional filter and projection.
datris query-mongo events
datris query-mongo events --filter '{"status": "active"}' --limit 20
datris query-mongo events --projection '{"name": 1, "status": 1}'
OptionDescription
--filterMongoDB filter JSON (default: {})
--projectionFields to include/exclude
--limitMax documents (default: 100)
--jsonReturn raw JSON

datris status

Get the latest job status for a pipeline.
datris status my_pipeline

datris delete

Delete a pipeline configuration and optionally its destination data.
datris delete my_pipeline
datris delete my_pipeline --keep-data
OptionDescription
--keep-dataKeep destination data (only delete the pipeline config)
--jsonReturn raw JSON

datris health

Check the health of all backend services.
datris health

datris secrets

List all configured secrets.
datris secrets

datris taps

List all taps.
datris taps

datris tap create

Create a tap from a plain-English description.
datris tap create "<description>" --pipeline <name> [--name <tap-name>] [--cron "<expression>"] [--type structured|document]
OptionDescription
--pipeline, -pTarget pipeline name (required)
--name, -nTap name (default: derived from pipeline name)
--cronCRON expression for scheduling (Quartz format)
--secretVault secret name for credentials injected as env vars
--scriptPath to a Python script file with a fetch() function (skips AI generation)
--typestructured (default) returns rows of records; document returns {uri, filename, content} dicts destined for a vector-store pipeline. See Document Taps
Examples:
# Structured: rows into a Postgres/Mongo pipeline
datris tap create "Fetch daily stock prices for S&P 500 from yfinance" --pipeline stocks --cron "0 0 0 * * ?"
datris tap create "Get weather data from Open-Meteo API" --pipeline weather --name weather-tap

# Document: raw file bytes into a vector-store pipeline
datris tap create "Discover all PDFs in the S3 bucket 'contracts' under prefix 2026/" \
  --pipeline contracts-vec --type document --secret aws-creds
Document taps require a target pipeline whose source is unstructuredAttributes and whose destination is a vector store (qdrant, pgvector, weaviate, milvus, or chroma). The server rejects tap create --type document against a structured pipeline with HTTP 400.

datris tap run

Run a tap manually. Output reflects whether the fetched records actually landed in the target pipeline:
$ datris tap run stock-prices
  Running tap 'stock-prices'...
 success 33 records fetched
 persisted to stock-prices
 watch: datris pipeline status --publisher 5b2f4a1d-8c7e-4f0a-9b3d-6e1c2a4f8b9e
Or if the run succeeded but records weren’t persisted (missing target pipeline, test mode, script error, zero records), the CLI tells you exactly why:
 success 33 records fetched
 not persisted (no_target_pipeline)
Pass --json to get the full response including mode, persisted, persistedReason, publisherToken, and pipelineTokens.

datris tap delete

Delete a tap.
datris tap delete <name>

datris version

Get the server version.
datris version

Pipeline Name Auto-Detection

When --pipeline is not specified, the CLI derives the pipeline name from the filename:
  • sales-data.csvsales_data
  • Q1 Revenue Report.csvq1_revenue_report
  • trades.jsontrades
The extension is stripped, hyphens and spaces are replaced with underscores, and the name is lowercased.