Documentation Index
Fetch the complete documentation index at: https://docs.datris.ai/llms.txt
Use this file to discover all available pages before exploring further.
The Datris CLI (datris) is a command-line interface for the Datris Data Platform. It communicates with the platform via the MCP server, providing a simple way to ingest data, run queries, search vector stores, and manage pipelines — all from the terminal.
Installation
brew tap datris/tap
brew install datris
Configuration
The CLI connects to the MCP server via SSE. Set the server URL with an environment variable:
export MCP_SERVER_URL=http://localhost:3000/sse # default
JSON Output
Every command supports --json to return raw JSON instead of human-readable output. This is useful for scripting and programmatic use.
datris pipelines --json
datris query "SELECT * FROM trades" --json
datris analyze "top 5 stocks" --table trades --json
datris health --json
Commands
datris help
Show all available commands and options.
datris pipelines
List all registered pipelines.
datris ingest
Create a pipeline and ingest a data file in one step. The schema is auto-detected from the file.
If the named pipeline already exists, the existing config is preserved and the file is uploaded into it — --dest, --table, --ai-validate, --ai-transform, and --catalog are ignored on re-ingest. Delete the pipeline first (datris delete <name>) if you need a fresh config.
datris ingest <FILE> [OPTIONS]
| Option | Description |
|---|
--pipeline, -p | Pipeline name (default: derived from filename) |
--dest, -d | Destination: postgres, mongodb, qdrant, weaviate, milvus, chroma, pgvector (default: postgres) |
--table, -t | Table/collection name (default: pipeline name) |
--database | Database name (default: datris) |
--ai-validate | AI data quality rule — plain-English instruction |
--ai-transform | AI transformation instruction — plain-English instruction |
--ai-analyze | Ask a question about the data after ingestion completes |
--catalog | Catalog label to group this pipeline with related pipelines (e.g. openclaw). Free-form — no need to pre-create. Only applied when creating a new pipeline. |
--json | Return raw JSON |
Examples:
# Basic ingestion — auto-derives pipeline name from filename
datris ingest sales-data.csv --dest postgres
# With explicit pipeline name
datris ingest sales-data.csv --pipeline sales --dest postgres
# With AI validation
datris ingest trades.csv --dest postgres --ai-validate "all prices must be positive and dates must be YYYY-MM-DD"
# With AI transformation
datris ingest trades.csv --dest postgres --ai-transform "convert all date columns to YYYY/MM/DD format"
# Both validation and transformation
datris ingest trades.csv --dest postgres \
--ai-validate "all prices must be positive" \
--ai-transform "convert dates to YYYY/MM/DD and uppercase all ticker symbols"
# Ingest into MongoDB
datris ingest events.json --dest mongodb --database analytics
# Ingest into a vector store for RAG
datris ingest manual.pdf --dest pgvector
# Group related pipelines under a catalog
datris ingest legal-2026.pdf --pipeline legal_2026 --dest pgvector --catalog legal
# Ingest + analyze in one command
datris ingest trades.csv --dest postgres --ai-analyze "What are the top 5 stocks by volume?"
# Ingest a document and ask a question about it
datris ingest annual-report.pdf --dest pgvector --ai-analyze "What was the company's revenue?"
# Ingest + analyze, raw JSON output for scripts
datris ingest trades.csv --dest postgres --ai-analyze "top 5 by volume" --json
datris query
Execute a read-only SQL SELECT query against PostgreSQL.
datris query "SELECT * FROM public.sales LIMIT 10"
datris query "SELECT symbol, close FROM public.trades WHERE close > 100" --limit 50
| Option | Description |
|---|
--limit | Max rows returned (default: 100, max: 1000) |
--json | Return raw JSON |
datris search
Semantic search across a vector database.
datris search "What is the return policy?" --store pgvector --collection support_docs
| Option | Description |
|---|
--store | Vector store: qdrant, weaviate, milvus, chroma, pgvector (default: pgvector) |
--collection | Collection/table name (required) |
--top-k | Number of results (default: 5) |
--json | Return raw JSON |
datris analyze
Ask a question about your data using AI. Works with any destination type — auto-picks the right approach based on --dest.
datris analyze <QUESTION> --table <TABLE> [OPTIONS]
| Option | Description |
|---|
--table, -t | Table/collection name (required) |
--dest, -d | Data source type: postgres, mongodb, qdrant, weaviate, milvus, chroma, pgvector (default: postgres) |
--top-k, -k | Number of search results for vector stores (default: 5) |
--json | Return raw JSON instead of AI narrative |
Examples:
# Analyze PostgreSQL data — AI generates SQL, executes it, returns AI answer
datris analyze "What are the top 5 stocks by volume?" --table trades
# Analyze MongoDB data
datris analyze "How many events occurred in March?" --table events --dest mongodb
# Analyze vector store data (RAG) — semantic search + AI answer
datris analyze "What is the return policy?" --table support_docs --dest pgvector
datris analyze "What was quarterly revenue?" --table financial_docs --dest qdrant
# Raw JSON output for scripts
datris analyze "top 5 stocks" --table trades --json
How it works by destination:
- PostgreSQL — AI generates a SQL query from your question, executes it, then summarizes the results in a natural language answer
- MongoDB — fetches documents from the collection, then AI answers the question based on the data
- Vector stores — performs semantic search to find relevant chunks, then AI generates an answer from the retrieved context
datris query-mongo
Query a MongoDB collection with optional filter and projection.
datris query-mongo events
datris query-mongo events --filter '{"status": "active"}' --limit 20
datris query-mongo events --projection '{"name": 1, "status": 1}'
| Option | Description |
|---|
--filter | MongoDB filter JSON (default: {}) |
--projection | Fields to include/exclude |
--limit | Max documents (default: 100) |
--json | Return raw JSON |
datris status
Get the latest job status for a pipeline.
datris status my_pipeline
datris delete
Delete a pipeline configuration and optionally its destination data.
datris delete my_pipeline
datris delete my_pipeline --keep-data
| Option | Description |
|---|
--keep-data | Keep destination data (only delete the pipeline config) |
--json | Return raw JSON |
datris health
Check the health of all backend services.
datris secrets
List all configured secrets.
datris taps
List all taps.
datris tap create
Create a tap from a plain-English description.
datris tap create "<description>" --pipeline <name> [--name <tap-name>] [--cron "<expression>"] [--type structured|document]
| Option | Description |
|---|
--pipeline, -p | Target pipeline name (required) |
--name, -n | Tap name (default: derived from pipeline name) |
--cron | CRON expression for scheduling (Quartz format) |
--secret | Vault secret name for credentials injected as env vars |
--script | Path to a Python script file with a fetch() function (skips AI generation) |
--type | structured (default) returns rows of records; document returns {uri, filename, content} dicts destined for a vector-store pipeline. See Document Taps |
Examples:
# Structured: rows into a Postgres/Mongo pipeline
datris tap create "Fetch daily stock prices for S&P 500 from yfinance" --pipeline stocks --cron "0 0 0 * * ?"
datris tap create "Get weather data from Open-Meteo API" --pipeline weather --name weather-tap
# Document: raw file bytes into a vector-store pipeline
datris tap create "Discover all PDFs in the S3 bucket 'contracts' under prefix 2026/" \
--pipeline contracts-vec --type document --secret aws-creds
Document taps require a target pipeline whose source is unstructuredAttributes and whose destination is a vector store (qdrant, pgvector, weaviate, milvus, or chroma). The server rejects tap create --type document against a structured pipeline with HTTP 400.
datris tap run
Run a tap manually. Output reflects whether the fetched records actually landed in the target pipeline:
$ datris tap run stock-prices
Running tap 'stock-prices'...
✓ success — 33 records fetched
→ persisted to stock-prices
→ watch: datris pipeline status --publisher 5b2f4a1d-8c7e-4f0a-9b3d-6e1c2a4f8b9e
Or if the run succeeded but records weren’t persisted (missing target pipeline, test mode, script error, zero records), the CLI tells you exactly why:
✓ success — 33 records fetched
→ not persisted (no_target_pipeline)
Pass --json to get the full response including mode, persisted, persistedReason, publisherToken, and pipelineTokens.
datris tap delete
Delete a tap.
datris version
Get the server version.
Pipeline Name Auto-Detection
When --pipeline is not specified, the CLI derives the pipeline name from the filename:
sales-data.csv → sales_data
Q1 Revenue Report.csv → q1_revenue_report
trades.json → trades
The extension is stripped, hyphens and spaces are replaced with underscores, and the name is lowercased.