AI Configuration

The data platform integrates with AI providers for these features:

Schema generation — upload a file to POST /api/v1/pipeline/generate and receive a complete pipeline configuration with inferred field names and types
Data quality rules — describe validation logic in plain English using aiRule instead of writing code
Transformations — describe data transformations in plain English using aiTransformation
Data profiling — upload a file and get summary statistics, quality issues, and suggested validation rules
Header validation — AI-powered fuzzy matching for CSV/delimited file headers against the pipeline schema
Error explanation — AI analyzes job failures and explains the root cause in plain English
Vector search & embeddings — required when pushing data to vector destinations (Qdrant, Weaviate, Milvus, Chroma, pgvector) or running semantic search

Three Independent Secrets

AI configuration is split into three independent, self-describing Vault secrets, each pointing at its own endpoint/model/key. Each is a separate slot in application.yaml:

ai:
  enabled: "true"
  aiPrimary:
    secretName: "oss/ai-primary"
  codegen:
    secretName: "oss/codegen"
  embedding:
    secretName: "oss/embedding"

Slot	Purpose	Used by
`aiPrimary`	General AI tasks	Schema generation, error explanation, brainstorm chat, profiling, header validation, CRON generation, AI Q&A
`codegen`	Code-generation tasks	Tap scripts, `aiRule`, `aiTransformation`, JSON Schema / XSD generation, natural-language → SQL
`embedding`	Vector embeddings	Vector destinations (Qdrant, Weaviate, Milvus, Chroma, pgvector), semantic search

Each Vault secret carries its full configuration inline:

Field	Required	Notes
`provider`	Yes	`anthropic`, `openai`, or `ollama`
`endpoint`	Yes	API endpoint URL
`model`	Yes	Model identifier
`apiKey`	Yes (no for ollama)	Provider API key
`version`	Optional	API version (Anthropic uses `2023-06-01`)

The resolver reads each secret at the configured path and uses whatever it finds inside — there is no path derivation from a YAML provider field. This means the three slots can independently use different providers (e.g. Anthropic for aiPrimary, Ollama for codegen, OpenAI for embedding). All three slots support Ollama — you can run the entire platform on local models with no API keys. The Configuration UI offers Ollama as a provider choice for every section, with a free-text model input since Ollama supports hundreds of community models.

Quick Start with `.env`

docker/vault-init.sh seeds all three Vault secrets automatically based on which API key you put in .env. Set either ANTHROPIC_API_KEY or OPENAI_API_KEY:

# .env
ANTHROPIC_API_KEY=sk-ant-...
# OPENAI_API_KEY=

OpenAI path

When OPENAI_API_KEY is set, all three slots are seeded against OpenAI:

Slot	Provider	Endpoint	Model
`oss/ai-primary`	openai	`https://api.openai.com/v1/chat/completions`	`gpt-4.1`
`oss/codegen`	openai	`https://api.openai.com/v1/chat/completions`	`gpt-5`
`oss/embedding`	openai	`https://api.openai.com/v1/embeddings`	`text-embedding-3-small`

Anthropic path (with bundled embedding service)

When ANTHROPIC_API_KEY is set, aiPrimary and codegen use Claude. Anthropic has no embeddings API, so embedding falls back to a bundled embedding service running bge-m3 (a strong open-source 1024-dimension embedding model). No OpenAI key required, vector destinations work out of the box.

Slot	Provider	Endpoint	Model
`oss/ai-primary`	anthropic	`https://api.anthropic.com/v1/messages`	`claude-sonnet-4-6`
`oss/codegen`	anthropic	`https://api.anthropic.com/v1/messages`	`claude-opus-4-7`
`oss/embedding`	tei	`http://tei:80/v1/embeddings`	`BAAI/bge-m3`

The tei service (HuggingFace Text Embeddings Inference) is part of docker-compose.yml. On first start it downloads bge-m3 (~2.2 GB ONNX weights) into a persistent volume; subsequent restarts use the cached model.

Fully local with Ollama

You can run all three slots on a local Ollama instance — no cloud API keys needed. Ollama is opt-in: uncomment the optional ollama service block in docker-compose.yml (and the ollama-data volume), then docker compose up -d ollama and pull the models you want (e.g. docker exec ollama ollama pull qwen3:14b). Alternatively, point at an Ollama instance on your host machine using host.docker.internal:

Slot	Provider	Endpoint	Model (example)
`oss/ai-primary`	ollama	`http://ollama:11434/v1/chat/completions` (bundled) or `http://host.docker.internal:11434/v1/chat/completions` (host)	`qwen3:14b`
`oss/codegen`	ollama	same as above	`qwen2.5-coder:7b-instruct`
`oss/embedding`	ollama	same as above	`bge-m3`

If you only need local chat but want to keep the bundled embedding service (TEI), leave oss/embedding pointed at http://tei:80/v1/embeddings with model="BAAI/bge-m3" and only switch the AI Primary and CodeGen slots to Ollama. The Configuration UI makes this easy — select Ollama (if optional service enabled) as the provider, type your model name, and the endpoint defaults to ollama:11434. No API key is required.

Optional model overrides

Override the seeded models by setting these in .env:

ANTHROPIC_MODEL=claude-sonnet-4-6
OPENAI_MODEL=gpt-4.1
CODEGEN_MODEL=claude-opus-4-7   # or gpt-5 if using OpenAI

Mix-and-match: Anthropic chat + OpenAI embeddings

The embedding slot is decoupled from the chat slots — set EMBEDDING_PROVIDER in .env to override what vault-init.sh seeds for oss/embedding, regardless of AI_PROVIDER. This is the recommended path when you want Claude for chat/codegen but don’t want to run the bundled TEI sidecar (for example on a small host where TEI gets OOM-killed).

# .env — Anthropic for chat/codegen, OpenAI for embeddings
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...
EMBEDDING_PROVIDER=openai

Valid values for EMBEDDING_PROVIDER:

Value	Endpoint	Default model	Requirements
`openai`	`https://api.openai.com/v1/embeddings`	`text-embedding-3-small` (1536-dim)	`OPENAI_API_KEY` set
`tei`	`http://tei:80/v1/embeddings`	`BAAI/bge-m3` (1024-dim)	bundled TEI sidecar running
`ollama`	`http://ollama:11434/v1/embeddings`	`bge-m3` (1024-dim)	optional Ollama service enabled, model pulled

EMBEDDING_MODEL overrides the default model for the chosen provider (e.g. text-embedding-3-large for higher quality at 3072-dim). EMBEDDING_ENDPOINT overrides the URL for the tei or ollama providers — useful when running TEI or Ollama on a separate host and pointing at it from inside Docker. When EMBEDDING_PROVIDER is unset, embedding follows AI_PROVIDER: anthropic seeds the bundled TEI; openai seeds OpenAI embeddings.

Switching providers changes vector dimensions. bge-m3 is 1024-dim, text-embedding-3-small is 1536-dim, text-embedding-3-large is 3072-dim. Existing pgvector / Qdrant / Weaviate / Milvus / Chroma collections built on the previous embedder will fail-fast on the next run with a dimension-mismatch message — drop the destination table or collection and re-ingest.

Hot Reload

When you save AI configuration from the Configuration UI, the server reloads the ai-primary and codegen configs from Vault immediately — no restart required. The next AI call uses the updated provider, model, and endpoint. The embedding config is read from Vault on each request and also requires no restart.

Setting Secrets Manually

To write a secret directly to Vault (note: manual Vault writes do not trigger hot reload — restart the server or re-save from the UI):

docker compose exec -e VAULT_ADDR=http://vault:8200 -e VAULT_TOKEN=root-token vault \
  vault kv put secret/oss/ai-primary \
  provider="anthropic" \
  endpoint="https://api.anthropic.com/v1/messages" \
  model="claude-sonnet-4-6" \
  apiKey="sk-ant-..." \
  version="2023-06-01"

Same shape for oss/codegen and oss/embedding. Verify with:

docker compose exec -e VAULT_ADDR=http://vault:8200 -e VAULT_TOKEN=root-token vault \
  vault kv get secret/oss/ai-primary

Embedding Provider Notes

The embedding endpoint must speak the OpenAI embeddings API (POST /v1/embeddings with {model, input} body returning {data: [{embedding: [...]}]}). That contract is implemented by:

OpenAI — https://api.openai.com/v1/embeddings
TEI (bundled) — http://tei:80/v1/embeddings (the bundled HuggingFace Text Embeddings Inference sidecar serves BAAI/bge-m3 out of the box)
Ollama — http://ollama:11434/v1/embeddings (only when the optional Ollama service is enabled and bge-m3 has been pulled there)
Ollama (external) — http://host.docker.internal:11434/v1/embeddings (your own Ollama with any embedding model)
Self-hosted vLLM, LM Studio, etc. — anything implementing the same contract

Anthropic has no embeddings API, so provider: "anthropic" is not valid for the embedding slot.

Chat Provider Notes

The aiPrimary and codegen slots support Anthropic, OpenAI, and Ollama. Ollama must expose the OpenAI chat completions API (POST /v1/chat/completions), which it does by default. When running Ollama on your host machine, the endpoint from inside Docker is http://host.docker.internal:11434/v1/chat/completions.

Stronger Model for Code Generation

Highly recommended: Anthropic with the latest coding model for CodeGen. Tap script generation, AI data quality rules, AI transformations, and NL→SQL all depend heavily on the model’s coding ability — Claude produces dramatically better results than smaller or cheaper models. The same recommendation appears in the Configuration UI’s CodeGen section and in .env.example.

CodeGen tasks (tap scripts, aiRule, aiTransformation, JSON Schema / XSD generation, natural-language → SQL) have their own slot precisely so you can point them at a stronger model than the chatty default. With OpenAI the default is gpt-5 for codegen vs gpt-4.1 for general; with Anthropic the default is claude-opus-4-7 for codegen vs claude-sonnet-4-6 for general. Cheap chatty paths stay on the cheaper model so you only pay the upgrade on the calls that benefit.

Feature	Uses codegen slot?
Tap script generation, auto-fix	Yes
`aiRule` data quality rules	Yes
`aiTransformation`	Yes
CSV / JSON Schema / XSD generation	Yes
Natural language → SQL (`/query/ai`)	Yes
Error explanation, brainstorm chat, CRON generation	No (general slot)
Data profiling, AI Q&A	No (general slot)
AI header validation	No (general slot)

To use the same model for codegen as for everything else, point oss/codegen at the same secret values as oss/ai-primary.

Multi-Tenant: Per-Tenant Overrides

In multi-tenant deployments (multiTenant: "true" in application.yaml) each tenant can override any of the three slots independently. The resolver checks per-request:

Slot	Per-tenant Vault path	Falls back to
AI Primary	`{env}/ai-primary`	`oss/ai-primary`
CodeGen	`{env}/codegen`	`oss/codegen`
Embedding	`{env}/embedding`	`oss/embedding`

({env} is the tenant’s environment name, e.g. trial-acme.) A per-tenant secret takes effect on the next AI call — no restart needed. The override is independent for each slot: a tenant can override just AI Primary, just CodeGen, just Embedding, or any combination.

Setting your key from the UI

Trial users (and any multi-tenant tenant) can manage all three slots from the platform UI. The Configuration page is organized into three sub-tabs:

Environment — tenant name, version, trial/hosted flags, and a status block showing which AI slots are using Datris-managed defaults vs your own keys.
AI Providers — AI Primary, CodeGen, and Embedding provider + model selectors, plus a shared API Keys panel on the right. This is where you enter your Anthropic / OpenAI key and pick models.
Taps — Prompt Fragments: source-specific context the LLM sees when you create a tap or use Discovery.

To set your AI keys:

Open the Configuration tab, then the AI Providers sub-tab.
Each section (AI Provider, CodeGen Provider, Embedding Provider) has its own provider and model selector. Providers include Anthropic, OpenAI, and Ollama (local).
For Anthropic/OpenAI, enter your API key in the right-hand panel and pick a model from the dropdown. For Ollama, type the model name directly — no API key needed.
Click Save Configuration. Changes take effect immediately — no restart required.

To revert to defaults, click Reset to Datris-managed. This deletes all three per-tenant secrets and falls the tenant back to the global defaults.

Setting tenant secrets directly in Vault

For operators provisioning tenants programmatically:

docker compose exec -e VAULT_ADDR=http://vault:8200 -e VAULT_TOKEN=root-token vault \
  vault kv put secret/trial-acme/ai-primary \
  provider="anthropic" \
  endpoint="https://api.anthropic.com/v1/messages" \
  model="claude-sonnet-4-6" \
  apiKey="sk-ant-..." \
  version="2023-06-01"

Same shape for {env}/codegen and {env}/embedding.

Misconfiguration

The platform fails fast at startup if any required slot is missing or malformed. There is no silent fallback. The aiPrimary slot is always required when ai.enabled: true; codegen and embedding are required if you use features that depend on them (which is most features for codegen, and only vector destinations for embedding).

Getting Started

Discovery

Taps

Ingestion

Destinations

Data Quality

Transformation

AI Features

Configuration

Examples

AI Configuration

Three Independent Secrets

Quick Start with `.env`

OpenAI path

Anthropic path (with bundled embedding service)

Fully local with Ollama

Optional model overrides

Mix-and-match: Anthropic chat + OpenAI embeddings

Hot Reload

Setting Secrets Manually

Embedding Provider Notes

Chat Provider Notes

Stronger Model for Code Generation

Multi-Tenant: Per-Tenant Overrides

Setting your key from the UI

Setting tenant secrets directly in Vault

Misconfiguration

Getting Started

Discovery

Taps

Ingestion

Destinations

Data Quality

Transformation

AI Features

Configuration

Examples

Documentation Index

​Three Independent Secrets

​Quick Start with .env

​OpenAI path

​Anthropic path (with bundled embedding service)

​Fully local with Ollama

​Optional model overrides

​Mix-and-match: Anthropic chat + OpenAI embeddings

​Hot Reload

​Setting Secrets Manually

​Embedding Provider Notes

​Chat Provider Notes

​Stronger Model for Code Generation

​Multi-Tenant: Per-Tenant Overrides

​Setting your key from the UI

​Setting tenant secrets directly in Vault

​Misconfiguration

Three Independent Secrets

Quick Start with `.env`

OpenAI path

Anthropic path (with bundled embedding service)

Fully local with Ollama

Optional model overrides

Mix-and-match: Anthropic chat + OpenAI embeddings

Hot Reload

Setting Secrets Manually

Embedding Provider Notes

Chat Provider Notes

Stronger Model for Code Generation

Multi-Tenant: Per-Tenant Overrides

Setting your key from the UI

Setting tenant secrets directly in Vault

Misconfiguration