Skip to main content

Prerequisites

Quick Start

1. Clone the repository

git clone https://github.com/datris/datris-platform-oss.git
cd datris-platform-oss

2. Set your API keys

cp .env.example .env
Edit .env and add your API keys (at least one required for AI features):
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-proj-...

3. Start all services

docker compose up -d
Docker Compose pulls the pre-built images from Docker Hub and starts the full stack. On first run, vault-init automatically seeds your API keys into Vault.

4. Verify

curl http://localhost:8080/api/v1/version
That’s it. The platform is running.

Upgrading

If you already have Datris installed and want to upgrade to the latest version:
cd datris-platform-oss
git pull origin main
docker compose pull
docker compose up -d
This pulls the latest pre-built images from Docker Hub and restarts the services. No build tools required.

Services

ServicePortPurpose
Pipeline Server8080REST API and data processing
Pipeline UI4200Web dashboard
MCP Server3000AI agent integration (MCP protocol)
MinIO9000 (API), 9001 (Console)Object storage
MongoDB27017Configuration and status store
ActiveMQ61616 (broker), 8161 (console)Message queue and notifications
Vault8200Secrets management
Kafka9092Streaming (optional)
Kafka UI8085Kafka topic browser
PostgreSQL5432Database destination + pgvector
Zookeeper2181Kafka coordination

Web UIs

UIURLCredentials
Pipeline UIhttp://localhost:4200none
Pipeline APIhttp://localhost:8080none
MCP Server (SSE)http://localhost:3000/ssenone
MinIO Consolehttp://localhost:9001minioadmin / minioadmin
ActiveMQ Consolehttp://localhost:8161admin / admin
Kafka UIhttp://localhost:8085none
Vault UIhttp://localhost:8200Token: root-token

API Keys and AI Providers

Datris supports three AI providers. Set your keys in .env:
ProviderEnvironment VariableUsed For
Anthropic ClaudeANTHROPIC_API_KEYAI data quality, transformations, error explanation, schema generation, profiling
OpenAIOPENAI_API_KEYSame as above, plus embeddings for vector database / RAG
Ollama (local)OLLAMA_MODELSame as above — no API key needed, runs locally
At least one AI provider key is required for AI features. The embedding provider for RAG defaults to OpenAI but can be changed via EMBEDDING_PROVIDER in .env.

Infrastructure Details

MinIO

The minio-init container automatically creates the required buckets:
  • oss-raw - File upload staging
  • oss-raw-plus - Processed file staging
  • oss-temp - Temporary processing files
  • oss-data - Pipeline output (object store destination)
  • oss-config - Configuration files (validation schemas)

Vault

The vault-init container seeds Vault with default secrets for all services (MinIO, ActiveMQ, MongoDB, PostgreSQL, Kafka) plus your AI provider API keys from .env. Vault runs in dev mode with root token root-token.

Vector Databases

pgvector is included by default via the PostgreSQL service. To add other vector databases, uncomment the relevant sections in docker-compose.yml:
  • Qdrant — high-performance vector database
  • Weaviate — open-source vector database
  • Chroma — lightweight, single container
  • Milvus — scalable vector database (requires separate setup)

Configuration

The pipeline server reads configuration from application.yaml, mounted from docker/config/application.yaml. See Configuration Reference for the full list of properties.

Building from Source

For development or contributing:

Prerequisites

Build and run

# Build the server JAR
sbt clean assembly

# Start with local builds (edit docker-compose.yml to uncomment build: lines)
docker compose up --build
In docker-compose.yml, uncomment the build: lines and comment out the image: lines for the services you want to build locally:
datris:
  # image: datris/datris-server:latest
  build: .  # Build from source