- Schema generation — upload a file to
POST /api/v1/pipeline/generateand receive a complete pipeline configuration with inferred field names and types - Data quality rules — describe validation logic in plain English using
aiRuleinstead of writing code
Choosing a Provider
The pipeline supports cloud providers (Anthropic, OpenAI) and local models via Ollama. Cloud providers are recommended for production use. They offer larger context windows (128K–200K tokens), higher accuracy, and stronger domain knowledge — all of which matter for AI data quality rules. TheaiRule feature sends the entire file to the model in a single call, so a larger context window means larger files can be validated without falling back to batching.
| Provider | Context Window | Accuracy | API Key | Best For |
|---|---|---|---|---|
| Anthropic (Claude) | 200K tokens | Excellent | Required | Production, large files |
| OpenAI (GPT-4o) | 128K tokens | Excellent | Required | Production, large files |
| Ollama (local) | 32K–128K tokens | Good (varies by model) | No | Development, testing, no external dependencies |
Supported Providers
| Provider | Value | API Key Required |
|---|---|---|
| Anthropic (Claude) | anthropic | Yes |
| OpenAI (GPT) | openai | Yes |
| Ollama (local) | ollama | No |
application.yaml
Enable AI and set the provider inapplication.yaml:
aiSecretName to match the Vault secret name for your chosen provider.
Vault Secret
The endpoint and model are stored in Vault. The secret must contain three fields:| Field | Description |
|---|---|
endpoint | API endpoint URL |
model | Model identifier |
apiKey | API key (empty string "" for Ollama) |
Anthropic (recommended)
application.yaml:
OpenAI
application.yaml:
Ollama (local model)
Ollama runs models locally on the host machine and exposes an OpenAI-compatible API. No API key is required. Install Ollama on your machine from ollama.com/download, then pull a model:host.docker.internal. The Vault secret is seeded automatically by docker/vault-init.sh. To set it manually:
qwen2.5:14b-instruct. To use a different model, set OLLAMA_MODEL in your .env file and pull it locally:
docker-compose up --build to update the Vault secret.
Recommended local models
Choose a model based on your available RAM. Larger models are significantly more accurate — 70B+ models approach cloud-level quality for data validation tasks. Models with 128K context windows work best with theaiRule full-file validation mode.
| Model | Size | Context | RAM Needed | Accuracy | Best For |
|---|---|---|---|---|---|
llama3.1:8b | 8B | 128K | ~6 GB | Good | Lightweight dev/testing |
qwen2.5:14b-instruct | 14B | 128K | ~10 GB | Good | Default for 16-24 GB machines |
qwen2.5:32b-instruct | 32B | 128K | ~20 GB | Very good | 32 GB machines |
llama3.3:70b | 70B | 128K | ~42 GB | Excellent | 64+ GB machines, near cloud quality |
qwen2.5:72b-instruct | 72B | 128K | ~45 GB | Excellent | 64+ GB machines, near cloud quality |
Local model accuracy
Local models work well for obvious violations (e.g., a $5,000,000 stock price) but may miss subtler issues (e.g., high < low, negative volume). In testing, a 14B model caught 1 of 3 intentional violations, while cloud models (Claude Sonnet) caught all three. For production data quality where accuracy matters, use a cloud provider or a 70B+ local model. For development and testing with small files, 14B models provide fast iteration (2-7 seconds per validation) at acceptable accuracy.Full-file mode and file size limits
TheaiRule sends the entire file to the model in a single call. The maximum file size depends on the model’s context window:
| Model Context | Max File Size (approx) | Typical Rows |
|---|---|---|
| 32K tokens | ~50 KB | ~1,000 rows |
| 128K tokens | ~400 KB | ~8,000 rows |
| 200K tokens (Claude) | ~600 KB | ~12,000 rows |
Running Ollama on a cloud instance
For production workloads, consider running Ollama on a GPU-enabled AWS EC2 Spot Instance. This gives you the accuracy of a 70B+ model with no per-token API costs or rate limits, at a fraction of on-demand pricing. Recommended instance:g5.12xlarge (4x NVIDIA A10G, 96 GB VRAM) — comfortably runs 70B models. On-demand ~1.70/hr.
Setup with a pre-baked AMI (recommended):
- Launch a
g5.12xlargeon-demand instance - Install Ollama and pull the model:
- Create an AMI from this instance
- Terminate the on-demand instance
- Launch Spot Instances from the AMI — they boot with Ollama and the model pre-installed, ready to serve in ~1 minute with no download required
11434 in the instance’s security group, then point the pipeline at it:
GPU acceleration: Ollama automatically uses available GPU hardware (NVIDIA CUDA, AMD ROCm, Apple Silicon Metal). For local development on macOS, always run Ollama natively on the host — Docker Desktop does not expose Apple Silicon GPUs to containers.
Automating with .env
To avoid setting Vault secrets manually after everydocker-compose up --build, create a .env file at the project root (this file is gitignored):
.env automatically and passes these values to vault-init, which seeds the secrets on every startup.
Verifying the Secret
Disabling AI
Setai.enabled: "false" in application.yaml. Schema generation will return an error and AI data quality rules will be rejected at pipeline registration time.