Assistant - Datris

The Assistant tab is an in-product agent that wires up taps and pipelines for you. You describe what data you want; the agent finds a source, generates the tap script, creates the pipeline, links them, and shows you the result — all in one chat. It uses the same MCP tools that external agents (Claude Desktop, Cursor) see when they connect to your Datris MCP server. There is one canonical workflow, shared between in-product and external agents.

What you can ask

The Assistant is open-ended. Some examples:

“Find a source of US public-company SEC filings and build a tap and pipeline.”
“Pull current weather data from a public API and create a pipeline for it.”
“Ingest the most recent NYC taxi trip data into a pipeline.”
“List the taps that are currently scheduled and tell me which haven’t run in over a week.”
“Show me the rows in the weather pipeline.” (works for MongoDB, PostgreSQL, Object Store (Parquet / ORC), or vector destinations — the agent picks the right query tool for the destination)

The agent will reach for tools as it needs them: web search to find a source, list_taps / list_pipelines to check for existing work, create_tap_secret when credentials are needed, create_tap, test_tap, create_pipeline, and so on.

Drop a file to build a pipeline

You can drag a file straight into the chat — or attach one with the file picker — and the Assistant will build a pipeline around it for you. Drop a CSV, JSON, XML, or document file onto the chat window (or click the attach control and pick it), then describe what you want done with it. When you send a message with a file attached, the agent:

Reads a sample of the file — a text extract from the attached content — so it can see the columns and data shape.
Proposes a destination, defaulting sensibly for the data and naming the alternatives (Postgres, MongoDB, an object store, or a vector store) so you can pick a different one.
Confirms before building — it tells you what it’s about to create and waits for your go-ahead.
Asks about an upsert key when relevant, so repeated loads update rows in place instead of duplicating them.
Creates the pipeline, loads the data, and reports the result — including how many rows landed in the destination.

This is the fastest path from “here’s a file” to a working pipeline: no wizard, no manual schema entry — just drop it in and confirm.

What you’ll see in real time

Three live layers in every assistant turn:

💭 Thinking — the model’s chain-of-thought, streaming character-by-character. Anthropic tenants see the full reasoning; OpenAI tenants see a high-level summary instead. Click the block to expand.
Tool cards — inline within the response. Each card shows a human-friendly label (”🔍 Searching the web for SEC EDGAR API…”, ”🛠 Creating tap sec-edgar…”) with a spinner while running, then a ✓ or ✕. Click to expand the raw input and result.
Streaming prose — the response text appears as the model writes it, with tool calls inlined where the model invoked them.

When the agent finishes, Open tap and Open pipeline links appear at the bottom of the turn so you can jump straight to anything it created.

Safety

The agent has access to every MCP tool, including destructive ones (delete_tap, delete_pipeline, delete_tap_secret, update_secret). It is instructed to never invoke a destructive tool without asking you first — if you say “delete the broken tap,” the agent will name what it’s about to delete and wait for your confirmation in the chat before proceeding. The Stop button at the bottom of the chat aborts the loop server-side at the next checkpoint, so you can interrupt anytime. Stop halts the in-flight model call within a fraction of a second — cancelled responses cost only what was already streamed.

Platform-tab secrets are visible but read-only

When a pipeline destination needs operator-owned credentials (AWS S3, an external Postgres, a vector-store endpoint, etc.), the agent can discover secrets you’ve created on Configuration → Secrets → Platform and verify their field shape — but it cannot create, modify, or delete them. If nothing suitable exists, the agent will name the field shape the destination needs and ask you to add the secret in the Secrets tab, then give it the name to use. Secret values are never exposed to the agent.

Long-running runs are carried to completion

When the agent runs a tap (run_tap or upload_data), it owns that work all the way to a terminal success / warning / error outcome. It will not kick off the job and tell you to check back later — that defeats the point of delegating. The polling pattern it uses:

Poll get_pipeline_status immediately after the run is submitted (most small structured taps finish in 1–5 seconds; no point sleeping first).
If still in progress, sleep with wait_seconds using exponential backoff (5s → 10s → 20s → 30s → 60s → 60s …, capped at 60s normally, 120s only if things are genuinely glacial).
Reset to a short wait (5–15s) the moment a poll shows new jobs flipping to a terminal state — progress means it’s about to finish.
Emit a one-line progress update each cycle (“12 of 28 jobs done, 16 still processing — checking again in 30s”) so you can see the run advance.
After ~20 polls, if the run is still in progress, pause and ask whether to keep waiting, kill the run, or check later — never silently stop.

The same pattern applies to upload_data → get_job_status. If the chat seems to be “hanging” mid-run, it’s waiting between polls on purpose.

Reasoning visibility by provider

Provider	Streaming text	Tool cards	Reasoning block
Anthropic Claude 4.x	✅	✅	Full chain-of-thought (extended thinking)
OpenAI GPT-5	✅	✅	Reasoning summary
Ollama / Llama	✅	If the model supports tool calling	None

Set ai.extendedThinking: "false" in application.yaml to disable the reasoning block platform-wide. Default is on. Extended thinking adds roughly $0.05/turn and 2–8s of latency on Opus 4.x — cheap insurance for transparency on an agent that’s writing production-bound config.

Configuration

The Assistant uses your tenant’s codegen AI config (the same one that powers tap script generation, AI data quality, and AI transformations). On Anthropic tenants the default is Opus 4.x; on OpenAI tenants it’s GPT-5. There is no per-tenant Assistant configuration to manage beyond what you already have for codegen. See AI Configuration for the underlying AI primary / codegen / web search / embedding settings.

Authentication

The Assistant inherits the identity of the user running it. How that maps to permissions depends on which auth mode the platform is running in:

USE_USER_AUTH=true (recommended): the Assistant runs as your logged-in user. Audit log entries and capability checks both show session:<your username>. Permissions follow your role — admin gets full access, editor gets create/update/run on data resources, viewer gets read-only. See User Authentication for the role-to- capability mapping.
USE_USER_AUTH=false + USE_API_KEYS=true: the Assistant uses whatever value you pasted into the Connect prompt (typically the seeded ui key). Same identity for everyone using the browser.
USE_API_KEYS=false: no auth on tool calls; the Assistant runs as the platform’s anonymous identity.

The Assistant doesn’t have its own dedicated API key — its server-side MCP-bound tool calls use the same identity the chat endpoint was authenticated under. Want to give the Assistant narrower permissions than the human user has? That’s not exposed today; the workaround is to log in as a different user (an editor or viewer) when running the Assistant, so the role mapping applies. See API Keys for the full capability model.

Limits

50 iterations per turn. The agent loop is hard-capped to bound cost. If the agent runs out of iterations, it stops and asks you to send a follow-up message to continue.
No conversation history. v1 is session-only — refreshing the tab clears the conversation. Persistence is on the roadmap.
16K output tokens per LLM call. Sufficient for thinking + reasoning + tool selection on every public model we support.

How it relates to other tabs

MCP tab — static catalog of the tools the Assistant uses. Read this if you want to understand what each tool does.
Agent Monitor (pop-out) — read-only activity monitor of external MCP clients (Claude Desktop, Cursor, scripts) calling your server. Opens in its own browser window via the pop-out icon on the MCP tab so you can keep an eye on Connections and the Activity Log while you work elsewhere. The Assistant is the in-product counterpart — same tools, different surface.
Catalog tab — anything the Assistant creates (taps and pipelines) lands here, grouped by catalog. The ”→ Open tap” / ”→ Open pipeline” links at the end of each conversation route you there.

​What you can ask

​Drop a file to build a pipeline

​What you’ll see in real time

​Safety

​Platform-tab secrets are visible but read-only

​Long-running runs are carried to completion

​Reasoning visibility by provider

​Configuration

​Authentication

​Limits

​How it relates to other tabs