Documentation Index
Fetch the complete documentation index at: https://docs.datris.ai/llms.txt
Use this file to discover all available pages before exploring further.
Upload a File
Upload a data file for processing by a configured pipeline.
POST /api/v1/pipeline/upload
Content-Type: multipart/form-data
Parameters:
| Parameter | Type | Required | Description |
|---|
file | form-data | Yes | The file to upload |
pipeline | form-data | Yes | Target pipeline name |
publishertoken | form-data | No | Publisher identifier for tracking |
Behavior:
- Compressed files (
.zip, .gz, .tar, .jar): Staged to MinIO raw bucket for asynchronous processing
- Uncompressed files: Processed immediately in-memory
- Schema evolution: If a CSV file contains new columns not in the pipeline schema, they are automatically added (as
string type) and the destination table is altered. See Schema Evolution.
Example:
curl -X POST http://localhost:8080/api/v1/pipeline/upload \
-F "file=@/path/to/data.csv" \
-F "pipeline=sales_data" \
-F "publishertoken=batch-001"
Response: 200 OK with the pipeline token (for uncompressed files):
For compressed files, the response is 200 OK with no body. The file is processed asynchronously when the pipeline detects it in the raw bucket.
Generate Pipeline Schema
Upload a CSV file to automatically infer the schema and generate a partial pipeline configuration.
POST /api/v1/pipeline/generate
Content-Type: multipart/form-data
Parameters:
| Parameter | Type | Required | Description |
|---|
file | form-data | Yes | Data file to analyze (CSV, JSON, or XML) |
pipeline | form-data | No | Pipeline name (auto-derived from filename if omitted) |
delimiter | form-data | No | CSV delimiter (default: ,) |
header | form-data | No | Whether file has header row |
allStrings | form-data | No | If true, all fields are typed as string (default: false) |
Example:
curl -X POST http://localhost:8080/api/v1/pipeline/generate \
-F "file=@/path/to/sample.csv" \
-F "pipeline=my_pipeline"
Response: 200 OK with a partial PipelineConfig JSON:
{
"name": "my_pipeline",
"source": {
"fileAttributes": {
"csvAttributes": {
"delimiter": ",",
"header": true
}
},
"schemaProperties": {
"fields": [
{"name": "id", "type": "int"},
{"name": "name", "type": "string"},
{"name": "amount", "type": "double"},
{"name": "created_at", "type": "string"}
]
}
},
"destination": {
"database": {
"dbName": "datris",
"schema": "public",
"table": "my_pipeline",
"usePostgres": true
}
}
}
Inferred types: boolean, int, bigint, float, double, string, date, timestamp
Note: For CSV files, the AI analyzes the content and infers types. For JSON and XML files, a default config is generated with a single _json or _xml string field. When allStrings is true, all fields are set to string (used by the MCP server for reliable ingestion). Edit the generated JSON to add your destination configuration before registering it with POST /pipeline.
Profile Data
Upload a data file for AI-powered profiling — get summary statistics, quality issues, and suggested validation rules.
POST /api/v1/pipeline/profile
Content-Type: multipart/form-data
Parameters:
| Parameter | Type | Required | Description |
|---|
file | form-data | Yes | Data file to profile |
delimiter | form-data | No | CSV delimiter (default: ,) |
header | form-data | No | Whether file has header row (default: true) |
sampleSize | form-data | No | Number of rows to sample (default: 200) |
Example:
curl -X POST http://localhost:8080/api/v1/pipeline/profile \
-F "file=@/path/to/data.csv" \
-F "sampleSize=200"
Response: 200 OK with a JSON profile including summary statistics, quality issues, recommendations, and suggested data quality rules. See AI Data Profiling for the full response format.