Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.datris.ai/llms.txt

Use this file to discover all available pages before exploring further.

Upload a File

Upload a data file for processing by a configured pipeline.
POST /api/v1/pipeline/upload
Content-Type: multipart/form-data
Parameters:
ParameterTypeRequiredDescription
fileform-dataYesThe file to upload
pipelineform-dataYesTarget pipeline name
publishertokenform-dataNoPublisher identifier for tracking
Behavior:
  • Compressed files (.zip, .gz, .tar, .jar): Staged to MinIO raw bucket for asynchronous processing
  • Uncompressed files: Processed immediately in-memory
  • Schema evolution: If a CSV file contains new columns not in the pipeline schema, they are automatically added (as string type) and the destination table is altered. See Schema Evolution.
Example:
curl -X POST http://localhost:8080/api/v1/pipeline/upload \
  -F "file=@/path/to/data.csv" \
  -F "pipeline=sales_data" \
  -F "publishertoken=batch-001"
Response: 200 OK with the pipeline token (for uncompressed files):
pt-abc12345-6789-...
For compressed files, the response is 200 OK with no body. The file is processed asynchronously when the pipeline detects it in the raw bucket.

Generate Pipeline Schema

Upload a CSV file to automatically infer the schema and generate a partial pipeline configuration.
POST /api/v1/pipeline/generate
Content-Type: multipart/form-data
Parameters:
ParameterTypeRequiredDescription
fileform-dataYesData file to analyze (CSV, JSON, or XML)
pipelineform-dataNoPipeline name (auto-derived from filename if omitted)
delimiterform-dataNoCSV delimiter (default: ,)
headerform-dataNoWhether file has header row
allStringsform-dataNoIf true, all fields are typed as string (default: false)
Example:
curl -X POST http://localhost:8080/api/v1/pipeline/generate \
  -F "file=@/path/to/sample.csv" \
  -F "pipeline=my_pipeline"
Response: 200 OK with a partial PipelineConfig JSON:
{
  "name": "my_pipeline",
  "source": {
    "fileAttributes": {
      "csvAttributes": {
        "delimiter": ",",
        "header": true
      }
    },
    "schemaProperties": {
      "fields": [
        {"name": "id", "type": "int"},
        {"name": "name", "type": "string"},
        {"name": "amount", "type": "double"},
        {"name": "created_at", "type": "string"}
      ]
    }
  },
  "destination": {
    "database": {
      "dbName": "datris",
      "schema": "public",
      "table": "my_pipeline",
      "usePostgres": true
    }
  }
}
Inferred types: boolean, int, bigint, float, double, string, date, timestamp Note: For CSV files, the AI analyzes the content and infers types. For JSON and XML files, a default config is generated with a single _json or _xml string field. When allStrings is true, all fields are set to string (used by the MCP server for reliable ingestion). Edit the generated JSON to add your destination configuration before registering it with POST /pipeline.

Profile Data

Upload a data file for AI-powered profiling — get summary statistics, quality issues, and suggested validation rules.
POST /api/v1/pipeline/profile
Content-Type: multipart/form-data
Parameters:
ParameterTypeRequiredDescription
fileform-dataYesData file to profile
delimiterform-dataNoCSV delimiter (default: ,)
headerform-dataNoWhether file has header row (default: true)
sampleSizeform-dataNoNumber of rows to sample (default: 200)
Example:
curl -X POST http://localhost:8080/api/v1/pipeline/profile \
  -F "file=@/path/to/data.csv" \
  -F "sampleSize=200"
Response: 200 OK with a JSON profile including summary statistics, quality issues, recommendations, and suggested data quality rules. See AI Data Profiling for the full response format.