API File Upload

Upload data files directly to the pipeline through the REST API. The pipeline accepts any file type (CSV, JSON, XML, Excel, PDF, Word, PowerPoint, HTML, email, EPUB, plain text, and archives), handles decompression automatically, and returns a tracking token.

Endpoint

POST /api/v1/pipeline/upload

Request

The request uses multipart/form-data encoding with the following parts:

Part	Required	Description
`file`	Yes	The data file to upload
`pipeline`	Yes	Name of the target pipeline configuration
`publishertoken`	No	Opaque token identifying the data publisher; used for lineage tracking

Example

curl -X POST "http://localhost:9000/api/v1/pipeline/upload" \
  -F "file=@transactions_2025.csv" \
  -F "pipeline=transactions" \
  -F "publishertoken=finance-team-a"

Response

A successful upload returns HTTP 200 with a JSON body containing a pipelineToken:

{
  "pipelineToken": "d4f8e1a2-7b3c-4e9f-a5d6-1c2b3e4f5a6b",
  "status": "ACCEPTED"
}

The pipelineToken is a unique UUID generated for each upload. Use it to track the job’s processing status, view errors, or cancel the job.

Compressed Files

When the uploaded file has a compressed extension, the pipeline stages it to the MinIO raw bucket before processing:

Extension	Handling
`.zip`	Staged to MinIO raw bucket, then extracted
`.gz`	Staged to MinIO raw bucket, then decompressed
`.tar`	Staged to MinIO raw bucket, then extracted
`.jar`	Staged to MinIO raw bucket, then extracted

Compressed archives may contain multiple data files. Each file inside the archive is processed as a separate unit against the same pipeline configuration.

Uncompressed Files

Files without a recognized compressed extension (e.g., .csv, .json, .xml, .xls, .pdf, .docx, .txt, etc) are processed in-memory directly. They are not staged to MinIO.

Processing Flow

The client sends the multipart request.
The pipeline inspects the file extension.
Compressed path: the file is written to the MinIO raw bucket under {env}-raw/temp/{pipeline}/{filename}. A background job picks it up, decompresses it, and feeds each inner file into the ingestion pipeline.
Uncompressed path: the file contents are read into memory and passed directly to the ingestion pipeline.
The pipeline returns the pipelineToken immediately. Processing continues asynchronously.

Error Responses

HTTP Status	Cause
400	Missing `pipeline` parameter or empty file
413	File exceeds the configured upload size limit
500	Pipeline not found, internal error, or processing failure

Size Limits

The maximum upload size is controlled by spring.servlet.multipart.max-file-size in application.yaml. The default is 1 GB. Adjust this value if your files exceed the limit:

spring:
  servlet:
    multipart:
      max-file-size: 1GB

Getting Started

Discovery

Taps

Ingestion

Destinations

Data Quality

Transformation

AI Features

Configuration

Examples

API File Upload

Endpoint

Request

Example

Response

Compressed Files

Uncompressed Files

Processing Flow

Error Responses

Size Limits

Getting Started

Discovery

Taps

Ingestion

Destinations

Data Quality

Transformation

AI Features

Configuration

Examples

Documentation Index

​Endpoint

​Request

​Example

​Response

​Compressed Files

​Uncompressed Files

​Processing Flow

​Error Responses

​Size Limits

Endpoint

Request

Example

Response

Compressed Files

Uncompressed Files

Processing Flow

Error Responses

Size Limits