Defining Schemas
Schemas are declared in thesource.schemaProperties.fields array of a pipeline configuration. Each field entry specifies a name and a data type.
Supported Data Types
| Type | Description |
|---|---|
boolean | True/false value |
int | 32-bit signed integer |
tinyint | 8-bit signed integer |
smallint | 16-bit signed integer |
bigint | 64-bit signed integer |
float | 32-bit floating point |
double | 64-bit floating point |
decimal(p,s) | Fixed-precision decimal with p total digits and s scale digits |
string | Variable-length text, no upper bound |
varchar(n) | Variable-length text with maximum length n |
char(n) | Fixed-length text of exactly n characters |
date | Calendar date (no time component) |
timestamp | Date and time with microsecond precision |
Auto-Generating a Schema
If you have a representative CSV file, the pipeline can infer a schema automatically using AI. POST the file to the/api/v1/pipeline/generate endpoint:
AI-Generated Validation Schemas
For JSON and XML pipelines, you can also generate validation schemas using AI:- JSON Schema (Draft 4) — for validating JSON data against an Everit-compatible schema
- W3C XSD — for validating XML data against an XML Schema
json-schema (generates Draft 4 JSON Schema) and xsd (generates W3C XSD). The generated schema is stored at {environment}-config/validation-schema/{name}.json or {name}.xsd.
Source vs Destination Schemas
A source schema describes the data as it arrives (CSV columns, JSON keys, database columns). It is always required. A destination schema describes the data as it should be written to the target system. It is optional. When omitted, the destination inherits the source schema unchanged. Use a destination schema when you need to:- Widen a type (e.g.,
inttobigint) for the target table. - Rename a field between ingestion and storage.
- Drop fields that should not reach the destination.