Header validation uses an AI model to check that the CSV file header matches the expected field names defined in the pipeline schema. The validation is fuzzy — it allows case differences, underscores vs spaces, abbreviations, and minor naming variations. Column order does not matter. All schema columns must be present; extra columns in the file are OK.Documentation Index
Fetch the complete documentation index at: https://docs.datris.ai/llms.txt
Use this file to discover all available pages before exploring further.
Requirements
- The source file must be CSV format.
- The pipeline must be configured with
header: true(i.e., the first row contains column names). ai.enabled: truemust be set inapplication.yaml.
Configuration
Enable header validation by settingvalidateFileHeader to true in the dataQuality block:
Behavior
- The pipeline reads the first row of the CSV file (the header) and the schema field names.
- Both are sent to the AI model, which evaluates whether they match.
- The AI allows fuzzy matching:
"First Name"matches"first_name","qty"matches"quantity", etc. - Column order does not matter — the pipeline uses the header to map columns to schema fields by name.
- All schema columns must be present in the header. Missing columns fail validation.
- Extra columns in the header beyond the schema are accepted.
- If validation fails, a clear error message explains which columns are missing or unmatched.
