Requirements
- The source file must be CSV format.
- The pipeline must be configured with
header: true(i.e., the first row contains column names). ai.enabled: truemust be set inapplication.yaml.
Configuration
Enable header validation by settingvalidateFileHeader to true in the dataQuality block:
Behavior
- The pipeline reads the first row of the CSV file (the header) and the schema field names.
- Both are sent to the AI model, which evaluates whether they match.
- The AI allows fuzzy matching:
"First Name"matches"first_name","qty"matches"quantity", etc. - Column order does not matter — the pipeline uses the header to map columns to schema fields by name.
- All schema columns must be present in the header. Missing columns fail validation.
- Extra columns in the header beyond the schema are accepted.
- If validation fails, a clear error message explains which columns are missing or unmatched.