COPY command. It streams data directly into the target table, making it significantly faster than row-by-row inserts.
Connection
Connection credentials are retrieved from Vault using the secret name specified insecrets.postgresSecretName in application.yaml. The secret must contain the following keys:
| Key | Description |
|---|---|
jdbcUrl | JDBC connection string for the database |
username | Database user |
password | Database password |
Table Management
WhenmanageTableManually is set to false (the default), the pipeline auto-creates the target table if it does not exist. The table schema is derived from the pipeline’s destination schema if one is defined, otherwise it falls back to the source schema. Types are mapped using the table below.
Set manageTableManually to true if you want to create and manage the table yourself. In this case the table must already exist and its column types must be compatible with the incoming data.
Type Mapping
When auto-creating tables, source schema types are mapped to PostgreSQL types as follows:| Source Type | PostgreSQL Type |
|---|---|
tinyint | int2 |
smallint | int2 |
float | float4 |
double | float8 |
string | text |
_json | json |
_xml | xml |
int, bigint, boolean, date, timestamp, decimal(p,s), varchar(n), char(n)) are passed through to PostgreSQL as-is.
Truncate Before Write
SettruncateBeforeWrite to true to issue a TRUNCATE statement on the target table before loading data. This replaces the full contents of the table on each run.
Transaction Support
All writes are wrapped in a database transaction by default (useTransaction: true). If any error occurs during the COPY operation, the transaction is rolled back and no partial data is committed. Set useTransaction to false to disable transactional writes.
Primary Key and Upsert
Specify one or more columns inkeyFields to define the natural key for the target table. keyFields controls two behaviors:
- Table shape. When
manageTableManuallyisfalse, the auto-created table includes aPRIMARY KEYconstraint on these columns. - Load strategy. When
keyFieldsis set andtruncateBeforeWriteisfalse, the loader switches from rawCOPYto an upsert path:COPYinto a session-local staging table, thenINSERT ... SELECT ... ON CONFLICT (keyFields) DO UPDATE SET <non_key_cols> = EXCLUDED.<non_key_cols>into the target. Rows that match an existing primary-key tuple replace the existing row; rows that don’t match are inserted. The same row can therefore be loaded across multiple runs without violating the unique constraint.
keyFields onto an existing table: if you add keyFields to a pipeline that’s been loading without it, the underlying table may not have a matching unique constraint yet. The loader auto-creates a unique index on the keyFields columns the next time it runs. If the existing data already contains rows that violate the proposed uniqueness, the index creation fails with a clear remediation message — deduplicate the existing rows, or set truncateBeforeWrite: true to wipe and reload from scratch.
Upsert semantics — on conflict, non-key columns are overwritten in full, including NULLs. This matches MongoDB’s keyFields behavior (true upsert, not a non-null merge). Sources that emit partial rows and don’t want NULLs to clobber existing values should coalesce upstream before the pipeline.
When keyFields is empty, the loader uses raw COPY directly into the target table. This is the fastest path; duplicate-key violations against any pre-existing unique constraint fail the load. Use this for append-only ingestion where duplicates aren’t expected.
When truncateBeforeWrite: true, the loader uses raw COPY regardless of keyFields because truncate already prevents duplicates. The upsert path would just be slower with no benefit.
Schema Auto-Creation
WhenmanageTableManually is false, the pipeline also creates the target schema if it doesn’t exist (CREATE SCHEMA IF NOT EXISTS).
Custom COPY Options
Use theoptions array to pass additional PostgreSQL COPY options. Each entry is appended to the COPY command. When no options are specified, the default is FORMAT csv — Postgres CSV mode treats unquoted empty fields as NULL automatically. If your source uses a non-empty placeholder for missing values (e.g. a literal .), normalize it at the source layer or set the NULL option explicitly via options.
Configuration Example
application.yaml via secrets.postgresSecretName, not per pipeline. See Configuration Reference for Vault secret format.
Field Reference
| Field | Required | Default | Description |
|---|---|---|---|
dbName | yes | Target database name | |
schema | no | public | Target schema |
table | yes | Target table name | |
usePostgres | yes | false | Must be true to enable PostgreSQL destination |
manageTableManually | no | false | If true, skip auto-creation of the table |
truncateBeforeWrite | no | false | Truncate the table before loading |
keyFields | no | Array of column names for the natural key. Sets a PRIMARY KEY constraint on auto-created tables AND switches the load strategy to upsert (INSERT ... ON CONFLICT DO UPDATE) when truncateBeforeWrite is false. See Primary Key and Upsert. | |
useTransaction | no | true | Wrap the COPY in a database transaction |
options | no | Array of additional COPY option strings |
