Skip to main content
The PostgreSQL destination performs bulk inserts using the PostgreSQL COPY command. It streams data directly into the target table, making it significantly faster than row-by-row inserts.

Connection

Connection credentials are retrieved from Vault using the secret name specified in secrets.postgresSecretName in application.yaml. The secret must contain the following keys:
KeyDescription
jdbcUrlJDBC connection string for the database
usernameDatabase user
passwordDatabase password

Table Management

When manageTableManually is set to false (the default), the pipeline auto-creates the target table if it does not exist. The table schema is derived from the pipeline’s destination schema if one is defined, otherwise it falls back to the source schema. Types are mapped using the table below. Set manageTableManually to true if you want to create and manage the table yourself. In this case the table must already exist and its column types must be compatible with the incoming data.

Type Mapping

When auto-creating tables, source schema types are mapped to PostgreSQL types as follows:
Source TypePostgreSQL Type
tinyintint2
smallintint2
floatfloat4
doublefloat8
stringtext
_jsonjson
_xmlxml
All other types (int, bigint, boolean, date, timestamp, decimal(p,s), varchar(n), char(n)) are passed through to PostgreSQL as-is.

Truncate Before Write

Set truncateBeforeWrite to true to issue a TRUNCATE statement on the target table before loading data. This replaces the full contents of the table on each run.

Transaction Support

All writes are wrapped in a database transaction by default (useTransaction: true). If any error occurs during the COPY operation, the transaction is rolled back and no partial data is committed. Set useTransaction to false to disable transactional writes.

Primary Key and Upsert

Specify one or more columns in keyFields to define the natural key for the target table. keyFields controls two behaviors:
  1. Table shape. When manageTableManually is false, the auto-created table includes a PRIMARY KEY constraint on these columns.
  2. Load strategy. When keyFields is set and truncateBeforeWrite is false, the loader switches from raw COPY to an upsert path: COPY into a session-local staging table, then INSERT ... SELECT ... ON CONFLICT (keyFields) DO UPDATE SET <non_key_cols> = EXCLUDED.<non_key_cols> into the target. Rows that match an existing primary-key tuple replace the existing row; rows that don’t match are inserted. The same row can therefore be loaded across multiple runs without violating the unique constraint.
Retrofitting keyFields onto an existing table: if you add keyFields to a pipeline that’s been loading without it, the underlying table may not have a matching unique constraint yet. The loader auto-creates a unique index on the keyFields columns the next time it runs. If the existing data already contains rows that violate the proposed uniqueness, the index creation fails with a clear remediation message — deduplicate the existing rows, or set truncateBeforeWrite: true to wipe and reload from scratch. Upsert semantics — on conflict, non-key columns are overwritten in full, including NULLs. This matches MongoDB’s keyFields behavior (true upsert, not a non-null merge). Sources that emit partial rows and don’t want NULLs to clobber existing values should coalesce upstream before the pipeline. When keyFields is empty, the loader uses raw COPY directly into the target table. This is the fastest path; duplicate-key violations against any pre-existing unique constraint fail the load. Use this for append-only ingestion where duplicates aren’t expected. When truncateBeforeWrite: true, the loader uses raw COPY regardless of keyFields because truncate already prevents duplicates. The upsert path would just be slower with no benefit.

Schema Auto-Creation

When manageTableManually is false, the pipeline also creates the target schema if it doesn’t exist (CREATE SCHEMA IF NOT EXISTS).

Custom COPY Options

Use the options array to pass additional PostgreSQL COPY options. Each entry is appended to the COPY command. When no options are specified, the default is FORMAT csv — Postgres CSV mode treats unquoted empty fields as NULL automatically. If your source uses a non-empty placeholder for missing values (e.g. a literal .), normalize it at the source layer or set the NULL option explicitly via options.

Configuration Example

{
  "name": "orders_pipeline",
  "source": { "..." : "..." },
  "destination": {
    "database": {
      "dbName": "analytics",
      "schema": "public",
      "table": "orders",
      "usePostgres": true,
      "manageTableManually": false,
      "truncateBeforeWrite": true,
      "keyFields": ["order_id"],
      "options": ["NULL 'NA'"]
    }
  }
}
Connection credentials are configured globally in application.yaml via secrets.postgresSecretName, not per pipeline. See Configuration Reference for Vault secret format.

Field Reference

FieldRequiredDefaultDescription
dbNameyesTarget database name
schemanopublicTarget schema
tableyesTarget table name
usePostgresyesfalseMust be true to enable PostgreSQL destination
manageTableManuallynofalseIf true, skip auto-creation of the table
truncateBeforeWritenofalseTruncate the table before loading
keyFieldsnoArray of column names for the natural key. Sets a PRIMARY KEY constraint on auto-created tables AND switches the load strategy to upsert (INSERT ... ON CONFLICT DO UPDATE) when truncateBeforeWrite is false. See Primary Key and Upsert.
useTransactionnotrueWrap the COPY in a database transaction
optionsnoArray of additional COPY option strings

Completion Notification

A pipeline notification is published to ActiveMQ on completion. See Notifications for details.