Skip to main content
The pipeline server is configured via application.yaml (or application.properties). This page documents all available properties.

Full Reference

Spring Boot

PropertyDefaultDescription
spring.servlet.multipart.max-file-size1GBMaximum upload file size
spring.servlet.multipart.max-request-size1GBMaximum request size
spring.server.tomcat.connection-timeout600000Tomcat connection timeout (ms)

Logging

PropertyDefaultDescription
logging.level.rootINFORoot log level
logging.level.org.springframework.webINFOSpring web log level
logging.level.net.datris.pipelineINFOPipeline log level

Scheduling

PropertyDefaultDescription
schedule.checkFileNotifierQueue5000Polling interval for file notification queue (ms)
schedule.findJobsToStart5000Interval to check for queued jobs (ms)
schedule.checkDatabaseSourceQueries30000Interval to check for database pulls (ms)

Pipeline

PropertyDefaultDescription
environmentossEnvironment name. Used as prefix for bucket names (oss-raw, oss-data, etc.) and table names
useApiKeysfalseEnable API key authentication
sendPipelineNotificationstrueEnable pipeline event notifications
ttlFileNotifierQueueMessages60Days to retain processed message IDs for deduplication

MinIO (Object Store)

PropertyDescription
minio.serverMinIO endpoint URL (e.g., http://localhost:9000)
MinIO credentials are stored in Vault under the secret specified by secrets.minIOSecretName:
{
  "accessKey": "minioadmin",
  "secretKey": "minioadmin"
}

Secrets (HashiCorp Vault)

PropertyDescription
secrets.apiKeysSecretNameVault path for API keys
secrets.postgresSecretNameVault path for PostgreSQL credentials
secrets.minIOSecretNameVault path for MinIO credentials
secrets.activeMQSecretNameVault path for ActiveMQ credentials
secrets.mongoDbSecretNameVault path for MongoDB credentials
secrets.kafkaProducerSecretNameVault path for Kafka producer credentials
secrets.embeddingSecretNameVault path for embedding model credentials
secrets.qdrantSecretNameVault path for Qdrant connection
secrets.weaviateSecretNameVault path for Weaviate connection
secrets.milvusSecretNameVault path for Milvus connection
secrets.chromaSecretNameVault path for Chroma connection
secrets.pgvectorSecretNameVault path for pgvector PostgreSQL connection
Vault connection is configured via environment variables:
  • VAULT_ADDR - Vault server URL (e.g., http://vault:8200)
  • VAULT_TOKEN - Authentication token

ActiveMQ (Queue & Notifications)

PropertyDescription
activemq.serverActiveMQ broker URL (e.g., tcp://localhost:61616)
ActiveMQ credentials are stored in Vault under secrets.activeMQSecretName:
{
  "username": "admin",
  "password": "admin"
}

MongoDB (NoSQL Store)

PropertyDescription
mongodb.connectionStringMongoDB connection URI (e.g., mongodb://localhost:27017)
mongodb.databaseDatabase name for pipeline metadata

Kafka Consumer (Optional)

PropertyDefaultDescription
kafkaConsumer.enabledfalseEnable Kafka topic consumption
kafkaConsumer.bootstrapServersKafka broker address
kafkaConsumer.groupIdConsumer group ID
kafkaConsumer.topicPollingInterval500Topic polling interval (ms)
kafkaConsumer.topicPrefixPrefix for topic names

AI Schema Generation (Optional)

PropertyDefaultDescription
ai.enabledfalseEnable the AI schema generation endpoint
ai.provideranthropicAI provider — anthropic or openai
ai.aiSecretNameVault secret name containing endpoint, model, and apiKey
See AI Schema Generation for full setup details.

Vault Secret Formats

PostgreSQL

{
  "username": "postgres",
  "password": "password",
  "jdbcUrl": "jdbc:postgresql://localhost:5432"
}

MySQL

{
  "username": "root",
  "password": "password",
  "jdbcUrl": "jdbc:mysql://localhost:3306"
}

Kafka Producer

{
  "bootstrapServers": "kafka:9092",
  "username": null,
  "password": null
}

MongoDB

{
  "connectionString": "mongodb://localhost:27017"
}

MinIO Buckets

The following buckets are created automatically by the minio-init container:
BucketPurpose
{environment}-rawFile upload staging
{environment}-raw-plusProcessed file staging
{environment}-tempTemporary processing files
{environment}-dataObject store destination output
{environment}-configConfiguration files (validation schemas)

MongoDB Collections

CollectionPurpose
{environment}-pipelinePipeline configurations
{environment}-pipeline-statusJob processing status
{environment}-archived-metadataFile ingestion metadata
{environment}-file-notifier-messageProcessed message deduplication
{environment}-data-pullDatabase pull scheduling state