Deploy n8n with Docker Compose: Production-Ready Setup with PostgreSQL and Redis
Master production n8n deployment with Docker Compose, PostgreSQL, Redis queue management, multi-container architecture, security hardening, automated backups, monitoring, and enterprise scaling strategies.
Deploy n8n with Docker Compose: Production-Ready Setup with PostgreSQL and Redis
After years of deploying n8n in production environments, I've learned that the difference between a hobby automation server and a mission-critical workflow engine often comes down to the initial architecture decisions you make. Too many teams start with a basic Docker run command and then struggle to scale when their automation needs grow exponentially. This guide walks through building a robust, production-ready n8n deployment using Docker Compose with PostgreSQL and Redis, covering everything from multi-container orchestration to enterprise-grade security configurations.
Understanding the production landscape
When you're running n8n in production, you're not just spinning up a container and calling it a day. You're building an automation infrastructure that needs to handle thousands of workflow executions, manage sensitive credentials securely, recover from failures gracefully, and scale with your organization's growing needs. The architecture we're building uses PostgreSQL as the primary database instead of SQLite, Redis for queue management and caching, and implements proper data persistence with automated backup strategies. Most importantly, we're configuring everything with security and scalability in mind from day one.
The transition from development to production reveals interesting performance characteristics. A single n8n instance can handle up to 220 workflow executions per second under optimal conditions, but most organizations hit performance walls around 5,000-10,000 daily executions without proper architecture. That's where queue mode with Redis becomes essential, allowing you to distribute workflow execution across multiple worker containers while maintaining a responsive UI and webhook processing layer.
Prerequisites and environment preparation
Before diving into the deployment, ensure your system meets the production requirements. You'll need a server with at least 4 CPU cores and 8GB of RAM for a basic production setup, though 16GB or more is recommended for high-volume environments. The host should be running Docker 20.10+ and Docker Compose 2.0+ for optimal compatibility with the latest n8n features. Storage-wise, allocate at least 20GB for the application and databases, preferably on SSD or NVMe drives for better performance.
Start by preparing your deployment directory structure. This organization helps maintain clarity as your deployment grows more complex over time:
sudo mkdir -p /opt/n8n/{config,data,postgres,redis,backups,scripts}
sudo chown -R 1000:1000 /opt/n8n/data
sudo chown -R 999:999 /opt/n8n/postgres
sudo chown -R 999:999 /opt/n8n/redis
The ownership settings are crucial here. The n8n container runs as user node with UID 1000, and getting these permissions wrong is one of the most common deployment issues I encounter. PostgreSQL and Redis typically run as UID 999, though this can vary depending on your base images.
Generate your encryption key before proceeding. This key encrypts all credentials stored in n8n and must remain consistent across all instances and upgrades:
openssl rand -hex 32 > /opt/n8n/config/encryption_key.txt
chmod 600 /opt/n8n/config/encryption_key.txt
Building the multi-container architecture
The heart of our production deployment is a carefully orchestrated multi-container setup. Rather than running everything in a single container, we separate concerns to improve scalability, maintainability, and fault tolerance. Let's build this step by step, starting with our Docker Compose configuration.
Navigate to your n8n directory and create the main Docker Compose file:
cd /opt/n8n
sudo nano docker-compose.yml
Start with a basic structure and build it incrementally. Here's our complete production configuration:
version: '3.8'
volumes:
postgres_data:
driver: local
driver_opts:
type: none
device: /opt/n8n/postgres
o: bind
n8n_data:
driver: local
driver_opts:
type: none
device: /opt/n8n/data
o: bind
redis_data:
driver: local
driver_opts:
type: none
device: /opt/n8n/redis
o: bind
networks:
n8n-network:
driver: bridge
ipam:
config:
- subnet: 172.20.0.0/16
x-shared: &shared
restart: always
image: docker.n8n.io/n8nio/n8n:latest
environment:
# Database Configuration
- DB_TYPE=postgresdb
- DB_POSTGRESDB_HOST=postgres
- DB_POSTGRESDB_PORT=5432
- DB_POSTGRESDB_DATABASE=${POSTGRES_DB:-n8n}
- DB_POSTGRESDB_USER=${POSTGRES_NON_ROOT_USER:-n8n}
- DB_POSTGRESDB_PASSWORD_FILE=/run/secrets/postgres_password
# Queue Configuration
- EXECUTIONS_MODE=queue
- QUEUE_BULL_REDIS_HOST=redis
- QUEUE_BULL_REDIS_PORT=6379
- QUEUE_HEALTH_CHECK_ACTIVE=true
# Security and Encryption
- N8N_ENCRYPTION_KEY_FILE=/run/secrets/encryption_key
- N8N_SECURE_COOKIE=true
- N8N_SAMESITE_COOKIE=strict
# Production Settings
- N8N_HOST=${DOMAIN:-n8n.example.com}
- N8N_PORT=5678
- N8N_PROTOCOL=https
- WEBHOOK_URL=https://${DOMAIN:-n8n.example.com}/
- N8N_PROXY_HOPS=1
# Data Management
- N8N_DEFAULT_BINARY_DATA_MODE=filesystem
- EXECUTIONS_DATA_PRUNE=true
- EXECUTIONS_DATA_PRUNE_MAX_COUNT=10000
- EXECUTIONS_DATA_MAX_AGE=168
# Security Hardening
- N8N_BLOCK_ENV_ACCESS_IN_NODE=true
- N8N_BLOCK_FILE_ACCESS_TO_N8N_FILES=true
- N8N_RESTRICT_FILE_ACCESS_TO=/data/files:/tmp
# Performance
- NODE_OPTIONS=--max-old-space-size=4096
volumes:
- n8n_data:/home/node/.n8n
- ./files:/data/files
networks:
- n8n-network
secrets:
- encryption_key
- postgres_password
depends_on:
redis:
condition: service_healthy
postgres:
condition: service_healthy
services:
# Main n8n Instance - Handles UI, API, and triggers
n8n:
<<: *shared
container_name: n8n-main
ports:
- "127.0.0.1:5678:5678"
command: ["n8n"]
labels:
- "traefik.enable=true"
- "traefik.http.routers.n8n.rule=Host(`${DOMAIN:-n8n.example.com}`)"
- "traefik.http.routers.n8n.entrypoints=websecure"
- "traefik.http.routers.n8n.tls.certresolver=letsencrypt"
- "traefik.http.services.n8n.loadbalancer.server.port=5678"
# API endpoints - strict rate limiting
- "traefik.http.routers.n8n-api.rule=Host(`${DOMAIN:-n8n.example.com}`) && PathPrefix(`/api/`)"
- "traefik.http.routers.n8n-api.entrypoints=websecure"
- "traefik.http.routers.n8n-api.tls.certresolver=letsencrypt"
- "traefik.http.routers.n8n-api.middlewares=api-ratelimit,security-headers"
# Webhook endpoints - higher rate limits
- "traefik.http.routers.n8n-webhook.rule=Host(`${DOMAIN:-n8n.example.com}`) && PathPrefix(`/webhook/`)"
- "traefik.http.routers.n8n-webhook.entrypoints=websecure"
- "traefik.http.routers.n8n-webhook.tls.certresolver=letsencrypt"
- "traefik.http.routers.n8n-webhook.middlewares=webhook-ratelimit,security-headers"
# Web interface - moderate rate limits
- "traefik.http.routers.n8n.middlewares=web-ratelimit,security-headers"
# Worker Instances - Execute workflows
n8n-worker:
<<: *shared
command: ["worker", "--concurrency=10"]
deploy:
replicas: 2
environment:
- N8N_CONCURRENCY_PRODUCTION_LIMIT=10
- N8N_GRACEFUL_SHUTDOWN_TIMEOUT=300
# PostgreSQL Database
postgres:
image: postgres:16-alpine
restart: always
container_name: n8n-postgres
environment:
- POSTGRES_USER=${POSTGRES_USER:-postgres}
- POSTGRES_PASSWORD_FILE=/run/secrets/postgres_root_password
- POSTGRES_DB=${POSTGRES_DB:-n8n}
volumes:
- postgres_data:/var/lib/postgresql/data
- ./scripts/init-db.sql:/docker-entrypoint-initdb.d/init.sql:ro
networks:
- n8n-network
secrets:
- postgres_root_password
- postgres_password
healthcheck:
test: ['CMD-SHELL', 'pg_isready -h localhost -U ${POSTGRES_USER:-postgres} -d ${POSTGRES_DB:-n8n}']
interval: 5s
timeout: 5s
retries: 10
command: >
postgres
-c shared_buffers=256MB
-c effective_cache_size=1GB
-c maintenance_work_mem=64MB
-c max_connections=200
# Redis for Queue Management
redis:
image: redis:7-alpine
restart: always
container_name: n8n-redis
volumes:
- redis_data:/data
- ./config/redis.conf:/usr/local/etc/redis/redis.conf:ro
networks:
- n8n-network
healthcheck:
test: ['CMD', 'redis-cli', '--raw', 'incr', 'ping']
interval: 5s
timeout: 5s
retries: 5
command: redis-server /usr/local/etc/redis/redis.conf
# Traefik Reverse Proxy with rate limiting
traefik:
image: traefik:v3.0
restart: always
container_name: traefik
command:
- "--api.dashboard=true"
- "--providers.docker=true"
- "--providers.docker.exposedbydefault=false"
- "--providers.file.filename=/etc/traefik/dynamic.yml"
- "--providers.file.watch=true"
- "--entrypoints.web.address=:80"
- "--entrypoints.web.http.redirections.entrypoint.to=websecure"
- "--entrypoints.websecure.address=:443"
- "--certificatesresolvers.letsencrypt.acme.httpchallenge=true"
- "--certificatesresolvers.letsencrypt.acme.httpchallenge.entrypoint=web"
- "--certificatesresolvers.letsencrypt.acme.email=${ACME_EMAIL:-admin@example.com}"
- "--certificatesresolvers.letsencrypt.acme.storage=/letsencrypt/acme.json"
- "--log.level=INFO"
- "--accesslog=true"
ports:
- "80:80"
- "443:443"
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- ./letsencrypt:/letsencrypt
- ./config/traefik/dynamic.yml:/etc/traefik/dynamic.yml:ro
networks:
- n8n-network
security_opt:
- no-new-privileges:true
secrets:
encryption_key:
file: ./config/encryption_key.txt
postgres_root_password:
file: ./config/postgres_root_password.txt
postgres_password:
file: ./config/postgres_password.txt
This configuration establishes our multi-container architecture with proper separation of concerns. The x-shared block defines common configuration that both the main instance and workers inherit, reducing duplication and ensuring consistency. Now let's configure the supporting components before we launch everything.
Configuring environment variables and secrets
Before we can launch our containers, we need to set up the environment variables and secure secrets that our deployment requires. This step is crucial for both security and functionality. Let's create the environment configuration first:
sudo nano /opt/n8n/.env
Configure your environment variables with your specific domain and settings:
# Domain Configuration
DOMAIN=n8n.yourdomain.com
ACME_EMAIL=admin@yourdomain.com
# Database Configuration
POSTGRES_USER=postgres
POSTGRES_DB=n8n
POSTGRES_NON_ROOT_USER=n8n
# Basic Authentication (Optional but recommended)
N8N_BASIC_AUTH_ACTIVE=true
N8N_BASIC_AUTH_USER=admin
N8N_BASIC_AUTH_HASH=$2a$10$... # Generate with: htpasswd -bnBC 10 "" password | tr -d ':'
# Timezone
GENERIC_TIMEZONE=America/New_York
TZ=America/New_York
Now let's generate the secure secrets that our services will use. Never use default passwords in production:
# Generate secure passwords
echo "$(openssl rand -base64 32)" > /opt/n8n/config/postgres_root_password.txt
echo "$(openssl rand -base64 32)" > /opt/n8n/config/postgres_password.txt
chmod 600 /opt/n8n/config/*.txt
To generate a hash for basic authentication, use this command and replace the value in your .env file:
htpasswd -bnBC 10 "" your_password_here | tr -d ':'
The beauty of using Docker secrets with the _FILE suffix is that sensitive values never appear in environment variables or process listings. This approach prevents accidental exposure through logs or debugging tools. When n8n sees an environment variable ending in _FILE, it reads the actual value from the specified file at runtime.
Setting up the database initialization script
PostgreSQL needs proper user setup and optimization for production use. Create the database initialization script:
sudo nano /opt/n8n/scripts/init-db.sql
-- Create non-root user with proper permissions
CREATE USER n8n WITH ENCRYPTED PASSWORD 'will_be_replaced_by_secret';
GRANT ALL PRIVILEGES ON DATABASE n8n TO n8n;
ALTER DATABASE n8n OWNER TO n8n;
-- Performance optimizations for n8n workloads
ALTER SYSTEM SET shared_buffers = '256MB';
ALTER SYSTEM SET effective_cache_size = '1GB';
ALTER SYSTEM SET max_connections = 200;
ALTER SYSTEM SET checkpoint_completion_target = 0.9;
ALTER SYSTEM SET wal_buffers = '16MB';
ALTER SYSTEM SET default_statistics_target = 100;
SELECT pg_reload_conf();
This script creates a dedicated user for n8n and applies performance optimizations that work well with n8n's database access patterns. The settings are conservative but effective for most production deployments.
Configuring Redis for queue management
Redis serves as the message broker for queue mode, enabling horizontal scaling of workflow execution. Let's configure Redis for production use before launching our services:
sudo nano /opt/n8n/config/redis.conf
# Memory management
maxmemory 1gb
maxmemory-policy allkeys-lru
# Persistence for recovery
save 900 1
save 300 10
save 60 10000
appendonly yes
appendfsync everysec
# Performance tuning
tcp-keepalive 60
timeout 0
tcp-backlog 511
databases 2
# Security
requirepass your_redis_password_here
# Network and connection settings
bind 0.0.0.0
port 6379
The persistence settings ensure Redis can recover queue data after unexpected restarts. The maxmemory-policy of allkeys-lru allows Redis to evict least recently used keys when memory limits are reached, preventing memory exhaustion while maintaining recent queue data.
Setting up Traefik rate limiting
Create the Traefik dynamic configuration for rate limiting and security headers:
sudo mkdir -p /opt/n8n/config/traefik
sudo nano /opt/n8n/config/traefik/dynamic.yml
# Traefik dynamic configuration for rate limiting
http:
middlewares:
# API rate limiting - stricter limits for API endpoints
api-ratelimit:
rateLimit:
burst: 20
average: 10
period: "1m"
sourceCriterion:
ipStrategy:
depth: 1
# Webhook rate limiting - higher limits for webhook endpoints
webhook-ratelimit:
rateLimit:
burst: 100
average: 50
period: "1m"
sourceCriterion:
ipStrategy:
depth: 1
# General rate limiting for web interface
web-ratelimit:
rateLimit:
burst: 50
average: 25
period: "1m"
sourceCriterion:
ipStrategy:
depth: 1
# Security headers
security-headers:
headers:
customRequestHeaders:
X-Forwarded-Proto: "https"
customResponseHeaders:
X-Frame-Options: "DENY"
X-Content-Type-Options: "nosniff"
X-XSS-Protection: "1; mode=block"
Strict-Transport-Security: "max-age=31536000; includeSubDomains"
Creating supporting directories and files
Let's create the remaining directories and placeholder files needed for our deployment:
# Create additional directories
sudo mkdir -p /opt/n8n/{files,letsencrypt}
sudo chown -R 1000:1000 /opt/n8n/files
# Create Traefik acme.json with proper permissions
sudo touch /opt/n8n/letsencrypt/acme.json
sudo chmod 600 /opt/n8n/letsencrypt/acme.json
These directories will store user-uploaded files and SSL certificates. The permissions are critical for security and proper container operation.
Initial deployment and verification
Now that all configuration files are in place, let's start our deployment step by step. This methodical approach helps identify issues early and ensures each component works correctly before adding complexity.
First, let's start just the database and Redis to verify they're working:
cd /opt/n8n
sudo docker-compose up -d postgres redis
Check that both services are healthy:
# Wait for services to be ready
sleep 15
# Verify PostgreSQL is running
sudo docker-compose exec postgres pg_isready -U postgres -d n8n
# Verify Redis is working
sudo docker-compose exec redis redis-cli ping
You should see "accepting connections" from PostgreSQL and "PONG" from Redis. If either service fails, check the logs:
# Check PostgreSQL logs
sudo docker-compose logs postgres
# Check Redis logs
sudo docker-compose logs redis
Next, let's start the main n8n instance to initialize the database schema:
sudo docker-compose up -d n8n
Monitor the n8n startup process to ensure it connects to the database successfully:
# Watch n8n logs during startup
sudo docker-compose logs -f n8n
You should see messages indicating successful database connection and schema initialization. Once n8n shows "n8n ready on 0.0.0.0:5678", press Ctrl+C to stop watching logs.
Now start the worker instances:
sudo docker-compose up -d n8n-worker
Finally, start Traefik for SSL termination and routing:
sudo docker-compose up -d traefik
Verify all services are running:
sudo docker-compose ps
All services should show "Up" status. If any show "Exit" or "Restarting", investigate the logs for that specific service.
Testing your deployment
With all services running, let's verify that our production setup is working correctly. Start by accessing the n8n interface through your configured domain. If you haven't set up DNS yet, you can test locally by adding an entry to your hosts file:
# For local testing only
echo "127.0.0.1 n8n.yourdomain.com" | sudo tee -a /etc/hosts
Open your browser and navigate to https://n8n.yourdomain.com. You should see the n8n interface load with SSL encryption provided by Traefik and Let's Encrypt. If you configured basic authentication, you'll be prompted for credentials first.
Let's test the queue functionality by creating a simple workflow. Log into n8n and create a new workflow with these nodes:
- Manual Trigger - Start the workflow manually
- Code Node - Add some processing delay to simulate work
- HTTP Request - Make a simple API call
Here's a test Code node that simulates processing:
// Simulate some work that takes time
const start = Date.now();
while (Date.now() - start < 5000) {
// Wait 5 seconds
}
return [{ json: { message: "Processed by worker", timestamp: new Date().toISOString() } }];
Execute this workflow multiple times in quick succession. Check that the workers are processing these executions:
# Monitor worker activity
sudo docker-compose logs -f n8n-worker
# Check Redis queue status
sudo docker-compose exec redis redis-cli LLEN bull:n8n:wait
sudo docker-compose exec redis redis-cli LLEN bull:n8n:active
You should see workflows being picked up by different worker instances, demonstrating that queue mode is functioning correctly.
Test the rate limiting by making rapid requests:
# Test API rate limiting (should get rate limited after 20 requests)
for i in {1..25}; do
curl -s -o /dev/null -w "%{http_code}\n" "https://n8n.yourdomain.com/api/workflows"
sleep 0.1
done
# Check Traefik access logs to see rate limiting in action
sudo docker-compose logs traefik | grep -i "rate"
Implementing data persistence and automated backups
Now that our deployment is running, let's implement comprehensive backup strategies. Data persistence in n8n involves multiple layers: the PostgreSQL database storing workflow definitions and metadata, the filesystem storing binary data and execution logs, and Redis maintaining queue state.
Create an automated backup script that handles all components:
sudo nano /opt/n8n/scripts/backup.sh
#!/bin/bash
set -euo pipefail
# Configuration
BACKUP_DIR="/opt/n8n/backups"
RETENTION_DAYS=30
DB_CONTAINER="n8n-postgres"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
echo "[$(date)] Starting n8n backup..."
# Create timestamped backup directory
CURRENT_BACKUP="$BACKUP_DIR/$TIMESTAMP"
mkdir -p "$CURRENT_BACKUP"
# Backup PostgreSQL database
echo "[$(date)] Backing up PostgreSQL database..."
docker exec "$DB_CONTAINER" pg_dump -U n8n -d n8n -Fc > "$CURRENT_BACKUP/database.dump"
# Backup n8n data directory (workflows, credentials, settings)
echo "[$(date)] Backing up n8n data..."
tar -czf "$CURRENT_BACKUP/n8n_data.tar.gz" -C /opt/n8n/data .
# Export workflows and credentials via n8n CLI
echo "[$(date)] Exporting workflows and credentials..."
docker exec n8n-main n8n export:workflow --backup --output=/tmp/workflows_backup.json
docker exec n8n-main n8n export:credentials --backup --output=/tmp/credentials_backup.json
docker cp n8n-main:/tmp/workflows_backup.json "$CURRENT_BACKUP/"
docker cp n8n-main:/tmp/credentials_backup.json "$CURRENT_BACKUP/"
# Create backup manifest
cat > "$CURRENT_BACKUP/manifest.json" <<EOF
{
"timestamp": "$TIMESTAMP",
"date": "$(date -Iseconds)",
"version": "$(docker exec n8n-main n8n --version)",
"components": ["database", "data", "workflows", "credentials"]
}
EOF
# Compress entire backup
tar -czf "$BACKUP_DIR/n8n_backup_$TIMESTAMP.tar.gz" -C "$BACKUP_DIR" "$TIMESTAMP"
rm -rf "$CURRENT_BACKUP"
# Cleanup old backups
echo "[$(date)] Cleaning up old backups..."
find "$BACKUP_DIR" -name "n8n_backup_*.tar.gz" -mtime +$RETENTION_DAYS -delete
# Optional: Upload to S3
if [ -n "${S3_BACKUP_BUCKET:-}" ]; then
echo "[$(date)] Uploading to S3..."
aws s3 cp "$BACKUP_DIR/n8n_backup_$TIMESTAMP.tar.gz" \
"s3://$S3_BACKUP_BUCKET/n8n/backups/" \
--storage-class STANDARD_IA
fi
echo "[$(date)] Backup completed successfully"
Make the script executable and schedule it with cron:
sudo chmod +x /opt/n8n/scripts/backup.sh
# Test the backup script
sudo /opt/n8n/scripts/backup.sh
# Schedule daily backups at 2 AM
sudo crontab -e
Add this line to your crontab:
0 2 * * * /opt/n8n/scripts/backup.sh >> /opt/n8n/backups/backup.log 2>&1
Create a restoration script for disaster recovery scenarios:
sudo nano /opt/n8n/scripts/restore.sh
#!/bin/bash
set -euo pipefail
BACKUP_FILE="$1"
RESTORE_DIR="/tmp/restore_$$"
if [ -z "$BACKUP_FILE" ]; then
echo "Usage: $0 <backup_file.tar.gz>"
exit 1
fi
echo "[$(date)] Starting restoration from $BACKUP_FILE..."
# Stop services gracefully
docker-compose down
# Extract backup
mkdir -p "$RESTORE_DIR"
tar -xzf "$BACKUP_FILE" -C "$RESTORE_DIR"
BACKUP_NAME=$(ls "$RESTORE_DIR")
# Start only database for restoration
docker-compose up -d postgres
sleep 10
# Restore PostgreSQL database
echo "[$(date)] Restoring database..."
docker exec -i n8n-postgres psql -U postgres -c "DROP DATABASE IF EXISTS n8n;"
docker exec -i n8n-postgres psql -U postgres -c "CREATE DATABASE n8n OWNER n8n;"
docker exec -i n8n-postgres pg_restore -U n8n -d n8n < "$RESTORE_DIR/$BACKUP_NAME/database.dump"
# Restore n8n data
echo "[$(date)] Restoring n8n data..."
rm -rf /opt/n8n/data/*
tar -xzf "$RESTORE_DIR/$BACKUP_NAME/n8n_data.tar.gz" -C /opt/n8n/data
# Start all services
docker-compose up -d
echo "[$(date)] Restoration completed successfully"
rm -rf "$RESTORE_DIR"
Make the restore script executable:
sudo chmod +x /opt/n8n/scripts/restore.sh
Performance optimization and scaling strategies
Now that your basic deployment is running, let's optimize it for different workload types and implement scaling strategies. Performance tuning in n8n requires understanding your workflow patterns and adjusting configurations accordingly.
Analyzing your workflow patterns
First, let's determine what type of workflows you're running most frequently. Check your workflow execution patterns:
# Connect to your n8n database to analyze workflow patterns
sudo docker-compose exec postgres psql -U n8n -d n8n -c "
SELECT
execution_entity.workflow_id,
COUNT(*) as execution_count,
AVG(EXTRACT(EPOCH FROM (finished_at - started_at))) as avg_duration_seconds
FROM execution_entity
WHERE started_at > NOW() - INTERVAL '7 days'
GROUP BY workflow_id
ORDER BY execution_count DESC
LIMIT 10;"
This analysis helps you understand whether your workflows are CPU-intensive (data transformation, complex logic) or I/O-bound (API calls, database operations).
Optimizing for CPU-intensive workflows
If your workflows involve heavy data processing, create a specialized worker configuration. Update your docker-compose.yml file to replace the basic worker configuration:
# Edit your existing docker-compose.yml
sudo nano /opt/n8n/docker-compose.yml
Replace the existing n8n-worker service with these specialized configurations:
# CPU-intensive worker pool
n8n-worker-cpu:
<<: *shared
command: ["worker", "--concurrency=5"]
environment:
- N8N_CONCURRENCY_PRODUCTION_LIMIT=5
- NODE_OPTIONS=--max-old-space-size=8192
- EXECUTIONS_PROCESS=main
deploy:
replicas: 3
resources:
limits:
cpus: '2.0'
memory: 10G
reservations:
cpus: '1.0'
memory: 4G
# I/O-bound worker pool for API-heavy workflows
n8n-worker-io:
<<: *shared
command: ["worker", "--concurrency=20"]
environment:
- N8N_CONCURRENCY_PRODUCTION_LIMIT=20
- NODE_OPTIONS=--max-old-space-size=2048
- EXECUTIONS_PROCESS=main
deploy:
replicas: 2
resources:
limits:
cpus: '1.0'
memory: 4G
reservations:
cpus: '0.5'
memory: 2G
Apply these changes by restarting your deployment:
cd /opt/n8n
sudo docker-compose down
sudo docker-compose up -d
Verify your new worker configuration:
# Check that both worker types are running
sudo docker-compose ps | grep worker
# Monitor resource usage
sudo docker stats --format "table {{.Container}}\t{{.CPUPerc}}\t{{.MemUsage}}"
Implementing automatic scaling based on queue depth
Create an intelligent autoscaling script that monitors your Redis queue and adjusts worker counts automatically:
sudo nano /opt/n8n/scripts/autoscale.sh
#!/bin/bash
set -euo pipefail
# Configuration
MIN_WORKERS=2
MAX_WORKERS=10
SCALE_UP_THRESHOLD=50
SCALE_DOWN_THRESHOLD=10
SCALE_COOLDOWN=300 # 5 minutes between scaling actions
LOGFILE="/opt/n8n/logs/autoscale.log"
# Ensure log directory exists
mkdir -p /opt/n8n/logs
# Function to log with timestamp
log_message() {
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOGFILE"
}
# Check if we're in cooldown period
COOLDOWN_FILE="/tmp/n8n_autoscale_cooldown"
if [ -f "$COOLDOWN_FILE" ]; then
LAST_SCALE=$(cat "$COOLDOWN_FILE")
CURRENT_TIME=$(date +%s)
TIME_DIFF=$((CURRENT_TIME - LAST_SCALE))
if [ $TIME_DIFF -lt $SCALE_COOLDOWN ]; then
log_message "Still in cooldown period (${TIME_DIFF}s/${SCALE_COOLDOWN}s)"
exit 0
fi
fi
# Get current queue metrics
QUEUE_WAITING=$(docker exec n8n-redis redis-cli LLEN bull:n8n:wait 2>/dev/null || echo "0")
QUEUE_ACTIVE=$(docker exec n8n-redis redis-cli LLEN bull:n8n:active 2>/dev/null || echo "0")
QUEUE_FAILED=$(docker exec n8n-redis redis-cli LLEN bull:n8n:failed 2>/dev/null || echo "0")
# Get current worker count
CURRENT_CPU_WORKERS=$(docker-compose ps -q n8n-worker-cpu | wc -l)
CURRENT_IO_WORKERS=$(docker-compose ps -q n8n-worker-io | wc -l)
TOTAL_WORKERS=$((CURRENT_CPU_WORKERS + CURRENT_IO_WORKERS))
log_message "Queue status - Waiting: $QUEUE_WAITING, Active: $QUEUE_ACTIVE, Failed: $QUEUE_FAILED"
log_message "Current workers - CPU: $CURRENT_CPU_WORKERS, IO: $CURRENT_IO_WORKERS, Total: $TOTAL_WORKERS"
# Scaling logic
if [ "$QUEUE_WAITING" -gt "$SCALE_UP_THRESHOLD" ] && [ "$TOTAL_WORKERS" -lt "$MAX_WORKERS" ]; then
# Scale up - prioritize CPU workers for complex tasks
NEW_CPU_WORKERS=$((CURRENT_CPU_WORKERS + 1))
log_message "Scaling UP: CPU workers $CURRENT_CPU_WORKERS -> $NEW_CPU_WORKERS (queue depth: $QUEUE_WAITING)"
docker-compose up -d --scale n8n-worker-cpu=$NEW_CPU_WORKERS
date +%s > "$COOLDOWN_FILE"
elif [ "$QUEUE_WAITING" -lt "$SCALE_DOWN_THRESHOLD" ] && [ "$TOTAL_WORKERS" -gt "$MIN_WORKERS" ]; then
# Scale down - reduce CPU workers first
if [ "$CURRENT_CPU_WORKERS" -gt 1 ]; then
NEW_CPU_WORKERS=$((CURRENT_CPU_WORKERS - 1))
log_message "Scaling DOWN: CPU workers $CURRENT_CPU_WORKERS -> $NEW_CPU_WORKERS (queue depth: $QUEUE_WAITING)"
docker-compose up -d --scale n8n-worker-cpu=$NEW_CPU_WORKERS
elif [ "$CURRENT_IO_WORKERS" -gt 1 ]; then
NEW_IO_WORKERS=$((CURRENT_IO_WORKERS - 1))
log_message "Scaling DOWN: IO workers $CURRENT_IO_WORKERS -> $NEW_IO_WORKERS (queue depth: $QUEUE_WAITING)"
docker-compose up -d --scale n8n-worker-io=$NEW_IO_WORKERS
fi
date +%s > "$COOLDOWN_FILE"
else
log_message "No scaling action needed"
fi
# Alert on high failure rate
if [ "$QUEUE_FAILED" -gt 10 ]; then
log_message "WARNING: High failure rate detected - $QUEUE_FAILED failed jobs"
# You can add webhook notification here
fi
Make the script executable:
sudo chmod +x /opt/n8n/scripts/autoscale.sh
Test the autoscaling script manually first:
# Test the script
sudo /opt/n8n/scripts/autoscale.sh
# Check the log output
cat /opt/n8n/logs/autoscale.log
Automating the autoscaling
Now let's set up automatic execution of the scaling script. Add it to cron to run every 2 minutes:
sudo crontab -e
Add this line to run autoscaling every 2 minutes:
# Run autoscaling check every 2 minutes
*/2 * * * * /opt/n8n/scripts/autoscale.sh
# Daily backup at 2 AM (if you haven't added this already)
0 2 * * * /opt/n8n/scripts/backup.sh >> /opt/n8n/backups/backup.log 2>&1
Creating a performance monitoring dashboard
Create a script to monitor your n8n performance metrics:
sudo nano /opt/n8n/scripts/monitor.sh
#!/bin/bash
echo "=== n8n Performance Monitor ==="
echo "Timestamp: $(date)"
echo
# Queue metrics
echo "=== Queue Status ==="
echo "Waiting: $(docker exec n8n-redis redis-cli LLEN bull:n8n:wait)"
echo "Active: $(docker exec n8n-redis redis-cli LLEN bull:n8n:active)"
echo "Completed: $(docker exec n8n-redis redis-cli LLEN bull:n8n:completed)"
echo "Failed: $(docker exec n8n-redis redis-cli LLEN bull:n8n:failed)"
echo
# Worker status
echo "=== Worker Status ==="
echo "CPU Workers: $(docker-compose ps -q n8n-worker-cpu | wc -l)"
echo "IO Workers: $(docker-compose ps -q n8n-worker-io | wc -l)"
echo
# Resource usage
echo "=== Resource Usage ==="
docker stats --no-stream --format "table {{.Container}}\t{{.CPUPerc}}\t{{.MemUsage}}" | grep n8n
# Recent executions
echo
echo "=== Recent Execution Stats (last hour) ==="
docker exec n8n-postgres psql -U n8n -d n8n -t -c "
SELECT
COUNT(*) as total_executions,
COUNT(*) FILTER (WHERE finished_at IS NOT NULL) as completed,
COUNT(*) FILTER (WHERE stopped_at IS NOT NULL AND finished_at IS NULL) as failed,
ROUND(AVG(EXTRACT(EPOCH FROM (finished_at - started_at)))::numeric, 2) as avg_duration_sec
FROM execution_entity
WHERE started_at > NOW() - INTERVAL '1 hour';"
Make it executable and test it:
sudo chmod +x /opt/n8n/scripts/monitor.sh
sudo /opt/n8n/scripts/monitor.sh
Performance tuning based on your metrics
After running for a few days, analyze your performance data to fine-tune the configuration:
# Check autoscaling effectiveness
tail -50 /opt/n8n/logs/autoscale.log
# Analyze execution patterns
sudo docker-compose exec postgres psql -U n8n -d n8n -c "
SELECT
DATE_TRUNC('hour', started_at) as hour,
COUNT(*) as executions,
AVG(EXTRACT(EPOCH FROM (finished_at - started_at))) as avg_duration
FROM execution_entity
WHERE started_at > NOW() - INTERVAL '7 days'
GROUP BY hour
ORDER BY hour DESC
LIMIT 24;"
Based on these metrics, you can adjust:
- Concurrency levels: If you see high CPU usage but low throughput, reduce concurrency
- Memory allocation: If containers are getting killed, increase memory limits
- Scaling thresholds: Adjust
SCALE_UP_THRESHOLDandSCALE_DOWN_THRESHOLDin the autoscale script - Worker distribution: Change the replica counts for CPU vs IO workers
This systematic approach ensures your n8n deployment scales efficiently based on actual workload patterns rather than guesswork.
Monitoring and health checks
Production deployments require comprehensive monitoring to detect issues before they impact users. Configure Prometheus metrics collection by adding monitoring services to your docker-compose:
# Add to docker-compose.yml services section
prometheus:
image: prom/prometheus:latest
container_name: prometheus
volumes:
- ./config/prometheus.yml:/etc/prometheus/prometheus.yml:ro
- prometheus_data:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--web.console.libraries=/usr/share/prometheus/console_libraries'
- '--web.console.templates=/usr/share/prometheus/consoles'
networks:
- n8n-network
ports:
- "127.0.0.1:9090:9090"
grafana:
image: grafana/grafana:latest
container_name: grafana
volumes:
- grafana_data:/var/lib/grafana
- ./config/grafana/dashboards:/etc/grafana/provisioning/dashboards:ro
- ./config/grafana/datasources:/etc/grafana/provisioning/datasources:ro
environment:
- GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD:-admin}
- GF_INSTALL_PLUGINS=redis-datasource
networks:
- n8n-network
ports:
- "127.0.0.1:3000:3000"
Create the Prometheus configuration:
sudo nano /opt/n8n/config/prometheus.yml
# /opt/n8n/config/prometheus.yml
global:
scrape_interval: 30s
evaluation_interval: 30s
scrape_configs:
- job_name: 'n8n'
static_configs:
- targets: ['n8n:5678']
metrics_path: /metrics
- job_name: 'postgres'
static_configs:
- targets: ['postgres-exporter:9187']
- job_name: 'redis'
static_configs:
- targets: ['redis-exporter:9121']
Configure health check endpoints for container orchestration:
# Health check configuration in docker-compose
healthcheck:
test: ['CMD', 'wget', '--spider', '-q', 'http://localhost:5678/healthz']
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
Set up alerting rules for critical metrics. Create an alerts configuration:
sudo nano /opt/n8n/config/prometheus/alerts.yml
# /opt/n8n/config/prometheus/alerts.yml
groups:
- name: n8n_alerts
interval: 30s
rules:
- alert: HighErrorRate
expr: rate(n8n_execution_failed_total[5m]) > 0.1
for: 2m
labels:
severity: warning
annotations:
summary: "High error rate detected"
description: "Error rate is {{ $value }} errors per second"
- alert: QueueBacklog
expr: n8n_queue_bull_queue_waiting > 100
for: 5m
labels:
severity: warning
annotations:
summary: "Large queue backlog"
description: "{{ $value }} workflows waiting in queue"
- alert: HighMemoryUsage
expr: n8n_process_resident_memory_bytes > 3221225472
for: 5m
labels:
severity: warning
annotations:
summary: "High memory usage"
description: "Process using {{ $value | humanize }} of memory"
- alert: DatabaseConnectionFailure
expr: up{job="postgres"} == 0
for: 1m
labels:
severity: critical
annotations:
summary: "PostgreSQL is down"
description: "Cannot connect to PostgreSQL database"
Advanced security configurations
Security in production extends beyond basic authentication. Implementing defense-in-depth strategies protects your automation infrastructure from various attack vectors. Start with network isolation using Docker networks to prevent unauthorized container communication. The configuration we've built uses a dedicated bridge network, but you can further enhance security with network policies.
Configure fail2ban to protect against brute force attacks:
sudo nano /etc/fail2ban/jail.d/n8n.conf
[n8n-auth]
enabled = true
port = https
filter = n8n-auth
logpath = /opt/n8n/data/logs/n8n.log
maxretry = 5
findtime = 600
bantime = 3600
Enable audit logging to track all configuration changes and sensitive operations. Configure n8n with enhanced logging:
{
"log": {
"level": "info",
"output": ["console", "file"],
"file": {
"location": "/home/node/.n8n/logs/n8n.log",
"maxsize": 100,
"maxcount": 30
}
},
"audit": {
"enabled": true,
"events": ["workflow.create", "workflow.update", "workflow.delete", "credentials.create", "credentials.update", "credentials.delete"]
}
}
Troubleshooting common deployment issues
Even with careful planning, production deployments encounter issues. The most common problem involves permissions, particularly when n8n cannot write to its data directory. The symptom appears as EACCES: permission denied, open '/home/node/.n8n/config' in the logs. The solution involves ensuring the data directory has the correct ownership before starting containers:
# Fix permission issues
docker-compose down
sudo chown -R 1000:1000 /opt/n8n/data
docker-compose up -d
Database connection issues often manifest as workflows failing to execute or the UI becoming unresponsive. Check database connectivity from within the n8n container:
# Test database connection
docker exec -it n8n-main sh -c 'apt-get update && apt-get install -y postgresql-client'
docker exec -it n8n-main psql -h postgres -U n8n -d n8n -c "SELECT version();"
Memory issues become apparent when handling large datasets or running many concurrent workflows. The Node.js heap exhaustion error FATAL ERROR: Ineffective mark-compacts near heap limit indicates insufficient memory allocation. Increase the heap size through the NODE_OPTIONS environment variable, but remember that container limits must also accommodate the increase:
# Update memory limits in docker-compose.yml
environment:
- NODE_OPTIONS=--max-old-space-size=8192
deploy:
resources:
limits:
memory: 10G
Webhook issues frequently arise from incorrect URL configuration or reverse proxy misconfigurations. Ensure your WEBHOOK_URL environment variable matches your public domain and includes the protocol. For reverse proxy setups, WebSocket support is crucial for real-time updates:
# Traefik WebSocket support is built-in, but verify these labels
labels:
- "traefik.http.routers.n8n.rule=Host(`${DOMAIN}`)"
- "traefik.http.services.n8n.loadbalancer.server.port=5678"
When workflows hang or execute slowly, examine the execution data pruning settings. Without proper pruning, the execution history table grows unbounded, degrading performance:
# Check execution table size
docker exec n8n-postgres psql -U n8n -d n8n -c "
SELECT
schemaname,
tablename,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size
FROM pg_tables
WHERE tablename LIKE '%execution%'
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;"
Conclusion: Building for the long term
Deploying n8n in production with Docker Compose provides a robust foundation for automation infrastructure that can grow with your organization. The architecture we've built separates concerns properly, with PostgreSQL handling persistent data, Redis managing queue operations, and multiple worker instances ensuring scalability. The security configurations protect sensitive workflow data while the monitoring setup provides visibility into system health and performance bottlenecks.
The key to successful production deployment lies in starting with the right architecture rather than trying to retrofit production requirements onto a development setup. By implementing proper secrets management, automated backups, and comprehensive monitoring from the beginning, you avoid the technical debt that often accumulates when systems grow organically. The queue mode configuration with Redis enables horizontal scaling that can handle anything from hundreds to millions of workflow executions, simply by adjusting the worker count.
Remember that this setup is a starting point that you'll refine based on your specific workload patterns. Monitor your metrics, analyze performance bottlenecks, and adjust configurations accordingly. Some workflows benefit from higher concurrency, others from more memory, and understanding these patterns helps you optimize resource utilization. The investment in proper infrastructure pays dividends through improved reliability, easier maintenance, and the confidence that your automation platform can handle whatever demands your organization places on it.