Application Scaling Playbook
Application Scaling Guide for 20k Concurrent Users
This guide explains how to configure Puma web server and set up load balancing for handling 20k concurrent users.
Puma Configuration
Current Configuration
- Workers: 1 (default)
- Threads: 3 (default)
- Concurrent requests: 3 per server
For 20k Concurrent Users
Single Server Configuration
- Workers: 10 processes
- Threads: 5 per worker
- Concurrent requests: 50 per server (10 × 5)
Multi-Server Configuration (Recommended)
- Servers: 2-4 application servers
- Workers per server: 10
- Threads per worker: 5
- Total concurrent requests: 100-200 (2-4 servers × 50)
Configuration Files
config/puma.rb
# Worker processes for handling 20k concurrent users
workers ENV.fetch("WEB_CONCURRENCY", 10)
# Threads per worker
threads_count = ENV.fetch("RAILS_MAX_THREADS", 5)
threads threads_count, threads_count
# Preload app for better memory efficiency
preload_app!
# Worker boot code
on_worker_boot do
ActiveRecord::Base.establish_connection if defined?(ActiveRecord)
end
# Master process boot code
before_fork do
ActiveRecord::Base.connection_pool.disconnect! if defined?(ActiveRecord)
end
Environment Variables
# Puma configuration
WEB_CONCURRENCY=10
RAILS_MAX_THREADS=5
Load Balancing
Architecture
┌─────────────┐
│ Load Balancer│
│ (Nginx/HAProxy) │
└──────┬───────┘
│
┌──────────────────┼──────────────────┐
│ │ │
┌────▼────┐ ┌────▼────┐ ┌────▼────┐
│ App │ │ App │ │ App │
│ Server 1│ │ Server 2│ │ Server 3│
│ 10×5=50 │ │ 10×5=50 │ │ 10×5=50 │
└─────────┘ └─────────┘ └─────────┘
│ │ │
└──────────────────┼──────────────────┘
│
┌──────▼───────┐
│ PostgreSQL │
│ (Primary DB) │
└───────────────┘
Load Balancer Options
1. Nginx (Recommended)
Configuration (/etc/nginx/sites-available/ecommerce_app):
upstream ecommerce_app {
# Round-robin load balancing (default)
server app1.example.com:3000;
server app2.example.com:3000;
server app3.example.com:3000;
# Optional: Weighted load balancing
# server app1.example.com:3000 weight=3;
# server app2.example.com:3000 weight=2;
# server app3.example.com:3000 weight=1;
# Optional: Health checks
# server app1.example.com:3000 max_fails=3 fail_timeout=30s;
}
server {
listen 80;
server_name yourdomain.com;
# Redirect HTTP to HTTPS
return 301 https://$server_name$request_uri;
}
server {
listen 443 ssl http2;
server_name yourdomain.com;
ssl_certificate /path/to/cert.pem;
ssl_certificate_key /path/to/key.pem;
# Gzip compression
gzip on;
gzip_vary on;
gzip_min_length 1024;
gzip_types text/plain text/css text/xml text/javascript application/json application/javascript application/xml+rss;
# Proxy settings
location / {
proxy_pass http://ecommerce_app;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Timeouts
proxy_connect_timeout 60s;
proxy_send_timeout 60s;
proxy_read_timeout 60s;
# Buffer settings
proxy_buffering on;
proxy_buffer_size 4k;
proxy_buffers 8 4k;
}
# Health check endpoint
location /health {
access_log off;
proxy_pass http://ecommerce_app/health;
}
}
2. HAProxy
Configuration (/etc/haproxy/haproxy.cfg):
global
log /dev/log local0
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin
stats timeout 30s
user haproxy
group haproxy
daemon
defaults
log global
mode http
option httplog
option dontlognull
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
frontend http_front
bind *:80
redirect scheme https code 301 if !{ ssl_fc }
frontend https_front
bind *:443 ssl crt /path/to/cert.pem
default_backend ecommerce_app_backend
backend ecommerce_app_backend
balance roundrobin
option httpchk GET /health
http-check expect status 200
server app1 app1.example.com:3000 check
server app2 app2.example.com:3000 check
server app3 app3.example.com:3000 check
listen stats
bind *:8404
stats enable
stats uri /stats
stats refresh 30s
3. Cloud Load Balancers
AWS Application Load Balancer (ALB):
- Use target groups with health checks
- Configure SSL termination
- Enable sticky sessions (not required for stateless app)
Google Cloud Load Balancer:
- Configure backend services
- Set up health checks
- Enable SSL termination
Azure Load Balancer:
- Configure backend pools
- Set up health probes
- Enable SSL termination
Session Stickiness
Not Required - This application is stateless:
- No server-side sessions
- JWT tokens for authentication
- Database-backed sessions (if needed)
- Redis for shared state (if needed)
If you need session stickiness for specific features:
# Nginx sticky sessions
upstream ecommerce_app {
ip_hash; # Route by client IP
server app1.example.com:3000;
server app2.example.com:3000;
server app3.example.com:3000;
}
Database Connection Pooling
Option 1: Direct PostgreSQL Connections (Current)
Configuration: Already configured in config/database.yml
- Primary DB: 120 connections
- Queue DB: 90 connections
- Each server: 50 connections (10 workers × 5 threads)
For 4 servers: 4 × 50 = 200 connections needed
- PostgreSQL
max_connectionsshould be ≥ 200
Option 2: PgBouncer (Optional, Recommended for High Load)
Benefits:
- Reduces PostgreSQL connection overhead
- Allows more application connections
- Better connection management
Architecture:
App Servers → PgBouncer → PostgreSQL
(200 connections) (20 connections)
Installation:
# Ubuntu/Debian
sudo apt-get install pgbouncer
# macOS
brew install pgbouncer
Configuration (/etc/pgbouncer/pgbouncer.ini):
[databases]
ecommerce_app_production = host=postgres.example.com port=5432 dbname=ecommerce_app_production
ecommerce_app_production_queue = host=postgres.example.com port=5432 dbname=ecommerce_app_production_queue
[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
# Connection pooling mode
pool_mode = transaction # One connection per transaction
# Pool sizes
max_client_conn = 1000 # Max client connections
default_pool_size = 25 # Connections per database
reserve_pool_size = 5 # Reserved connections
max_db_connections = 100 # Max connections to PostgreSQL
# Timeouts
server_idle_timeout = 600
server_connect_timeout = 15
server_login_retry = 15
# Logging
logfile = /var/log/pgbouncer/pgbouncer.log
pidfile = /var/run/pgbouncer/pgbouncer.pid
Update Rails Configuration (config/database.yml):
production:
primary:
host: pgbouncer.example.com # Point to PgBouncer instead of PostgreSQL
port: 6432
pool: <%= ENV.fetch("DB_POOL_SIZE") { 120 } %>
User List (/etc/pgbouncer/userlist.txt):
# Generate with: echo "md5"$(echo -n "passwordusername" | md5sum | cut -d' ' -f1)
"ecommerce_app" "md5abc123..."
Health Checks
Application Health Check Endpoint
Create a health check endpoint for load balancer:
# config/routes.rb
get '/health', to: 'health#check'
# app/controllers/health_controller.rb
class HealthController < ApplicationController
def check
# Check database connection
ActiveRecord::Base.connection.execute('SELECT 1')
# Check cache
Rails.cache.write('health_check', 'ok', expires_in: 1.second)
Rails.cache.read('health_check')
render json: { status: 'ok', timestamp: Time.current.iso8601 }
rescue => e
render json: { status: 'error', error: e.message }, status: 503
end
end
Monitoring
Key Metrics to Monitor
- Request Rate: Requests per second per server
- Response Time: P50, P95, P99 latencies
- Error Rate: 4xx and 5xx errors
- Connection Pool: Database connection utilization
- Worker Utilization: CPU and memory per worker
- Queue Depth: Background job queue depths
Monitoring Tools
- Application Metrics:
OrderMetricsService,DatabaseMetricsService - Server Metrics: Prometheus, Datadog, New Relic
- Load Balancer Metrics: Nginx/HAProxy stats, Cloud provider metrics
Deployment Checklist
Step 1: Configure Puma
# Set environment variables
export WEB_CONCURRENCY=10
export RAILS_MAX_THREADS=5
Step 2: Deploy Application Servers
Deploy to 2-4 servers with Puma configuration.
Step 3: Configure Load Balancer
Set up Nginx or HAProxy with health checks.
Step 4: (Optional) Set Up PgBouncer
If using PgBouncer:
- Install and configure PgBouncer
- Update
database.ymlto point to PgBouncer - Restart application servers
Step 5: Verify Configuration
# Check Puma workers
ps aux | grep puma
# Check database connections
rails runner "puts ActiveRecord::Base.connection_pool.size"
# Test load balancer
curl https://yourdomain.com/health
Performance Tuning
For Higher Loads (50k+ users)
- Increase Workers:
WEB_CONCURRENCY=20 - Add More Servers: Scale to 4-8 servers
- Use PgBouncer: Reduce PostgreSQL connection overhead
- Enable Caching: Redis for shared cache
- CDN: Use CloudFlare or AWS CloudFront for static assets
Memory Considerations
- Each Puma worker: ~200-500MB RAM
- 10 workers: ~2-5GB RAM per server
- Ensure servers have sufficient RAM (8GB+ recommended)
Troubleshooting
“Too many connections” Error
Solution: Use PgBouncer or increase PostgreSQL max_connections
High Memory Usage
Solution:
- Reduce
WEB_CONCURRENCY - Enable
preload_app!in Puma - Monitor for memory leaks
Slow Response Times
Solution:
- Check database query performance
- Enable query caching
- Use CDN for static assets
- Optimize N+1 queries