# Application Scaling Playbook ## Application Scaling Guide for 20k Concurrent Users This guide explains how to configure Puma web server and set up load balancing for handling 20k concurrent users. ## Puma Configuration ### Current Configuration - **Workers**: 1 (default) - **Threads**: 3 (default) - **Concurrent requests**: 3 per server ### For 20k Concurrent Users #### Single Server Configuration - **Workers**: 10 processes - **Threads**: 5 per worker - **Concurrent requests**: 50 per server (10 × 5) #### Multi-Server Configuration (Recommended) - **Servers**: 2-4 application servers - **Workers per server**: 10 - **Threads per worker**: 5 - **Total concurrent requests**: 100-200 (2-4 servers × 50) ### Configuration Files #### `config/puma.rb` ```ruby # Worker processes for handling 20k concurrent users workers ENV.fetch("WEB_CONCURRENCY", 10) # Threads per worker threads_count = ENV.fetch("RAILS_MAX_THREADS", 5) threads threads_count, threads_count # Preload app for better memory efficiency preload_app! # Worker boot code on_worker_boot do ActiveRecord::Base.establish_connection if defined?(ActiveRecord) end # Master process boot code before_fork do ActiveRecord::Base.connection_pool.disconnect! if defined?(ActiveRecord) end ``` #### Environment Variables ```bash # Puma configuration WEB_CONCURRENCY=10 RAILS_MAX_THREADS=5 ``` ## Load Balancing ### Architecture ``` ┌─────────────┐ │ Load Balancer│ │ (Nginx/HAProxy) │ └──────┬───────┘ │ ┌──────────────────┼──────────────────┐ │ │ │ ┌────▼────┐ ┌────▼────┐ ┌────▼────┐ │ App │ │ App │ │ App │ │ Server 1│ │ Server 2│ │ Server 3│ │ 10×5=50 │ │ 10×5=50 │ │ 10×5=50 │ └─────────┘ └─────────┘ └─────────┘ │ │ │ └──────────────────┼──────────────────┘ │ ┌──────▼───────┐ │ PostgreSQL │ │ (Primary DB) │ └───────────────┘ ``` ### Load Balancer Options #### 1. Nginx (Recommended) **Configuration** (`/etc/nginx/sites-available/ecommerce_app`): ```nginx upstream ecommerce_app { # Round-robin load balancing (default) server app1.example.com:3000; server app2.example.com:3000; server app3.example.com:3000; # Optional: Weighted load balancing # server app1.example.com:3000 weight=3; # server app2.example.com:3000 weight=2; # server app3.example.com:3000 weight=1; # Optional: Health checks # server app1.example.com:3000 max_fails=3 fail_timeout=30s; } server { listen 80; server_name yourdomain.com; # Redirect HTTP to HTTPS return 301 https://$server_name$request_uri; } server { listen 443 ssl http2; server_name yourdomain.com; ssl_certificate /path/to/cert.pem; ssl_certificate_key /path/to/key.pem; # Gzip compression gzip on; gzip_vary on; gzip_min_length 1024; gzip_types text/plain text/css text/xml text/javascript application/json application/javascript application/xml+rss; # Proxy settings location / { proxy_pass http://ecommerce_app; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; # Timeouts proxy_connect_timeout 60s; proxy_send_timeout 60s; proxy_read_timeout 60s; # Buffer settings proxy_buffering on; proxy_buffer_size 4k; proxy_buffers 8 4k; } # Health check endpoint location /health { access_log off; proxy_pass http://ecommerce_app/health; } } ``` #### 2. HAProxy **Configuration** (`/etc/haproxy/haproxy.cfg`): ```haproxy global log /dev/log local0 chroot /var/lib/haproxy stats socket /run/haproxy/admin.sock mode 660 level admin stats timeout 30s user haproxy group haproxy daemon defaults log global mode http option httplog option dontlognull timeout connect 5000ms timeout client 50000ms timeout server 50000ms frontend http_front bind *:80 redirect scheme https code 301 if !{ ssl_fc } frontend https_front bind *:443 ssl crt /path/to/cert.pem default_backend ecommerce_app_backend backend ecommerce_app_backend balance roundrobin option httpchk GET /health http-check expect status 200 server app1 app1.example.com:3000 check server app2 app2.example.com:3000 check server app3 app3.example.com:3000 check listen stats bind *:8404 stats enable stats uri /stats stats refresh 30s ``` #### 3. Cloud Load Balancers **AWS Application Load Balancer (ALB)**: - Use target groups with health checks - Configure SSL termination - Enable sticky sessions (not required for stateless app) **Google Cloud Load Balancer**: - Configure backend services - Set up health checks - Enable SSL termination **Azure Load Balancer**: - Configure backend pools - Set up health probes - Enable SSL termination ### Session Stickiness **Not Required** - This application is stateless: - No server-side sessions - JWT tokens for authentication - Database-backed sessions (if needed) - Redis for shared state (if needed) If you need session stickiness for specific features: ```nginx # Nginx sticky sessions upstream ecommerce_app { ip_hash; # Route by client IP server app1.example.com:3000; server app2.example.com:3000; server app3.example.com:3000; } ``` ## Database Connection Pooling ### Option 1: Direct PostgreSQL Connections (Current) **Configuration**: Already configured in `config/database.yml` - Primary DB: 120 connections - Queue DB: 90 connections - Each server: 50 connections (10 workers × 5 threads) **For 4 servers**: 4 × 50 = 200 connections needed - PostgreSQL `max_connections` should be ≥ 200 ### Option 2: PgBouncer (Optional, Recommended for High Load) **Benefits**: - Reduces PostgreSQL connection overhead - Allows more application connections - Better connection management **Architecture**: ``` App Servers → PgBouncer → PostgreSQL (200 connections) (20 connections) ``` **Installation**: ```bash # Ubuntu/Debian sudo apt-get install pgbouncer # macOS brew install pgbouncer ``` **Configuration** (`/etc/pgbouncer/pgbouncer.ini`): ```ini [databases] ecommerce_app_production = host=postgres.example.com port=5432 dbname=ecommerce_app_production ecommerce_app_production_queue = host=postgres.example.com port=5432 dbname=ecommerce_app_production_queue [pgbouncer] listen_addr = 0.0.0.0 listen_port = 6432 auth_type = md5 auth_file = /etc/pgbouncer/userlist.txt # Connection pooling mode pool_mode = transaction # One connection per transaction # Pool sizes max_client_conn = 1000 # Max client connections default_pool_size = 25 # Connections per database reserve_pool_size = 5 # Reserved connections max_db_connections = 100 # Max connections to PostgreSQL # Timeouts server_idle_timeout = 600 server_connect_timeout = 15 server_login_retry = 15 # Logging logfile = /var/log/pgbouncer/pgbouncer.log pidfile = /var/run/pgbouncer/pgbouncer.pid ``` **Update Rails Configuration** (`config/database.yml`): ```yaml production: primary: host: pgbouncer.example.com # Point to PgBouncer instead of PostgreSQL port: 6432 pool: <%= ENV.fetch("DB_POOL_SIZE") { 120 } %> ``` **User List** (`/etc/pgbouncer/userlist.txt`): ```bash # Generate with: echo "md5"$(echo -n "passwordusername" | md5sum | cut -d' ' -f1) "ecommerce_app" "md5abc123..." ``` ## Health Checks ### Application Health Check Endpoint Create a health check endpoint for load balancer: ```ruby # config/routes.rb get '/health', to: 'health#check' # app/controllers/health_controller.rb class HealthController < ApplicationController def check # Check database connection ActiveRecord::Base.connection.execute('SELECT 1') # Check cache Rails.cache.write('health_check', 'ok', expires_in: 1.second) Rails.cache.read('health_check') render json: { status: 'ok', timestamp: Time.current.iso8601 } rescue => e render json: { status: 'error', error: e.message }, status: 503 end end ``` ## Monitoring ### Key Metrics to Monitor 1. **Request Rate**: Requests per second per server 2. **Response Time**: P50, P95, P99 latencies 3. **Error Rate**: 4xx and 5xx errors 4. **Connection Pool**: Database connection utilization 5. **Worker Utilization**: CPU and memory per worker 6. **Queue Depth**: Background job queue depths ### Monitoring Tools - **Application Metrics**: `OrderMetricsService`, `DatabaseMetricsService` - **Server Metrics**: Prometheus, Datadog, New Relic - **Load Balancer Metrics**: Nginx/HAProxy stats, Cloud provider metrics ## Deployment Checklist ### Step 1: Configure Puma ```bash # Set environment variables export WEB_CONCURRENCY=10 export RAILS_MAX_THREADS=5 ``` ### Step 2: Deploy Application Servers Deploy to 2-4 servers with Puma configuration. ### Step 3: Configure Load Balancer Set up Nginx or HAProxy with health checks. ### Step 4: (Optional) Set Up PgBouncer If using PgBouncer: 1. Install and configure PgBouncer 2. Update `database.yml` to point to PgBouncer 3. Restart application servers ### Step 5: Verify Configuration ```bash # Check Puma workers ps aux | grep puma # Check database connections rails runner "puts ActiveRecord::Base.connection_pool.size" # Test load balancer curl https://yourdomain.com/health ``` ## Performance Tuning ### For Higher Loads (50k+ users) 1. **Increase Workers**: `WEB_CONCURRENCY=20` 2. **Add More Servers**: Scale to 4-8 servers 3. **Use PgBouncer**: Reduce PostgreSQL connection overhead 4. **Enable Caching**: Redis for shared cache 5. **CDN**: Use CloudFlare or AWS CloudFront for static assets ### Memory Considerations - Each Puma worker: ~200-500MB RAM - 10 workers: ~2-5GB RAM per server - Ensure servers have sufficient RAM (8GB+ recommended) ## Troubleshooting ### "Too many connections" Error **Solution**: Use PgBouncer or increase PostgreSQL `max_connections` ### High Memory Usage **Solution**: - Reduce `WEB_CONCURRENCY` - Enable `preload_app!` in Puma - Monitor for memory leaks ### Slow Response Times **Solution**: - Check database query performance - Enable query caching - Use CDN for static assets - Optimize N+1 queries ## References - [Puma Configuration](https://github.com/puma/puma#configuration) - [Nginx Load Balancing](https://nginx.org/en/docs/http/load_balancing.html) - [HAProxy Configuration](http://www.haproxy.org/#docs) - [PgBouncer Documentation](https://www.pgbouncer.org/config.html)