Application Scaling Playbook

## Application Scaling Guide for 20k Concurrent Users

This guide explains how to configure Puma web server and set up load balancing for handling 20k concurrent users.

## Puma Configuration

### Current Configuration
- **Workers**: 1 (default)
- **Threads**: 3 (default)
- **Concurrent requests**: 3 per server

### For 20k Concurrent Users

#### Single Server Configuration
- **Workers**: 10 processes
- **Threads**: 5 per worker
- **Concurrent requests**: 50 per server (10 × 5)

#### Multi-Server Configuration (Recommended)
- **Servers**: 2-4 application servers
- **Workers per server**: 10
- **Threads per worker**: 5
- **Total concurrent requests**: 100-200 (2-4 servers × 50)

### Configuration Files

#### `config/puma.rb`

```ruby
# Worker processes for handling 20k concurrent users
workers ENV.fetch("WEB_CONCURRENCY", 10)

# Threads per worker
threads_count = ENV.fetch("RAILS_MAX_THREADS", 5)
threads threads_count, threads_count

# Preload app for better memory efficiency
preload_app!

# Worker boot code
on_worker_boot do
  ActiveRecord::Base.establish_connection if defined?(ActiveRecord)
end

# Master process boot code
before_fork do
  ActiveRecord::Base.connection_pool.disconnect! if defined?(ActiveRecord)
end
```

#### Environment Variables

```bash
# Puma configuration
WEB_CONCURRENCY=10
RAILS_MAX_THREADS=5
```

## Load Balancing

### Architecture

```
                    ┌─────────────┐
                    │ Load Balancer│
                    │  (Nginx/HAProxy) │
                    └──────┬───────┘
                           │
        ┌──────────────────┼──────────────────┐
        │                  │                  │
   ┌────▼────┐        ┌────▼────┐       ┌────▼────┐
   │ App     │        │ App     │       │ App     │
   │ Server 1│        │ Server 2│       │ Server 3│
   │ 10×5=50 │        │ 10×5=50 │       │ 10×5=50 │
   └─────────┘        └─────────┘       └─────────┘
        │                  │                  │
        └──────────────────┼──────────────────┘
                           │
                    ┌──────▼───────┐
                    │  PostgreSQL   │
                    │  (Primary DB) │
                    └───────────────┘
```

### Load Balancer Options

#### 1. Nginx (Recommended)

**Configuration** (`/etc/nginx/sites-available/ecommerce_app`):

```nginx
upstream ecommerce_app {
    # Round-robin load balancing (default)
    server app1.example.com:3000;
    server app2.example.com:3000;
    server app3.example.com:3000;
    
    # Optional: Weighted load balancing
    # server app1.example.com:3000 weight=3;
    # server app2.example.com:3000 weight=2;
    # server app3.example.com:3000 weight=1;
    
    # Optional: Health checks
    # server app1.example.com:3000 max_fails=3 fail_timeout=30s;
}

server {
    listen 80;
    server_name yourdomain.com;

# Redirect HTTP to HTTPS
    return 301 https://$server_name$request_uri;
}

server {
    listen 443 ssl http2;
    server_name yourdomain.com;

ssl_certificate /path/to/cert.pem;
    ssl_certificate_key /path/to/key.pem;

# Gzip compression
    gzip on;
    gzip_vary on;
    gzip_min_length 1024;
    gzip_types text/plain text/css text/xml text/javascript application/json application/javascript application/xml+rss;

# Proxy settings
    location / {
        proxy_pass http://ecommerce_app;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        
        # Timeouts
        proxy_connect_timeout 60s;
        proxy_send_timeout 60s;
        proxy_read_timeout 60s;
        
        # Buffer settings
        proxy_buffering on;
        proxy_buffer_size 4k;
        proxy_buffers 8 4k;
    }

# Health check endpoint
    location /health {
        access_log off;
        proxy_pass http://ecommerce_app/health;
    }
}
```

#### 2. HAProxy

**Configuration** (`/etc/haproxy/haproxy.cfg`):

```haproxy
global
    log /dev/log local0
    chroot /var/lib/haproxy
    stats socket /run/haproxy/admin.sock mode 660 level admin
    stats timeout 30s
    user haproxy
    group haproxy
    daemon

defaults
    log global
    mode http
    option httplog
    option dontlognull
    timeout connect 5000ms
    timeout client 50000ms
    timeout server 50000ms

frontend http_front
    bind *:80
    redirect scheme https code 301 if !{ ssl_fc }

frontend https_front
    bind *:443 ssl crt /path/to/cert.pem
    default_backend ecommerce_app_backend

backend ecommerce_app_backend
    balance roundrobin
    option httpchk GET /health
    http-check expect status 200
    server app1 app1.example.com:3000 check
    server app2 app2.example.com:3000 check
    server app3 app3.example.com:3000 check

listen stats
    bind *:8404
    stats enable
    stats uri /stats
    stats refresh 30s
```

#### 3. Cloud Load Balancers

**AWS Application Load Balancer (ALB)**:
- Use target groups with health checks
- Configure SSL termination
- Enable sticky sessions (not required for stateless app)

**Google Cloud Load Balancer**:
- Configure backend services
- Set up health checks
- Enable SSL termination

**Azure Load Balancer**:
- Configure backend pools
- Set up health probes
- Enable SSL termination

### Session Stickiness

**Not Required** - This application is stateless:
- No server-side sessions
- JWT tokens for authentication
- Database-backed sessions (if needed)
- Redis for shared state (if needed)

If you need session stickiness for specific features:
```nginx
# Nginx sticky sessions
upstream ecommerce_app {
    ip_hash;  # Route by client IP
    server app1.example.com:3000;
    server app2.example.com:3000;
    server app3.example.com:3000;
}
```

## Database Connection Pooling

### Option 1: Direct PostgreSQL Connections (Current)

**Configuration**: Already configured in `config/database.yml`
- Primary DB: 120 connections
- Queue DB: 90 connections
- Each server: 50 connections (10 workers × 5 threads)

**For 4 servers**: 4 × 50 = 200 connections needed
- PostgreSQL `max_connections` should be ≥ 200

### Option 2: PgBouncer (Optional, Recommended for High Load)

**Benefits**:
- Reduces PostgreSQL connection overhead
- Allows more application connections
- Better connection management

**Architecture**:
```
App Servers → PgBouncer → PostgreSQL
(200 connections) (20 connections)
```

**Installation**:

```bash
# Ubuntu/Debian
sudo apt-get install pgbouncer

# macOS
brew install pgbouncer
```

**Configuration** (`/etc/pgbouncer/pgbouncer.ini`):

```ini
[databases]
ecommerce_app_production = host=postgres.example.com port=5432 dbname=ecommerce_app_production
ecommerce_app_production_queue = host=postgres.example.com port=5432 dbname=ecommerce_app_production_queue

[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt

# Connection pooling mode
pool_mode = transaction  # One connection per transaction

# Pool sizes
max_client_conn = 1000    # Max client connections
default_pool_size = 25    # Connections per database
reserve_pool_size = 5     # Reserved connections
max_db_connections = 100  # Max connections to PostgreSQL

# Timeouts
server_idle_timeout = 600
server_connect_timeout = 15
server_login_retry = 15

# Logging
logfile = /var/log/pgbouncer/pgbouncer.log
pidfile = /var/run/pgbouncer/pgbouncer.pid
```

**Update Rails Configuration** (`config/database.yml`):

```yaml
production:
  primary:
    host: pgbouncer.example.com  # Point to PgBouncer instead of PostgreSQL
    port: 6432
    pool: <%= ENV.fetch("DB_POOL_SIZE") { 120 } %>
```

**User List** (`/etc/pgbouncer/userlist.txt`):

```bash
# Generate with: echo "md5"$(echo -n "passwordusername" | md5sum | cut -d' ' -f1)
"ecommerce_app" "md5abc123..."
```

## Health Checks

### Application Health Check Endpoint

Create a health check endpoint for load balancer:

```ruby
# config/routes.rb
get '/health', to: 'health#check'

# app/controllers/health_controller.rb
class HealthController < ApplicationController
  def check
    # Check database connection
    ActiveRecord::Base.connection.execute('SELECT 1')
    
    # Check cache
    Rails.cache.write('health_check', 'ok', expires_in: 1.second)
    Rails.cache.read('health_check')
    
    render json: { status: 'ok', timestamp: Time.current.iso8601 }
  rescue => e
    render json: { status: 'error', error: e.message }, status: 503
  end
end
```

## Monitoring

### Key Metrics to Monitor

1. **Request Rate**: Requests per second per server
2. **Response Time**: P50, P95, P99 latencies
3. **Error Rate**: 4xx and 5xx errors
4. **Connection Pool**: Database connection utilization
5. **Worker Utilization**: CPU and memory per worker
6. **Queue Depth**: Background job queue depths

### Monitoring Tools

- **Application Metrics**: `OrderMetricsService`, `DatabaseMetricsService`
- **Server Metrics**: Prometheus, Datadog, New Relic
- **Load Balancer Metrics**: Nginx/HAProxy stats, Cloud provider metrics

## Deployment Checklist

### Step 1: Configure Puma

```bash
# Set environment variables
export WEB_CONCURRENCY=10
export RAILS_MAX_THREADS=5
```

### Step 2: Deploy Application Servers

Deploy to 2-4 servers with Puma configuration.

### Step 3: Configure Load Balancer

Set up Nginx or HAProxy with health checks.

### Step 4: (Optional) Set Up PgBouncer

If using PgBouncer:
1. Install and configure PgBouncer
2. Update `database.yml` to point to PgBouncer
3. Restart application servers

### Step 5: Verify Configuration

```bash
# Check Puma workers
ps aux | grep puma

# Check database connections
rails runner "puts ActiveRecord::Base.connection_pool.size"

# Test load balancer
curl https://yourdomain.com/health
```

## Performance Tuning

### For Higher Loads (50k+ users)

1. **Increase Workers**: `WEB_CONCURRENCY=20`
2. **Add More Servers**: Scale to 4-8 servers
3. **Use PgBouncer**: Reduce PostgreSQL connection overhead
4. **Enable Caching**: Redis for shared cache
5. **CDN**: Use CloudFlare or AWS CloudFront for static assets

### Memory Considerations

- Each Puma worker: ~200-500MB RAM
- 10 workers: ~2-5GB RAM per server
- Ensure servers have sufficient RAM (8GB+ recommended)

## Troubleshooting

### "Too many connections" Error

**Solution**: Use PgBouncer or increase PostgreSQL `max_connections`

### High Memory Usage

**Solution**: 
- Reduce `WEB_CONCURRENCY`
- Enable `preload_app!` in Puma
- Monitor for memory leaks

### Slow Response Times

**Solution**:
- Check database query performance
- Enable query caching
- Use CDN for static assets
- Optimize N+1 queries

## References

- [Puma Configuration](https://github.com/puma/puma#configuration)
- [Nginx Load Balancing](https://nginx.org/en/docs/http/load_balancing.html)
- [HAProxy Configuration](http://www.haproxy.org/#docs)
- [PgBouncer Documentation](https://www.pgbouncer.org/config.html)

Application Scaling Guide for 20k Concurrent Users

This guide explains how to configure Puma web server and set up load balancing for handling 20k concurrent users.

Puma Configuration

Current Configuration

Workers: 1 (default)
Threads: 3 (default)
Concurrent requests: 3 per server

For 20k Concurrent Users

Single Server Configuration

Workers: 10 processes
Threads: 5 per worker
Concurrent requests: 50 per server (10 × 5)

Multi-Server Configuration (Recommended)

Servers: 2-4 application servers
Workers per server: 10
Threads per worker: 5
Total concurrent requests: 100-200 (2-4 servers × 50)

Configuration Files

`config/puma.rb`

# Worker processes for handling 20k concurrent users
workers ENV.fetch("WEB_CONCURRENCY", 10)

# Threads per worker
threads_count = ENV.fetch("RAILS_MAX_THREADS", 5)
threads threads_count, threads_count

# Preload app for better memory efficiency
preload_app!

# Worker boot code
on_worker_boot do
  ActiveRecord::Base.establish_connection if defined?(ActiveRecord)
end

# Master process boot code
before_fork do
  ActiveRecord::Base.connection_pool.disconnect! if defined?(ActiveRecord)
end

Environment Variables

# Puma configuration
WEB_CONCURRENCY=10
RAILS_MAX_THREADS=5

Load Balancing

Architecture

                    ┌─────────────┐
                    │ Load Balancer│
                    │  (Nginx/HAProxy) │
                    └──────┬───────┘
                           │
        ┌──────────────────┼──────────────────┐
        │                  │                  │
   ┌────▼────┐        ┌────▼────┐       ┌────▼────┐
   │ App     │        │ App     │       │ App     │
   │ Server 1│        │ Server 2│       │ Server 3│
   │ 10×5=50 │        │ 10×5=50 │       │ 10×5=50 │
   └─────────┘        └─────────┘       └─────────┘
        │                  │                  │
        └──────────────────┼──────────────────┘
                           │
                    ┌──────▼───────┐
                    │  PostgreSQL   │
                    │  (Primary DB) │
                    └───────────────┘

Load Balancer Options

1. Nginx (Recommended)

Configuration (/etc/nginx/sites-available/ecommerce_app):

upstream ecommerce_app {
    # Round-robin load balancing (default)
    server app1.example.com:3000;
    server app2.example.com:3000;
    server app3.example.com:3000;
    
    # Optional: Weighted load balancing
    # server app1.example.com:3000 weight=3;
    # server app2.example.com:3000 weight=2;
    # server app3.example.com:3000 weight=1;
    
    # Optional: Health checks
    # server app1.example.com:3000 max_fails=3 fail_timeout=30s;
}

server {
    listen 80;
    server_name yourdomain.com;

    # Redirect HTTP to HTTPS
    return 301 https://$server_name$request_uri;
}

server {
    listen 443 ssl http2;
    server_name yourdomain.com;

    ssl_certificate /path/to/cert.pem;
    ssl_certificate_key /path/to/key.pem;

    # Gzip compression
    gzip on;
    gzip_vary on;
    gzip_min_length 1024;
    gzip_types text/plain text/css text/xml text/javascript application/json application/javascript application/xml+rss;

    # Proxy settings
    location / {
        proxy_pass http://ecommerce_app;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        
        # Timeouts
        proxy_connect_timeout 60s;
        proxy_send_timeout 60s;
        proxy_read_timeout 60s;
        
        # Buffer settings
        proxy_buffering on;
        proxy_buffer_size 4k;
        proxy_buffers 8 4k;
    }

    # Health check endpoint
    location /health {
        access_log off;
        proxy_pass http://ecommerce_app/health;
    }
}

2. HAProxy

Configuration (/etc/haproxy/haproxy.cfg):

global
    log /dev/log local0
    chroot /var/lib/haproxy
    stats socket /run/haproxy/admin.sock mode 660 level admin
    stats timeout 30s
    user haproxy
    group haproxy
    daemon

defaults
    log global
    mode http
    option httplog
    option dontlognull
    timeout connect 5000ms
    timeout client 50000ms
    timeout server 50000ms

frontend http_front
    bind *:80
    redirect scheme https code 301 if !{ ssl_fc }

frontend https_front
    bind *:443 ssl crt /path/to/cert.pem
    default_backend ecommerce_app_backend

backend ecommerce_app_backend
    balance roundrobin
    option httpchk GET /health
    http-check expect status 200
    server app1 app1.example.com:3000 check
    server app2 app2.example.com:3000 check
    server app3 app3.example.com:3000 check

listen stats
    bind *:8404
    stats enable
    stats uri /stats
    stats refresh 30s

3. Cloud Load Balancers

AWS Application Load Balancer (ALB):

Use target groups with health checks
Configure SSL termination
Enable sticky sessions (not required for stateless app)

Google Cloud Load Balancer:

Configure backend services
Set up health checks
Enable SSL termination

Azure Load Balancer:

Configure backend pools
Set up health probes
Enable SSL termination

Session Stickiness

Not Required - This application is stateless:

No server-side sessions
JWT tokens for authentication
Database-backed sessions (if needed)
Redis for shared state (if needed)

If you need session stickiness for specific features:

# Nginx sticky sessions
upstream ecommerce_app {
    ip_hash;  # Route by client IP
    server app1.example.com:3000;
    server app2.example.com:3000;
    server app3.example.com:3000;
}

Database Connection Pooling

Option 1: Direct PostgreSQL Connections (Current)

Configuration: Already configured in config/database.yml

Primary DB: 120 connections
Queue DB: 90 connections
Each server: 50 connections (10 workers × 5 threads)

For 4 servers: 4 × 50 = 200 connections needed

PostgreSQL max_connections should be ≥ 200

Option 2: PgBouncer (Optional, Recommended for High Load)

Benefits:

Reduces PostgreSQL connection overhead
Allows more application connections
Better connection management

Architecture:

App Servers → PgBouncer → PostgreSQL
(200 connections) (20 connections)

Installation:

# Ubuntu/Debian
sudo apt-get install pgbouncer

# macOS
brew install pgbouncer

Configuration (/etc/pgbouncer/pgbouncer.ini):

[databases]
ecommerce_app_production = host=postgres.example.com port=5432 dbname=ecommerce_app_production
ecommerce_app_production_queue = host=postgres.example.com port=5432 dbname=ecommerce_app_production_queue

[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt

# Connection pooling mode
pool_mode = transaction  # One connection per transaction

# Pool sizes
max_client_conn = 1000    # Max client connections
default_pool_size = 25    # Connections per database
reserve_pool_size = 5     # Reserved connections
max_db_connections = 100  # Max connections to PostgreSQL

# Timeouts
server_idle_timeout = 600
server_connect_timeout = 15
server_login_retry = 15

# Logging
logfile = /var/log/pgbouncer/pgbouncer.log
pidfile = /var/run/pgbouncer/pgbouncer.pid

Update Rails Configuration (config/database.yml):

production:
  primary:
    host: pgbouncer.example.com  # Point to PgBouncer instead of PostgreSQL
    port: 6432
    pool: <%= ENV.fetch("DB_POOL_SIZE") { 120 } %>

User List (/etc/pgbouncer/userlist.txt):

# Generate with: echo "md5"$(echo -n "passwordusername" | md5sum | cut -d' ' -f1)
"ecommerce_app" "md5abc123..."

Health Checks

Application Health Check Endpoint

Create a health check endpoint for load balancer:

# config/routes.rb
get '/health', to: 'health#check'

# app/controllers/health_controller.rb
class HealthController < ApplicationController
  def check
    # Check database connection
    ActiveRecord::Base.connection.execute('SELECT 1')
    
    # Check cache
    Rails.cache.write('health_check', 'ok', expires_in: 1.second)
    Rails.cache.read('health_check')
    
    render json: { status: 'ok', timestamp: Time.current.iso8601 }
  rescue => e
    render json: { status: 'error', error: e.message }, status: 503
  end
end

Monitoring

Key Metrics to Monitor

Request Rate: Requests per second per server
Response Time: P50, P95, P99 latencies
Error Rate: 4xx and 5xx errors
Connection Pool: Database connection utilization
Worker Utilization: CPU and memory per worker
Queue Depth: Background job queue depths

Monitoring Tools

Application Metrics: OrderMetricsService, DatabaseMetricsService
Server Metrics: Prometheus, Datadog, New Relic
Load Balancer Metrics: Nginx/HAProxy stats, Cloud provider metrics

Deployment Checklist

Step 1: Configure Puma

# Set environment variables
export WEB_CONCURRENCY=10
export RAILS_MAX_THREADS=5

Step 2: Deploy Application Servers

Deploy to 2-4 servers with Puma configuration.

Step 3: Configure Load Balancer

Set up Nginx or HAProxy with health checks.

Step 4: (Optional) Set Up PgBouncer

If using PgBouncer:

Install and configure PgBouncer
Update database.yml to point to PgBouncer
Restart application servers

Step 5: Verify Configuration

# Check Puma workers
ps aux | grep puma

# Check database connections
rails runner "puts ActiveRecord::Base.connection_pool.size"

# Test load balancer
curl https://yourdomain.com/health

Performance Tuning

For Higher Loads (50k+ users)

Increase Workers: WEB_CONCURRENCY=20
Add More Servers: Scale to 4-8 servers
Use PgBouncer: Reduce PostgreSQL connection overhead
Enable Caching: Redis for shared cache
CDN: Use CloudFlare or AWS CloudFront for static assets

Memory Considerations

Each Puma worker: ~200-500MB RAM
10 workers: ~2-5GB RAM per server
Ensure servers have sufficient RAM (8GB+ recommended)

Troubleshooting

“Too many connections” Error

Solution: Use PgBouncer or increase PostgreSQL max_connections

High Memory Usage

Solution:

Reduce WEB_CONCURRENCY
Enable preload_app! in Puma
Monitor for memory leaks

Slow Response Times

Solution:

Check database query performance
Enable query caching
Use CDN for static assets
Optimize N+1 queries