A taxonomy of error handling in a Rails monolith
Error Handling & Resilience Guide
This guide explains the comprehensive error handling and resilience mechanisms implemented for the order system.
Overview
The system implements multi-layered error handling for:
- Inventory Conflicts - Retry with exponential backoff, user notifications
- Payment Failures - Inventory release, error logging, user notifications, retry support
- Database Deadlocks - Automatic retry, exponential backoff, queue fallback
Error Handling Architecture
OrderErrorHandler Service
Centralized error handling service (app/services/order_error_handler.rb) that provides:
- Consistent error handling across all order operations
- Retry logic with exponential backoff
- User notifications
- Inventory release on failures
- Queue-based fallback for deadlocks
Error Scenarios
1. Inventory Conflicts
Scenario: User tries to order items that are out of stock or insufficient inventory available.
Handling:
- Retry with Exponential Backoff
- Automatic retry up to 3 attempts
- Exponential backoff: 0.1s, 0.2s, 0.4s, 0.8s, etc.
- Max delay: 10 seconds
- User Notification
- Email notification with unavailable items list
- Clear error messages indicating which items are out of stock
- Available quantities shown
- Partial Order Fulfillment (if applicable)
- Create order with available items
- Mark unavailable items in order metadata
- Notify user of partial fulfillment
Implementation:
result = OrderErrorHandler.handle_inventory_conflict(
order: order,
unavailable_items: [
{ variant_id: 1, variant_name: "Product A", requested: 2, available: 0 }
],
user: user,
reservations: reservations,
retry_count: 0,
max_retries: 3
)
User Experience:
- Clear error message: “Product A: Only 0 available (requested 2)”
- Email notification with details
- Option to retry with updated cart
2. Payment Failures
Scenario: Payment gateway returns failure or payment processing error.
Handling:
- Error Logging
- Detailed error information logged
- Sentry integration for error tracking
- Payment gateway error details captured
- Inventory Release
- Automatic release of inventory reservations
- Prevents inventory from being held indefinitely
- Ensures inventory is available for other users
- User Notification
- Email notification of payment failure
- Clear message explaining what happened
- Instructions for retry
- Retry Support
- Order remains in pending state
- User can retry payment
- New payment intent created on retry
Implementation:
result = OrderErrorHandler.handle_payment_failure(
order: order,
payment_intent: payment_intent,
error: payment_error,
user: user,
gateway_type: "razorpay"
)
User Experience:
- Email: “Payment Failed for Order #12345”
- Order status:
payment_status: :failed - Inventory released automatically
- Can retry payment from order page
3. Database Deadlocks
Scenario: Concurrent transactions cause database deadlocks or serialization failures.
Handling:
- Automatic Retry
- Up to 3 retry attempts
- Exponential backoff: 0.1s, 0.2s, 0.4s
- Tracks retry count and delays
- Exponential Backoff
- Formula:
base_delay * (2 ^ retry_count) - Base delay: 0.1 seconds
- Max delay: 10 seconds
- Prevents thundering herd
- Formula:
- Queue-Based Fallback
- If retries exhausted, enqueue to background job
- Process asynchronously to avoid blocking
- User notified that request is queued
Implementation:
result = OrderErrorHandler.handle_database_deadlock(
error: deadlock_error,
operation: :create_order,
context: { user_id: user.id, items_count: 5 },
retry_count: 0,
max_retries: 3
)
User Experience:
- First 3 attempts: Automatic retry (transparent to user)
- After 3 attempts: “Request queued due to high load. You will be notified when processing completes.”
- Email notification when order is processed
Error Response Format
All error handlers return consistent result hashes:
{
success: true/false,
retry: true/false, # Whether to retry
retry_delay: 0.2, # Seconds to wait before retry
retry_count: 1, # Current retry attempt
fallback_to_queue: false, # Whether to queue for async processing
notify_user: true, # Whether user was notified
released_inventory: true, # Whether inventory was released
errors: ["Error message"], # Array of error messages
message: "User-friendly message",
unavailable_items: [...], # For inventory conflicts
queued: false # Whether request was queued
}
Exponential Backoff Algorithm
def calculate_exponential_backoff(retry_count, base_delay: 0.1, max_delay: 10.0)
delay = base_delay * (2 ** retry_count)
[delay, max_delay].min
end
Retry Delays:
- Attempt 1: 0.1 seconds
- Attempt 2: 0.2 seconds
- Attempt 3: 0.4 seconds
- Attempt 4: 0.8 seconds
- Max: 10.0 seconds
User Notifications
Email Notifications
All error scenarios trigger email notifications:
- Inventory Conflict (
OrderMailer.inventory_conflict)- Lists unavailable items
- Shows available quantities
- Provides link to update cart
- Payment Failure (
OrderMailer.payment_failed)- Explains payment failure
- Provides retry instructions
- Includes order details
- Partial Fulfillment (
OrderMailer.partial_fulfillment)- Lists fulfilled items
- Lists unavailable items
- Explains next steps
Event System
All errors emit events to Rails event system:
order.inventory_conflictorder.payment_failedorder.partial_fulfillmentorder.deadlock_retry
Metrics & Monitoring
Tracked Metrics
- Inventory Conflicts
inventory.reservation.conflicts.totalinventory.reservation.conflicts.variant.#{variant_id}
- Payment Failures
payments.failed.totalpayments.failed.gateway.#{gateway_type}
- Database Deadlocks
database.deadlock.totaldatabase.serialization_failure.total
Sentry Integration
All errors are reported to Sentry with:
- Error class and message
- Order context (ID, number, user)
- Payment intent details (if applicable)
- Gateway information
- Retry count
Best Practices
- Always Release Inventory on Failure
- Prevents inventory from being held indefinitely
- Ensures fair access for all users
- Notify Users Promptly
- Email notifications sent immediately
- Clear, actionable error messages
- Log Detailed Information
- Full error context for debugging
- Retry attempts tracked
- Performance metrics recorded
- Graceful Degradation
- Queue-based fallback for high load
- Partial fulfillment when possible
- Clear user communication
- Monitor Error Rates
- Track error frequencies
- Alert on high error rates
- Analyze error patterns
Testing
Test Error Scenarios
# Test inventory conflict
RSpec.describe "Inventory Conflict Handling" do
it "retries with exponential backoff" do
# Mock inventory conflict
# Verify retry logic
# Check user notification
end
end
# Test payment failure
RSpec.describe "Payment Failure Handling" do
it "releases inventory and notifies user" do
# Mock payment failure
# Verify inventory release
# Check email sent
end
end
# Test database deadlock
RSpec.describe "Database Deadlock Handling" do
it "retries and falls back to queue" do
# Mock deadlock
# Verify retry logic
# Check queue fallback
end
end
Configuration
Retry Settings
# Maximum retry attempts for order creation
ORDER_CREATION_MAX_RETRIES=3
# Base delay for exponential backoff (seconds)
ORDER_RETRY_BASE_DELAY=0.1
# Maximum delay for exponential backoff (seconds)
ORDER_RETRY_MAX_DELAY=10.0
Troubleshooting
High Inventory Conflict Rate
Symptom: Many inventory conflicts reported
Solution:
- Check inventory levels
- Review reservation expiration times
- Consider increasing inventory
- Analyze concurrent order patterns
High Payment Failure Rate
Symptom: Many payment failures
Solution:
- Check payment gateway status
- Review payment gateway logs
- Verify payment gateway configuration
- Check for gateway-specific issues
Frequent Database Deadlocks
Symptom: Many deadlock errors
Solution:
- Review transaction isolation levels
- Optimize database queries
- Reduce transaction duration
- Consider read replicas for reads