# Scaling

The platform is designed to stay fast and correct at roughly **20,000
concurrent users**. That target shaped three areas: how lists are paginated,
how data is cached, and how work is kept off the request path.

## Cursor pagination, not page numbers

Offset pagination (`LIMIT 20 OFFSET 4000`) has two problems at scale: the
database still walks the skipped rows, and if the underlying data changes
between page loads, items shift — a shopper sees a duplicate or misses a
product.

`CursorPaginator` uses **keyset pagination** instead. Each page carries an
opaque cursor encoding the sort position of the last row:

```
GET /products?after=eyJpZCI6MTQ4Mn0   # "rows after this key"
```

The next query becomes `WHERE (sort_key) < (cursor) LIMIT n` — an index range
scan whose cost does not grow with how deep the shopper has scrolled. It powers
infinite scroll on the catalog, the [blog](/features/blog-platform/), and the
[Product API](/features/product-api/). The cursor is opaque and integrity-checked
so it cannot be tampered with to escape a scope.

## Multi-layer caching

Caching is applied per subsystem rather than globally:

- **Counter caches** keep aggregate counts (wishlist items, comments) on the
  parent row, so common pages never `COUNT(*)`.
- **Fragment caches** — services like `WishlistCacheService`,
  `ProductCacheService`, and `BlogPostCacheService` cache rendered fragments
  and invalidate them precisely on write, rather than relying on time-based
  expiry alone.
- **Cache warming** — `ProductCacheWarmingService` pre-populates hot product
  caches so the first shopper after a deploy does not pay the cold-cache cost.
- **Versioned keys** — `ProductCacheVersion` / `WishlistCacheVersion` make
  bulk invalidation a single version bump instead of a key sweep.

## Keep work off the request path

Anything that does not have to happen before the response is sent, doesn't.
Email, notifications, broadcasts, reconciliation, and cache warming all run as
[background jobs](/architecture/background-jobs/). The request thread does the
minimum and returns.

## Defend the hot endpoints

[Rate limiting](/architecture/rate-limiting/) runs at two levels — HTTP
(Rack::Attack) and application (`OrderRateLimiter`) — so a burst of traffic,
benign or hostile, cannot exhaust capacity on checkout or login.

## The principle

No single trick makes the platform scale. It is the discipline of, at every
layer, asking the same question: *does this work need to happen now, on this
thread, against this row?* Usually the answer is no — and pagination, caching,
and background jobs are how that "no" is enforced.

Scaling

The platform is designed to stay fast and correct at roughly 20,000 concurrent users. That target shaped three areas: how lists are paginated, how data is cached, and how work is kept off the request path.

Cursor pagination, not page numbers

Offset pagination (LIMIT 20 OFFSET 4000) has two problems at scale: the database still walks the skipped rows, and if the underlying data changes between page loads, items shift — a shopper sees a duplicate or misses a product.

CursorPaginator uses keyset pagination instead. Each page carries an opaque cursor encoding the sort position of the last row:

GET /products?after=eyJpZCI6MTQ4Mn0   # "rows after this key"

The next query becomes WHERE (sort_key) < (cursor) LIMIT n — an index range scan whose cost does not grow with how deep the shopper has scrolled. It powers infinite scroll on the catalog, the blog, and the Product API. The cursor is opaque and integrity-checked so it cannot be tampered with to escape a scope.

Multi-layer caching

Caching is applied per subsystem rather than globally:

Counter caches keep aggregate counts (wishlist items, comments) on the parent row, so common pages never COUNT(*).
Fragment caches — services like WishlistCacheService, ProductCacheService, and BlogPostCacheService cache rendered fragments and invalidate them precisely on write, rather than relying on time-based expiry alone.
Cache warming — ProductCacheWarmingService pre-populates hot product caches so the first shopper after a deploy does not pay the cold-cache cost.
Versioned keys — ProductCacheVersion / WishlistCacheVersion make bulk invalidation a single version bump instead of a key sweep.

Keep work off the request path

Anything that does not have to happen before the response is sent, doesn’t. Email, notifications, broadcasts, reconciliation, and cache warming all run as background jobs. The request thread does the minimum and returns.

Defend the hot endpoints

Rate limiting runs at two levels — HTTP (Rack::Attack) and application (OrderRateLimiter) — so a burst of traffic, benign or hostile, cannot exhaust capacity on checkout or login.

The principle

No single trick makes the platform scale. It is the discipline of, at every layer, asking the same question: does this work need to happen now, on this thread, against this row? Usually the answer is no — and pagination, caching, and background jobs are how that “no” is enforced.