# Scaling

The platform is designed to stay fast and correct at roughly **20,000
concurrent users**. That target shaped three areas: how lists are paginated,
how data is cached, and how work is kept off the request path.

## Cursor pagination, not page numbers

Offset pagination (`LIMIT 20 OFFSET 4000`) has two problems at scale: the
database still walks the skipped rows, and if the underlying data changes
between page loads, items shift — a shopper sees a duplicate or misses a
product.

`CursorPaginator` uses **keyset pagination** instead. Each page carries an
opaque cursor encoding the sort position of the last row:

```
GET /products?after=eyJpZCI6MTQ4Mn0   # "rows after this key"
```

The next query becomes `WHERE (sort_key) < (cursor) LIMIT n` — an index range
scan whose cost does not grow with how deep the shopper has scrolled. It powers
infinite scroll on the catalog, the [blog](/features/blog-platform/), and the
[Product API](/features/product-api/). The cursor is opaque and integrity-checked
so it cannot be tampered with to escape a scope.

## Multi-layer caching

Caching is applied per subsystem rather than globally:

- **Counter caches** keep aggregate counts (wishlist items, comments) on the
  parent row, so common pages never `COUNT(*)`.
- **Fragment caches** — services like `WishlistCacheService`,
  `ProductCacheService`, and `BlogPostCacheService` cache rendered fragments
  and invalidate them precisely on write, rather than relying on time-based
  expiry alone.
- **Cache warming** — `ProductCacheWarmingService` pre-populates hot product
  caches so the first shopper after a deploy does not pay the cold-cache cost.
- **Versioned keys** — `ProductCacheVersion` / `WishlistCacheVersion` make
  bulk invalidation a single version bump instead of a key sweep.

## Keep work off the request path

Anything that does not have to happen before the response is sent, doesn't.
Email, notifications, broadcasts, reconciliation, and cache warming all run as
[background jobs](/architecture/background-jobs/). The request thread does the
minimum and returns.

## Defend the hot endpoints

[Rate limiting](/architecture/rate-limiting/) runs at two levels — HTTP
(Rack::Attack) and application (`OrderRateLimiter`) — so a burst of traffic,
benign or hostile, cannot exhaust capacity on checkout or login.

## The principle

No single trick makes the platform scale. It is the discipline of, at every
layer, asking the same question: *does this work need to happen now, on this
thread, against this row?* Usually the answer is no — and pagination, caching,
and background jobs are how that "no" is enforced.