# Scaling The platform is designed to stay fast and correct at roughly **20,000 concurrent users**. That target shaped three areas: how lists are paginated, how data is cached, and how work is kept off the request path. ## Cursor pagination, not page numbers Offset pagination (`LIMIT 20 OFFSET 4000`) has two problems at scale: the database still walks the skipped rows, and if the underlying data changes between page loads, items shift — a shopper sees a duplicate or misses a product. `CursorPaginator` uses **keyset pagination** instead. Each page carries an opaque cursor encoding the sort position of the last row: ``` GET /products?after=eyJpZCI6MTQ4Mn0 # "rows after this key" ``` The next query becomes `WHERE (sort_key) < (cursor) LIMIT n` — an index range scan whose cost does not grow with how deep the shopper has scrolled. It powers infinite scroll on the catalog, the [blog](/features/blog-platform/), and the [Product API](/features/product-api/). The cursor is opaque and integrity-checked so it cannot be tampered with to escape a scope. ## Multi-layer caching Caching is applied per subsystem rather than globally: - **Counter caches** keep aggregate counts (wishlist items, comments) on the parent row, so common pages never `COUNT(*)`. - **Fragment caches** — services like `WishlistCacheService`, `ProductCacheService`, and `BlogPostCacheService` cache rendered fragments and invalidate them precisely on write, rather than relying on time-based expiry alone. - **Cache warming** — `ProductCacheWarmingService` pre-populates hot product caches so the first shopper after a deploy does not pay the cold-cache cost. - **Versioned keys** — `ProductCacheVersion` / `WishlistCacheVersion` make bulk invalidation a single version bump instead of a key sweep. ## Keep work off the request path Anything that does not have to happen before the response is sent, doesn't. Email, notifications, broadcasts, reconciliation, and cache warming all run as [background jobs](/architecture/background-jobs/). The request thread does the minimum and returns. ## Defend the hot endpoints [Rate limiting](/architecture/rate-limiting/) runs at two levels — HTTP (Rack::Attack) and application (`OrderRateLimiter`) — so a burst of traffic, benign or hostile, cannot exhaust capacity on checkout or login. ## The principle No single trick makes the platform scale. It is the discipline of, at every layer, asking the same question: *does this work need to happen now, on this thread, against this row?* Usually the answer is no — and pagination, caching, and background jobs are how that "no" is enforced.