Under the Hood: Paginating the Batch Rescreen Engine
Under the Hood: Paginating the Batch Rescreen Engine
Release: v0.1.37 · Track: Performance / Data Patterns · Ref: PERF-15
The Problem
As the sanctions screening platform adds users and monitored entities, a handful of internal endpoints were quietly accumulating technical debt: they issued unbounded database queries — SELECT … FROM people with no LIMIT — and loaded every row into application memory before doing any work.
For a small dataset this is invisible. At tens of thousands of rows it becomes a memory exhaustion event waiting to happen, and at scale it can take down the nightly rescreen job entirely.
The three areas affected were:
| Area | Risk |
|---|---|
Batch rescreen engine (batch-rescreen.ts) | Fetches all people across all users in one query |
/api/matches endpoint | Loads all match records without a page limit |
| Export endpoints | Returns full record sets in a single response |
The Fix
Cursor-Based Chunking in the Batch Rescreen Engine
The nightly rescreen job — which re-evaluates every monitored person against the current OFSI consolidated list — now processes people in chunks of 50–100 rows per iteration. Each chunk is fetched using offset pagination with a cursor on the people.id column:
Iteration 1: SELECT … WHERE id > 0 ORDER BY id LIMIT 100
Iteration 2: SELECT … WHERE id > 100 ORDER BY id LIMIT 100
Iteration 3: SELECT … WHERE id > 200 ORDER BY id LIMIT 100
… and so on until no rows are returned
This approach has two important properties:
- Bounded memory usage. At any point in time the engine holds at most 100 people in memory, regardless of how large the dataset grows.
- Progress checkpointing. Because each chunk is self-contained, a crashed or interrupted run does not need to restart from the beginning — the cursor position marks how far work has progressed.
Consistent Pagination on /api/matches
The /api/matches endpoint now uses the same page / limit query parameter convention already established on /api/people. Responses follow the existing envelope format:
{
"data": [ /* up to `limit` match objects */ ],
"pagination": {
"page": 1,
"limit": 50,
"total": 1240,
"totalPages": 25
}
}
What You Need to Do
- No action required for the nightly rescreen job — chunking is handled automatically by the engine.
- API consumers of
/api/matchesthat previously received all records in one response should update their clients to iterate through pages using thepageandlimitparameters. The interface mirrors/api/peopleexactly.
Looking Ahead
With pagination now standardised across people, matches, and export endpoints, all future list-returning routes will follow the same convention from the outset. This forms part of a broader performance track aimed at keeping the platform responsive as monitored entity counts grow into the hundreds of thousands.