Under the Hood: Paginating the Batch Rescreen Engine

Release: v0.1.37 · Track: Performance / Data Patterns · Ref: PERF-15

The Problem

As the sanctions screening platform adds users and monitored entities, a handful of internal endpoints were quietly accumulating technical debt: they issued unbounded database queries — SELECT … FROM people with no LIMIT — and loaded every row into application memory before doing any work.

For a small dataset this is invisible. At tens of thousands of rows it becomes a memory exhaustion event waiting to happen, and at scale it can take down the nightly rescreen job entirely.

The three areas affected were:

Area	Risk
Batch rescreen engine (`batch-rescreen.ts`)	Fetches all people across all users in one query
`/api/matches` endpoint	Loads all match records without a page limit
Export endpoints	Returns full record sets in a single response

The Fix

Cursor-Based Chunking in the Batch Rescreen Engine

The nightly rescreen job — which re-evaluates every monitored person against the current OFSI consolidated list — now processes people in chunks of 50–100 rows per iteration. Each chunk is fetched using offset pagination with a cursor on the people.id column:

Iteration 1: SELECT … WHERE id > 0    ORDER BY id LIMIT 100
Iteration 2: SELECT … WHERE id > 100  ORDER BY id LIMIT 100
Iteration 3: SELECT … WHERE id > 200  ORDER BY id LIMIT 100
… and so on until no rows are returned

This approach has two important properties:

Bounded memory usage. At any point in time the engine holds at most 100 people in memory, regardless of how large the dataset grows.
Progress checkpointing. Because each chunk is self-contained, a crashed or interrupted run does not need to restart from the beginning — the cursor position marks how far work has progressed.

Consistent Pagination on `/api/matches`

The /api/matches endpoint now uses the same page / limit query parameter convention already established on /api/people. Responses follow the existing envelope format:

{
  "data": [ /* up to `limit` match objects */ ],
  "pagination": {
    "page": 1,
    "limit": 50,
    "total": 1240,
    "totalPages": 25
  }
}

What You Need to Do

No action required for the nightly rescreen job — chunking is handled automatically by the engine.
API consumers of /api/matches that previously received all records in one response should update their clients to iterate through pages using the page and limit parameters. The interface mirrors /api/people exactly.

Looking Ahead

With pagination now standardised across people, matches, and export endpoints, all future list-returning routes will follow the same convention from the outset. This forms part of a broader performance track aimed at keeping the platform responsive as monitored entity counts grow into the hundreds of thousands.

Under the Hood: Paginating the Batch Rescreen Engine

Under the Hood: Paginating the Batch Rescreen Engine

The Problem

The Fix

Cursor-Based Chunking in the Batch Rescreen Engine

Consistent Pagination on /api/matches

What You Need to Do

Looking Ahead

Consistent Pagination on `/api/matches`