Batch Rescreen Engine — Cursor-based Chunking

Added in v0.1.30 (PERF-15) — pending merge of PR #58.

Overview

The nightly batch rescreen engine screens every monitored person across all tenants against the current OFSI consolidated list. Prior to v0.1.30 it did this with a single unbounded SELECT * FROM people query, loading the entire monitored-persons table into application memory before processing.

At scale this approach causes:

Out-of-memory crashes in the Node.js process when tens of thousands of rows are loaded at once.
Postgres statement timeouts on large result sets.
No partial progress visibility — if the run was interrupted, there was no record of how far it had reached.

v0.1.30 replaces the full-table scan with a cursor-based chunking loop that processes people 100 at a time and checkpoints progress to the database after each chunk.

How it works

Chunk size

A module-level constant controls the batch size:

const CHUNK_SIZE = 100;

Each database query fetches at most 100 rows. This keeps the in-flight memory footprint small and predictable regardless of how many people are being monitored.

Cursor query

A helper function fetches each chunk:

async function fetchPeopleChunk(afterId: number) {
  return db
    .select()
    .from(people)
    .where(and(ne(people.status, "deleted"), gt(people.id, afterId)))
    .orderBy(asc(people.id))
    .limit(CHUNK_SIZE);
}

gt(people.id, afterId) — returns only rows whose id is greater than the current cursor position.
orderBy(asc(people.id)) — guarantees a stable, index-backed ordering.
limit(CHUNK_SIZE) — caps the result set.
ne(people.status, "deleted") — excludes soft-deleted records as before.

Main loop

let lastId = 0;       // cursor — starts before the first possible id
let chunksProcessed = 0;

while (true) {
  const chunk = await fetchPeopleChunk(lastId);

  if (chunk.length === 0) break; // end of table

  for (const person of chunk) {
    try {
      const delta = await processPerson(person, sanctionEntries, settingsCache);
      newMatchesFound += delta.newMatchesFound;
      matchesCleared  += delta.matchesCleared;
      totalPeopleScreened++;
    } catch (error) {
      console.error(`Batch rescreen error for person ${person.id}:`, error);
      errorCount++;
    }
  }

  // Advance cursor
  lastId = chunk[chunk.length - 1].id;
  chunksProcessed++;

  // Checkpoint to DB after each chunk
  // (updates rescreenRuns with chunksProcessed, lastCursorId, checkpointedAt)
}

The loop terminates naturally when fetchPeopleChunk returns an empty array, meaning the cursor has passed the last row in the table.

Per-person error isolation

Errors thrown while processing a single person are caught inside the inner for loop. The error is logged and errorCount is incremented, but processing continues with the next person. A single bad record cannot abort the entire nightly run.

Progress checkpointing

After each chunk the engine writes an intermediate checkpoint to the rescreenRuns table:

Column	Value
`chunksProcessed`	Number of chunks completed so far
`lastCursorId`	The `id` of the last person processed
`checkpointedAt`	Timestamp of the checkpoint write

This means the monitoring dashboard shows live progress during long runs rather than showing no activity until the run completes.

The final audit log entry also includes chunksProcessed for post-run observability.

Why cursor pagination instead of OFFSET?

Approach	Query cost as table grows
`OFFSET N`	O(N) — Postgres must scan and discard N rows per page
`WHERE id > lastId`	O(log n) — resolved via primary key index

For a table with 50,000 monitored persons, OFFSET 49900 forces Postgres to skip 49,900 rows on the last page. The cursor approach always uses the primary key index and has constant per-page cost.

Behaviour at chunk boundaries

Scenario	Behaviour
Zero people in the table	First chunk is empty → loop exits immediately, run completes with `totalPeopleScreened = 0`
Number of people < `CHUNK_SIZE`	Single partial chunk fetched → loop exits after one iteration
Number of people = exact multiple of `CHUNK_SIZE`	Final chunk returns exactly `CHUNK_SIZE` rows → next iteration returns empty → loop exits
Sparse / non-contiguous IDs	Cursor still advances correctly because `WHERE id > lastId` is id-value based, not offset based
New rows inserted mid-run	New rows get ids larger than all current chunks, so they are included in later iterations

Organisation-level tenant isolation

Also introduced in this release: getOrCreateUserProfile() in src/lib/rbac.ts accepts a ProfileOptions object including organisationId. This ensures that user profiles are resolved within the correct tenant context, supporting organisation-level data isolation across the batch run.

Batch Rescreen Engine — Cursor-based Chunking

Batch Rescreen Engine — Cursor-based Chunking

Overview

How it works

Chunk size

Cursor query

Main loop

Per-person error isolation

Progress checkpointing

Why cursor pagination instead of OFFSET?

Behaviour at chunk boundaries

Organisation-level tenant isolation

Related