Under the Hood: Eliminating SELECT * on High-Frequency Queries

Release: v0.1.89 · PERF-16
Area: Database Performance · Data Patterns

Background

As the volume of screened individuals and monitored entities grows, the cost of each database round-trip compounds. One of the most common — and easily overlooked — sources of unnecessary overhead is the unbounded SELECT * query: fetching every column from a table even when only a handful of fields are consumed by the application.

This post covers the two query paths addressed in v0.1.89.

The Problem

1. Batch Rescreening (`runBatchRescreen`)

The nightly batch rescreen job iterates over every active person record and re-evaluates their match status against the current OFSI consolidated list. The query that seeds this process was:

// Before — fetches all columns
db.select().from(people)

The people table includes columns such as registrationNumber and referenceId that are stored for audit and reference purposes but play no role in the fuzzy-match scoring logic. On a large dataset, transferring these columns on every nightly run adds measurable I/O and memory allocation with no benefit.

2. People POST Route — Sanctioned Entities Lookup

When a new person is submitted for screening, the API immediately queries the sanctionedEntities table to check for matches. This query was also unbounded:

// Before — fetches all columns
db.select().from(sanctionedEntities)

The sanctionedEntities table holds a range of supplementary fields (listing dates, legal basis references, programme codes, etc.) that are only required in the detailed entity view — not during initial match scoring.

The Fix

Both queries now declare explicit projections, requesting only the columns the screening logic actually reads.

`runBatchRescreen()` — updated projection

db.select({
  id: people.id,
  userId: people.userId,
  fullName: people.fullName,
  status: people.status,
  dateOfBirth: people.dateOfBirth,
  nationality: people.nationality,
}).from(people)

People POST — sanctioned entities projection

The projection is now consistent with the one already used inside getActiveSanctionEntries(), ensuring a single canonical set of fields is used for screening across both code paths:

db.select({
  id: sanctionedEntities.id,
  primaryName: sanctionedEntities.primaryName,
  aliases: sanctionedEntities.aliases,
  dateOfBirth: sanctionedEntities.dateOfBirth,
  nationality: sanctionedEntities.nationality,
}).from(sanctionedEntities)

Why It Matters

Query path	Frequency	Change
`runBatchRescreen()`	Nightly + on-demand	Drops unused `registrationNumber`, `referenceId` and other columns from the result set
`people` POST screening	Every new person submission	Aligns with the minimal projection used elsewhere in the screening pipeline

For compliance platforms that process thousands of records, reducing per-query payload size has a compounding effect: less memory allocated per row, less data transferred over the database connection, and faster serialisation through the ORM layer.

No Breaking Changes

This is a pure internal performance optimisation. There are no changes to:

API request or response shapes
Screening logic or match-scoring behaviour
Stored data or schema
Dashboard or monitoring output

Existing integrations require no updates.

Under the Hood: Eliminating SELECT * on High-Frequency Queries

Under the Hood: Eliminating SELECT * on High-Frequency Queries

Background

The Problem

1. Batch Rescreening (runBatchRescreen)

2. People POST Route — Sanctioned Entities Lookup

The Fix

runBatchRescreen() — updated projection

People POST — sanctioned entities projection

Why It Matters

No Breaking Changes

1. Batch Rescreening (`runBatchRescreen`)

`runBatchRescreen()` — updated projection