Under the Hood: Eliminating SELECT * on High-Frequency Queries
Under the Hood: Eliminating SELECT * on High-Frequency Queries
Release: v0.1.89 · PERF-16
Area: Database Performance · Data Patterns
Background
As the volume of screened individuals and monitored entities grows, the cost of each database round-trip compounds. One of the most common — and easily overlooked — sources of unnecessary overhead is the unbounded SELECT * query: fetching every column from a table even when only a handful of fields are consumed by the application.
This post covers the two query paths addressed in v0.1.89.
The Problem
1. Batch Rescreening (runBatchRescreen)
The nightly batch rescreen job iterates over every active person record and re-evaluates their match status against the current OFSI consolidated list. The query that seeds this process was:
// Before — fetches all columns
db.select().from(people)
The people table includes columns such as registrationNumber and referenceId that are stored for audit and reference purposes but play no role in the fuzzy-match scoring logic. On a large dataset, transferring these columns on every nightly run adds measurable I/O and memory allocation with no benefit.
2. People POST Route — Sanctioned Entities Lookup
When a new person is submitted for screening, the API immediately queries the sanctionedEntities table to check for matches. This query was also unbounded:
// Before — fetches all columns
db.select().from(sanctionedEntities)
The sanctionedEntities table holds a range of supplementary fields (listing dates, legal basis references, programme codes, etc.) that are only required in the detailed entity view — not during initial match scoring.
The Fix
Both queries now declare explicit projections, requesting only the columns the screening logic actually reads.
runBatchRescreen() — updated projection
db.select({
id: people.id,
userId: people.userId,
fullName: people.fullName,
status: people.status,
dateOfBirth: people.dateOfBirth,
nationality: people.nationality,
}).from(people)
People POST — sanctioned entities projection
The projection is now consistent with the one already used inside getActiveSanctionEntries(), ensuring a single canonical set of fields is used for screening across both code paths:
db.select({
id: sanctionedEntities.id,
primaryName: sanctionedEntities.primaryName,
aliases: sanctionedEntities.aliases,
dateOfBirth: sanctionedEntities.dateOfBirth,
nationality: sanctionedEntities.nationality,
}).from(sanctionedEntities)
Why It Matters
| Query path | Frequency | Change |
|---|---|---|
runBatchRescreen() | Nightly + on-demand | Drops unused registrationNumber, referenceId and other columns from the result set |
people POST screening | Every new person submission | Aligns with the minimal projection used elsewhere in the screening pipeline |
For compliance platforms that process thousands of records, reducing per-query payload size has a compounding effect: less memory allocated per row, less data transferred over the database connection, and faster serialisation through the ORM layer.
No Breaking Changes
This is a pure internal performance optimisation. There are no changes to:
- API request or response shapes
- Screening logic or match-scoring behaviour
- Stored data or schema
- Dashboard or monitoring output
Existing integrations require no updates.