How We Cut Pipeline Stream DB Load by 80%
How We Cut Pipeline Stream DB Load by 80%
Release: v1.0.8 | Category: Performance | File:
src/app/api/pipelines/[id]/stream/route.ts
The Problem
Every active pipeline in SaaS Factory streams live status updates to the browser via a Server-Sent Events (SSE) connection at /api/pipelines/[id]/stream. Before v1.0.8, that endpoint executed 3 separate database queries on every 2-second tick — for as long as the pipeline was running:
- Full pipeline run + project JOIN — fetches the top-level run record and its associated project on every tick.
SELECT *on allagentJobsfor the run — returns every column includinginputDataandoutputData, two JSONB blobs that store the complete tool-call history for each agent job. These blobs can be megabytes each.- New agent logs since last sequence — a log tail query keyed off the last seen sequence number.
For a 30-minute pipeline that's 2,700 individual queries and 900 full agentJobs fetches, even when nothing in the job list has changed. With multiple pipelines running concurrently, this load compounds linearly and creates serious database pressure.
The Fix
1. Status-Gated agentJobs Polling (ETag Pattern)
The pipeline run record already carries a status field that only changes when meaningful work happens. v1.0.8 introduces a lightweight status check before executing the expensive agentJobs query:
- On each tick, the poller first reads only the pipeline
status(and a last-modified indicator). - If the status is unchanged from the previous tick, the
agentJobsSELECT is skipped entirely — the client already has a fresh copy. - The full
agentJobsquery only fires when the pipeline status actually transitions (e.g.,running→completed, or when a new agent job becomes active).
This is equivalent to HTTP ETag / If-None-Match cache validation, applied at the database query level.
2. Lean Column Projection on agentJobs
Even when the agentJobs query does run, it now selects only the columns the streaming client needs:
SELECT
id,
agentType,
status,
startedAt,
completedAt,
error,
tokenUsage
FROM agentJobs
WHERE pipelineRunId = $1
The inputData and outputData JSONB columns — which hold full tool-call histories and can be multiple megabytes per row — are never included in the polling projection. These fields are only fetched on explicit, user-initiated detail requests, not on every 2-second heartbeat.
3. Path Forward — Postgres LISTEN/NOTIFY
Polling with a fixed 2-second interval is inherently wasteful; it fires queries even during idle periods between agent transitions. The architecture team is evaluating replacing the polling loop with Postgres LISTEN/NOTIFY:
- The database emits a notification the moment an
agentJobrow changes. - The SSE handler wakes up only when there is actual work to report.
- Round-trip latency drops from up to 2 seconds to near-zero.
- DB query count drops from O(pipeline duration) to O(state changes).
This is tracked as a follow-on improvement to v1.0.8.
Impact
| Scenario | Queries Before | Queries After | Reduction |
|---|---|---|---|
| 30-min pipeline, single stream | ~2,700 | ≤ 540 | ~80% |
| 30-min pipeline, 10 concurrent streams | ~27,000 | ≤ 5,400 | ~80% |
| JSONB blob bytes transferred per poll | MBs (all jobs) | 0 | 100% |
The improvement scales directly with the number of concurrent active pipelines — the more pipelines running simultaneously, the greater the absolute DB load reduction.
Who Is Affected
- All deployments where pipeline runs are actively streamed to the frontend.
- Deployments with multiple concurrent pipeline runs see the most benefit — the previous query pattern was a compounding bottleneck.
- No changes to the SSE client protocol or event payload shape; this is a server-side-only optimization.