All Docs
FeaturesNurtureHubUpdated March 25, 2026

Hardening Twilio SMS: Retry Logic & Circuit Breaking in v1.0.92

Hardening Twilio SMS: Retry Logic & Circuit Breaking in v1.0.92

Version: 1.0.92
Control: SCR-04
Area: API Connection · SMS Notifications


Background

NurtureHub uses Twilio to deliver SMS hot lead alerts and notification messages to property agents. These alerts are time-sensitive — a missed SMS at the moment a prospect clicks through can mean a missed viewing or lost instruction.

As part of our ongoing Supply Chain Resilience (SCR) programme, every external API integration is reviewed for error handling robustness. This release addresses a gap identified in SCR-04: the Twilio SMS sender had no retry logic and silently discarded transient failures.


The Problem

The sendSms() function in src/lib/notifications/sms.ts made a single fetch() call to the Twilio REST API and returned null on any non-success response. This created two compounding failure modes:

1. Silent Failures on Transient Errors

Twilio, like any external API, occasionally returns transient errors:

Status CodeMeaning
429Rate limited — too many requests
500Twilio internal server error
503Twilio service temporarily unavailable

With no retry logic in place, any of these responses would cause sendSms() to return null. The calling Inngest step would see this as a failed send, increment retryCount, and move on — without actually retrying the Twilio call.

2. Inngest Retries Were Never Triggered

Inngest has a powerful built-in retry mechanism: if a step throws an error, Inngest will automatically retry it with backoff. However, because sendSms() returned null instead of throwing, Inngest had no signal that anything had gone wrong. The retry mechanism was effectively bypassed for all transient Twilio failures.

3. No Request Timeout

The fetch() call had no timeout configured. A slow or hanging Twilio response could cause the Inngest step to stall indefinitely, blocking the job queue.


The Fix

Three changes were made to src/lib/notifications/sms.ts:

Retry Logic with Exponential Backoff

sendSms() now retries automatically on retryable status codes (429, 500, 503) using exponential backoff before propagating the failure. This handles the majority of transient Twilio blips transparently, without any Inngest step retry being consumed.

Throw on Failure

When all retries are exhausted (or a non-retryable error is received), sendSms() now throws an error rather than returning null. This correctly surfaces the failure to Inngest, which will then apply its own step-level retry policy — ensuring the message eventually delivers even under sustained Twilio degradation.

Request Timeout via AbortSignal.timeout

A timeout has been added to the fetch() call using the standard AbortSignal.timeout API. This ensures that a slow Twilio response does not block the Inngest worker indefinitely.


Behaviour Summary

ScenarioBefore v1.0.92After v1.0.92
Twilio returns 429Returns null, message silently lostRetries with backoff, then throws for Inngest retry
Twilio returns 503Returns null, message silently lostRetries with backoff, then throws for Inngest retry
Twilio hangsFetch stalls indefinitelyAborted after timeout
Inngest retry triggeredNever (no throw)Yes, on all unrecovered failures
Transient blip recoveryNoYes, via internal backoff retries

What Agents Will Notice

For the vast majority of agents, nothing changes — SMS hot lead alerts continue to arrive as expected. The improvement is in the platform's resilience:

  • Fewer missed SMS alerts during Twilio service blips or rate limit windows.
  • No silent failures — every undelivered SMS is now a visible, retried event in the job queue.
  • Faster recovery from transient Twilio errors without manual intervention.

Files Changed

  • src/lib/notifications/sms.ts

Related