All Docs
FeaturesCalmony Sanctions MonitorUpdated March 12, 2026

Resilience: Circuit Breaker Pattern for External Services

Resilience: Circuit Breaker Pattern for External Services

Overview

The platform integrates with three external services at runtime:

  • Twilio — outbound SMS alerts for sanctions matches and compliance notifications
  • Stripe — subscription and billing management
  • OFSI endpoint — nightly sync of the UK consolidated sanctions list

This page describes how the platform handles sustained outages for each of these services.


Background: ERR-12 Audit Finding

An internal resilience audit (control ERR-12) identified that no circuit breaker pattern was in place for any external service. Without a circuit breaker, repeated failures are retried on every request with no trip threshold, which can cause:

  • Alert spam — continuous failed SMS attempts flooding Twilio's dead endpoint
  • Increased latency — every request waits for a timeout before failing
  • Cascading load — a degraded downstream amplifies load on the platform itself

The isStripeConfigured() check was explicitly reviewed during this audit. It is a static configuration guard (it returns 503 when Stripe credentials are not configured), not a runtime circuit breaker, and does not protect against a Stripe outage that occurs after startup.


Remediation Strategy

A full circuit breaker library was evaluated and deemed over-engineering for the current scale of the platform. A targeted, lightweight approach was adopted per service.

Twilio SMS

A failure counter is used to trip the SMS circuit after sustained failures:

  • Trip threshold: 5 consecutive failures
  • Window: 5 minutes
  • Backend: in-memory (single instance) or Redis-backed (multi-instance / production)
  • Behaviour when tripped: outbound SMS is suppressed; a service-degraded warning is surfaced in the monitoring dashboard
  • Reset: automatic reset after the window elapses with no further failures

This prevents alert spam to a dead Twilio service and protects compliance teams from missing genuine alerts that would be silently dropped.

Failure 1  → log warning
Failure 2  → log warning
Failure 3  → log warning
Failure 4  → log warning
Failure 5  → TRIP: suppress SMS, raise dashboard warning
  ...
[5 min window resets] → RESET: resume SMS delivery

Stripe

No runtime circuit breaker is implemented. The existing isStripeConfigured() guard returns HTTP 503 when Stripe credentials are absent, which is sufficient for the billing use-case at this scale. Individual Stripe errors are handled gracefully at the call site.

OFSI Endpoint

The OFSI sanctions list sync runs as a nightly background job, fully isolated from the request path. Failures in the sync do not impact real-time screening. Sync failures are logged and surfaced in the monitoring dashboard. No per-request circuit breaker is required.


Monitoring

When the Twilio circuit is tripped, the compliance monitoring dashboard will display a service-degraded warning. Operators should:

  1. Check Twilio service status at https://status.twilio.com
  2. Review recent SMS delivery logs for the specific error
  3. The circuit will reset automatically; no manual intervention is required unless the outage persists

Related Controls

ControlDescription
ERR-12Circuit breaker pattern for external services