v0.1.10 — Closing the Silent Failure Gap: Critical Incident Escalation

Release: v0.1.10 ISO 27001 Control: ISO-05 — Incident Response

The Problem

Until this release, critical system failures could occur with no engineer ever finding out.

The nightly workflow that synchronises the OFSI consolidated sanctions list, the monthly billing workflow, and a range of runtime error conditions — health check degradation, payment processing failures, sync failures — produced no automatic notifications. If a workflow failed at 2 AM, or if sanctions screening silently stopped working, the only way to discover it was to manually inspect logs or notice something wrong downstream.

This is a direct gap under ISO 27001 Control ISO-05 (Incident Response), which requires that information security incidents are detected and that responsible parties are notified in a timely manner.

What Changed in v0.1.10

1. Workflow Failure Notifications

We added failure notification steps to the two most critical GitHub Actions workflows:

nightly-sync.yml — runs every night to pull the latest OFSI consolidated list and update the screening database.
monthly-billing.yml — processes subscription billing on a monthly cycle.

When either workflow fails, a notification is now dispatched automatically — via a Slack GitHub Action or GitHub's native email notification — so the on-call engineer is informed immediately.

# Example failure step added to nightly-sync.yml
- name: Notify on failure
  if: failure()
  uses: slackapi/slack-github-action@v1
  with:
    payload: |
      {
        "text": ":red_circle: nightly-sync failed on `${{ github.ref }}`. See: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}"
      }
  env:
    SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}

2. Runtime Critical Error Escalation via Sentry

For failures that happen at runtime — rather than in a scheduled workflow — we have configured Sentry alert rules to escalate high-severity issues to Slack and PagerDuty.

Covered error categories:

Error Type	Severity	Escalation Target
Health check degradation	Critical	Slack + PagerDuty
Payment processing failure	Critical	Slack + PagerDuty
Sanctions sync failure	Critical	Slack + PagerDuty

This ensures that even runtime errors that don't surface through a CI workflow are captured and escalated within minutes.

3. On-Call Runbook (`RUNBOOK.md`)

A new RUNBOOK.md has been added to the repository. It documents the escalation paths and response procedures for each critical failure type covered by this release.

The runbook covers:

Severity classification — how to assess and categorise an incident
Escalation contacts — who to contact and through which channel
Step-by-step response procedures for each failure type (sync failure, billing failure, health degradation)
Post-incident actions — logging, review, and ISO-05 reporting requirements

Why This Matters for Compliance Teams

For users of this platform, these changes provide greater confidence that:

Sanctions screening data stays current. If the nightly OFSI sync fails, your team is notified before screening results become stale.
Billing disruptions are caught immediately. Payment processing failures surface to engineers rather than silently affecting customer accounts.
ISO 27001 audit trail is maintained. Documented escalation paths and alert configurations provide evidence of a functioning incident response process.

Configuration Required

To enable Slack notifications for workflow failures, set the following secret in your GitHub repository:

Secret	Description
`SLACK_WEBHOOK_URL`	Incoming webhook URL for your Slack notification channel

For PagerDuty escalation via Sentry, configure a Sentry integration with your PagerDuty service key through the Sentry dashboard under Settings → Integrations → PagerDuty.

RUNBOOK.md — On-call escalation runbook
ISO/IEC 27001:2022, Annex A — Control 5.26: Response to information security incidents
Changelog — v0.1.10

v0.1.10 — Closing the Silent Failure Gap: Critical Incident Escalation

v0.1.10 — Closing the Silent Failure Gap: Critical Incident Escalation

The Problem

What Changed in v0.1.10

1. Workflow Failure Notifications

2. Runtime Critical Error Escalation via Sentry

3. On-Call Runbook (RUNBOOK.md)

Why This Matters for Compliance Teams

Configuration Required

Related

3. On-Call Runbook (`RUNBOOK.md`)