v0.1.10 — Closing the Silent Failure Gap: Critical Incident Escalation
v0.1.10 — Closing the Silent Failure Gap: Critical Incident Escalation
Release: v0.1.10 ISO 27001 Control: ISO-05 — Incident Response
The Problem
Until this release, critical system failures could occur with no engineer ever finding out.
The nightly workflow that synchronises the OFSI consolidated sanctions list, the monthly billing workflow, and a range of runtime error conditions — health check degradation, payment processing failures, sync failures — produced no automatic notifications. If a workflow failed at 2 AM, or if sanctions screening silently stopped working, the only way to discover it was to manually inspect logs or notice something wrong downstream.
This is a direct gap under ISO 27001 Control ISO-05 (Incident Response), which requires that information security incidents are detected and that responsible parties are notified in a timely manner.
What Changed in v0.1.10
1. Workflow Failure Notifications
We added failure notification steps to the two most critical GitHub Actions workflows:
nightly-sync.yml— runs every night to pull the latest OFSI consolidated list and update the screening database.monthly-billing.yml— processes subscription billing on a monthly cycle.
When either workflow fails, a notification is now dispatched automatically — via a Slack GitHub Action or GitHub's native email notification — so the on-call engineer is informed immediately.
# Example failure step added to nightly-sync.yml
- name: Notify on failure
if: failure()
uses: slackapi/slack-github-action@v1
with:
payload: |
{
"text": ":red_circle: nightly-sync failed on `${{ github.ref }}`. See: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}"
}
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}
2. Runtime Critical Error Escalation via Sentry
For failures that happen at runtime — rather than in a scheduled workflow — we have configured Sentry alert rules to escalate high-severity issues to Slack and PagerDuty.
Covered error categories:
| Error Type | Severity | Escalation Target |
|---|---|---|
| Health check degradation | Critical | Slack + PagerDuty |
| Payment processing failure | Critical | Slack + PagerDuty |
| Sanctions sync failure | Critical | Slack + PagerDuty |
This ensures that even runtime errors that don't surface through a CI workflow are captured and escalated within minutes.
3. On-Call Runbook (RUNBOOK.md)
A new RUNBOOK.md has been added to the repository. It documents the escalation paths and response procedures for each critical failure type covered by this release.
The runbook covers:
- Severity classification — how to assess and categorise an incident
- Escalation contacts — who to contact and through which channel
- Step-by-step response procedures for each failure type (sync failure, billing failure, health degradation)
- Post-incident actions — logging, review, and ISO-05 reporting requirements
Why This Matters for Compliance Teams
For users of this platform, these changes provide greater confidence that:
- Sanctions screening data stays current. If the nightly OFSI sync fails, your team is notified before screening results become stale.
- Billing disruptions are caught immediately. Payment processing failures surface to engineers rather than silently affecting customer accounts.
- ISO 27001 audit trail is maintained. Documented escalation paths and alert configurations provide evidence of a functioning incident response process.
Configuration Required
To enable Slack notifications for workflow failures, set the following secret in your GitHub repository:
| Secret | Description |
|---|---|
SLACK_WEBHOOK_URL | Incoming webhook URL for your Slack notification channel |
For PagerDuty escalation via Sentry, configure a Sentry integration with your PagerDuty service key through the Sentry dashboard under Settings → Integrations → PagerDuty.
Related
RUNBOOK.md— On-call escalation runbook- ISO/IEC 27001:2022, Annex A — Control 5.26: Response to information security incidents
- Changelog — v0.1.10