All Docs
FeaturesCalmony Sanctions MonitorUpdated March 12, 2026

SOC2-09: Configuring Critical Failure Alerting

SOC2-09: Configuring Critical Failure Alerting

This page documents the SOC 2 control gap identified in v0.1.146 and provides a step-by-step guide for remediating missing critical failure alerting across the platform.

Background

SOC 2 control SOC2-09 requires that the organisation has monitoring and alerting mechanisms in place to notify responsible personnel of critical system failures in a timely manner. An audit identified that the following failure scenarios were not generating any alerts:

Failure ScenarioWhere It OccursRisk
Nightly OFSI sanctions sync failsnightly-sync.yml workflowStale sanctions data served to compliance users
Monthly billing job errorsmonthly-billing.yml workflowRevenue loss, customer impact
/api/health returns 503Production runtimeApplication unavailability undetected

No Slack, PagerDuty, email, or webhook destinations were configured for any of these scenarios.


Remediation Steps

1. GitHub Actions — Slack Webhook Notifications

Add a failure notification step to each critical workflow. This step runs only when a preceding step fails, using the if: failure() condition.

Prerequisites:

Add to nightly-sync.yml and monthly-billing.yml:

jobs:
  sync:
    runs-on: ubuntu-latest
    steps:
      # ... existing steps ...

      - name: Notify Slack on failure
        if: failure()
        uses: slackapi/slack-github-action@v1
        with:
          payload: |
            {
              "text": "🚨 *${{ github.workflow }}* failed.\nBranch: `${{ github.ref }}`\nRun: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}"
            }
        env:
          SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}

Tip: The same pattern can be used with PagerDuty or any webhook-based incident management tool by replacing the uses action and payload format.


2. Runtime Alerting via Sentry

If Sentry is already integrated for error tracking, configure alert rules to notify on-call engineers:

  1. Navigate to Sentry → Alerts → Create Alert Rule.
  2. Set the condition to trigger on: "Number of events is greater than 0 in 1 minute" for issues with level fatal or error.
  3. Add a notification action targeting the appropriate Slack channel or PagerDuty service.
  4. Scope the rule to the production environment.

3. Uptime Monitoring for /api/health

Configure an external uptime monitor to poll the health endpoint and alert when the application is degraded or unavailable.

Recommended tools: Better Uptime, Checkly, Pingdom

Configuration:

SettingValue
URLhttps://<your-domain>/api/health
MethodGET
Check intervalEvery 1 minute
Alert conditionHTTP status != 200 (or >= 500)
Alert channelSlack / PagerDuty / email
Confirmation period2 consecutive failures before alerting

A healthy response from /api/health should return HTTP 200. A 503 response indicates the application or a critical dependency (e.g. database, OFSI sync) is unhealthy.


Required Secrets

The following secrets must be added to your GitHub repository and/or CI environment:

SecretDescription
SLACK_WEBHOOK_URLIncoming webhook URL for your Slack alerting channel

Compliance Status

ControlStatusResolved In
SOC2-09 — Critical Failure Alerting⚠️ OpenPending — see remediation above

Once the steps above are implemented and verified, this control can be marked as remediated in the next SOC 2 audit cycle.