Fleet-wide Auth Rate Limiting
Fleet-wide Auth Rate Limiting
Calmony Pay enforces rate limits on all authentication and sensitive endpoints at the edge. From v1.0.47, these limits are shared atomically across every Vercel edge instance via Upstash Redis, closing a bypass vector that existed with per-instance in-memory counters.
How It Works
The Problem with Per-instance Limits
Each Vercel edge/serverless instance has isolated memory. If a 10 req/min limit was stored in-memory on each instance, an attacker could distribute requests across N instances and effectively multiply their allowed throughput by N.
The Solution: Upstash Redis Atomic Counter
When UPSTASH_REDIS_REST_URL and UPSTASH_REDIS_REST_TOKEN are configured, the middleware calls checkRateLimitEdge, which:
- Computes a fixed-window key in the form
rl:<label>:<windowEpoch>(e.g.rl:1.2.3.4:auth:1234567). - Sends a single pipelined HTTP request to the Upstash Redis REST API containing:
INCR <key>— atomically increment the counter.PEXPIRE <key> <windowMs * 2> NX— set a TTL on first write only; keys self-expire so no cleanup job is needed.
- Compares the returned counter value against the configured limit.
- Returns a
RateLimitResultwithallowed,limit,remaining, andresetfields.
All edge instances hit the same Redis key — the limit is a true hard cap fleet-wide.
Graceful Degradation
| Condition | Behaviour |
|---|---|
| Redis configured and reachable | Fleet-wide atomic counter |
| Redis not configured | Per-instance in-memory sliding window (zero-latency fallback) |
| Redis configured but unreachable | Fails open (request allowed); emits console.warn |
Auth traffic is never blocked by a Redis infrastructure failure.
Protected Endpoints
| Endpoint | Methods | Limit | Purpose |
|---|---|---|---|
/api/auth/* | POST, GET | 10 req/min/IP | NextAuth sign-in / OAuth initiation |
/sign-in | POST | 10 req/min/IP | Credential sign-in page |
/sign-up | POST | 10 req/min/IP | Account registration |
/invite/* | POST, GET | 20 req/min/IP | Invite token enumeration prevention |
/api/trpc/invite | POST | 30 req/min/IP | tRPC invite mutations |
/api/trpc/org | POST | 30 req/min/IP | tRPC org mutations |
Configuration
Enable Fleet-wide Limiting
Add the following to your environment (.env.local for development, environment variables in your Vercel project for production):
UPSTASH_REDIS_REST_URL=https://<your-database>.upstash.io
UPSTASH_REDIS_REST_TOKEN=<your-token>
Both variables are available from the Upstash console after creating a Redis database. Choose the same AWS/GCP region as your primary Vercel deployment to minimise latency.
Note: If these variables are absent, the system automatically falls back to per-instance in-memory limits. No additional configuration is needed to keep existing deployments working.
Performance
The Redis check adds a single HTTP round-trip to the middleware on every matched request. On a same-region Upstash instance, this is typically 10–20 ms — negligible for auth-endpoint traffic.
API Reference
checkRateLimitEdge(key, limit, windowMs): Promise<RateLimitResult>
The primary async rate-limit function used by the middleware.
import { checkRateLimitEdge } from "@/lib/security/rate-limit";
const result = await checkRateLimitEdge(
`${ip}:auth`, // key — unique per client + route
10, // limit — max requests
60_000 // windowMs — window duration in ms
);
if (!result.allowed) {
// return 429
}
Parameters:
| Parameter | Type | Description |
|---|---|---|
key | string | Unique identifier for this client+route (e.g. 1.2.3.4:auth) |
limit | number | Maximum requests allowed per window |
windowMs | number | Window duration in milliseconds |
Returns RateLimitResult:
| Field | Type | Description |
|---|---|---|
allowed | boolean | Whether the request should be permitted |
limit | number | Total requests allowed per window |
remaining | number | Requests remaining in the current window |
reset | number | Epoch-ms timestamp when the window resets |
checkRateLimit(key, limit, windowMs): RateLimitResult
The legacy synchronous in-memory limiter. Retained for backwards compatibility with callers that cannot await. Does not provide fleet-wide enforcement — use checkRateLimitEdge in all new code.
Runtime Compatibility
checkRateLimitEdge uses the native fetch API and has no Node.js-specific imports, making it safe to call from Next.js Edge middleware.