AI Analytics Engine
AI Analytics Engine
Availability: This feature is part of release v1.0.35, which is pending merge. The infrastructure described here is not yet deployed. This page will be updated when the release lands on
main.
The AI Analytics Engine provides tenant-isolated ML-powered insights, on-demand predictions, and anonymized cross-tenant industry benchmarks across HR, finance, legal, and operations domains.
Architecture overview
┌─────────────────────────────────────────────────────────────┐
│ Your tenant │
│ │
│ tRPC client → aiAnalytics router → prediction_jobs │
│ │ insights │
│ │ │
│ Inngest event bus │
│ │ │
│ ┌────────────────────┼───────────────────────────────┐ │
│ │ ai-prediction-job-processor (event-driven) │ │
│ │ ai-daily-metric-calculation (03:00 UTC daily) │ │
│ │ ai-weekly-model-retraining (02:00 UTC Mon) │ │
│ │ ai-monthly-benchmark-update (01:00 UTC 1st) │ │
│ └────────────────────────────────────────────────────┘ │
│ │ │
│ ai_models (shared, anonymized) │
│ benchmark_data (cross-tenant, anonymized) │
└─────────────────────────────────────────────────────────────┘
Data model
ai_models — ML model registry
Stores versioned model definitions. Models are shared across all tenants and trained on anonymized aggregate data.
| Column | Type | Description |
|---|---|---|
id | text (UUID) | Primary key |
name | text | Display name |
type | enum | turnover_prediction, spend_forecasting, contract_risk, performance_prediction |
version | text | Semantic version string |
trainingDataCutoff | timestamp | Latest training data included |
accuracyMetrics | jsonb | { accuracy, precision, recall, f1, ... } — shape varies by type |
status | enum | training → active → deprecated / failed |
Unique constraint: (type, version).
prediction_jobs — prediction queue
Records each prediction request made by a tenant.
| Column | Type | Description |
|---|---|---|
id | text (UUID) | Primary key |
orgId | text | Owning organization (tenant-scoped) |
modelId | text | FK → ai_models.id |
inputDataHash | text | SHA-256 of the serialized input payload |
status | enum | queued → running → completed / failed |
scheduledFor | timestamp | Execution target time |
completedAt | timestamp | Populated on completion |
results | jsonb | Structured prediction output |
errorMessage | text | Populated on failure |
Unique constraint: (orgId, modelId, inputDataHash) — prevents duplicate jobs for identical inputs.
insights — AI recommendations
Holds generated recommendations surfaced to users within a tenant.
| Column | Type | Description |
|---|---|---|
id | text (UUID) | Primary key |
orgId | text | Owning organization |
category | enum | hr, finance, legal, operations |
insightType | text | e.g. high_turnover_risk, budget_overrun |
title | text | Short human-readable title |
description | text | Full description with supporting context |
confidenceScore | decimal(4,3) | Model confidence — 0.000 to 1.000 |
dataPoints | jsonb | Supporting data, e.g. { affectedEmployeeCount: 5 } |
recommendedActions | jsonb[] | Ordered list of { action, priority, estimatedImpact?, resourceUrl? } |
expiresAt | timestamp | Insight is hidden from queries after this time |
viewedBy | jsonb | Array of user IDs that have read the insight |
dismissedAt | timestamp | Set when a user dismisses the insight (soft-delete) |
dismissedBy | text | FK → users.id |
benchmark_data — cross-tenant benchmarks
Aggregate statistics only — no org IDs or personal data.
| Column | Type | Description |
|---|---|---|
metricName | text | e.g. employee_turnover_rate |
industry | text | e.g. technology, healthcare |
companySizeRange | text | e.g. 50-200, 201-1000 |
percentile25 | decimal(10,4) | 25th-percentile value |
percentile50 | decimal(10,4) | Median value |
percentile75 | decimal(10,4) | 75th-percentile value |
lastUpdated | timestamp | When this row was last recalculated |
Unique constraint: (metricName, industry, companySizeRange).
tRPC API
All procedures are under the aiAnalytics namespace.
aiAnalytics.getInsights
Returns a paginated list of active insights for the calling tenant. Excludes expired and dismissed insights.
Input filters:
category— filter byhr,finance,legal, oroperationsinsightType— filter by specific type stringminConfidence— minimum confidence score threshold (0.0–1.0)unviewedOnly— only return insights the current user has not yet seenlimit/cursor— pagination
aiAnalytics.dismissInsight
Soft-deletes an insight for the tenant. Sets dismissedAt and dismissedBy, and writes an audit log entry. The insight will no longer appear in getInsights results.
Input: { insightId: string }
aiAnalytics.requestPrediction
Enqueues a prediction job. The router resolves the latest active model of the requested type and deduplicates the request using the SHA-256 hash of the input payload — if an identical job already exists for this org + model, the existing job ID is returned instead.
Input: { modelType: AiModelType, inputData: Record<string, unknown> }
Returns: { jobId: string, deduplicated: boolean }
aiAnalytics.getPredictionJob
Polls the status and results of a prediction job.
Input: { jobId: string }
Returns: Full PredictionJob record including status, results, and errorMessage.
aiAnalytics.getBenchmarks
Queries industry benchmark data. Results are always anonymized aggregate statistics.
Input: { metrics: string[], industry?: string, companySizeRange?: string }
aiAnalytics.listModels
Admin-only. Returns all entries in the model registry with accuracy metrics and status.
Background jobs
Daily metric calculation
Schedule: 0 3 * * * (03:00 UTC every day)
Runs across all organizations. For each org, computes:
- Turnover rate — flags high-risk employees and emits
hrcategory insights - Payroll cost spikes — detects anomalous payroll movements and emits
financeinsights - Expiring contracts — identifies contracts approaching renewal dates and emits
legalinsights
Each insight is generated with a model-derived confidence score and a set of recommended actions.
Weekly model retraining
Schedule: 0 2 * * 1 (02:00 UTC every Monday)
Refreshes all four model types (turnover_prediction, spend_forecasting, contract_risk, performance_prediction). The job:
- Marks all current
activeversions of each model type asdeprecated. - Inserts a new model row with an incremented version and updated training data cutoff.
Monthly benchmark update
Schedule: 0 1 1 * * (01:00 UTC on the 1st of each month)
Upserts benchmark percentile values for 6 metrics × 6 industries × 5 company size ranges = up to 180 rows per run. All values are fully anonymized aggregates.
Prediction job processor
Trigger: Inngest event ai/prediction.requested
Executes queued prediction jobs as they arrive.
- Concurrency: Max 2 simultaneous jobs per org
- Throttle: Max 50 jobs per minute globally
- Permanent failures: Uses
NonRetriableErrorto halt retries on unrecoverable errors (e.g. model not found, malformed input)
Privacy and security
- Tenant isolation:
prediction_jobsandinsightsare always scoped toorgId. Cross-tenant queries are not possible through the router. - Benchmark anonymization:
benchmark_datacontains no org identifiers or PII — only aggregate percentile statistics. - Input deduplication: Prediction inputs are hashed before storage; raw input payloads are not persisted.
- Insight expiry: All insights carry an
expiresAttimestamp. Expired insights are filtered out at the query layer and will never surface to users after their TTL. - Audit trail: All dismissals and prediction requests produce audit log entries.