Platform Operations

Operator reference for abuse status and the sending gate, the platform-admin roster, content review, org deletion, in-app self-update, dev endpoints, crons, and migrations.

This page documents the operational surface that sits behind the product: the abuse-status lifecycle that gates sending, the platform-admin role and its queries, the content review queue, the organization-deletion walker, system health and in-app self-update, and the dev-mode endpoints, crons, and migrations that keep a deployment maintainable. It is written for operators and contributors working on a self-hosted Owlat instance.

One organization per deployment

Owlat is single-org-per-deployment. The data plane belongs to exactly one organization, and the abuse-status, settings, and stats described here all live on the singleton instanceSettings row. There is no multi-tenant org switcher.

No platform-admin dashboard in this repo

This OSS repository ships one platform-admin-gated page: Settings → System & Updates (/dashboard/settings/system). The richer control-plane UI that consumed the queries below was extracted into a separate private repo. Treat the roster, abuse-status, and content-review functions documented here as a backend/API surface you call from your own tooling or scripts — not as an in-product dashboard.

Abuse status lifecycle and the sending gate

Every deployment carries a single abuse status on instanceSettings.abuseStatus. It drives whether the instance is allowed to send. The four statuses, defined in apps/api/convex/organizations/abuseStatus.ts, form a severity ladder:

StatusSeveritySendingMeaning
clean0AllowedNormal operation (also the default when no status was ever written)
warned1AllowedAdvisory warning issued — still fully operational
suspended2BlockedAll sending blocked; the account stays accessible
banned3BlockedAccount fully disabled; terminal for internal writers

The read and write halves are deliberately split into sibling modules (ADR-0011, docs/adr/0011-abuse-status-modules.md):

  • Abuse gate (apps/api/convex/organizations/abuseGate.ts) is the home of the sending predicate. It exports two surfaces: isSendingAllowed(status), the pure predicate that the live send paths call (the transactional dispatch path in transactional/dispatch.ts), and requireSendingAllowed(ctx), a mutation-context helper that fetches instanceSettings and throws on suspended/banned. The campaign send path enforces the same predicate inline through its pre-flight query (campaigns/preflight.ts, which checks abuseStatus against suspended/banned directly) rather than calling the helper. Either way the gate always runs — it is not behind a feature flag.
  • Abuse status (apps/api/convex/organizations/abuseStatus.ts) is the only writer of abuseStatus and its companion fields (abuseStatusReason, abuseStatusChangedAt, abuseStatusChangedBy).

Transition rules

There are two write entry points. The internal-writer path enforces severity rules; the admin path bypasses them.

Entry pointUsed byRules
transition (internal mutation)MTA circuit breaker, reputation auto-enforcementbanned is terminal; downgrades are refused except down to clean (the auto-recover path); a same-state attempt is recorded but not patched
adminOverride (internal mutation)The platform-admin setOrganizationStatus mutationBypasses all severity rules — an admin can demote a banned org back to clean to resolve an appeal

Both paths write an abuse_status_changed audit-log row on every call, including same-state no-ops (so "circuit breaker tripped again while already warned" is observable). A missing instanceSettings row returns { ok: false, reason: 'no_settings_row' } rather than throwing — an early-deployment edge case.

What flips the status automatically

TriggerSourceTarget status
Reputation reaches critical riskanalytics/sendingReputation.tsevaluateAutoEnforce (hourly cron) → autoEnforceReputationsuspended
Reputation reaches high riskanalytics/sendingReputation.tsevaluateAutoEnforce (hourly cron) → autoEnforceReputationwarned
MTA circuit breaker tripswebhooks/dispatcher.ts (internal.circuit_breaker_tripped)warned

Reputation is derived on read (a rolling 30-day window — see Deliverability Infrastructure), but auto-enforcement is deliberately kept off the read hot path: an hourly evaluate reputation auto-enforce cron (apps/api/convex/crons.ts) calls evaluateAutoEnforce, which summarizes the org window and schedules autoEnforceReputation when risk is high/critical. A separate hourly cleanup sending reputation cron (recalculateAll) ages out >60-day buckets. Because severity downgrades are refused, a critical → warned transition never silently relaxes an existing suspension.

Platform admin role, roster, and operational queries

Platform admins are rows in the platformAdmins table, keyed by BetterAuth user id. Two roles exist: admin and superadmin. The first admin is created with seedPlatformAdmin (an internal mutation that only succeeds while the table is empty and always seeds a superadmin); after that, superadmins manage the roster.

Gating is centralized in apps/api/convex/platformAdmin/platformAdmin.ts:

FunctionKindPurpose
requirePlatformAdmin(ctx)HelperThrows FORBIDDEN unless the caller is in platformAdmins; returns { authUserId, email, role }
isPlatformAdminPublic queryBoolean nav helper, safe for anonymous callers
isPlatformAdminByUserIdInternal queryAdmin check by user id, for HTTP/action contexts that lack a QueryCtx
seedPlatformAdminInternal mutationOne-shot bootstrap of the first superadmin

The web app gates the single admin page with the platform-admin route middleware (apps/web/app/middleware/platform-admin.ts), which calls isPlatformAdmin and redirects non-admins to /dashboard.

Operational queries

All of these live in apps/api/convex/platformAdmin/queries.ts and start with requirePlatformAdmin. They read the singleton instanceSettings row plus derived reputation/stats:

QueryReturns
getPlatformStatsContact count, abuse status, 30-day send/delivery/bounce/complaint aggregates, signups-by-day
getDeliveryStatsIn-flight sending campaigns, scheduled count, aggregate delivery metrics, last-7-day blocked-email counts by reason
getOrganizationDetailSettings, reputation summary, blocked-email counts (bounced/complained/manual), recent content scans and campaigns
listFlaggedOrganizationsThe instance entry, but only when warned/suspended/banned or reputation risk is high/critical
listAllOrganizationsThe instance entry with optional search/statusFilter
listAllUsersUser profiles with optional search
listAllDomainsDomains with optional statusFilter/search
listRecentAbuseFlagged content scans (suspicious/blocked) plus pending_review campaigns
getAdminAuditLogUp to 200 platform_admin-resource audit-log rows, with admin-email resolution
listPlatformAdminsThe full admin roster
Billing is out of OSS

getBillingOverview exists for backward compatibility and returns empty data — invoice-based billing was removed during the billing simplification and now lives only in the separate private repo. Do not build against it.

Roster + status mutations

These live in apps/api/convex/platformAdmin/mutations.ts:

MutationEffect
setOrganizationStatusSets abuse status via adminOverride (bypassing severity rules); writes both abuse_status_changed and the legacy platform_admin.org_status_changed audit rows
addPlatformAdminAdds an admin (superadmin-only; rejects duplicates)
removePlatformAdminRemoves an admin (superadmin-only; cannot remove yourself)

Content review queue (approve/reject pending content)

Campaigns and transactional emails can land in a pending_review status (for example when content scanning flags them — see Email Security). The review surface is platform-admin only.

getContentReviewQueue (in platformAdmin/queries.ts) returns the pending items joined with their latest content-scan result, plus the recently-reviewed log:

{
  pending: [{ type: 'campaign' | 'transactional', id, name, subject, scan: { score, level } | null, ... }],
  pendingCount: number,
  recentlyReviewed: [{ action, details, userId, createdAt }],
}

The three review mutations (in platformAdmin/mutations.ts) all require pending_review as the current status and write a platform_admin.content_approved or platform_admin.content_rejected audit row:

MutationEffect on the resource
approveCampaignCampaign → draft (the owner can then send it)
approveTransactionalTransactional email → published (sets publishedAt)
rejectContentCampaign → draft, or transactional email → draft; reason is recorded

Both approval paths and the rejection path move the resource out of the queue; there is no separate "blocked" terminal state for reviewed content — rejection reverts to draft with the reason captured in the audit log.

Organization deletion walker

Deleting the organization is a hard, irreversible wipe of the entire data plane. It is owner-only: organizationSettings.remove (in apps/api/convex/organizations/settings.ts) checks session.role === 'owner' and schedules internal.organizations.deletion.walker.start. The walker lives in apps/api/convex/organizations/deletion/walker.ts and implements ADR-0025 (docs/adr/0025-organization-deletion-module-family.md).

Irreversible

remove deletes every per-organization table, including storage blobs, audit logs, and finally the singleton instanceSettings row that owns the org's existence. There is no soft-delete and no recovery short of a database restore.

How it works:

  • Ordered cascade. STEPS is an ordered list of the full per-organization table set, ~90 tables. Children are deleted before parents; storage-bearing tables (e.g. mediaAssets, semanticFiles, mailMessages) purge their blobs before the row delete; auditLogs is second-to-last (it keeps accumulating from delegated lifecycle calls during the wipe); instanceSettings is the terminal step.
  • One module per table. Each entry has a sibling step module under deletion/steps/<table>.ts implementing the deleteBatch(ctx) contract from steps/_common.ts (default batch size 100). A few tables delegate — for example the contacts step sweeps five child tables (contactTopics, contactPropertyValues, contactActivities, contactIdentities, contactRelationships) that are not standalone steps.
  • Self-scheduled hop. runStep runs one batch, re-fires itself while hasMore is true, advances to nextTable when a step drains, and terminates when there is no next step. The table argument is validated against the literal union in _common.ts, so a typo is a compile-time and boot-time error rather than a silent no-op.

Any new per-organization table added to the schema must also add a literal to OrganizationDeletionTable and a sibling step module, or the deletion cascade will leave it orphaned.

System health and in-app self-update (updater sidecar)

The System & Updates page (/dashboard/settings/system, platform-admin only) surfaces the current version, container health, the GitHub release check, the update flow, and update history.

Update check

apps/api/convex/systemUpdates.ts polls the GitHub Releases API (https://api.github.com/repos/wolvesdotink/owlat/releases/latest), caches the result in a singleton systemUpdates row (kind: 'latestCheck') with a 1-hour TTL, and computes updateAvailable by comparing the cached latest version to the running OWLAT_VERSION.

FunctionKindNotes
checkForUpdatesActionAdmin-gated; honours the 1-hour cache unless force: true; returns cache on GitHub 403/429 rate-limit
getLatestReleaseQueryReads the cached latest-check row
listUpdateHistoryQueryLists kind: 'updateRun' rows (most recent first; capped at 200)

A local build reporting OWLAT_VERSION=dev (or a non-semver value) is always treated as "no update available", because Owlat cannot tell whether a dev build is ahead of or behind the release tag.

Applying an update

Clicking Update now posts to the Nitro route apps/web/server/api/system/update.post.ts, which:

Verify the caller

The route re-checks platform-admin status via the session cookie, and requires INSTANCE_SECRET to be configured.

Download and verify the pinned compose file

It fetches docker-compose-<version>.yml and its .sha256 manifest from the GitHub Release, verifies the SHA-256 hash, and confirms the body references ghcr.io/wolvesdotink/web:<version>. A hash mismatch aborts the update.

Record the attempt

It records an updateRun row via recordUpdateStart, capturing versionFrom/versionTo.

Dispatch to the updater sidecar

It POSTs the verified compose template to http://updater:3200/update with the X-Instance-Secret header.

Record the outcome

It records recordUpdateFinish with success/failed and the step output. Both record mutations also emit a structured JSON log line for external log sinks.

The updater sidecar

The updater (apps/updater/src/index.ts, image ghcr.io/wolvesdotink/updater, listening on port 3200, exposed only on an internal docker network (no host port mapping) so only the web container can reach it via http://updater:3200) is the only component that touches Docker. It authenticates every request with a timing-safe X-Instance-Secret compare and reaches Docker through a least-privilege socket proxy (it can pull/recreate/list containers but not exec, build, or touch volumes).

EndpointMethodPurpose
/updatePOSTValidate the compose template against an image allowlist + dangerous-mount/privileged-mode rules, write it, docker compose pull, run a one-shot convex-deploy (function deploy before restart — a bad schema aborts here and the old containers keep serving), then docker compose up -d
/healthGETPer-service container state, image tag, and health (auth-required to prevent enumeration)
/configure-ipPOSTAttach/detach a floating IPv4 to eth0 and the campaign IP pool, then restart the MTA
/rotate-envPOSTRewrite rotated secrets in .env and force-recreate containers (requires all secret fields; tightest rate limit)

The System & Updates page reads /health (proxied by apps/web/server/api/internal/updater-health.get.ts) to render the container-health table. The convex-deploy-before-restart ordering is the safety property: an incompatible schema fails the deploy step and the running Web/MTA containers are never restarted against a half-deployed backend.

Operator-provided distribution

The in-app updater applies images and compose templates from the GitHub Release pipeline; it is the supported self-update path. The desktop app's auto-update key is an empty placeholder, so desktop auto-update is not production-ready and desktop distribution is operator-provided — see Desktop App.

For the operator-facing update playbook (rollback, recovery, automation), see Maintenance & Updates.

Dev-mode endpoints, crons, and migrations

Dev-mode endpoints

A small set of destructive shortcuts exist for local development. They are fail-closed: the guard in apps/api/convex/devShortcuts/_guard.ts treats the deployment as production unless the operator explicitly sets OWLAT_DEV_MODE=true in the Convex backend runtime env (npx convex env set OWLAT_DEV_MODE true). The CLI-side CONVEX_DEPLOYMENT env var is not propagated into the function runtime, so it cannot be used as a security boundary.

EndpointGuardPurpose
POST /dev/resetOWLAT_DEV_MODE + X-Instance-SecretWipe the instance back to a blank slate (tenant tables, BetterAuth tables, and local auth tables) so the signup flow can be re-exercised without docker compose down -v; idempotent
POST /seed/demoOWLAT_DEV_MODE + X-Instance-SecretSeed realistic demo content (idempotent)
forceVerifyDomainOWLAT_DEV_MODE + organization:manageForce a domain to verified with synthesised DNS results, bypassing live DNS lookups
Never enable OWLAT_DEV_MODE in production

/dev/reset deletes everything tenant-side — not just seed-tagged rows. Leave OWLAT_DEV_MODE unset on any deployment that holds real data.

Crons

Scheduled jobs are registered in apps/api/convex/crons.ts. Operationally relevant ones include:

CronIntervalJob
process scheduled campaigns1 minBackup dispatch for campaigns whose scheduledAt has passed
reconcile sending campaigns1 minAdvance sending → sent when no queued sends remain
process account deletions24 hProcess accounts past their 30-day grace period
cleanup webhook logsweeklyRemove delivery logs older than 30 days
cleanup sending reputation1 hAge out reputation buckets older than 60 days (both scopes)
cleanup soft-deleted contacts24 hPermanently delete contacts past their 30-day retention
knowledge graph maintenance24 hConfidence decay + expiry cleanup
channel health checks5 minProbe channel connectivity
agent metrics rollup5 minCompute queue depth/latency/error rates and evaluate circuit breakers
report analytics15 minReport instance analytics to the control plane

Counter-reconciliation crons (reconcile contact counts, reconcile topic member counts, refresh segment counts, reconcile transactional send counts) keep denormalized counters honest against any drift from partial failures.

Migrations

Schema/data migrations are one-shot internal mutations under apps/api/convex/migrations/, each exporting a run mutation. They follow the pre-prod "atomic breaking change" pattern: each migration is idempotent (re-running is a no-op once every row is rewritten) and, because deployments are single-org and resettable, most are written to run synchronously against .collect().

You invoke one by name through the Convex CLI, for example:

npx convex run migrations/0033_campaign_audience:run
Prefer reset over backfill in pre-prod

Owlat's convention is clean breaking changes plus a data reset rather than two-phase backward-compat migration ceremony. Several migrations note that on a freshly-seeded deployment they see no legacy rows and are a no-op record of intent — if a deployment ever does carry legacy rows that a migration cannot map, the migration throws loudly instead of silently dropping data.