Platform Operations | Owlat Docs

Operator reference for abuse status and the sending gate, the platform-admin roster, content review, org deletion, in-app self-update, dev endpoints, crons, and migrations.

This page documents the operational surface that sits behind the product: the abuse-status lifecycle that gates sending, the platform-admin role and its queries, the content review queue, the organization-deletion walker, system health and in-app self-update, and the dev-mode endpoints, crons, and migrations that keep a deployment maintainable. It is written for operators and contributors working on a self-hosted Owlat instance.

One organization per deployment

Owlat is single-org-per-deployment. The data plane belongs to exactly one organization, and the abuse-status, settings, and stats described here all live on the singleton instanceSettings row. There is no multi-tenant org switcher.

No platform-admin dashboard in this repo

This OSS repository ships one platform-admin-gated page: Settings → System & Updates (/dashboard/settings/system). The richer control-plane UI that consumed the queries below was extracted into a separate private repo. Treat the roster, abuse-status, and content-review functions documented here as a backend/API surface you call from your own tooling or scripts — not as an in-product dashboard.

Abuse status lifecycle and the sending gate

Every deployment carries a single abuse status on instanceSettings.abuseStatus. It drives whether the instance is allowed to send. The four statuses, defined in apps/api/convex/organizations/abuseStatus.ts, form a severity ladder:

Status	Severity	Sending	Meaning
`clean`	0	Allowed	Normal operation (also the default when no status was ever written)
`warned`	1	Allowed	Advisory warning issued — still fully operational
`suspended`	2	Blocked	All sending blocked; the account stays accessible
`banned`	3	Blocked	Account fully disabled; terminal for internal writers

The read and write halves are deliberately split into sibling modules (ADR-0011, docs/adr/0011-abuse-status-modules.md):

Abuse gate (apps/api/convex/organizations/abuseGate.ts) is the home of the sending predicate. It exports two surfaces: isSendingAllowed(status), the pure predicate that the live send paths call (the transactional dispatch path in transactional/dispatch.ts), and requireSendingAllowed(ctx), a mutation-context helper that fetches instanceSettings and throws on suspended/banned. The campaign send path enforces the same predicate inline through its pre-flight query (campaigns/preflight.ts, which checks abuseStatus against suspended/banned directly) rather than calling the helper. Either way the gate always runs — it is not behind a feature flag.
Abuse status (apps/api/convex/organizations/abuseStatus.ts) is the only writer of abuseStatus and its companion fields (abuseStatusReason, abuseStatusChangedAt, abuseStatusChangedBy).

Transition rules

There are two write entry points. The internal-writer path enforces severity rules; the admin path bypasses them.

Entry point	Used by	Rules
`transition` (internal mutation)	MTA circuit breaker, reputation auto-enforcement	`banned` is terminal; downgrades are refused except down to `clean` (the auto-recover path); a same-state attempt is recorded but not patched
`adminOverride` (internal mutation)	The platform-admin `setOrganizationStatus` mutation	Bypasses all severity rules — an admin can demote a `banned` org back to `clean` to resolve an appeal

Both paths write an abuse_status_changed audit-log row on every call, including same-state no-ops (so "circuit breaker tripped again while already warned" is observable). A missing instanceSettings row returns { ok: false, reason: 'no_settings_row' } rather than throwing — an early-deployment edge case.

What flips the status automatically

Trigger	Source	Target status
Reputation reaches `critical` risk	`analytics/sendingReputation.ts` → `evaluateAutoEnforce` (hourly cron) → `autoEnforceReputation`	`suspended`
Reputation reaches `high` risk	`analytics/sendingReputation.ts` → `evaluateAutoEnforce` (hourly cron) → `autoEnforceReputation`	`warned`
MTA circuit breaker trips	`webhooks/dispatcher.ts` (`internal.circuit_breaker_tripped`)	`warned`

Reputation is derived on read (a rolling 30-day window — see Deliverability Infrastructure), but auto-enforcement is deliberately kept off the read hot path: an hourly evaluate reputation auto-enforce cron (apps/api/convex/crons.ts) calls evaluateAutoEnforce, which summarizes the org window and schedules autoEnforceReputation when risk is high/critical. A separate hourly cleanup sending reputation cron (recalculateAll) ages out >60-day buckets. Because severity downgrades are refused, a critical → warned transition never silently relaxes an existing suspension.

Platform admin role, roster, and operational queries

Platform admins are rows in the platformAdmins table, keyed by BetterAuth user id. Two roles exist: admin and superadmin. The first admin is created with seedPlatformAdmin (an internal mutation that only succeeds while the table is empty and always seeds a superadmin); after that, superadmins manage the roster.

Gating is centralized in apps/api/convex/platformAdmin/platformAdmin.ts:

Function	Kind	Purpose
`requirePlatformAdmin(ctx)`	Helper	Throws `FORBIDDEN` unless the caller is in `platformAdmins`; returns `{ authUserId, email, role }`
`isPlatformAdmin`	Public query	Boolean nav helper, safe for anonymous callers
`isPlatformAdminByUserId`	Internal query	Admin check by user id, for HTTP/action contexts that lack a `QueryCtx`
`seedPlatformAdmin`	Internal mutation	One-shot bootstrap of the first `superadmin`

The web app gates the single admin page with the platform-admin route middleware (apps/web/app/middleware/platform-admin.ts), which calls isPlatformAdmin and redirects non-admins to /dashboard.

Operational queries

All of these live in apps/api/convex/platformAdmin/queries.ts and start with requirePlatformAdmin. They read the singleton instanceSettings row plus derived reputation/stats:

Query	Returns
`getPlatformStats`	Contact count, abuse status, 30-day send/delivery/bounce/complaint aggregates, signups-by-day
`getDeliveryStats`	In-flight `sending` campaigns, `scheduled` count, aggregate delivery metrics, last-7-day blocked-email counts by reason
`getOrganizationDetail`	Settings, reputation summary, blocked-email counts (`bounced`/`complained`/`manual`), recent content scans and campaigns
`listFlaggedOrganizations`	The instance entry, but only when `warned`/`suspended`/`banned` or reputation risk is `high`/`critical`
`listAllOrganizations`	The instance entry with optional `search`/`statusFilter`
`listAllUsers`	User profiles with optional `search`
`listAllDomains`	Domains with optional `statusFilter`/`search`
`listRecentAbuse`	Flagged content scans (`suspicious`/`blocked`) plus `pending_review` campaigns
`getAdminAuditLog`	Up to 200 `platform_admin`-resource audit-log rows, with admin-email resolution
`listPlatformAdmins`	The full admin roster

Billing is out of OSS

getBillingOverview exists for backward compatibility and returns empty data — invoice-based billing was removed during the billing simplification and now lives only in the separate private repo. Do not build against it.

Roster + status mutations

These live in apps/api/convex/platformAdmin/mutations.ts:

Mutation	Effect
`setOrganizationStatus`	Sets abuse status via `adminOverride` (bypassing severity rules); writes both `abuse_status_changed` and the legacy `platform_admin.org_status_changed` audit rows
`addPlatformAdmin`	Adds an admin (`superadmin`-only; rejects duplicates)
`removePlatformAdmin`	Removes an admin (`superadmin`-only; cannot remove yourself)

Content review queue (approve/reject pending content)

Campaigns and transactional emails can land in a pending_review status (for example when content scanning flags them — see Email Security). The review surface is platform-admin only.

getContentReviewQueue (in platformAdmin/queries.ts) returns the pending items joined with their latest content-scan result, plus the recently-reviewed log:

{
  pending: [{ type: 'campaign' | 'transactional', id, name, subject, scan: { score, level } | null, ... }],
  pendingCount: number,
  recentlyReviewed: [{ action, details, userId, createdAt }],
}

The three review mutations (in platformAdmin/mutations.ts) all require pending_review as the current status and write a platform_admin.content_approved or platform_admin.content_rejected audit row:

Mutation	Effect on the resource
`approveCampaign`	Campaign → `draft` (the owner can then send it)
`approveTransactional`	Transactional email → `published` (sets `publishedAt`)
`rejectContent`	Campaign → `draft`, or transactional email → `draft`; reason is recorded

Both approval paths and the rejection path move the resource out of the queue; there is no separate "blocked" terminal state for reviewed content — rejection reverts to draft with the reason captured in the audit log.

Organization deletion walker

Deleting the organization is a hard, irreversible wipe of the entire data plane. It is owner-only: organizationSettings.remove (in apps/api/convex/organizations/settings.ts) checks session.role === 'owner' and schedules internal.organizations.deletion.walker.start. The walker lives in apps/api/convex/organizations/deletion/walker.ts and implements ADR-0025 (docs/adr/0025-organization-deletion-module-family.md).

Irreversible

remove deletes every per-organization table, including storage blobs, audit logs, and finally the singleton instanceSettings row that owns the org's existence. There is no soft-delete and no recovery short of a database restore.

How it works:

Ordered cascade. STEPS is an ordered list of the full per-organization table set, ~90 tables. Children are deleted before parents; storage-bearing tables (e.g. mediaAssets, semanticFiles, mailMessages) purge their blobs before the row delete; auditLogs is second-to-last (it keeps accumulating from delegated lifecycle calls during the wipe); instanceSettings is the terminal step.
One module per table. Each entry has a sibling step module under deletion/steps/<table>.ts implementing the deleteBatch(ctx) contract from steps/_common.ts (default batch size 100). A few tables delegate — for example the contacts step sweeps five child tables (contactTopics, contactPropertyValues, contactActivities, contactIdentities, contactRelationships) that are not standalone steps.
Self-scheduled hop. runStep runs one batch, re-fires itself while hasMore is true, advances to nextTable when a step drains, and terminates when there is no next step. The table argument is validated against the literal union in _common.ts, so a typo is a compile-time and boot-time error rather than a silent no-op.

Any new per-organization table added to the schema must also add a literal to OrganizationDeletionTable and a sibling step module, or the deletion cascade will leave it orphaned.

System health and in-app self-update (updater sidecar)

The System & Updates page (/dashboard/settings/system, platform-admin only) surfaces the current version, container health, the GitHub release check, the update flow, and update history.

Update check

apps/api/convex/systemUpdates.ts polls the GitHub Releases API (https://api.github.com/repos/wolvesdotink/owlat/releases/latest), caches the result in a singleton systemUpdates row (kind: 'latestCheck') with a 1-hour TTL, and computes updateAvailable by comparing the cached latest version to the running OWLAT_VERSION.

Function	Kind	Notes
`checkForUpdates`	Action	Admin-gated; honours the 1-hour cache unless `force: true`; returns cache on GitHub 403/429 rate-limit
`getLatestRelease`	Query	Reads the cached latest-check row
`listUpdateHistory`	Query	Lists `kind: 'updateRun'` rows (most recent first; capped at 200)

A local build reporting OWLAT_VERSION=dev (or a non-semver value) is always treated as "no update available", because Owlat cannot tell whether a dev build is ahead of or behind the release tag.

Applying an update

Clicking Update now posts to the Nitro route apps/web/server/api/system/update.post.ts, which:

Verify the caller

The route re-checks platform-admin status via the session cookie, and requires INSTANCE_SECRET to be configured.

Download and verify the pinned compose file

It fetches docker-compose-<version>.yml and its .sha256 manifest from the GitHub Release, verifies the SHA-256 hash, and confirms the body references ghcr.io/wolvesdotink/web:<version>. A hash mismatch aborts the update.

Record the attempt

It records an updateRun row via recordUpdateStart, capturing versionFrom/versionTo.

Dispatch to the updater sidecar

It POSTs the verified compose template to http://updater:3200/update with the X-Instance-Secret header.

Record the outcome

It records recordUpdateFinish with success/failed and the step output. Both record mutations also emit a structured JSON log line for external log sinks.

The updater sidecar

The updater (apps/updater/src/index.ts, image ghcr.io/wolvesdotink/updater, listening on port 3200, exposed only on an internal docker network (no host port mapping) so only the web container can reach it via http://updater:3200) is the only component that touches Docker. It authenticates every request with a timing-safe X-Instance-Secret compare and reaches Docker through a least-privilege socket proxy (it can pull/recreate/list containers but not exec, build, or touch volumes).

Endpoint	Method	Purpose
`/update`	POST	Validate the compose template against an image allowlist + dangerous-mount/privileged-mode rules, write it, `docker compose pull`, run a one-shot `convex-deploy` (function deploy before restart — a bad schema aborts here and the old containers keep serving), then `docker compose up -d`
`/health`	GET	Per-service container state, image tag, and health (auth-required to prevent enumeration)
`/configure-ip`	POST	Attach/detach a floating IPv4 to `eth0` and the campaign IP pool, then restart the MTA
`/rotate-env`	POST	Rewrite rotated secrets in `.env` and force-recreate containers (requires all secret fields; tightest rate limit)

The System & Updates page reads /health (proxied by apps/web/server/api/internal/updater-health.get.ts) to render the container-health table. The convex-deploy-before-restart ordering is the safety property: an incompatible schema fails the deploy step and the running Web/MTA containers are never restarted against a half-deployed backend.

Operator-provided distribution

The in-app updater applies images and compose templates from the GitHub Release pipeline; it is the supported self-update path. The desktop app's auto-update key is an empty placeholder, so desktop auto-update is not production-ready and desktop distribution is operator-provided — see Desktop App.

For the operator-facing update playbook (rollback, recovery, automation), see Maintenance & Updates.

Dev-mode endpoints, crons, and migrations

Dev-mode endpoints

A small set of destructive shortcuts exist for local development. They are fail-closed: the guard in apps/api/convex/devShortcuts/_guard.ts treats the deployment as production unless the operator explicitly sets OWLAT_DEV_MODE=true in the Convex backend runtime env (npx convex env set OWLAT_DEV_MODE true). The CLI-side CONVEX_DEPLOYMENT env var is not propagated into the function runtime, so it cannot be used as a security boundary.

Endpoint	Guard	Purpose
`POST /dev/reset`	`OWLAT_DEV_MODE` + `X-Instance-Secret`	Wipe the instance back to a blank slate (tenant tables, BetterAuth tables, and local auth tables) so the signup flow can be re-exercised without `docker compose down -v`; idempotent
`POST /seed/demo`	`OWLAT_DEV_MODE` + `X-Instance-Secret`	Seed realistic demo content (idempotent)
`forceVerifyDomain`	`OWLAT_DEV_MODE` + `organization:manage`	Force a domain to `verified` with synthesised DNS results, bypassing live DNS lookups

Never enable OWLAT_DEV_MODE in production

/dev/reset deletes everything tenant-side — not just seed-tagged rows. Leave OWLAT_DEV_MODE unset on any deployment that holds real data.

Crons

Scheduled jobs are registered in apps/api/convex/crons.ts. Operationally relevant ones include:

Cron	Interval	Job
process scheduled campaigns	1 min	Backup dispatch for campaigns whose `scheduledAt` has passed
reconcile sending campaigns	1 min	Advance `sending → sent` when no queued sends remain
process account deletions	24 h	Process accounts past their 30-day grace period
cleanup webhook logs	weekly	Remove delivery logs older than 30 days
cleanup sending reputation	1 h	Age out reputation buckets older than 60 days (both scopes)
cleanup soft-deleted contacts	24 h	Permanently delete contacts past their 30-day retention
knowledge graph maintenance	24 h	Confidence decay + expiry cleanup
channel health checks	5 min	Probe channel connectivity
agent metrics rollup	5 min	Compute queue depth/latency/error rates and evaluate circuit breakers
report analytics	15 min	Report instance analytics to the control plane

Counter-reconciliation crons (reconcile contact counts, reconcile topic member counts, refresh segment counts, reconcile transactional send counts) keep denormalized counters honest against any drift from partial failures.

Migrations

Schema/data migrations are one-shot internal mutations under apps/api/convex/migrations/, each exporting a run mutation. They follow the pre-prod "atomic breaking change" pattern: each migration is idempotent (re-running is a no-op once every row is rewritten) and, because deployments are single-org and resettable, most are written to run synchronously against .collect().

You invoke one by name through the Convex CLI, for example:

npx convex run migrations/0033_campaign_audience:run

Prefer reset over backfill in pre-prod

Owlat's convention is clean breaking changes plus a data reset rather than two-phase backward-compat migration ceremony. Several migrations note that on a freshly-seeded deployment they see no legacy rows and are a no-op record of intent — if a deployment ever does carry legacy rows that a migration cannot map, the migration throws loudly instead of silently dropping data.

Maintenance & Updates

The operator playbook for updating, rollback, and recovery from a failed update.

Deliverability Infrastructure

How sending reputation is derived and how risk levels drive abuse auto-enforcement.

Convex Backend

The backend conventions these operational modules are built on.

Feature flags

How runtime features map to docker profiles on a self-hosted deployment.