Campaign Internals

How the campaign backend works: two status machines, send pre-flight, the send orchestrator, emailSends records, and the priority workpools.

This page is the developer reference for the campaign send path on the Convex backend — what happens between a user clicking Send and a recipient's mailbox. It covers the two lifecycle state machines, the send pre-flight gate, the orchestrator pipeline, the per-recipient emailSends records, and the rate-limited workpools. For the user-facing walkthrough see Send & Monitor a Campaign; for how the audience is resolved into recipients see Audience Internals.

Two-machine design

A campaign row carries two independent status columnsstatus (the campaign lifecycle) and abTestStatus (the A/B-test lifecycle). Each has its own legal-edges graph and its own single-writer module. This split is recorded in docs/adr/0017-campaign-lifecycle-modules.md.

Campaign lifecycle state machine

The campaign's status field is written by exactly one module — apps/api/convex/campaigns/lifecycle.ts. Its public transition mutation is the only writer of campaigns.status and the companion fields it patches alongside it (sentAt, cancelledAt, scheduledAt, contentBlockReason, and the stats-zero block on send). Direct ctx.db.patch of status anywhere else is a layering violation.

The six statuses

StatusMeaning
draftEditable; not queued. The default for a new campaign.
scheduledA future send time is set; the orchestrator is scheduled to fire then.
sendingThe orchestrator is running (or its sends are in flight).
sentTerminal. Every queued send has left the queue.
cancelledTerminal. The user cancelled before or during sending.
pending_reviewThe content scanner flagged the send as suspicious; held pending review.

The reducer rejects any transition not in LEGAL_EDGES. There is no exception thrown — an illegal call returns { ok: false, reason: 'illegal_edge' }, and a transition out of a terminal status returns reason: 'terminal'. Callers translate the outcome into a user response.

FromLegal to
draftscheduled, sending
scheduleddraft, cancelled, sending
sendingsent, draft, pending_review
pending_reviewsending, draft
sent(terminal)
cancelled(terminal)

A self-loop (from === to) is idempotent: it writes an audit-log row, returns applied: 'recorded', and emits no patch, no scheduler hop, and no PostHog event. This is what makes a re-fired scheduler tick safe.

Review approve/reject is not yet wired

The content scanner can move a suspicious campaign into pending_review, but the two edges out of it (pending_review → sending approve, pending_review → draft reject) have no product caller today — there is no review-queue UI and no approve/reject mutation in this OSS repo. The edges exist in the legal-edges graph and the audit actions (campaign.review_approved / campaign.review_rejected) are defined, but an operator would have to invoke the internal lifecycle.transition mutation by hand to release a held campaign.

Effects

The reducer is pure: given the loaded campaign, the typed input, and a userId, it returns a patch plus a list of effects. A separate runner applies them in order, atomically with the row patch. Four effect kinds exist:

EffectWhen it fires
audit_logEvery transition (including idempotent self-loops).
schedule_campaign_send_orchestratorOn → scheduled (delay = scheduledAt - at) and → sending (delay = 0). Consumer is campaigns.send.startCampaignSend.
track_eventOn user-driven → scheduled / → sending / → cancelled only. Captured to PostHog.
start_ab_test_if_enabledCross-machine kickoff on → sending when isABTest. Calls the A/B-test lifecycle's → testing.

The userId argument discriminates user-driven from system-source transitions. User-facing mutations pass session.userId; internal callers pass a 'system:<source>' tag (e.g. 'system:scheduler_tick', 'system:content_scan', 'system:orchestrator', 'system:send_completion'). The audit log records the tag verbatim, and the track_event effect is suppressed for any system:-prefixed caller — so background transitions never pollute PostHog.

"Campaign sent" semantics

The campaign_sent PostHog event fires on the → sending edge, not → sent. For the user, "the campaign was sent" means "the send was kicked off." The pending_review → sending edge is excluded from the event — it would be a review release, not a user-initiated send (and, as noted above, it has no caller yet anyway).

Reaching sent: batch completion

→ sent is not driven by the user. Each per-send workpool callback advances its own emailSends row, but the campaign itself only completes when its last queued send leaves the queue. tryCompleteCampaign (in the same module) is the shared guard, called from two places:

  • reconcileCampaignCompletion — the per-send entry point, invoked from the workpool completion callback after every send reaches a terminal status. A cheap no-op until the campaign is genuinely done.
  • reconcileSendingCampaigns — a safety-net cron (process scheduled campaigns runs every minute; this sweep also runs every minute as reconcile sending campaigns) that reconciles every campaign still in sending. It catches callbacks that errored or final sends transitioned by a provider webhook rather than the workpool.

tryCompleteCampaign only advances to sent when all of these hold: status is sending; the checkpointed send walker has finished streaming (no campaignSendJobs row still in phase resolving); an A/B test (if any) has reached winner_selected (otherwise the second-wave remainder send would be skipped); at least one emailSends row exists; and no emailSends row remains in queued. queued is the sole non-terminal send status, so its absence means the campaign is done.

A/B test lifecycle and the remainder-send guarantee

The abTestStatus column has its own machine in apps/api/convex/campaigns/abTestLifecycle.ts — same row, different column, separate graph. Its transition mutation is the only writer of abTestStatus and its companions (abTestConfig, abWinner, abWinnerSelectedAt, and the variant-stat reset block on disable).

FromLegal toTrigger
(none)pendingenableABTest mutation
pendingtestingCross-machine — the campaign lifecycle's start_ab_test_if_enabled effect on → sending
testingwinner_selecteddeclareABTestWinner (manual) or autoDeclareWinner (criteria-driven)
*nonedisableABTest (full reset)

Two effects guarantee the held-back audience is never orphaned:

  • schedule_auto_winner fires on → testing when winnerCriteria is not manual and testDuration (hours) is set. It schedules campaigns.abTest.autoDeclareWinner after the test window. Without it, an open_rate / click_rate campaign (the wizard default) would sit in testing forever and the 40–60% remainder audience would never be sent. Manual criteria instead rely on the user clicking a "choose winner" button, which the report page surfaces only for manual.
  • schedule_winner_remainder fires on → winner_selected. It schedules campaigns.send.sendCampaignWinnerToRemainder (the second-phase orchestrator, below) to deliver the winning variant's content to everyone who was held back.
What splitPercentage means

splitPercentage (validated 10–50) is the percentage per variant of the test cohort. The cohort is therefore 2 × splitPercentage % of the audience — e.g. 20 produces a 40% test cohort (20% A, 20% B) and a 60% held-back remainder.

Send pre-flight validation

Before any caller transitions a campaign to scheduled or sending, it runs the pre-flight in apps/api/convex/campaigns/preflight.ts. The lifecycle reducer trusts its input — it does not re-validate readiness — so pre-flight is the single gate that the four send/schedule entry points (and the orchestrator at fire time) share.

validateReadyToSend returns a PreflightResult union; the first failing check wins. The ordered checks are:

reasonCondition
no_templateemailTemplateId is unset
no_audienceaudience is unset
no_from_emailfromEmail is unset
sending_not_allowedThe instance's abuseStatus is suspended or banned
no_delivery_providerNo email delivery provider (EMAIL_PROVIDER + credentials, or a provider route) is configured. A connected external IMAP mailbox does not satisfy this.
domain_not_verifiedThe from-address domain is not verified
scheduled_in_pastscheduledAt is in the past (only checked when scheduledAt is supplied)

The validateReadyToSendQuery internal query wraps the same logic so the orchestrator can re-run pre-flight at fire time — catching state that drifted between the original schedule call and the scheduler tick (org went suspended, template deleted, domain verification expired).

The send orchestrator

campaigns.send.startCampaignSend (apps/api/convex/campaigns/send.ts) is the single live action that takes a campaign from scheduled | sending through the full prep pipeline. It is fired by three producers: the processScheduledCampaigns cron tick, the lifecycle's schedule_campaign_send_orchestrator effect, and a direct reschedule in campaigns/scheduling.ts. It is the only writer of emailSends.abVariant and the only first-phase caller of enqueueCampaignEmails.

Status-race guard

If the campaign was cancelled or draft (unscheduled), or already sent, the orchestrator returns skipped: true with a reason. It does not skip on sending — the sendNow path arrives already flipped to sending, and same-state transitions don't re-fire the orchestrator effect.

Re-run pre-flight

validateReadyToSendQuery runs again. A failure returns skipped with the pre-flight message — no recipients are touched.

Flip to sending

If the campaign is still scheduled, the orchestrator transitions it to sending via the lifecycle (userId system:scheduler_tick).

Content scan gate

The subject and rendered HTML are scanned for spam, phishing, and prohibited content. When a GOOGLE_SAFE_BROWSING_API_KEY is configured, link URLs are also checked against Google Safe Browsing (a failure of that check does not block the send). The combined score classifies the send: ≥ 40blocked, ≥ 15suspicious, else clean. A non-clean result is persisted to contentScanResults. A blocked send reverts the campaign to draft with a contentBlockReason (userId system:content_scan); a suspicious send transitions it to pending_review. Both short-circuit the orchestrator.

Archive snapshot

If archiving is enabled (per-campaign archiveEnabled, falling back to the resolved campaigns.archive feature flag) and SITE_URL is set, the orchestrator generates a public archive HTML snapshot, stores it with a 24-char token, and computes a viewInBrowserUrl that is threaded into each send.

Freeze the audience

freezeCampaignAudience snapshots a segment audience's filters at send time (ADR-0033), so the campaign reproduces the exact audience it targeted even if the segment is later edited. Topic audiences and already-frozen segments pass through unchanged.

Open a send-job checkpoint and stream recipients

The orchestrator no longer resolves the whole audience inline. It opens a campaignSendJobs checkpoint (createSendJob, in campaigns/sendJob.ts) and schedules resolveCampaignPage (campaigns/send.ts:611), a self-rescheduling walker that streams the frozen audience one bounded page at a time via resolveRecipientPage. Each page applies the single eligibility predicate — live contact → email present → not suppressed → double-opt-in confirmed (topic audiences only — segment audiences are never DOI-gated) — then groups by language and buckets each contact into the A/B split (below), enqueueing the page before fetching the next. The walker re-schedules itself until the audience is exhausted. The empty-audience → sent fast-path lives in the walker's last-page branch (campaigns/send.ts:884-895, userId system:orchestrator).

Recipients are grouped by their language preference (falling back to the template's defaultLanguage) for i18n. Within each language group, the A/B fanout (if isABTest and abTestStatus === 'testing') buckets each contact by hash into the test cohort or the held-back remainder, then variants A and B are enqueued separately. Non-A/B groups enqueue everyone with variant A's content, untagged.

A/B fanout details

resolveAbFanout (campaigns/sendVariantSplit.ts) returns a fanout only when the campaign isABTest, abTestStatus === 'testing', and a config is present — so a winner_selected or pending campaign never first-phase-splits. The split is a deterministic FNV-1a hash of (campaignId:contactId) (hashFraction) compared against testFraction = 2 × splitPercentage / 100: h < testFraction is the test cohort (sub-bucketed A/B at the cohort midpoint via variantForHash), and h >= testFraction is the held-back remainder. Recipients stream page-by-page through the checkpointed send walker (resolveCampaignPage) — they are never shuffled or materialized in full.

The held-back remainder is not sent in the first phase. After a winner is declared, the second-phase action campaigns.send.sendCampaignWinnerToRemainder resolves the audience again, excludes every contact that already has an emailSends row for this campaign (covering the test cohort and guarding against double-sends), and enqueues the winning variant's content to the rest. Both phases share the per-variant enqueueVariantBatch helper — the single site that creates emailSends rows and schedules enqueueCampaignEmails.

emailSends records and "ever-reached" stat semantics

Each recipient of a campaign gets one emailSends row (apps/api/convex/schema/campaigns.ts). The contact's email, first name, and last name are snapshotted at send time and never updated — the row is the audit trail of what was actually sent, not a live view of the contact. status is a single field with eight values; queued is the only non-terminal one.

statusMeaning
queuedEnqueued, not yet handed to the provider.
sentAccepted by the provider.
failedThe workpool action errored (distinct from a bounce).
deliveredProvider confirmed delivery.
openedAn open was tracked.
clickedA link click was tracked.
bouncedProvider accepted but the receiver rejected. bounceType records hard / soft.
complainedRecipient marked it as spam.

Status writes go through one writer — internal.delivery.sendLifecycle.transition with a SendRef { kind: 'campaign', id }. The legacy per-event mutations in delivery/sends.ts (the emailSends-table mutation module) were removed; that module now only reads, creates (single via create and batched via createBatch), and deletes.

Stats are "ever-reached", not current-status

getStatsByCampaign does not count delivered / opened / clicked by current status. Because a row's status advances as new events arrive (a delivered row becomes opened, an opened row can later become bounced), counting by status would silently drop any recipient who progressed past a bucket and break every rate denominator. Instead, those buckets are derived from monotonic timestamps: delivered counts any row carrying a deliveredAt, openedAt, or clickedAt; opened counts any row with an openedAt; clicked counts any row with a clickedAt or non-empty clickedLinks. The same "ever delivered" denominator is used for the per-variant A/B stats, so the two surfaces can't drift.

getStatsByCampaign and the other per-send aggregate reads bound their scans (typically take(10_000)); campaigns larger than that should rely on the denormalized stats* counters on the campaign row, which the send lifecycle bumps per recipient.

Campaign vs transactional workpools and rate limiting

Email sends run through two separate Convex workpools defined in apps/api/convex/delivery/workpool.ts:

PoolmaxParallelismUsed for
transactionalEmailPool30/secTime-sensitive transactional emails (Transactional API)
campaignEmailPool20/secBulk marketing campaign sends

Two pools keep transactional mail from being blocked behind a long campaign queue. The combined ~50/sec stays under provider rate ceilings with a safety margin. Both pools set retryActionsByDefault but maxAttempts: 1 — exactly one worker run and no pool-level retry. The send-side retry loop is owned solely by the dispatch helper (lib/sendProviders/dispatch.ts, ADR-0020); a pool retry would re-run the whole worker and risk duplicate sends. The 1s/base-2 backoff fields exist but are never exercised at maxAttempts: 1.

The orchestrator's enqueueVariantBatch schedules enqueueCampaignEmails, which calls campaignEmailPool.enqueueAction once per recipient, targeting internal.delivery.worker.sendSingleEmail. Each enqueue wires the completion callback to internal.delivery.sendCompletion.completeSend and carries a typed sendRef in the workpool context so the completion module can translate worker outcomes into send-lifecycle transitions uniformly. Standard sends are scheduled immediately in chunks of 50; timezone-aware sends (useRecipientTimezone plus scheduledHour / scheduledMinute) are grouped by IANA timezone and delayed so each zone lands at the recipient's local time (DST-correct, not offset-based).