Code Tasks

Queue coding-agent tasks, watch them move from queued through review, and run the code-worker sidecar that opens the pull requests.

Code Tasks is a queue for the AI coding agent. A task carries a plain-language description of a feature or bug fix; the code-worker sidecar is designed to pick it up, clone your repository, write the change with a coding agent, run the test suite, and open a pull request for you to review. The dashboard at Inbox -> Code Tasks shows every task and its current stage. This is an admin surface — only owners and admins can create or cancel tasks.

The backend queue and the dashboard work today, and the code-worker sidecar authenticates with the deployment admin key, claims queued tasks, and reports progress back. The one real deployment gap is that the bundled VPS Compose template does not yet pass CONVEX_ADMIN_KEY to the code-worker service (see Setup and Current limitations).

What code tasks are

The feature has two halves: the queue that lives in the Owlat backend (the codeWorkTasks table), and the code-worker, a separate Docker container that does the actual coding work. The dashboard only reads and cancels tasks; the heavy lifting happens out-of-process in the sidecar.

Code Tasks is gated behind the Extract code tasks from inbox feature flag (inbox.codeTasks), which is off by default. Because the flag requires both inbox and ai.agent, enabling it also turns those on, and turning ai.agent back off cascades Code Tasks off again. When the flag is on, a Code Tasks entry appears in the Inbox section of the sidebar; when it's off, the sidebar entry disappears and the page itself is gated by the inbox.codeTasks feature requirement. See Feature flags for how to toggle it.

Automatic creation from inbound mail

When inbox.codeTasks is enabled, inbound messages classified as a feature request are automatically turned into queued code tasks via createFromInbound (idempotent per inbound message, so the same message never files two tasks). Manual creation via the codeWorkTasks.create mutation still exists in addition. See Current limitations.

Creating and cancelling a task

A task needs one thing: a description of what you want done. Write it the way you'd write a ticket — "Add a CSV export button to the contacts table", "Fix the off-by-one in the digest date range". The clearer the description, the better the coding agent's output, because that text is passed verbatim to the agent and becomes the pull-request summary.

Creating a task requires the organization:manage permission (owners and admins). The new task starts in the queued stage and waits for the code-worker to pick it up.

Creating a task in the dashboard

The Code Tasks dashboard has a New code task button — in the header and in the empty state — that opens a description modal and files the task via the codeWorkTasks.create mutation (the same mutation you can call from the Convex dashboard, a script, or your own tooling). The description needs at least 10 characters. The lifecycle below applies regardless of how the task was filed.

To cancel, open the Code Tasks dashboard and use the Cancel button on a task card. The dashboard's Cancel button only appears on queued or running tasks, but the underlying codeWorkTasks.cancel mutation rejects only merged tasks — testing and review tasks are still cancellable via the backend. Cancelling marks the task failed with the message "Cancelled by user" — there is no separate "cancelled" stage.

Task lifecycle and the PR/test/cost surface

A task moves through a fixed set of stages. The code-worker advances it as it works and the dashboard reflects each transition in real time.

StageBadgeWhat it means
queuedQueuedFiled and waiting for the code-worker to claim it.
runningRunning (pulsing)The worker has claimed the task, set up a branch, and is running the coding agent.
testingTestingThe change is committed; the worker is running the test suite.
reviewReviewA pull request has been opened. The change is ready for a human to review and merge.
mergedMergedThe pull request was merged.
failedFailedThe task failed, was cancelled, or produced no changes. The error is shown on the card.

Each task card surfaces the work product as it becomes available:

FieldWhen it appears
BranchOnce the worker creates the feature branch (named code-worker/<task-id>).
Pull requestA clickable link to the PR, once the worker opens one.
Test resultsA tail of the test-runner output captured during the testing stage.
ErrorOn failed tasks only — the reason the task stopped.
CostThe LLM spend for the run, shown to four decimal places (e.g. $0.0123). The field exists in the schema and UI, but the code-worker does not currently report llmCost (it is never sent on completeWithPR/markFailed), so in practice Cost stays empty for worker-run tasks.
CreatedA relative timestamp ("just now", "12m ago", "3d ago").
Reaching the merged stage

The worker opens the pull request and moves the task to review. To advance it to merged automatically, configure a GitHub webhook pointed at POST /webhooks/github with the secret GITHUB_WEBHOOK_SECRET. When the PR is merged, GitHub notifies Owlat and the task moves to merged via markMergedByPrUrl. The merged stage is reachable in production — review is not terminal.

Setup: the code-worker service and its environment

The code-worker ships as a Docker sidecar. In the bundled Compose stack it runs under the inbox-codetasks profile, which the inbox.codeTasks feature flag turns on. It connects to your Owlat deployment over the Convex client, polls for the next queued task (every 10 seconds by default), and processes one task at a time.

VPS Compose template omits CONVEX_ADMIN_KEY

The worker authenticates with the deployment admin key (CONVEX_ADMIN_KEY), exactly like the imap and mail-sync sidecars, which lets it call the internal queries and mutations it drives the task with (getNextQueued, claim, updateBranch, markTesting, completeWithPR, markFailed). The dev Compose stack supplies this var, but the bundled VPS Compose template currently does not pass CONVEX_ADMIN_KEY to the code-worker service block — so on a stock VPS deploy the worker throws CONVEX_ADMIN_KEY environment variable is required at startup until you add it. See Current limitations.

For each claimed task the worker:

Claim the task

It atomically claims the oldest queued task, moving it to running so no other poll picks it up.

Set up a workspace and branch

It clones your repository (shallow, on the base branch) into a per-task workspace and checks out a new code-worker/<task-id> branch.

Run the coding agent

It runs the coding agent (OpenCode by default) with your task description as the prompt, with a 10-minute timeout. If the agent fails or produces no file changes, the task is marked failed.

Commit and test

It commits the changes, moves the task to testing, and runs the test suite (vitest) with a 5-minute timeout, capturing the output tail.

Push and open a PR

It pushes the branch and — when GitHub credentials are configured — opens a pull request whose body includes your description and the test summary. The task moves to review with the PR URL, test results, and cost attached.

The container reads its configuration from environment variables:

VariablePurpose
CONVEX_URLThe Owlat backend the worker polls and reports to. Required.
CONVEX_ADMIN_KEYDeployment admin key the worker authenticates with to call the internal queue queries and mutations. Required — the worker throws at startup if it is unset.
GIT_REPO_URLThe clone URL of the repository the agent works in.
GIT_BASE_BRANCHBranch to clone and target the PR against (defaults to main).
GITHUB_TOKENGitHub token used to open the pull request.
GITHUB_OWNER / GITHUB_REPOThe repo the PR is opened against. If both are unset, the worker pushes the branch but skips PR creation.
LLM_BASE_URL / LLM_API_KEYLLM configuration for the coding agent. These are the only two LLM vars the worker child actually reads.
LLM_PROVIDER / LLM_MODELPassed through Compose env for convenience, but not read by the worker itself.
OPENCODE_BINPath to the OpenCode binary (defaults to opencode).
POLL_INTERVAL_MSHow often to poll for queued tasks (defaults to 10000).
WORKSPACE_ROOTWhere per-task clones live inside the container (defaults to /workspace).
The worker has write access to your repo

The code-worker clones, commits, pushes, and opens pull requests with the credentials you give it, and runs an autonomous coding agent against your codebase. Scope GITHUB_TOKEN to a single repository, point GIT_BASE_BRANCH at a branch you protect with required reviews, and treat every generated pull request as untrusted until you've read it.

For where these variables live in the deployment and how to enable the inbox-codetasks profile, see Self-Hosting Configuration and Feature flags.

Current limitations

Code Tasks is an early feature. Keep these in mind:

  • The VPS Compose template omits CONVEX_ADMIN_KEY. The worker authenticates with the deployment admin key, but the bundled VPS Compose template does not yet pass CONVEX_ADMIN_KEY to the code-worker service, so on a stock VPS deploy the worker won't start until you add that var.
  • Cost is not reported. The code-worker never sends llmCost, so the Cost field stays empty for worker-run tasks even though the schema and UI support it.
  • One task at a time, in order. The worker processes the oldest queued task per poll; there's no parallelism or prioritisation.
  • No automatic retry. A failed task stays failed. File a new task to try again.

For the broader picture of how Owlat's agent handles inbound conversations and drafts, see AI Agent & Autonomy and the Team Inbox.