Code Tasks
Queue coding-agent tasks, watch them move from queued through review, and run the code-worker sidecar that opens the pull requests.
Code Tasks is a queue for the AI coding agent. A task carries a plain-language description of a feature or bug fix; the code-worker sidecar is designed to pick it up, clone your repository, write the change with a coding agent, run the test suite, and open a pull request for you to review. The dashboard at Inbox -> Code Tasks shows every task and its current stage. This is an admin surface — only owners and admins can create or cancel tasks.
The backend queue and the dashboard work today, and the code-worker sidecar authenticates with the deployment admin key, claims queued tasks, and reports progress back. The one real deployment gap is that the bundled VPS Compose template does not yet pass CONVEX_ADMIN_KEY to the code-worker service (see Setup and Current limitations).
What code tasks are
The feature has two halves: the queue that lives in the Owlat backend (the codeWorkTasks table), and the code-worker, a separate Docker container that does the actual coding work. The dashboard only reads and cancels tasks; the heavy lifting happens out-of-process in the sidecar.
Code Tasks is gated behind the Extract code tasks from inbox feature flag (inbox.codeTasks), which is off by default. Because the flag requires both inbox and ai.agent, enabling it also turns those on, and turning ai.agent back off cascades Code Tasks off again. When the flag is on, a Code Tasks entry appears in the Inbox section of the sidebar; when it's off, the sidebar entry disappears and the page itself is gated by the inbox.codeTasks feature requirement. See Feature flags for how to toggle it.
When inbox.codeTasks is enabled, inbound messages classified as a feature request are automatically turned into queued code tasks via createFromInbound (idempotent per inbound message, so the same message never files two tasks). Manual creation via the codeWorkTasks.create mutation still exists in addition. See Current limitations.
Creating and cancelling a task
A task needs one thing: a description of what you want done. Write it the way you'd write a ticket — "Add a CSV export button to the contacts table", "Fix the off-by-one in the digest date range". The clearer the description, the better the coding agent's output, because that text is passed verbatim to the agent and becomes the pull-request summary.
Creating a task requires the organization:manage permission (owners and admins). The new task starts in the queued stage and waits for the code-worker to pick it up.
The Code Tasks dashboard has a New code task button — in the header and in the empty state — that opens a description modal and files the task via the codeWorkTasks.create mutation (the same mutation you can call from the Convex dashboard, a script, or your own tooling). The description needs at least 10 characters. The lifecycle below applies regardless of how the task was filed.
To cancel, open the Code Tasks dashboard and use the Cancel button on a task card. The dashboard's Cancel button only appears on queued or running tasks, but the underlying codeWorkTasks.cancel mutation rejects only merged tasks — testing and review tasks are still cancellable via the backend. Cancelling marks the task failed with the message "Cancelled by user" — there is no separate "cancelled" stage.
Task lifecycle and the PR/test/cost surface
A task moves through a fixed set of stages. The code-worker advances it as it works and the dashboard reflects each transition in real time.
| Stage | Badge | What it means |
|---|---|---|
queued | Queued | Filed and waiting for the code-worker to claim it. |
running | Running (pulsing) | The worker has claimed the task, set up a branch, and is running the coding agent. |
testing | Testing | The change is committed; the worker is running the test suite. |
review | Review | A pull request has been opened. The change is ready for a human to review and merge. |
merged | Merged | The pull request was merged. |
failed | Failed | The task failed, was cancelled, or produced no changes. The error is shown on the card. |
Each task card surfaces the work product as it becomes available:
| Field | When it appears |
|---|---|
| Branch | Once the worker creates the feature branch (named code-worker/<task-id>). |
| Pull request | A clickable link to the PR, once the worker opens one. |
| Test results | A tail of the test-runner output captured during the testing stage. |
| Error | On failed tasks only — the reason the task stopped. |
| Cost | The LLM spend for the run, shown to four decimal places (e.g. $0.0123). The field exists in the schema and UI, but the code-worker does not currently report llmCost (it is never sent on completeWithPR/markFailed), so in practice Cost stays empty for worker-run tasks. |
| Created | A relative timestamp ("just now", "12m ago", "3d ago"). |
The worker opens the pull request and moves the task to review. To advance it to merged automatically, configure a GitHub webhook pointed at POST /webhooks/github with the secret GITHUB_WEBHOOK_SECRET. When the PR is merged, GitHub notifies Owlat and the task moves to merged via markMergedByPrUrl. The merged stage is reachable in production — review is not terminal.
Setup: the code-worker service and its environment
The code-worker ships as a Docker sidecar. In the bundled Compose stack it runs under the inbox-codetasks profile, which the inbox.codeTasks feature flag turns on. It connects to your Owlat deployment over the Convex client, polls for the next queued task (every 10 seconds by default), and processes one task at a time.
The worker authenticates with the deployment admin key (CONVEX_ADMIN_KEY), exactly like the imap and mail-sync sidecars, which lets it call the internal queries and mutations it drives the task with (getNextQueued, claim, updateBranch, markTesting, completeWithPR, markFailed). The dev Compose stack supplies this var, but the bundled VPS Compose template currently does not pass CONVEX_ADMIN_KEY to the code-worker service block — so on a stock VPS deploy the worker throws CONVEX_ADMIN_KEY environment variable is required at startup until you add it. See Current limitations.
For each claimed task the worker:
Claim the task
It atomically claims the oldest queued task, moving it to running so no other poll picks it up.
Set up a workspace and branch
It clones your repository (shallow, on the base branch) into a per-task workspace and checks out a new code-worker/<task-id> branch.
Run the coding agent
It runs the coding agent (OpenCode by default) with your task description as the prompt, with a 10-minute timeout. If the agent fails or produces no file changes, the task is marked failed.
Commit and test
It commits the changes, moves the task to testing, and runs the test suite (vitest) with a 5-minute timeout, capturing the output tail.
Push and open a PR
It pushes the branch and — when GitHub credentials are configured — opens a pull request whose body includes your description and the test summary. The task moves to review with the PR URL, test results, and cost attached.
The container reads its configuration from environment variables:
| Variable | Purpose |
|---|---|
CONVEX_URL | The Owlat backend the worker polls and reports to. Required. |
CONVEX_ADMIN_KEY | Deployment admin key the worker authenticates with to call the internal queue queries and mutations. Required — the worker throws at startup if it is unset. |
GIT_REPO_URL | The clone URL of the repository the agent works in. |
GIT_BASE_BRANCH | Branch to clone and target the PR against (defaults to main). |
GITHUB_TOKEN | GitHub token used to open the pull request. |
GITHUB_OWNER / GITHUB_REPO | The repo the PR is opened against. If both are unset, the worker pushes the branch but skips PR creation. |
LLM_BASE_URL / LLM_API_KEY | LLM configuration for the coding agent. These are the only two LLM vars the worker child actually reads. |
LLM_PROVIDER / LLM_MODEL | Passed through Compose env for convenience, but not read by the worker itself. |
OPENCODE_BIN | Path to the OpenCode binary (defaults to opencode). |
POLL_INTERVAL_MS | How often to poll for queued tasks (defaults to 10000). |
WORKSPACE_ROOT | Where per-task clones live inside the container (defaults to /workspace). |
The code-worker clones, commits, pushes, and opens pull requests with the credentials you give it, and runs an autonomous coding agent against your codebase. Scope GITHUB_TOKEN to a single repository, point GIT_BASE_BRANCH at a branch you protect with required reviews, and treat every generated pull request as untrusted until you've read it.
For where these variables live in the deployment and how to enable the inbox-codetasks profile, see Self-Hosting Configuration and Feature flags.
Current limitations
Code Tasks is an early feature. Keep these in mind:
- The VPS Compose template omits
CONVEX_ADMIN_KEY. The worker authenticates with the deployment admin key, but the bundled VPS Compose template does not yet passCONVEX_ADMIN_KEYto thecode-workerservice, so on a stock VPS deploy the worker won't start until you add that var. - Cost is not reported. The code-worker never sends
llmCost, so the Cost field stays empty for worker-run tasks even though the schema and UI support it. - One task at a time, in order. The worker processes the oldest queued task per poll; there's no parallelism or prioritisation.
- No automatic retry. A failed task stays failed. File a new task to try again.
For the broader picture of how Owlat's agent handles inbound conversations and drafts, see AI Agent & Autonomy and the Team Inbox.