Desktop App & Advanced Agents
Technical architecture for the Owlat desktop app, visualization agent, graduated autonomy, and coding agents.
Desktop App & Advanced Agents
The final phase of the roadmap brings Owlat to the desktop as a native communication channel, adds specialized agents for visualization and coding, and introduces graduated autonomy where organizations fine-tune how much decision-making they delegate to AI.
Desktop App
Why Tauri
The desktop app is built with Tauri v2:
| Tauri v2 | Electron | |
|---|---|---|
| Binary size | ~5–10 MB | ~150–200 MB |
| Memory usage | Native webview | Bundled Chromium |
| License | MIT | MIT |
| Backend | Rust | Node.js |
| Auto-update | Built-in updater | electron-updater |
| System tray | Native API | Tray API |
Tauri uses the OS native webview (WebKit on macOS, WebView2 on Windows, WebKitGTK on Linux) — no bundled Chromium. The app shell wraps the existing Nuxt web application, reusing ~95% of the UI code through packages/ui.
Architecture
apps/desktop/
src-tauri/ # Rust backend
src/
main.rs # Tauri app setup, window management
tray.rs # System tray with unread count badge
notifications.rs # Native OS notifications
updater.rs # Auto-update via Tauri updater plugin
Cargo.toml
src/ # Frontend (shares web app code)
main.ts # Tauri-specific entry, Convex client setup
tauri.conf.json # Window config, permissions, deep links
package.json
Key features
System tray — persistent tray icon showing unread count from the verification queue. Clicking opens the app to the inbox view.
Native notifications — when a new item enters the verification queue or a colleague sends a chat message, the OS notification system triggers. Notifications are driven by Convex reactive queries — the desktop app subscribes to unread count changes.
Deep links — owlat://thread/{threadId} opens a specific conversation. Deep links work from email notifications, browser bookmarks, and external tools.
Owlat as a channel — the desktop app is a first-class channel adapter in the same architecture. Internal chat messages flow through the same unifiedMessages table and the same agent pipeline as email or SMS. The chat adapter is native to Convex — messages are Convex mutations, delivery is real-time subscriptions. No WebSocket server needed beyond what Convex already provides.
Internal chat
Team communication within the desktop app:
- Direct messages — one-to-one conversations between organization members
- Channels — topic-based group conversations (similar to Slack channels)
- Thread-linked — every conversation can reference a
conversationThread, linking internal discussion to customer communication
Internal chat messages are unifiedMessages with channel: 'chat' and memberId instead of contactId. The Knowledge Graph extracts knowledge from internal conversations the same way it does from customer emails.
Quick queries
Organization members can ask the system questions directly from the desktop app:
- "What is our current MRR?" → queries billing data
- "When did we last talk to Acme Corp?" → queries conversation threads
- "Show me the contract we signed with them" → semantic file search
Quick queries route through the Agent Pipeline with a specialized "query" classification. The agent retrieves context from the Knowledge Graph and file system, generates an answer with source citations, and renders it inline.
Visualization Agent
A specialized agent that takes data and builds interactive visualizations. It operates within the same Agent Pipeline — its outputs are artifacts that can land in the verification queue, render in conversations, or be pinned to dashboards.
How it works
User: "Show me our email delivery rates for the last 30 days"
→ Agent Pipeline classifies as visualization request
→ Visualization agent queries emailSends for the time range
→ Generates self-contained HTML/CSS/JS
→ Frontend renders in a sandboxed iframe
→ User can interact: hover for details, filter, change time range
The visualization agent generates raw HTML, CSS, and JavaScript — giving it full creative flexibility to produce any visual output. Unlike constrained charting libraries, this approach lets the agent build exactly what the data needs: charts, dashboards, data tables, animated progress trackers, interactive maps, or completely custom visualizations.
- Full flexibility — the agent writes HTML/CSS/JS directly, not limited to a charting library's vocabulary
- Interactive — JavaScript enables hover tooltips, click filtering, animated transitions, real-time updates
- Portable — visualizations are self-contained HTML bundles that can be saved, shared, embedded in reports, or pinned to dashboards
- Sandboxed — rendered in a sandboxed
<iframe>withsandbox="allow-scripts"— no access to the parent page, Convex client, cookies, or navigation. The iframe communicates only viapostMessagefor resize events
Agent-generated code runs in a sandboxed iframe with no access to the host application. The sandbox attribute blocks top-navigation, form submission, popups, and same-origin access. Only allow-scripts is enabled so the visualization's own JavaScript can execute. This prevents any injected code from accessing user sessions, Convex data, or the DOM of the parent application.
Schema
visualizations: defineTable({
organizationId: v.string(),
title: v.string(),
description: v.optional(v.string()),
html: v.string(), // Self-contained HTML document (HTML + CSS + JS)
dataQuery: v.optional(v.string()), // Convex query to refresh data
pinned: v.boolean(), // Pinned to dashboard
createdBy: v.string(), // User or agent ID
threadId: v.optional(v.id('conversationThreads')),
createdAt: v.number(),
updatedAt: v.number(),
})
.index('by_organization', ['organizationId'])
.index('by_organization_pinned', ['organizationId', 'pinned'])
Rendering
The frontend renders visualizations via a sandboxed iframe:
<iframe
:srcdoc="visualization.html"
sandbox="allow-scripts"
referrerpolicy="no-referrer"
style="width: 100%; border: none;"
/>
The iframe uses srcdoc (no network fetch) and sandbox="allow-scripts" (JS executes, but no DOM access to the parent). A postMessage listener handles resize events so the iframe height adapts to content.
Adaptive Dashboard
The dashboard is not a static grid of widgets. It adapts to what the user needs right now — different in the morning than in the evening, different on Monday than on Friday, different for a support lead than for a marketing manager.
Context signals
The dashboard assembles itself from context signals:
| Signal | What it tells us | Example effect |
|---|---|---|
| Time of day | Morning = planning, evening = review | Morning: today's scheduled campaigns, overnight inbound queue. Evening: today's performance summary, pending items for tomorrow |
| Day of week | Monday = catch-up, Friday = wrap-up | Monday: weekend inbound backlog, week's campaign schedule. Friday: weekly metrics, unresolved threads |
| Role | What the user is responsible for | Support lead sees queue depth and SLA status. Marketing manager sees campaign performance and audience growth |
| Recent activity | What the user has been working on | If you spent the last hour on a campaign, the dashboard surfaces its real-time delivery stats |
| Pending items | What needs attention | Verification queue items, campaigns waiting for approval, threads assigned to you |
| Anomalies | What's unusual right now | Bounce rate spike, unusual inbound volume, delivery issues with a specific ISP |
How it works
The dashboard is composed of cards — each card is a self-contained unit that fetches its own data via Convex reactive queries. The dashboard layout engine decides which cards to show, in what order, based on the context signals above.
dashboardLayouts: defineTable({
organizationId: v.string(),
memberId: v.string(), // Per-user layout
// Context-driven layout rules
rules: v.array(v.object({
condition: v.object({
timeRange: v.optional(v.object({ // e.g., { start: '06:00', end: '12:00' }
start: v.string(),
end: v.string(),
})),
dayOfWeek: v.optional(v.array(v.number())), // 0=Sun, 1=Mon, etc.
role: v.optional(v.string()),
}),
cards: v.array(v.object({
type: v.string(), // 'verification_queue', 'campaign_performance', 'inbound_summary', etc.
size: v.union(v.literal('small'), v.literal('medium'), v.literal('large')),
config: v.optional(v.string()), // Card-specific config (JSON)
})),
priority: v.number(), // Higher priority rules override lower ones
})),
// Pinned cards always show regardless of context
pinnedCards: v.optional(v.array(v.object({
type: v.string(),
size: v.union(v.literal('small'), v.literal('medium'), v.literal('large')),
config: v.optional(v.string()),
}))),
updatedAt: v.number(),
})
.index('by_member', ['organizationId', 'memberId'])
Card types
| Card | What it shows |
|---|---|
verification_queue | Pending items count, oldest item age, category breakdown |
campaign_performance | Active/recent campaign metrics (opens, clicks, delivery rate) |
inbound_summary | Inbound volume, auto-resolved vs human-reviewed, avg response time |
audience_growth | Contact growth trend, topic subscription changes |
anomaly_alert | Bounce spikes, delivery issues, unusual patterns |
scheduled_campaigns | Upcoming sends with countdown timers |
thread_assignments | Open threads assigned to this user, by priority |
weekly_summary | Week-over-week comparison of key metrics |
knowledge_recent | Recently extracted knowledge entries (new facts, decisions) |
visualization | A pinned visualization (renders the sandboxed HTML/CSS/JS) |
Smart defaults
New users get a sensible default layout generated from their role and the organization's active features. The agent can also suggest layout changes:
"You check campaign performance every morning but it's at the bottom of your dashboard. Want me to move it to the top for your morning view?"
Users can also manually drag, resize, pin, or remove cards — and create multiple named layouts they switch between. The adaptive engine learns from usage patterns: cards the user always expands get promoted, cards they always collapse get demoted or hidden.
Agent-generated dashboard cards
The Visualization Agent can produce cards that appear on the dashboard. When a user asks "Show me weekly churn by segment" and pins the result, it becomes a live dashboard card — re-querying the data on each load and re-rendering in its sandboxed iframe.
Agent Health & Monitoring
As the agent pipeline processes messages across multiple organizations concurrently, the system needs centralized monitoring — not just for debugging, but for safety. A misbehaving LLM provider, a spike in low-confidence classifications, or a surge in rejections are signals that require automated response.
Metrics
The monitoring system tracks per-organization metrics via Convex scheduled functions:
| Metric | What it measures | How it is used |
|---|---|---|
| Queue depth | Unprocessed inbound messages | Alerts when backlog grows beyond threshold |
| Processing latency | Time from message receipt to draft ready | Detects LLM provider slowdowns |
| Classification accuracy | Human corrections vs auto-classifications | Tracks model quality over time |
| Auto-approve ratio | Auto-approved vs human-reviewed | Shows autonomy adoption |
| Rejection rate | Drafts rejected by humans, by category | Identifies categories needing improvement |
| LLM cost | Token usage per organization, per step | Budget tracking and alerting |
| Error rate | Failed pipeline runs, by step | Detects systemic issues |
Circuit breakers
Automated safety mechanisms that activate when metrics cross thresholds:
LLM provider failure — if the LLM error rate exceeds 20% over a 5-minute window, the pipeline pauses auto-responses and queues all messages for human review. The circuit breaker resets after 5 successful calls. This prevents cascading failures from sending garbled responses.
Confidence degradation — if the average classification confidence drops below 0.6 for an organization over the last 50 messages, the system alerts the admin. Common cause: the organization's communication patterns have shifted and the agent needs updated context or knowledge entries.
Rejection spike — if humans reject more than 40% of drafts in a category over the last 24 hours, the system automatically tightens the auto-approval threshold for that category and surfaces a recommendation: "Agent drafts for billing questions are being rejected frequently — consider adding more billing context to the knowledge base."
Rate limiting — per-organization caps on daily LLM calls prevent runaway costs. Configurable in agentConfig:
rateLimits: v.optional(v.object({
maxDailyLLMCalls: v.number(), // e.g., 1000
maxConcurrentPipelines: v.number(), // e.g., 5
alertThresholdPercent: v.number(), // e.g., 80 — alert at 80% of daily cap
})),
Dashboard integration
Agent health surfaces as dashboard cards in the Adaptive Dashboard:
| Card | What it shows |
|---|---|
agent_health | Pipeline status (healthy/degraded/paused), active circuit breakers, error rate |
processing_queue | Current queue depth, average processing time, oldest unprocessed message |
cost_breakdown | LLM token usage by step (classification, drafting, extraction), daily/weekly trend |
accuracy_trend | Classification accuracy over time, rejection rate by category, confidence distribution |
These cards are available to all roles but prioritized for admin users by the adaptive layout engine.
Graduated Autonomy
Organizations control how much decision-making they delegate to agents. The system earns trust incrementally.
Per-category rules
autonomyRules: defineTable({
organizationId: v.string(),
category: v.string(), // "support", "sales", "billing", etc.
autoApproveThreshold: v.number(), // Confidence threshold (0–1)
maxDailyAutoActions: v.number(), // Safety cap
requiresHumanAbove: v.optional(v.number()), // e.g., dollar amount
enabled: v.boolean(),
createdAt: v.number(),
updatedAt: v.number(),
})
.index('by_organization', ['organizationId'])
.index('by_organization_and_category', ['organizationId', 'category'])
Example configuration
| Category | Threshold | Daily Cap | Notes |
|---|---|---|---|
| Simple acknowledgments | 0.95 | 50 | "Thanks, we'll look into it" |
| Support FAQ | 0.90 | 30 | Standard answers with data lookup |
| Billing questions | 0.85 | 20 | Account-specific responses |
| Sales inquiries | — | — | Always human review |
| Complaints | — | — | Always human review + escalation |
The Agent Pipeline's routing step (Step 5) consults these rules instead of a single global threshold. As confidence in the system grows, organizations expand the boundaries — enabling auto-approval for more categories and lowering thresholds.
Feedback loop
When a human rejects an agent draft:
- The rejection reason is stored
- The agent's confidence calibration adjusts for similar future messages
- Recurring rejections for a category automatically tighten the threshold
- The system surfaces patterns: "Agent drafts for billing questions are rejected 30% of the time — consider adding more context to the billing knowledge base"
Coding Agents
The most experimental phase: agents that take feature requests and produce working code.
Architecture
Coding agents run as a Docker sidecar — not within Convex — because they need file system access and git operations:
Feature request classified by Agent Pipeline
→ agentActions entry with type 'code_request'
→ Code worker picks up the task:
1. Creates a branch
2. AI SDK generates code with tool calling (read files, write files, run tests)
3. Runs test suite
4. Creates a PR with full context
→ PR link posted to verification queue
→ Developer reviews, improves, merges
The code worker communicates with Convex via the client SDK — reading task state, updating progress, posting results. It is an optional service in the Docker Compose:
code-worker:
build: ./apps/code-worker
volumes:
- workspace:/workspace
environment:
- CONVEX_URL=http://convex:3210
- LLM_PROVIDER=${LLM_PROVIDER}
- LLM_BASE_URL=${LLM_BASE_URL}
- LLM_API_KEY=${LLM_API_KEY}
- LLM_MODEL=${LLM_MODEL}
profiles:
- dev # Only enabled for development-focused deployments
Coding agents are the furthest-out part of the vision. The architecture is designed to support them, but the implementation will evolve significantly based on advances in AI code generation capabilities.