Desktop App & Advanced Agents

Technical architecture for the Owlat desktop app, visualization agent, graduated autonomy, and coding agents.

Desktop App & Advanced Agents

The final phase of the roadmap brings Owlat to the desktop as a native communication channel, adds specialized agents for visualization and coding, and introduces graduated autonomy where organizations fine-tune how much decision-making they delegate to AI.

Desktop App

Why Tauri

The desktop app is built with Tauri v2:

Tauri v2Electron
Binary size~5–10 MB~150–200 MB
Memory usageNative webviewBundled Chromium
LicenseMITMIT
BackendRustNode.js
Auto-updateBuilt-in updaterelectron-updater
System trayNative APITray API

Tauri uses the OS native webview (WebKit on macOS, WebView2 on Windows, WebKitGTK on Linux) — no bundled Chromium. The app shell wraps the existing Nuxt web application, reusing ~95% of the UI code through packages/ui.

Architecture

apps/desktop/
  src-tauri/                 # Rust backend
    src/
      main.rs                # Tauri app setup, window management
      tray.rs                # System tray with unread count badge
      notifications.rs       # Native OS notifications
      updater.rs             # Auto-update via Tauri updater plugin
    Cargo.toml
  src/                       # Frontend (shares web app code)
    main.ts                  # Tauri-specific entry, Convex client setup
  tauri.conf.json            # Window config, permissions, deep links
  package.json

Key features

System tray — persistent tray icon showing unread count from the verification queue. Clicking opens the app to the inbox view.

Native notifications — when a new item enters the verification queue or a colleague sends a chat message, the OS notification system triggers. Notifications are driven by Convex reactive queries — the desktop app subscribes to unread count changes.

Deep linksowlat://thread/{threadId} opens a specific conversation. Deep links work from email notifications, browser bookmarks, and external tools.

Owlat as a channel — the desktop app is a first-class channel adapter in the same architecture. Internal chat messages flow through the same unifiedMessages table and the same agent pipeline as email or SMS. The chat adapter is native to Convex — messages are Convex mutations, delivery is real-time subscriptions. No WebSocket server needed beyond what Convex already provides.

Internal chat

Team communication within the desktop app:

  • Direct messages — one-to-one conversations between organization members
  • Channels — topic-based group conversations (similar to Slack channels)
  • Thread-linked — every conversation can reference a conversationThread, linking internal discussion to customer communication

Internal chat messages are unifiedMessages with channel: 'chat' and memberId instead of contactId. The Knowledge Graph extracts knowledge from internal conversations the same way it does from customer emails.

Quick queries

Organization members can ask the system questions directly from the desktop app:

  • "What is our current MRR?" → queries billing data
  • "When did we last talk to Acme Corp?" → queries conversation threads
  • "Show me the contract we signed with them" → semantic file search

Quick queries route through the Agent Pipeline with a specialized "query" classification. The agent retrieves context from the Knowledge Graph and file system, generates an answer with source citations, and renders it inline.

Visualization Agent

A specialized agent that takes data and builds interactive visualizations. It operates within the same Agent Pipeline — its outputs are artifacts that can land in the verification queue, render in conversations, or be pinned to dashboards.

How it works

User: "Show me our email delivery rates for the last 30 days"
  → Agent Pipeline classifies as visualization request
  → Visualization agent queries emailSends for the time range
  → Generates self-contained HTML/CSS/JS
  → Frontend renders in a sandboxed iframe
  → User can interact: hover for details, filter, change time range

The visualization agent generates raw HTML, CSS, and JavaScript — giving it full creative flexibility to produce any visual output. Unlike constrained charting libraries, this approach lets the agent build exactly what the data needs: charts, dashboards, data tables, animated progress trackers, interactive maps, or completely custom visualizations.

  • Full flexibility — the agent writes HTML/CSS/JS directly, not limited to a charting library's vocabulary
  • Interactive — JavaScript enables hover tooltips, click filtering, animated transitions, real-time updates
  • Portable — visualizations are self-contained HTML bundles that can be saved, shared, embedded in reports, or pinned to dashboards
  • Sandboxed — rendered in a sandboxed <iframe> with sandbox="allow-scripts" — no access to the parent page, Convex client, cookies, or navigation. The iframe communicates only via postMessage for resize events
Sandboxing is critical

Agent-generated code runs in a sandboxed iframe with no access to the host application. The sandbox attribute blocks top-navigation, form submission, popups, and same-origin access. Only allow-scripts is enabled so the visualization's own JavaScript can execute. This prevents any injected code from accessing user sessions, Convex data, or the DOM of the parent application.

Schema

visualizations: defineTable({
  organizationId: v.string(),
  title: v.string(),
  description: v.optional(v.string()),
  html: v.string(),                // Self-contained HTML document (HTML + CSS + JS)
  dataQuery: v.optional(v.string()), // Convex query to refresh data
  pinned: v.boolean(),             // Pinned to dashboard
  createdBy: v.string(),           // User or agent ID
  threadId: v.optional(v.id('conversationThreads')),
  createdAt: v.number(),
  updatedAt: v.number(),
})
  .index('by_organization', ['organizationId'])
  .index('by_organization_pinned', ['organizationId', 'pinned'])

Rendering

The frontend renders visualizations via a sandboxed iframe:

<iframe
  :srcdoc="visualization.html"
  sandbox="allow-scripts"
  referrerpolicy="no-referrer"
  style="width: 100%; border: none;"
/>

The iframe uses srcdoc (no network fetch) and sandbox="allow-scripts" (JS executes, but no DOM access to the parent). A postMessage listener handles resize events so the iframe height adapts to content.

Adaptive Dashboard

The dashboard is not a static grid of widgets. It adapts to what the user needs right now — different in the morning than in the evening, different on Monday than on Friday, different for a support lead than for a marketing manager.

Context signals

The dashboard assembles itself from context signals:

SignalWhat it tells usExample effect
Time of dayMorning = planning, evening = reviewMorning: today's scheduled campaigns, overnight inbound queue. Evening: today's performance summary, pending items for tomorrow
Day of weekMonday = catch-up, Friday = wrap-upMonday: weekend inbound backlog, week's campaign schedule. Friday: weekly metrics, unresolved threads
RoleWhat the user is responsible forSupport lead sees queue depth and SLA status. Marketing manager sees campaign performance and audience growth
Recent activityWhat the user has been working onIf you spent the last hour on a campaign, the dashboard surfaces its real-time delivery stats
Pending itemsWhat needs attentionVerification queue items, campaigns waiting for approval, threads assigned to you
AnomaliesWhat's unusual right nowBounce rate spike, unusual inbound volume, delivery issues with a specific ISP

How it works

The dashboard is composed of cards — each card is a self-contained unit that fetches its own data via Convex reactive queries. The dashboard layout engine decides which cards to show, in what order, based on the context signals above.

dashboardLayouts: defineTable({
  organizationId: v.string(),
  memberId: v.string(),           // Per-user layout
  // Context-driven layout rules
  rules: v.array(v.object({
    condition: v.object({
      timeRange: v.optional(v.object({    // e.g., { start: '06:00', end: '12:00' }
        start: v.string(),
        end: v.string(),
      })),
      dayOfWeek: v.optional(v.array(v.number())),  // 0=Sun, 1=Mon, etc.
      role: v.optional(v.string()),
    }),
    cards: v.array(v.object({
      type: v.string(),           // 'verification_queue', 'campaign_performance', 'inbound_summary', etc.
      size: v.union(v.literal('small'), v.literal('medium'), v.literal('large')),
      config: v.optional(v.string()),  // Card-specific config (JSON)
    })),
    priority: v.number(),         // Higher priority rules override lower ones
  })),
  // Pinned cards always show regardless of context
  pinnedCards: v.optional(v.array(v.object({
    type: v.string(),
    size: v.union(v.literal('small'), v.literal('medium'), v.literal('large')),
    config: v.optional(v.string()),
  }))),
  updatedAt: v.number(),
})
  .index('by_member', ['organizationId', 'memberId'])

Card types

CardWhat it shows
verification_queuePending items count, oldest item age, category breakdown
campaign_performanceActive/recent campaign metrics (opens, clicks, delivery rate)
inbound_summaryInbound volume, auto-resolved vs human-reviewed, avg response time
audience_growthContact growth trend, topic subscription changes
anomaly_alertBounce spikes, delivery issues, unusual patterns
scheduled_campaignsUpcoming sends with countdown timers
thread_assignmentsOpen threads assigned to this user, by priority
weekly_summaryWeek-over-week comparison of key metrics
knowledge_recentRecently extracted knowledge entries (new facts, decisions)
visualizationA pinned visualization (renders the sandboxed HTML/CSS/JS)

Smart defaults

New users get a sensible default layout generated from their role and the organization's active features. The agent can also suggest layout changes:

"You check campaign performance every morning but it's at the bottom of your dashboard. Want me to move it to the top for your morning view?"

Users can also manually drag, resize, pin, or remove cards — and create multiple named layouts they switch between. The adaptive engine learns from usage patterns: cards the user always expands get promoted, cards they always collapse get demoted or hidden.

Agent-generated dashboard cards

The Visualization Agent can produce cards that appear on the dashboard. When a user asks "Show me weekly churn by segment" and pins the result, it becomes a live dashboard card — re-querying the data on each load and re-rendering in its sandboxed iframe.

Agent Health & Monitoring

As the agent pipeline processes messages across multiple organizations concurrently, the system needs centralized monitoring — not just for debugging, but for safety. A misbehaving LLM provider, a spike in low-confidence classifications, or a surge in rejections are signals that require automated response.

Metrics

The monitoring system tracks per-organization metrics via Convex scheduled functions:

MetricWhat it measuresHow it is used
Queue depthUnprocessed inbound messagesAlerts when backlog grows beyond threshold
Processing latencyTime from message receipt to draft readyDetects LLM provider slowdowns
Classification accuracyHuman corrections vs auto-classificationsTracks model quality over time
Auto-approve ratioAuto-approved vs human-reviewedShows autonomy adoption
Rejection rateDrafts rejected by humans, by categoryIdentifies categories needing improvement
LLM costToken usage per organization, per stepBudget tracking and alerting
Error rateFailed pipeline runs, by stepDetects systemic issues

Circuit breakers

Automated safety mechanisms that activate when metrics cross thresholds:

LLM provider failure — if the LLM error rate exceeds 20% over a 5-minute window, the pipeline pauses auto-responses and queues all messages for human review. The circuit breaker resets after 5 successful calls. This prevents cascading failures from sending garbled responses.

Confidence degradation — if the average classification confidence drops below 0.6 for an organization over the last 50 messages, the system alerts the admin. Common cause: the organization's communication patterns have shifted and the agent needs updated context or knowledge entries.

Rejection spike — if humans reject more than 40% of drafts in a category over the last 24 hours, the system automatically tightens the auto-approval threshold for that category and surfaces a recommendation: "Agent drafts for billing questions are being rejected frequently — consider adding more billing context to the knowledge base."

Rate limiting — per-organization caps on daily LLM calls prevent runaway costs. Configurable in agentConfig:

rateLimits: v.optional(v.object({
  maxDailyLLMCalls: v.number(),       // e.g., 1000
  maxConcurrentPipelines: v.number(), // e.g., 5
  alertThresholdPercent: v.number(),  // e.g., 80 — alert at 80% of daily cap
})),

Dashboard integration

Agent health surfaces as dashboard cards in the Adaptive Dashboard:

CardWhat it shows
agent_healthPipeline status (healthy/degraded/paused), active circuit breakers, error rate
processing_queueCurrent queue depth, average processing time, oldest unprocessed message
cost_breakdownLLM token usage by step (classification, drafting, extraction), daily/weekly trend
accuracy_trendClassification accuracy over time, rejection rate by category, confidence distribution

These cards are available to all roles but prioritized for admin users by the adaptive layout engine.

Graduated Autonomy

Organizations control how much decision-making they delegate to agents. The system earns trust incrementally.

Per-category rules

autonomyRules: defineTable({
  organizationId: v.string(),
  category: v.string(),           // "support", "sales", "billing", etc.
  autoApproveThreshold: v.number(),  // Confidence threshold (0–1)
  maxDailyAutoActions: v.number(),   // Safety cap
  requiresHumanAbove: v.optional(v.number()),  // e.g., dollar amount
  enabled: v.boolean(),
  createdAt: v.number(),
  updatedAt: v.number(),
})
  .index('by_organization', ['organizationId'])
  .index('by_organization_and_category', ['organizationId', 'category'])

Example configuration

CategoryThresholdDaily CapNotes
Simple acknowledgments0.9550"Thanks, we'll look into it"
Support FAQ0.9030Standard answers with data lookup
Billing questions0.8520Account-specific responses
Sales inquiriesAlways human review
ComplaintsAlways human review + escalation

The Agent Pipeline's routing step (Step 5) consults these rules instead of a single global threshold. As confidence in the system grows, organizations expand the boundaries — enabling auto-approval for more categories and lowering thresholds.

Feedback loop

When a human rejects an agent draft:

  1. The rejection reason is stored
  2. The agent's confidence calibration adjusts for similar future messages
  3. Recurring rejections for a category automatically tighten the threshold
  4. The system surfaces patterns: "Agent drafts for billing questions are rejected 30% of the time — consider adding more context to the billing knowledge base"

Coding Agents

The most experimental phase: agents that take feature requests and produce working code.

Architecture

Coding agents run as a Docker sidecar — not within Convex — because they need file system access and git operations:

Feature request classified by Agent Pipeline
  → agentActions entry with type 'code_request'
  → Code worker picks up the task:
    1. Creates a branch
    2. AI SDK generates code with tool calling (read files, write files, run tests)
    3. Runs test suite
    4. Creates a PR with full context
  → PR link posted to verification queue
  → Developer reviews, improves, merges

The code worker communicates with Convex via the client SDK — reading task state, updating progress, posting results. It is an optional service in the Docker Compose:

code-worker:
  build: ./apps/code-worker
  volumes:
    - workspace:/workspace
  environment:
    - CONVEX_URL=http://convex:3210
    - LLM_PROVIDER=${LLM_PROVIDER}
    - LLM_BASE_URL=${LLM_BASE_URL}
    - LLM_API_KEY=${LLM_API_KEY}
    - LLM_MODEL=${LLM_MODEL}
  profiles:
    - dev  # Only enabled for development-focused deployments
Experimental

Coding agents are the furthest-out part of the vision. The architecture is designed to support them, but the implementation will evolve significantly based on advances in AI code generation capabilities.