ADR-007: Pluggable LLM Provider
Why Owlat uses the Vercel AI SDK with a provider abstraction layer instead of hardcoding a single LLM vendor.
- Status: Accepted
- Date: 2026-03-24
Context
The Agent Pipeline, Knowledge Graph, and semantic file system all require LLM capabilities — text generation, structured output, embeddings, and tool calling. Owlat needs to support multiple deployment scenarios:
- Cloud users who want the best available models (GPT-4o, Claude, etc.) via API keys
- Self-hosters who need fully offline operation with local models (via Ollama, vLLM, or similar)
- Enterprise users who route through internal API gateways with custom endpoints
The options considered:
- Hardcode OpenAI — simplest, but locks out self-hosters who cannot or will not use cloud APIs
- LangChain/LlamaIndex — heavy frameworks with large dependency trees, complex abstractions, and features Owlat does not need (chains, memory management, vector store adapters)
- Vercel AI SDK with provider abstraction — lightweight, already in the dependency tree (
@ai-sdk/openai), provider-agnostic, supports structured output and tool calling natively
Decision
Use the Vercel AI SDK as the LLM orchestration layer, wrapped in a thin provider abstraction configured via environment variables.
LLM_PROVIDER=openai # or: anthropic, ollama, custom
LLM_BASE_URL= # for ollama: http://localhost:11434/v1
LLM_API_KEY= # not needed for ollama
LLM_MODEL=gpt-4o # or: claude-sonnet-4-20250514, llama3, etc.
LLM_EMBEDDING_MODEL= # optional, defaults to provider's default
The AI SDK's createOpenAI() factory accepts a baseURL parameter, which means any OpenAI-compatible API (Ollama, vLLM, LiteLLM, Azure OpenAI) works without additional provider code. For Anthropic, the AI SDK has a dedicated @ai-sdk/anthropic provider.
All LLM calls go through a single getLLMProvider() function in apps/api/convex/lib/llmProvider.ts that reads these environment variables and returns a configured provider instance.
Consequences
Enables:
- Self-hosters run Ollama locally for fully offline, zero-cost AI features
- Cloud users choose their preferred provider (OpenAI, Anthropic, or any compatible API)
- Enterprise users point at internal gateways or proxy endpoints via
LLM_BASE_URL - Single configuration surface — four environment variables control all LLM behavior
- AI SDK is already a dependency (
@ai-sdk/openaiin bothapps/apiandapps/web)
Trade-offs:
- Quality varies significantly between providers — local models may produce lower-quality classifications and drafts than GPT-4o or Claude
- Embedding dimensions differ across models — vector indexes need to be configured for the chosen embedding model's dimensions
- No built-in RAG chains — retrieval-augmented generation is implemented as explicit Convex function steps (query vector index, pass results to prompt), which is more verbose but more debuggable
- AI SDK updates may introduce breaking changes, though the abstraction layer isolates application code