Email Security
Content scanning, attachment validation, URL reputation checking, and malware detection for outbound emails.
All outbound email passes through multi-layered security scanning before delivery. The @owlat/email-scanner package provides all scanning logic as a shared library, consumed by both apps/api (Convex) and apps/mta.
Content Scanning
Analyzes email subject and HTML body for malicious or unwanted content. All scanners are pure TypeScript with zero dependencies, safe for the Convex serverless runtime.
Spam Keywords
40+ weighted patterns detect common spam phrases:
| Category | Examples | Severity |
|---|---|---|
| Financial scams | "free money", "million dollars", "wire transfer" | High (20 pts) |
| Urgency tactics | "act now", "limited time", "expires today" | Medium (10 pts) |
| Suspicious phrases | "click here", "no obligation", "satisfaction guaranteed" | Low (3 pts) |
Phishing URL Detection
- URL shortener detection (bit.ly, t.co, goo.gl, etc.)
- Anchor/href domain mismatch (link text says
paypal.combut links toevil.com) - Suspicious URL patterns (IP addresses in URLs, excessive subdomains)
Homoglyph / Unicode Spoofing
Detects mixed-script characters used to impersonate legitimate domains:
- ~50 confusable character mappings (Cyrillic
аU+0430 vs LatinaU+0061, GreekοU+03BF vs Latino, etc.) - Mixed-script detection in link text and URL hostnames
- Severity: high (20 pts) — homoglyph spoofing is a strong phishing indicator
Prohibited Content
Pattern matching for high-severity scam content:
- Advance fee fraud patterns ("beneficiary", "next of kin", "unclaimed funds")
- Credential phishing ("verify your account", "confirm your password", "update your payment")
Subject Line Analysis
- ALL CAPS abuse (>50% uppercase characters)
- Excessive punctuation (3+ consecutive
!or?)
Scoring
All flags contribute to a composite score:
| Severity | Points | Example |
|---|---|---|
| High | 20 | Homoglyph spoofing, credential phishing |
| Medium | 10 | URL shorteners, spam keywords |
| Low | 3 | Excessive punctuation, minor spam phrases |
Thresholds:
| Score | Level | Action |
|---|---|---|
| 0–14 | Clean | Allowed |
| 15–39 | Suspicious | Allowed with warning stored |
| 40+ | Blocked | Send rejected |
File Validation
Validates attachments and media uploads before storage or sending. Pure TypeScript, no external dependencies.
Magic Bytes Detection
Identifies real file type from binary headers (first 16 bytes), regardless of file extension:
| File Type | Magic Bytes |
|---|---|
| PE executable (.exe, .dll) | 4D 5A (MZ) |
| ELF binary | 7F 45 4C 46 |
| MSI installer | D0 CF 11 E0 (OLE compound) |
25 50 44 46 (%PDF) | |
| PNG | 89 50 4E 47 |
| JPEG | FF D8 FF |
| ZIP/DOCX/XLSX | 50 4B 03 04 (PK) |
Double Extension Detection
Catches attacks that hide executable extensions after document extensions:
invoice.pdf.exe— detected and blockedreport.docx.js— detected and blockedphoto.jpg.scr— detected and blocked
Extension Allowlist
Permitted file types (everything else is blocked):
| Category | Extensions |
|---|---|
| Images | .jpg, .jpeg, .png, .gif, .webp, .svg, .ico, .bmp, .tiff |
| Documents | .pdf, .doc, .docx, .odt, .rtf, .txt |
| Spreadsheets | .xls, .xlsx, .csv, .ods |
| Archives | .zip, .gz, .tar |
MIME Type Allowlist
Validates declared content types against an allowlist of safe MIME types (e.g., image/*, application/pdf, text/plain).
Integration Points
emailWorker.ts— validates each attachment buffer before sendingmediaAssets.ts— validates uploads before storing to Convex file storage
URL Reputation
Checks URLs in email content against the Google Safe Browsing API v4.
How It Works
- Extract all URLs from the email HTML content
- Normalize and hash URLs (SHA-256)
- Check cache (
urlReputationCachetable) for known verdicts - Batch-check uncached URLs against Safe Browsing API (up to 500 per request)
- Cache results (24h for clean, 1h for flagged)
- Convert flagged URLs to
ContentFlagentries with severity'high'
Threat Types
| Threat | Description |
|---|---|
MALWARE | Sites hosting malicious software |
SOCIAL_ENGINEERING | Phishing and deceptive sites |
UNWANTED_SOFTWARE | Sites distributing unwanted software |
POTENTIALLY_HARMFUL_APPLICATION | Mobile app threats |
Campaign vs Transactional
| Email Type | Behavior |
|---|---|
| Campaign sends | Blocking gate — flagged URLs prevent the campaign from sending |
| Transactional sends | Graceful — flags stored for review, delivery not blocked |
Configuration
Requires the GOOGLE_SAFE_BROWSING_API_KEY environment variable set in the Convex dashboard. The free tier allows 10,000 requests per day. When the API key is not configured, URL reputation checking is silently skipped.
ClamAV Malware Scanning
Scans attachment binary data for known malware signatures using ClamAV running as a Docker sidecar alongside the MTA.
Architecture
Convex emailWorker
│
│ POST /scan/attachment
│ (binary data + X-Filename header)
▼
MTA scan endpoint (src/routes/scan.ts)
│
├─ File type validation (magic bytes, extensions)
│
└─ ClamAV scan (TCP INSTREAM protocol)
│
▼
clamd (port 3310)
INSTREAM Protocol
The @owlat/email-scanner ClamAV client communicates with clamd over TCP using the INSTREAM protocol:
- Send
zINSTREAM\0 - Send chunked binary data (4-byte big-endian length prefix + data)
- Send zero-length terminator
- Read verdict:
stream: OK\0orstream: <virus_name> FOUND\0
Fail-Open Design
If ClamAV is unavailable (container not running, network error, timeout), the scan returns { clean: true, skipped: true } and logs a warning. This prevents ClamAV outages from blocking all email delivery.
Health Check
GET /scan/health returns the ClamAV connection status:
{ "clamav": "connected", "version": "ClamAV 1.3.0" }
or
{ "clamav": "unavailable", "error": "Connection refused" }
Configuration
See MTA System > ClamAV Sidecar for Docker setup and environment variables.
Feedback Loops
Spam complaints from ISP feedback loops are linked back to campaign content scan results:
- Resend/MTA webhook delivers a
complaintevent resendWebhook.tsprocesses the complaint and looks up the campaign's content scan result- Complaint count is incremented on the
contentScanResultsrecord - This data enables future pattern learning and complaint rate tracking per campaign
Integration Summary
| Scanner | Where Called | Blocking? | On Failure |
|---|---|---|---|
| Content (spam + homoglyphs) | emails.ts, transactionalEmails.ts | Yes | N/A (pure TS, always runs) |
| File type validation | emailWorker.ts, mediaAssets.ts | Yes | Block (safe default) |
| URL reputation (Safe Browsing) | emails.ts | Campaigns: yes, Transactional: no | Allow, skip silently |
| ClamAV malware scan | emailWorker.ts via MTA /scan/attachment | Yes | Allow, log warning (fail-open) |
Package Structure
packages/email-scanner/src/
├── content/ # Content analysis (pure TS, Convex-safe)
│ ├── index.ts # scanContent() orchestrator
│ ├── spamKeywords.ts # 40+ weighted spam patterns
│ ├── phishingUrls.ts # URL shorteners, anchor/href mismatch
│ ├── homoglyphs.ts # Unicode spoofing detection
│ ├── prohibitedContent.ts # Advance fee fraud, credential phishing
│ └── subjectAnalysis.ts # ALL CAPS, excessive punctuation
├── files/ # File type validation (pure TS)
│ ├── index.ts # validateFile() orchestrator
│ ├── magicBytes.ts # Binary header detection
│ ├── doubleExtension.ts # invoice.pdf.exe detection
│ └── filePolicy.ts # Allowlist/blocklist engine
├── urls/ # URL reputation (uses fetch)
│ ├── index.ts # checkUrlReputation() orchestrator
│ ├── safeBrowsing.ts # Google Safe Browsing API v4 client
│ └── cache.ts # Abstract cache interface
├── clamav/ # ClamAV TCP client (Node.js net, MTA only)
│ ├── index.ts # createClamClient() factory
│ ├── client.ts # clamd INSTREAM protocol implementation
│ └── pool.ts # Connection pooling
├── types.ts # Shared types (ContentFlag, ScanResult, etc.)
└── index.ts # Barrel export (excludes clamav/)
The clamav/ module uses Node.js net for TCP connections and is only importable from the MTA. The content/, files/, and urls/ modules are pure TS and work in both Convex and Node.js environments.
Key Files
| File | Purpose |
|---|---|
packages/email-scanner/src/content/index.ts | scanContent() — main content scanning orchestrator |
packages/email-scanner/src/files/index.ts | validateFile() — file validation orchestrator |
packages/email-scanner/src/urls/index.ts | checkUrlReputation() — URL reputation orchestrator |
packages/email-scanner/src/clamav/index.ts | createClamClient() — ClamAV client factory |
apps/api/convex/lib/contentScanner.ts | Thin re-export wrapper from @owlat/email-scanner |
apps/api/convex/emailWorker.ts | Attachment validation + ClamAV scan before sending |
apps/api/convex/mediaAssets.ts | Upload file validation (extension + MIME type) |
apps/api/convex/emails.ts | Content scanning + URL reputation in campaign send flow |
apps/api/convex/schema.ts | urlReputationCache table, extended contentScanResults |
apps/api/convex/resendWebhook.ts | Complaint feedback loop integration |
apps/mta/src/routes/scan.ts | MTA /scan/attachment and /scan/health endpoints |
apps/mta/docker-compose.yml | ClamAV sidecar configuration |