← Wednesday's Workflows

Sales Intelligence System Architecture 🏗️

From 100 to 10,000 accounts with AI-powered insights and real-time signals

October 16, 2025
💼 Sales🏗️ Architecture📊 Scalable🤖 AI-Powered

From prompts to production sales intelligence.

Monday: 3 core prompts for account enrichment, signal detection, and insight generation. Tuesday: automated code with CRM sync. Wednesday: team workflows for SDRs, AEs, and RevOps. Thursday: complete technical architecture with multi-agent orchestration, ML pipelines, and enterprise-grade scaling patterns.

Key Assumptions

  • Monitor 10-1,000 target accounts initially, scaling to 10K+
  • Hourly signal detection for tier-1 accounts, daily for tier-2/3
  • GDPR/CCPA compliant data handling with PII redaction
  • Salesforce or HubSpot as primary CRM
  • Budget: $500-5K/month depending on scale

System Requirements

Functional

  • Account enrichment from ZoomInfo, LinkedIn, Clearbit, news APIs
  • Real-time signal detection (funding, hiring, tech stack changes)
  • AI-generated insights with confidence scores and citations
  • Bi-directional CRM sync (Salesforce, HubSpot)
  • Role-based dashboards (SDR, AE, RevOps)
  • Webhook notifications for high-priority signals
  • Historical trend analysis and pattern recognition

Non-Functional (SLOs)

latency p95 ms3000
freshness min60
availability percent99.5
enrichment accuracy percent95
signal detection recall percent92

💰 Cost Targets: {"per_account_per_month_usd":5,"per_enrichment_usd":0.15,"per_insight_usd":0.08}

Agent Layer

planner

L4

Orchestrate enrichment workflow, select data sources, prioritize signals

🔧 account_lookup, source_selector, cost_estimator

⚡ Recovery: Fallback to cached plan if planning fails, Use default source set if selector unavailable

enrichment_executor

L3

Execute data fetching from ZoomInfo, LinkedIn, Clearbit, news APIs

🔧 zoominfo_api, linkedin_api, clearbit_api, news_api, cache_lookup

⚡ Recovery: Retry with exponential backoff (3x), Use cached data if API fails, Skip source if timeout > 5s

signal_detector

L3

Detect buying signals: funding, hiring, tech changes, executive moves

🔧 pattern_matcher, llm_classifier, trend_analyzer, priority_ranker

⚡ Recovery: Use rule-based fallback if LLM fails, Return empty signals with alert if all detection fails

insight_generator

L3

Generate actionable sales insights with citations and next steps

🔧 llm_api, citation_extractor, action_recommender

⚡ Recovery: Use template-based insights if LLM fails, Return signals without insights as fallback

evaluator

L2

Validate data quality, check confidence thresholds, flag low-quality outputs

🔧 schema_validator, confidence_checker, citation_verifier, hallucination_detector

⚡ Recovery: Flag for human review if validation fails, Block sync to CRM if quality < threshold

guardrail

L1

PII redaction, policy compliance, rate limiting, safety filters

🔧 pii_detector, policy_engine, rate_limiter, audit_logger

⚡ Recovery: Block request if PII detection fails, Return 429 if rate limit exceeded, Alert security team on policy violation

ML Layer

Feature Store

Update: Hourly for tier-1, daily for tier-2/3

  • account_engagement_score (0-100)
  • signal_velocity (signals/week)
  • enrichment_staleness_days
  • tech_stack_similarity (0-1)
  • buying_intent_score (0-100)
  • historical_conversion_rate

Model Registry

Strategy: Semantic versioning with A/B testing

  • signal_classifier
  • priority_ranker
  • insight_generator

Observability

Metrics

  • 📊 enrichment_success_rate
  • 📊 signal_detection_recall
  • 📊 insight_relevance_score
  • 📊 llm_latency_p95_ms
  • 📊 api_error_rate
  • 📊 crm_sync_lag_seconds
  • 📊 cost_per_account_usd
  • 📊 cache_hit_rate

Dashboards

  • 📈 ops_dashboard
  • 📈 ml_dashboard
  • 📈 sales_kpi_dashboard
  • 📈 cost_dashboard

Traces

✅ Enabled

Deployment Variants

🚀 Startup

Infrastructure:

  • Vercel/Netlify for frontend + API routes
  • Supabase (Postgres + Auth + Storage)
  • Upstash Redis (serverless cache)
  • Direct LLM API calls (OpenAI/Anthropic)
  • No Kubernetes, no Kafka

Deploy in 1 day

Cost: $200-500/mo

Scales to 1K accounts

Manual monitoring with Vercel Analytics

🏢 Enterprise

Infrastructure:

  • Kubernetes (EKS/GKE) with auto-scaling
  • Kafka + Schema Registry for event streaming
  • Aurora Global Database (multi-region)
  • Private VPC with VPN/Direct Connect
  • SSO/SAML (Okta/Azure AD)
  • BYO KMS/HSM for encryption keys
  • Multi-LLM routing (OpenAI + Anthropic + self-hosted)
  • Dedicated support + SLA

Multi-region deployment (US + EU)

Cost: $12K+/mo

Scales to 100K+ accounts

99.9% SLA with auto-failover

SOC2 Type II + GDPR compliant

📈 Migration: Start with startup stack. At 1K accounts, migrate to queue-based. At 5K accounts, add agent orchestration. At 10K accounts, move to Kubernetes + multi-region. Total migration time: 6-9 months.

Risks & Mitigations

⚠️ LinkedIn scraping detected, account banned

Medium

✓ Mitigation: Use official Sales Navigator API where possible. Rotate proxy IPs. Implement rate limiting (10 req/min). Have fallback to ZoomInfo for employee data. Budget for multiple LinkedIn accounts.

⚠️ LLM API costs spiral out of control

High

✓ Mitigation: Set per-account cost caps ($0.50 max). Use cheaper models for low-priority accounts. Cache aggressively (24hr TTL). Monitor cost per enrichment daily. Alert if > $0.20/account.

⚠️ Signal detection false positives annoy sales team

Medium

✓ Mitigation: Require 2+ signals for high-priority alerts. Show confidence scores. Allow feedback (thumbs up/down). Retrain monthly on feedback data. Target < 10% false positive rate.

⚠️ CRM sync conflicts overwrite manual updates

Medium

✓ Mitigation: Use last-modified timestamp for conflict resolution. Never overwrite manually edited fields. Show diff before sync. Allow rollback within 24 hours. Log all changes to audit trail.

⚠️ GDPR violation from storing EU customer data in US

Low

✓ Mitigation: Deploy EU region (eu-west-1) for EU customers. Data residency checks in API gateway. No cross-region data transfer. Annual GDPR audit. DPA with all vendors.

⚠️ Key engineer leaves, system becomes unmaintainable

Medium

✓ Mitigation: Document architecture (this doc!). Use standard frameworks (LangGraph, not custom). Automated tests (80%+ coverage). Pair programming on critical paths. Hire 2+ engineers familiar with stack.

⚠️ Enrichment sources change APIs, break integrations

High

✓ Mitigation: Version all API clients. Monitor for breaking changes (webhooks). Test against sandbox environments. Have fallback sources. Budget 1 day/month for API maintenance.

Evolution Roadmap

1

Phase 1: MVP (0-3 months)

Weeks 1-12
  • Launch with 10-100 accounts
  • Prove signal detection accuracy (> 90%)
  • Get 5 sales team advocates
  • Achieve < $5/account cost
2

Phase 2: Scale (3-6 months)

Weeks 13-24
  • Scale to 1,000 accounts
  • Automate CRM sync (Salesforce API)
  • Add insight generation with LLM
  • Reduce cost to < $3/account
3

Phase 3: Enterprise (6-12 months)

Weeks 25-52
  • Scale to 10,000+ accounts
  • Multi-region deployment (US + EU)
  • 99.9% SLA with auto-failover
  • SOC2 Type II certified

Complete Systems Architecture

9-layer architecture from presentation to security

Presentation
Sales Dashboard (React)
Mobile App (React Native)
Slack/Teams Bots
API Gateway
Load Balancer (ALB/Cloud LB)
Rate Limiter (Redis)
Auth Service (Auth0/Cognito)
API Gateway (Kong/AWS API Gateway)
Agent Layer
Planner Agent
Enrichment Executor
Signal Detector
Insight Generator
Evaluator Agent
Guardrail Agent
ML Layer
Feature Store (Feast/Tecton)
Model Registry (MLflow)
Inference Service
Evaluation Pipeline
Prompt Store
Integration
CRM Adapter (Salesforce/HubSpot)
Data Enrichment APIs
News/Social APIs
Webhook Manager
Data
PostgreSQL (accounts, signals)
Redis (cache, queue)
S3/GCS (raw data, logs)
Vector DB (embeddings)
External
Salesforce API
ZoomInfo API
LinkedIn Sales Navigator
News APIs (AlphaVantage, NewsAPI)
LLM APIs (OpenAI, Anthropic)
Observability
Metrics (Prometheus/Datadog)
Logs (CloudWatch/Loki)
Traces (Jaeger/Honeycomb)
Dashboards (Grafana)
Security
KMS/HSM (secrets)
WAF
PII Redaction Service
Audit Logger
RBAC Service

Request Flow - Account Enrichment

SDRAPI GatewayPlanner AgentEnrichment ExecutorSignal DetectorEvaluatorCRM SyncPOST /enrich {account_id}Route to enrichment workflowExecute enrichment planFetch company dataFetch employee dataFetch recent newsAnalyze for signalsGenerate insightsValidate qualitySync to Salesforce200 OK + enriched data

Sales Intelligence - Hub Orchestration

7 Components
[RPC]Account enrichment request[Response]Raw account data[RPC]Signal analysis request[Response]Detected signals[RPC]Generate insights[Response]Actionable insights[RPC]Validate output[Response]Quality metrics[RPC]Safety check[Response]Compliance status[REST]Enriched account dataOrchestrator4 capabilitiesData Fetcher4 capabilitiesSignal Detector4 capabilitiesInsight Generator4 capabilitiesQuality Validator4 capabilitiesGuardrail Service4 capabilitiesCRM Sync4 capabilities
HTTP
REST
gRPC
Event
Stream
WebSocket

Sales Intelligence - Feedback & Refinement Network

7 Components
[Stream]Raw account data[Event]Signal candidates[Feedback]Refinement needed[Event]Validated signals[Stream]Draft insights[Feedback]Compliance issues[Event]Filtered output[REST]Approved insights[Event]Sales outcomes[Feedback]Model updates[Feedback]Effectiveness metrics[Event]Source quality metrics[Feedback]Source prioritizationData Fetcher4 capabilitiesSignal Detector4 capabilitiesInsight Generator4 capabilitiesQuality Validator4 capabilitiesGuardrail Service4 capabilitiesFeedback Loop4 capabilitiesCRM Sync4 capabilities
HTTP
REST
gRPC
Event
Stream
WebSocket

Data Flow - End-to-End

From SDR request to CRM sync in under 8 seconds

1
SDR0s
Clicks 'Enrich Account'account_id
2
API Gateway50ms
Authenticates, rate limitsJWT token
3
Guardrail Agent200ms
Policy check, PII scanSanitized request
4
Planner Agent100ms
Selects data sources, creates planExecution plan
5
Enrichment Executor2.5s
Fetches from ZoomInfo, LinkedIn, newsRaw data JSON
6
Signal Detector1.5s
Analyzes for buying signalsSignals array
7
Insight Generator2s
Generates insights + citationsInsight text
8
Evaluator Agent300ms
Validates quality, confidenceQuality score
9
CRM Sync800ms
Updates Salesforce recordSFDC object
10
Webhook200ms
Notifies Slack if high-prioritySlack message
11
SDR100ms
Sees enriched account + insightsUI update

Scaling Patterns

Volume
10-100 accounts
Pattern
Serverless Monolith
Architecture
Next.js API routes
Vercel/Netlify hosting
Supabase (Postgres + Auth)
Upstash Redis (cache)
Direct API calls to enrichment sources
Cost
$200/mo
5-8s
Volume
100-1,000 accounts
Pattern
Queue-Based Workers
Architecture
API server (Node/Python)
Redis queue (BullMQ/Celery)
Worker pool (3-5 workers)
PostgreSQL (managed)
S3/GCS for raw data
Cost
$800/mo
3-5s
Volume
1,000-10,000 accounts
Pattern
Multi-Agent Orchestration
Architecture
Load balancer
Agent framework (LangGraph/CrewAI)
SQS/Kafka message bus
Lambda/Cloud Run functions
RDS Multi-AZ
Vector DB (Pinecone/Weaviate)
Cost
$3,500/mo
2-4s
Volume
10,000+ accounts
Pattern
Enterprise Multi-Region
Architecture
Kubernetes (EKS/GKE)
Kafka + Schema Registry
Multi-LLM routing
Aurora Global Database
CDN (CloudFront/Fastly)
Private VPC, SSO/SAML
Cost
$12K+/mo
1-3s

Key Integrations

Salesforce CRM

Protocol: REST API (SOAP for legacy)
Enrich account data
Map to SFDC Account/Opportunity objects
Upsert via Bulk API (batch) or REST (real-time)
Handle conflicts with last-modified timestamp

ZoomInfo

Protocol: REST API v2
Search by domain or company name
Fetch company profile + contacts
Parse technographics, firmographics
Cache for 24 hours

LinkedIn Sales Navigator

Protocol: Unofficial API (scraping with Playwright)
Search for company employees
Scrape profile data (title, tenure, posts)
Detect hiring signals from job changes
Respect rate limits (10 req/min)

News APIs

Protocol: REST (NewsAPI, AlphaVantage)
Query by company name + keywords
Filter by date, relevance
Extract funding, partnership, product launch signals
Deduplicate articles

Security & Compliance

Failure Modes & Recovery

FailureFallbackImpactSLA
ZoomInfo API downUse Clearbit + LinkedIn as backup sourcesSlightly lower data quality, 90% coverage maintained99.5%
LLM API timeoutUse cached insights from similar accounts, or template-based insightsDegraded insight quality, but not blocked99.0%
Salesforce sync failsQueue for retry (3 attempts over 1 hour)Delayed sync, eventual consistency99.5%
Signal detection low confidenceFlag for human review, show raw data to SDRQuality maintained, manual effort required100% (safety first)
Database connection lostRead from replica, queue writesRead-only mode for 5-10 min99.9%

Advanced ML Patterns

Production ML engineering beyond basic LLM calls

RAG vs Fine-Tuning

Hallucination Detection

Evaluation Framework

Dataset Curation

Agentic RAG

Tech Stack Summary

Frontend
Next.js 14 (App Router), React, TailwindCSS, shadcn/ui
Backend
Node.js (Express/Fastify) or Python (FastAPI)
LLMs
OpenAI GPT-4, Anthropic Claude 3.5 Sonnet, DeepSeek (self-hosted fallback)
Orchestration
LangGraph or CrewAI
Database
PostgreSQL (Aurora/RDS), Redis (ElastiCache/Upstash)
Vector DB
Pinecone or Weaviate
Message Queue
Redis (BullMQ) for startup, Kafka for enterprise
Compute
Vercel/Netlify (startup), Lambda/Cloud Run (scale), EKS/GKE (enterprise)
Monitoring
Datadog or Grafana + Prometheus
Security
Auth0/Cognito (auth), AWS KMS (secrets), Presidio (PII detection)
CRM Integration
jsforce (Salesforce), HubSpot Node SDK
Data Enrichment
ZoomInfo API, Clearbit API, Playwright (LinkedIn scraping)
🏗️

Need Architecture Review?

We'll audit your sales intelligence system, identify bottlenecks, and show you how to scale 10x while cutting costs 50%.