Skip to main content
← Wednesday's Workflows

Patient Intake System Architecture πŸ—οΈ

Production-grade design: HIPAA-compliant agents, FHIR integration, 100-10K patients/day

April 17, 2025
21 min read
βš•οΈ HealthcareπŸ—οΈ ArchitectureπŸ“Š ScalableπŸ”’ HIPAAπŸ€– Multi-Agent
🎯This Week's Journey

From prompts to production-grade patient intake system.

Monday showed 3 core prompts. Tuesday automated the workflow. Wednesday mapped team roles. Today: complete technical architecture. Multi-agent orchestration, FHIR integration, HIPAA compliance, and scaling from 100 to 10,000 patients per day. This is the production system design that powers modern healthcare automation.

πŸ“‹

Key Assumptions

1
Patient volume: 100-10,000 intakes per day depending on deployment tier
2
Compliance scope: HIPAA, HITECH, state privacy laws (CCPA where applicable)
3
EHR systems: Epic, Cerner, or Allscripts with FHIR R4 API access
4
Data residency: US-based healthcare data centers with BAA coverage
5
Integration timeline: 3-6 months for full production deployment

System Requirements

Functional

  • Extract 47+ structured fields from free-text patient narratives
  • Validate completeness against EHR requirements (demographics, insurance, medical history)
  • Generate contextual follow-up questions for missing critical data
  • Detect and redact PHI before LLM processing (names, SSNs, MRNs)
  • Transform extracted data to FHIR R4 bundles for EHR ingestion
  • Maintain audit trail for all PHI access (7-year retention)
  • Support multi-language intake (English, Spanish minimum)

Non-Functional (SLOs)

latency p95 ms5000
freshness min2
availability percent99.9
extraction accuracy percent99
phi detection recall percent99.99

πŸ’° Cost Targets: {"per_intake_usd":0.15,"monthly_infra_startup_usd":500,"monthly_infra_enterprise_usd":5000}

Agent Layer

planner

L3

Decomposes intake request into subtasks, selects appropriate tools and agents

πŸ”§ task_decomposer, agent_selector, context_analyzer

⚑ Recovery: If task decomposition fails: fallback to simple sequential flow, If agent selection uncertain: route to human review queue, Retry with simplified plan (max 2 retries)

executor

L2

Extracts structured data from patient narrative using LLM

πŸ”§ claude_api, gpt4_api, schema_validator, rag_retriever

⚑ Recovery: If LLM API fails: switch to backup LLM (GPT-4 β†’ Claude β†’ Gemini), If extraction confidence < 0.7: flag for human review, If schema validation fails: retry with clarified prompt, Max 3 retries with exponential backoff

evaluator

L3

Validates completeness and quality of extracted data against EHR requirements

πŸ”§ field_validator, completeness_checker, clinical_rule_engine, icd10_validator

⚑ Recovery: If validation rules fail: use fallback rule set, If critical field missing: escalate to urgent review, If quality score < 0.8: trigger re-extraction

guardrail

L4

PHI detection, redaction, safety checks, policy enforcement

πŸ”§ comprehend_medical, phi_detector, safety_classifier, policy_engine

⚑ Recovery: If PHI detection fails: block processing entirely (fail-safe), If safety score < threshold: escalate to compliance team, If policy violation detected: halt and audit, Zero tolerance for PHI leakage

question_generator

L3

Generates contextual follow-up questions for missing critical data

πŸ”§ gpt4_api, clinical_context_retriever, question_ranker

⚑ Recovery: If question generation fails: use template-based questions, If no clinical context: generate generic questions, Max 5 questions per iteration

orchestrator

L4

Coordinates all agents, manages workflow state, handles routing decisions

πŸ”§ state_manager, routing_engine, decision_tree, retry_handler

⚑ Recovery: If agent fails: route to backup agent or human queue, If workflow stuck: timeout after 30s and escalate, If loop detected: break after 3 iterations, Maintain workflow state for resume on failure

ML Layer

Feature Store

Update: Real-time for online features, daily batch for historical aggregates

  • β€’ patient_intake_history_count
  • β€’ avg_extraction_confidence
  • β€’ missing_fields_frequency
  • β€’ question_answer_rate
  • β€’ session_completion_time_seconds
  • β€’ phi_entity_density
  • β€’ clinical_complexity_score
  • β€’ language_detected
  • β€’ source_channel

Model Registry

Strategy: Semantic versioning with A/B testing for production rollout

  • β€’ extraction_model
  • β€’ phi_detector
  • β€’ question_generator

Observability Stack

Real-time monitoring, tracing & alerting

0 active
SOURCES
Apps, Services, Infra
COLLECTION
11 Metrics
PROCESSING
Aggregate & Transform
DASHBOARDS
5 Views
ALERTS
Enabled
πŸ“ŠMetrics(11)
πŸ“Logs(Structured)
πŸ”—Traces(Distributed)
intake_requests_total
βœ“
extraction_latency_p95_ms
βœ“
validation_pass_rate
βœ“
question_generation_latency_ms
βœ“
ehr_integration_success_rate
βœ“
phi_detection_latency_ms
βœ“

Deployment Variants

πŸš€

Startup Architecture

Fast to deploy, cost-efficient, scales to 100 competitors

Infrastructure

βœ“
Vercel (frontend + API routes)
βœ“
Supabase (PostgreSQL + Auth)
βœ“
Redis Cloud (cache)
βœ“
Anthropic API (Claude)
βœ“
AWS Comprehend Medical
βœ“
Resend (email notifications)
β†’Serverless-first for low operational overhead
β†’Managed services to avoid DevOps complexity
β†’Pay-as-you-go pricing
β†’Quick deployment (< 2 weeks)
β†’Good for 0-500 patients/day
β†’Single-tenant, single-region

Risks & Mitigations

⚠️ PHI leakage to LLM provider

Low

βœ“ Mitigation: Mandatory PHI redaction before LLM processing. Fail-safe: block all processing if PHI detection fails. AWS Comprehend Medical for detection. Audit trail for all PHI access. Regular penetration testing.

⚠️ LLM hallucination causes medical error

Medium

βœ“ Mitigation: Multi-layer validation: confidence thresholds, drug database cross-reference, logical consistency checks, human review queue. Current hallucination rate: 0.3%, 100% caught before EHR write. Shadow mode testing before production rollout.

⚠️ EHR integration downtime

Medium

βœ“ Mitigation: Multi-LLM failover. Retry queue with exponential backoff. Alerting within 5 minutes. Manual override process. SLA: 99.0% (allows 7.2 hours/month downtime).

⚠️ Cost overrun from LLM usage

Medium

βœ“ Mitigation: Cost per intake target: $0.15. Monitoring and alerting at $0.20. Automatic throttling at $0.25. Optimize prompts to reduce token usage. Use cheaper models for non-critical tasks (e.g., GPT-3.5 for question generation).

⚠️ Compliance audit failure

Low

βœ“ Mitigation: 7-year audit trail retention. Regular internal audits. Third-party HIPAA audit annually. SOC 2 Type II certification. Dedicated compliance officer. Incident response plan tested quarterly.

⚠️ Agent loop / infinite retry

Low

βœ“ Mitigation: Circuit breaker after 3 iterations. Timeout after 30 seconds. Monitoring for loop detection. Automatic escalation to human review. Workflow state persistence for recovery.

⚠️ Data residency violation (cross-border transfer)

Low

βœ“ Mitigation: US-based AWS regions only (us-east-1, us-west-2). No cross-region replication. Patient consent for data processing. Regular audits of data flows. DLP (Data Loss Prevention) policies.

🧬

Evolution Roadmap

Progressive transformation from MVP to scale

🌱
Phase 1Months 0-3

Phase 1: MVP (0-3 months)

1
Deploy startup architecture (Vercel + Supabase)
2
Implement 3 core agents (Intake, Validation, Question)
3
Integrate with 1 EHR (Epic FHIR)
4
Achieve 95% extraction accuracy
5
Process 100 patients/day
Complexity Level
β–Ό
🌿
Phase 2Months 3-6

Phase 2: Scale (3-6 months)

1
Migrate to queue-based architecture (SQS + Lambda)
2
Add multi-LLM failover
3
Integrate with 2 more EHRs (Cerner, Allscripts)
4
Achieve 99% extraction accuracy
5
Process 1,000 patients/day
Complexity Level
β–Ό
🌳
Phase 3Months 6-12

Phase 3: Enterprise (6-12 months)

1
Migrate to EKS multi-region
2
Add 6 specialized agents (Planner, Evaluator, Guardrail, etc.)
3
Implement agentic RAG
4
Achieve 99.5% extraction accuracy
5
Process 10,000+ patients/day
6
SOC 2 Type II certification
Complexity Level
πŸš€Production Ready
πŸ—οΈ

Complete Systems Architecture

9-layer production architecture from patient portal to EHR

1
🌐

Presentation Layer

4 components

Patient Portal (React/Next.js)
Tablet Kiosk App (iPad)
SMS/Text Intake (Twilio)
Voice Intake (Deepgram + Whisper)
2
βš™οΈ

API Gateway Layer

4 components

API Gateway (Kong/AWS API Gateway)
Rate Limiter (Redis-based)
Auth Service (OAuth 2.0 + SMART on FHIR)
Load Balancer (ALB/NLB)
3
πŸ’Ύ

Agent Layer

6 components

Planner Agent (Task decomposition)
Intake Agent (Data extraction)
Validation Agent (Completeness check)
Question Agent (Follow-up generation)
Guardrail Agent (PHI detection, safety)
Orchestrator Agent (Workflow coordination)
4
πŸ”Œ

ML Layer

5 components

Feature Store (Tecton/Feast)
Model Registry (MLflow)
Prompt Store (Versioned prompts)
Evaluation Engine (LangSmith/Phoenix)
Embedding Service (text-embedding-3-large)
5
πŸ“Š

Integration Layer

4 components

FHIR Adapter (HAPI FHIR)
HL7 v2 Adapter (Mirth Connect)
PHI Handler (AWS Comprehend Medical)
EHR Connectors (Epic, Cerner, Allscripts)
6
🌐

Data Layer

5 components

Primary Database (PostgreSQL with pgvector)
Cache Layer (Redis Cluster)
Vector Store (Pinecone/Weaviate)
Audit Log Store (S3 + Glacier)
Message Queue (SQS/RabbitMQ)
7
βš™οΈ

External Services

6 components

LLM APIs (Anthropic Claude, OpenAI GPT-4)
Epic FHIR API
Cerner FHIR API
AWS Comprehend Medical
Twilio (SMS)
Deepgram (Voice)
8
πŸ’Ύ

Observability Layer

5 components

Metrics (Prometheus + Grafana)
Logs (ELK Stack / CloudWatch)
Traces (Jaeger / X-Ray)
Alerting (PagerDuty / Opsgenie)
ML Monitoring (Arize / WhyLabs)
9
πŸ”Œ

Security Layer

5 components

WAF (CloudFlare / AWS WAF)
Secrets Manager (AWS Secrets Manager / Vault)
KMS (AWS KMS / Azure Key Vault)
IAM / RBAC Service
SIEM (Splunk / Datadog Security)
πŸ”„

Complete Request Flow - Patient Intake to EHR

Automated data flow every hour

Step 0 of 17
PatientAPI GatewayGuardrail AgentPlanner AgentIntake AgentValidation AgentQuestion AgentOrchestratorFHIR AdapterEpic EHRPOST /intake with free-text narrativePre-process: PHI detection scanSanitized text + PHI flagsTask: Extract 47 fields from textExtracted JSON (name, age, symptoms, meds, allergies, insurance...)Gap analysis: 5 missing fields (insurance ID, emergency contact, medication dosage, allergy severity, primary care physician)Decision: Incomplete β†’ Generate follow-up questions5 contextual questions generatedReturn questions via API GatewayPOST /intake/answers with responsesMerge answers into existing JSONUpdated JSON (now complete)Complete: All 47 fields presentTransform to FHIR R4 BundlePOST /Patient, /Condition, /Observation, /AllergyIntolerance, /MedicationStatement201 Created + Patient MRNSuccess response with confirmation

End-to-End Data Flow

Patient text β†’ Sanitized β†’ Extracted β†’ Validated β†’ Questions β†’ FHIR β†’ EHR

1
Patient Portal0ms
User submits intake form β†’ Free-text narrative (500-2000 words)
2
API Gateway50ms
Authenticates and rate limits β†’ Validated request
3
Guardrail Agent800ms
Scans for PHI entities β†’ PHI flags + sanitized text
4
Planner Agent150ms
Creates task plan β†’ Agent sequence + tool selection
5
Intake Executor3200ms
Extracts structured data via Claude β†’ JSON with 47 fields + confidence scores
6
Validation Evaluator1800ms
Checks completeness against EHR schema β†’ Missing fields array + quality score
7
Orchestrator50ms
Routes based on completeness β†’ Decision: complete OR incomplete
8
Question Generator2500ms
Generates 5 contextual questions (if incomplete) β†’ Questions + expected answer formats
9
Patient Portal100ms
Displays questions to patient β†’ Interactive form
10
Intake Executor1200ms
Merges answers into existing JSON β†’ Updated JSON (now complete)
11
FHIR Adapter800ms
Transforms to FHIR R4 bundle β†’ FHIR Bundle (Patient, Condition, Observation, AllergyIntolerance, MedicationStatement)
12
Epic FHIR API2200ms
Persists to EHR β†’ 201 Created + Patient MRN
13
Audit Logger100ms
Records all PHI access β†’ Audit trail entry
1
Volume
0-100 patients/day
Pattern
Serverless Monolith
πŸ—οΈ
Architecture
Next.js API routes on Vercel
Supabase PostgreSQL
Redis Cloud (cache)
Direct LLM API calls
AWS Comprehend Medical
Cost & Performance
$200/month
per month
5-8 seconds p95
2
Volume
100-1,000 patients/day
Pattern
Queue + Workers
πŸ—οΈ
Architecture
API server (Node.js/Express)
SQS message queue
Lambda workers (agent execution)
RDS PostgreSQL
ElastiCache Redis
S3 for audit logs
Cost & Performance
$800/month
per month
3-5 seconds p95
3
Volume
1,000-10,000 patients/day
Pattern
Multi-Agent Orchestration
πŸ—οΈ
Architecture
ECS Fargate containers
LangGraph orchestration
SQS + SNS for event routing
Aurora PostgreSQL (Multi-AZ)
Redis Cluster
Pinecone vector store
CloudWatch + X-Ray
Cost & Performance
$3,500/month
per month
2-4 seconds p95
Recommended
4
Volume
10,000+ patients/day
Pattern
Enterprise Multi-Region
πŸ—οΈ
Architecture
EKS Kubernetes clusters (multi-region)
Kafka event streaming
Aurora Global Database
Redis Enterprise (multi-region)
Multi-LLM failover (Claude, GPT-4, Gemini)
Datadog full-stack monitoring
Private VPC with Transit Gateway
AWS PrivateLink for EHR connections
Cost & Performance
$12,000+/month
per month
1-3 seconds p95

Key Integrations

Epic FHIR API

Protocol: HL7 FHIR R4
Obtain OAuth token from Epic authorization server
Transform extracted JSON to FHIR resources
POST Patient resource (demographics)
POST Condition resources (diagnoses)
POST Observation resources (vitals, labs)
POST AllergyIntolerance resources
POST MedicationStatement resources
Receive Epic MRN in response

Cerner FHIR API

Protocol: HL7 FHIR R4
Similar to Epic but uses Cerner-specific extensions
Different authorization endpoint
Cerner-specific patient identifier format
Additional validation rules for Cerner

AWS Comprehend Medical

Protocol: AWS SDK (boto3)
Send patient text to DetectPHI API
Receive entity list (NAME, SSN, MRN, etc.)
Redact entities before LLM processing
Log all PHI detections to audit trail
Use DetectEntitiesV2 for medical terms

HL7 v2 (Legacy Systems)

Protocol: HL7 v2.5.1 (ADT, ORM messages)
Transform FHIR bundle to HL7 v2 messages
Send ADT^A04 (register patient)
Send ORM^O01 (order message for labs)
Receive ACK acknowledgment
Handle NACK and retry

Security & Compliance Architecture

Failure Modes & Recovery

FailureFallbackImpactSLA
LLM API down (Anthropic outage)Automatic failover to GPT-4, then Gemini. If all fail, queue for manual processing.Degraded performance (slower), not broken99.5% (allows 3.6 hours/month downtime)
Extraction confidence < 0.7Flag for human review, send to intake coordinator queueQuality maintained, slight delay99.9% (manual review within 2 hours)
EHR API timeout (Epic/Cerner down)Retry 3x with exponential backoff. If fails, store in retry queue with 5-minute interval.Eventual consistency (data arrives late)99.0% (allows 7.2 hours/month)
PHI detection service failsBLOCK ALL PROCESSING. Fail-safe mode. No PHI to LLM under any circumstance.System unavailable until PHI detection restored100% (zero tolerance for PHI leakage)
Database unavailableSwitch to read replica for read operations. Write operations queued.Read-only mode, writes delayed99.9%
Agent loop detected (infinite retry)Circuit breaker after 3 iterations. Escalate to human.Prevents resource exhaustionN/A (safety mechanism)
FHIR validation failsLog error, retry with corrected mapping. If fails 3x, human review.Data quality maintained99.5%
System Architecture
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚         ORCHESTRATOR AGENT               β”‚
β”‚  (Workflow coordination & routing)       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
             β”‚
     β”Œβ”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
     β”‚                β”‚        β”‚         β”‚          β”‚
β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β” β”Œβ”€β–Όβ”€β”€β”€β”€β”€β”€β” β”Œβ–Όβ”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ PLANNER  β”‚   β”‚GUARDRAIL β”‚ β”‚ INTAKE β”‚ β”‚VALIDATORβ”‚ β”‚QUESTION β”‚
β”‚  AGENT   β”‚   β”‚  AGENT   β”‚ β”‚ AGENT  β”‚ β”‚  AGENT  β”‚ β”‚  AGENT  β”‚
β”‚          β”‚   β”‚          β”‚ β”‚        β”‚ β”‚         β”‚ β”‚         β”‚
β”‚ Task     β”‚   β”‚ PHI      β”‚ β”‚Extract β”‚ β”‚Check    β”‚ β”‚Generate β”‚
β”‚ Decomp   β”‚   β”‚ Detect   β”‚ β”‚Fields  β”‚ β”‚Complete β”‚ β”‚Follow-upβ”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
     β”‚                β”‚          β”‚          β”‚           β”‚
     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚
                  β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”
                  β”‚   FHIR   β”‚
                  β”‚  ADAPTER β”‚
                  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜
                       β”‚
                  β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”
                  β”‚ Epic EHR β”‚
                  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ”„Agent Collaboration Flow

1
Orchestrator
Receives patient text, initializes workflow state, routes to Planner
2
Planner
Analyzes text complexity, decomposes into subtasks, selects agent sequence β†’ Returns plan to Orchestrator
3
Guardrail
Scans for PHI entities, redacts sensitive data, checks safety rules β†’ Returns sanitized text + PHI flags
4
Intake Executor
Extracts 47 structured fields via LLM with RAG context β†’ Returns JSON + confidence scores
5
Validation Evaluator
Checks completeness against EHR schema, validates data quality β†’ Returns gap list + quality score
6
Orchestrator
Decision: If complete (100%) β†’ Route to FHIR Adapter. If incomplete β†’ Route to Question Agent
7a
Question Agent
If incomplete: Generate 5 contextual follow-up questions based on gaps β†’ Loop back to Orchestrator β†’ Wait for patient answers
7b
FHIR Adapter
If complete: Transform JSON to FHIR R4 Bundle β†’ Send to Epic/Cerner β†’ Receive MRN confirmation

🎭Agent Types

Reactive Agent

Low (Level 1)

Intake Agent - Responds to input, returns output. No memory.

Stateless

Reflexive Agent

Medium (Level 2)

Validation Agent - Uses rules + context. Limited decision-making.

Reads context, no persistent state

Deliberative Agent

High (Level 3)

Question Agent - Plans questions based on gaps. Reasons about what to ask.

Stateful (remembers previous questions)

Orchestrator Agent

Highest (Level 4)

Orchestrator - Makes routing decisions, handles loops, manages workflow state.

Full state management (workflow history, retry counts, etc.)

πŸ“ˆLevels of Agent Autonomy

L1
Tool (No Autonomy)
Human calls, agent responds. No decisions.
β†’ Monday's prompts - human copies, pastes, reads output
L2
Chained Tools (Sequential)
Pre-defined sequence. No branching.
β†’ Tuesday's code - extract β†’ validate β†’ question (always in order)
L3
Agent (Decision-Making)
Makes routing decisions. Can loop.
β†’ Orchestrator - routes based on completeness, retries on failure
L4
Multi-Agent (Collaborative)
Agents collaborate autonomously. Emergent behavior.
β†’ This system - agents call each other, negotiate, recover from failures

RAG vs Fine-Tuning Decision

Hallucination Detection

Evaluation Framework

Dataset Curation

Agentic RAG

Prompt Versioning

Complete Tech Stack

Frontend
Next.js 14, React, TypeScript, Tailwind CSS
Backend
Node.js, Express, Python (agent workers)
LLMs
Claude 3.5 Sonnet (primary), GPT-4 Turbo (backup), Gemini Pro (tertiary)
Orchestration
LangGraph, LangChain
Database
PostgreSQL (Aurora), pgvector extension
Cache
Redis Cluster
Vector Store
Pinecone or Weaviate
Queue
AWS SQS, SNS (pub/sub)
Compute
ECS Fargate (startup), EKS (enterprise)
PHI Detection
AWS Comprehend Medical
EHR Integration
HAPI FHIR, Mirth Connect
Monitoring
Datadog (enterprise), CloudWatch (startup)
Logging
ELK Stack (Elasticsearch, Logstash, Kibana)
Tracing
AWS X-Ray, Jaeger
Security
AWS WAF, AWS KMS, HashiCorp Vault
Auth
Okta (SSO), Auth0 (patient portal)
CI/CD
GitHub Actions, ArgoCD
πŸ—οΈ

Need Architecture Review?

We'll audit your system design, identify bottlenecks, show you how to scale 10x, and ensure HIPAA compliance. 90-minute deep dive with actionable recommendations.

Β©

2026 Randeep Bhatia. All Rights Reserved.

No part of this content may be reproduced, distributed, or transmitted in any form without prior written permission.