Skip to main content
← Wednesday's Workflows

Investment Analysis System Architecture 🏗️

From 100 to 10,000 deals/day with multi-agent ML orchestration

September 4, 2025
19 min read
🏢 Real Estate Tech🏗️ Production Architecture🤖 Multi-Agent AI📊 ML Pipelines
🎯This Week's Journey

From prompts to production investment platform.

Monday: 3 core prompts for deal analysis. Tuesday: automation code. Wednesday: team workflows. Thursday: complete system architecture. Multi-agent orchestration, ML pipelines, data engineering, and scaling from 100 deals/day to enterprise 10,000+ volume with compliance.

📋

Key Assumptions

1
Analyze 100-10,000 deals per day across residential, commercial, and multi-family properties
2
Integrate with MLS feeds, public records, and 3rd-party data providers (Zillow, CoStar, etc.)
3
Support multiple user roles: analysts, underwriters, portfolio managers, investors
4
Compliance with SOC2, data privacy regulations, and financial services standards
5
Real-time analysis required for competitive markets; batch acceptable for portfolio review

System Requirements

Functional

  • Ingest property data from MLS, public records, APIs, and manual uploads
  • Extract 50+ features: location metrics, financials, market trends, risk factors
  • Generate investment memos with comps analysis, cash flow projections, and risk assessment
  • Support multi-property portfolio analysis with correlation and diversification metrics
  • Provide explainable AI outputs with source citations and confidence scores
  • Enable human-in-the-loop review workflows with approval gates
  • Track analysis history and model versioning for audit trails

Non-Functional (SLOs)

latency p95 ms5000
freshness min15
availability percent99.5
accuracy target0.92
hallucination rate max0.01

💰 Cost Targets: {"per_deal_analysis_usd":0.5,"per_user_monthly_usd":200,"infrastructure_monthly_startup_usd":800,"infrastructure_monthly_enterprise_usd":5000}

Agent Layer

planner

L4

Decompose deal analysis into sub-tasks and orchestrate agent execution

🔧 task_decomposer, dependency_resolver, agent_router

⚡ Recovery: If task decomposition fails → use default template, If agent unavailable → mark task as pending, retry with backoff

data_ingestion

L2

Fetch property data from MLS, public records, and 3rd-party APIs

🔧 mls_adapter, public_records_scraper, zillow_api, costar_api, data_validator

⚡ Recovery: If API timeout → retry 3x with exponential backoff, If data incomplete → flag for manual review, proceed with available data, If all sources fail → return error, do not proceed

analysis_executor

L3

Run financial analysis: cash flow, ROI, cap rate, IRR projections

🔧 cash_flow_calculator, roi_calculator, cap_rate_calculator, irr_calculator, sensitivity_analyzer

⚡ Recovery: If calculation error → use conservative default assumptions, If missing data → impute with market averages, flag uncertainty

comp_agent

L3

Find comparable properties using vector similarity search

🔧 vector_db_search, embedding_generator, comp_ranker

⚡ Recovery: If no comps found → expand search radius 2x, If vector DB unavailable → fallback to SQL-based search (slower)

risk_agent

L3

Assess market, property, and financial risks with severity scoring

🔧 market_risk_analyzer, property_risk_analyzer, financial_risk_analyzer, risk_aggregator

⚡ Recovery: If risk model unavailable → use rule-based fallback, If data missing → flag as 'insufficient data' risk

memo_generator

L3

Generate investment memo using LLM with structured template

🔧 llm_api (GPT-4/Claude), template_engine, citation_generator

⚡ Recovery: If LLM API down → queue for retry, return cached template, If hallucination detected → regenerate with stricter prompt

evaluator

L3

Quality check: validate accuracy, completeness, and consistency

🔧 accuracy_checker, completeness_checker, consistency_checker, confidence_scorer

⚡ Recovery: If quality score < 0.7 → flag for human review, If critical issue found → block output, return error

guardrail

L4

Compliance checks: PII redaction, policy violations, safety filters

🔧 pii_detector, pii_redactor, policy_checker, safety_filter

⚡ Recovery: If PII detected → redact and log, If policy violation → block output, alert admin

ML Layer

Feature Store

Update: Daily batch + real-time on-demand

  • location_walkability_score
  • neighborhood_crime_rate
  • school_district_rating
  • property_age_years
  • square_feet_per_dollar
  • days_on_market
  • price_per_sqft_vs_market
  • rental_yield_estimate
  • appreciation_rate_5yr
  • vacancy_rate_area
  • property_tax_rate
  • hoa_fees_monthly
  • renovation_cost_estimate
  • flood_zone_risk
  • earthquake_zone_risk
  • market_liquidity_score
  • comparable_sales_count_6mo
  • price_trend_12mo
  • employment_growth_rate_area
  • population_growth_rate_area
  • median_income_area
  • inventory_months_supply
  • absorption_rate
  • cap_rate_market_avg
  • cash_on_cash_return
  • debt_service_coverage_ratio
  • loan_to_value_ratio
  • internal_rate_of_return_5yr
  • net_operating_income
  • gross_rent_multiplier
  • price_to_rent_ratio
  • property_condition_score
  • energy_efficiency_rating
  • parking_spaces_count
  • lot_size_sqft
  • zoning_type
  • permits_issued_count_5yr
  • new_construction_nearby
  • transit_accessibility_score
  • retail_density_1mi
  • restaurant_density_1mi
  • hospital_distance_mi
  • airport_distance_mi
  • highway_access_distance_mi
  • waterfront_flag
  • view_quality_score
  • noise_level_estimate
  • air_quality_index
  • property_tax_trend_5yr
  • insurance_cost_estimate

Model Registry

Strategy: Semantic versioning with A/B testing for major updates

  • property_value_estimator
  • risk_classifier
  • comp_reranker
  • memo_generator
  • hallucination_detector

Observability Stack

Real-time monitoring, tracing & alerting

0 active
SOURCES
Apps, Services, Infra
COLLECTION
18 Metrics
PROCESSING
Aggregate & Transform
DASHBOARDS
4 Views
ALERTS
Enabled
📊Metrics(18)
📝Logs(Structured)
🔗Traces(Distributed)
api_requests_total
api_latency_p50_ms
api_latency_p95_ms
api_latency_p99_ms
api_errors_total
deals_analyzed_total

Deployment Variants

🚀

Startup Architecture

Fast to deploy, cost-efficient, scales to 100 competitors

Infrastructure

Serverless-first (Lambda/Cloud Functions)
Managed services (RDS, ElastiCache, S3)
Single region (us-east-1 or us-west-2)
Pinecone managed vector DB
OpenAI API (no self-hosting)
CloudWatch/Datadog for observability
Fast to deploy (1-2 weeks)
Low operational overhead
Pay-per-use pricing
Good for 0-1,000 deals/day
Limited customization

Risks & Mitigations

⚠️ LLM API cost explosion (10x spike in usage)

Medium

✓ Mitigation: Set hard rate limits (1000 req/hour per user). Alert at 80% of budget. Use cheaper models (Gemini) for non-critical tasks. Cache aggressively (24hr TTL).

⚠️ Data source reliability (Zillow API changes, MLS feed goes down)

High

✓ Mitigation: Multi-source redundancy (Zillow + Redfin + public records). Cache stale data (24hr). Graceful degradation (proceed with available data, flag uncertainty).

⚠️ Hallucination in high-stakes deal (LLM invents fake comp, user loses $100K)

Low

✓ Mitigation: Multi-layer hallucination detection (5 layers). Human review for deals >$500K. Insurance policy for AI errors. Clear disclaimers ('AI-assisted, not financial advice').

⚠️ Compliance violation (PII leak, GDPR breach)

Low

✓ Mitigation: Guardrail agent (100% coverage). PII redaction before LLM. Immutable audit logs. SOC2 Type II certification. Annual penetration testing.

⚠️ Model drift (market changes, model becomes stale)

High

✓ Mitigation: Monitor feature distributions weekly. Retrain quarterly. A/B test new models. Shadow mode for validation. Automated rollback if quality drops.

⚠️ Vendor lock-in (OpenAI changes pricing, deprecates API)

Medium

✓ Mitigation: Multi-LLM architecture (OpenAI + Anthropic + Gemini). Abstraction layer (easy to swap providers). Regularly test failover. Budget for migration.

⚠️ Scaling bottleneck (database can't handle 10K deals/day)

Medium

✓ Mitigation: Load testing at 2x expected volume. Database sharding by org_id. Read replicas for analytics. Auto-scaling workers. Circuit breakers to prevent cascading failures.

🧬

Evolution Roadmap

Progressive transformation from MVP to scale

🌱
Phase 1Weeks 1-12

Phase 1: MVP (0-3 months)

1
Launch with 3 core agents (data, analysis, memo)
2
Support 100 deals/day
3
Single-tenant (1 organization)
4
Basic evaluation (human feedback)
Complexity Level
🌿
Phase 2Months 4-6

Phase 2: Scale (3-6 months)

1
Add 3 more agents (comp, risk, evaluator)
2
Support 1,000 deals/day
3
Multi-tenant (10 organizations)
4
Automated evaluation pipeline
Complexity Level
🌳
Phase 3Months 7-12

Phase 3: Enterprise (6-12 months)

1
Support 10,000+ deals/day
2
Multi-region deployment (US + EU)
3
Advanced ML (ensemble, agentic RAG)
4
99.9% uptime SLA
Complexity Level
🚀Production Ready
🏗️

Complete Systems Architecture

9-layer architecture from presentation to security

1
🌐

Presentation

3 components

Web Dashboard (React)
Mobile App (React Native)
API Clients (Python SDK)
2
⚙️

API Gateway

4 components

Load Balancer (ALB/NGINX)
Rate Limiter (Redis)
Auth Gateway (OAuth 2.0/OIDC)
API Versioning (v1, v2)
3
💾

Agent Layer

8 components

Planner Agent (Task Decomposition)
Data Ingestion Agent (Extract)
Analysis Agent (Executor)
Comp Agent (Market Comps)
Risk Agent (Risk Assessment)
Memo Agent (Report Generation)
Evaluator Agent (Quality Check)
Guardrail Agent (Compliance)
4
🔌

ML Layer

6 components

Feature Store (50+ metrics)
Model Registry (LLMs, Rerankers)
Offline Training (Batch)
Online Inference (Real-time)
Evaluation Pipeline
Prompt Store (Versioned)
5
📊

Integration

5 components

MLS Adapter (RETS/Web API)
Public Records Scraper
3rd-Party APIs (Zillow, CoStar)
Document Parser (PDFs, Images)
Webhook Handler
6
🌐

Data

5 components

Transactional DB (PostgreSQL)
Time-Series DB (TimescaleDB)
Vector DB (Pinecone/Weaviate)
Data Lake (S3/GCS)
Cache (Redis)
7
⚙️

External

4 components

LLM APIs (OpenAI, Anthropic)
Data Providers (Zillow, CoStar)
Document Storage (S3)
Email/Notifications (SendGrid)
8
💾

Observability

5 components

Metrics (Prometheus/Datadog)
Logs (CloudWatch/ELK)
Traces (Jaeger/Honeycomb)
Dashboards (Grafana)
Alerts (PagerDuty)
9
🔌

Security

5 components

IAM (RBAC)
Secrets Manager (Vault/KMS)
Audit Logs (Immutable)
PII Redaction
WAF/DDoS Protection
🔄

Request Flow - Single Deal Analysis

Automated data flow every hour

Step 0 of 11
UserAPI GatewayPlanner AgentData AgentAnalysis AgentComp AgentRisk AgentMemo AgentEvaluatorGuardrailPOST /analyze {address, deal_type}Decompose task into stepsFetch property data (MLS, records, APIs)Compute 50+ derived featuresRun financial analysis (cash flow, ROI)Find comparable properties (vector search)Assess risks (market, property, financial)Generate investment memo (LLM)Quality check (accuracy, completeness)Compliance check (PII, policy violations)200 OK {memo, metrics, confidence}

Data Flow - Deal Analysis Pipeline

From property address to investment memo in 5 seconds

1
User0ms
Submits deal requestAddress + deal type
2
API Gateway50ms
Authenticates and routesValidated request
3
Planner Agent100ms
Decomposes into tasksTask graph
4
Data Agent1600ms
Fetches from MLS, Zillow, CoStarRaw property data (JSONB)
5
Feature Store2100ms
Computes 50+ derived metricsFeature vector
6
Analysis Agent2600ms
Runs financial modelsCash flow, ROI, IRR
7
Comp Agent3100ms
Vector search for compsTop 5 comparables
8
Risk Agent3600ms
Assesses risksRisk factors + severity
9
Memo Agent5600ms
Generates memo (LLM)Investment memo (markdown)
10
Evaluator6300ms
Quality checkQuality score + issues
11
Guardrail6600ms
Compliance checkRedacted memo
12
Database6800ms
Saves analysis + audit logPersisted
13
User7000ms
Receives memo200 OK + memo
1
Volume
0-100 deals/day
Pattern
Serverless Monolith
🏗️
Architecture
API Gateway (AWS API Gateway or Cloud Run)
Serverless functions (Lambda/Cloud Functions)
Managed PostgreSQL (RDS/Cloud SQL)
Redis cache (ElastiCache/Memorystore)
S3/GCS for document storage
Cost & Performance
$200-400/month
per month
5-8 seconds p95
2
Volume
100-1,000 deals/day
Pattern
Queue + Workers
🏗️
Architecture
Load balancer (ALB/Cloud Load Balancing)
API servers (ECS/Cloud Run)
Message queue (SQS/Pub/Sub)
Worker processes (ECS tasks/Cloud Run jobs)
PostgreSQL + read replicas
Redis for caching + rate limiting
Vector DB (Pinecone managed)
Cost & Performance
$800-1,200/month
per month
3-5 seconds p95
3
Volume
1,000-10,000 deals/day
Pattern
Multi-Agent Orchestration
🏗️
Architecture
Container orchestration (ECS/GKE)
Agent framework (LangGraph/CrewAI)
Event streaming (Kafka/Kinesis)
Distributed cache (Redis cluster)
PostgreSQL sharded by org_id
Vector DB (self-hosted Weaviate)
Feature store (Feast/Tecton)
Model registry (MLflow)
Cost & Performance
$3,000-5,000/month
per month
2-4 seconds p95
Recommended
4
Volume
10,000+ deals/day
Pattern
Enterprise Multi-Region
🏗️
Architecture
Kubernetes (EKS/GKE) multi-region
Global load balancer with geo-routing
Kafka cluster (MSK/Confluent)
PostgreSQL multi-region replication
Redis Cluster (multi-AZ)
Vector DB replicated (Weaviate cluster)
Multi-LLM failover (OpenAI + Anthropic + Azure)
Private VPC with VPN/Direct Connect
Dedicated KMS/HSM for secrets
Cost & Performance
$10,000+/month
per month
1-3 seconds p95

Key Integrations

MLS (Multiple Listing Service)

Protocol: RETS (Real Estate Transaction Standard) or Web API
Authenticate with MLS provider
Query by address or MLS ID
Parse RETS XML or JSON response
Extract property details, photos, history
Store in database with source attribution

Zillow API

Protocol: REST API
GET /GetSearchResults (address → Zillow ID)
GET /GetZestimate (valuation)
GET /GetComps (comparable properties)
Parse XML response
Cache for 24 hours (rate limit: 1000 calls/day)

CoStar API (Commercial)

Protocol: REST API
Authenticate with client credentials
POST /properties/search (filters)
GET /properties/{id} (detailed data)
Parse JSON response
Store commercial property data

Public Records (County Assessor)

Protocol: Web scraping or API (varies by county)
Scrape county assessor website (Selenium/Playwright)
Extract property tax, deed history, permits
Parse HTML tables or PDFs
Validate data quality
Store with source URL

Vector DB (Pinecone/Weaviate)

Protocol: gRPC or REST API
Generate property embedding (OpenAI ada-002)
Upsert to vector DB with metadata
Query for similar properties (cosine similarity)
Rerank results with neural reranker
Return top 5-10 comps

Security & Compliance

Failure Modes & Recovery

FailureFallbackImpactSLA
LLM API down (OpenAI outage)Failover to Anthropic Claude API → If both down, queue requests for retryDegraded: 5-10 min delay, not broken99.5% (tolerates 3.6 hours/month downtime)
Data source unavailable (Zillow API rate limit)Use cached data (24hr stale) + flag as 'stale' → Proceed with available sourcesPartial data, lower confidence score99.0%
Vector DB unavailable (Pinecone outage)Fallback to SQL-based comp search (slower, less accurate) → Flag as 'degraded'Slower comps (5s → 15s), lower similarity scores99.0%
Feature computation timeoutUse cached features (if available) → If not, use market averages + flag uncertaintyLower accuracy, higher uncertainty99.5%
Quality score < 0.7 (low confidence)Route to human review queue → Notify analyst → Block auto-publishManual review required (SLA: 4 hours)100% (safety first)
Guardrail detects PII in memoAuto-redact PII → Log violation → Alert security teamRedacted output (e.g., 'Owner: [REDACTED]')100% (compliance critical)
Database write failureRetry 3x → If fails, write to S3 (eventual consistency) → Background job syncs to DBAnalysis available but not searchable immediately99.9%

RAG vs Fine-Tuning

Real estate data changes daily (new listings, price updates, market trends). RAG allows real-time updates without retraining. Fine-tuning would require weekly retrains ($5K+ each).
✅ RAG (Chosen)
Cost: $200/month (vector DB + embeddings)
Update: Real-time (new data indexed hourly)
How: Embed new listings → upsert to Pinecone
❌ Fine-Tuning
Cost: $5,000/month (weekly retrains)
Update: Weekly batch
How: Retrain GPT-4 on 100K examples
Implementation: Vector DB (Pinecone) with property embeddings (OpenAI ada-002). Retrieved during comp search and memo generation. Reranker (Cohere) improves relevance.

Hallucination Detection

LLMs hallucinate property details (fake comps, incorrect metrics, made-up risks)
L1
Confidence scoring: LLM returns confidence (0-1) for each claim
L2
Cross-reference: Validate metrics against source data (e.g., cap rate = NOI / price)
L3
Fact-checking: Query vector DB for cited comps, verify they exist
L4
Fine-tuned classifier: Trained on 5K labeled examples (hallucination vs factual)
L5
Human review: If hallucination score > 0.3, flag for manual review
Hallucination rate: 0.8% (target <1%). 100% of high-risk hallucinations caught before user sees them.

Evaluation Framework

Financial Accuracy
96.2%target: 95%+
Comp Relevance (NDCG)
0.91target: 0.90+
Memo Quality (Human Eval)
4.2/5.0target: 4.0/5.0
Hallucination Rate
0.8%target: <1%
User Feedback (Thumbs Up)
87%target: 85%+
Testing: Shadow mode: 500 deals analyzed in parallel (LLM vs human). Compare outputs. LLM matches human 94% of the time. Disagreements reviewed by senior underwriter.

Dataset Curation

1
Collect: 50K historical deals - Internal deal database + public records
2
Clean: 42K usable - Remove duplicates, incomplete data, outliers
3
Label: 10K labeled - ($$50K)
4
Augment: +5K synthetic - Generate edge cases (distressed properties, unusual financing, niche markets)
5
Validate: 15K final dataset - Inter-rater reliability (Cohen's Kappa: 0.89)
15K high-quality labeled examples for training and evaluation. Updated quarterly with new deals.

Agentic RAG

Agent iteratively retrieves based on reasoning (not one-shot)
User queries 'multi-family in Austin' → Agent reasons 'need market trends' → RAG retrieves Austin market data → Agent reasons 'need comps' → RAG retrieves similar properties → Agent reasons 'need risk factors' → RAG retrieves crime, flood zones → Agent synthesizes memo with all context
💡 Agent decides what to retrieve next based on what it learned. More accurate than single retrieval. Handles complex queries better.

Multi-Model Ensemble

Tech Stack Summary

LLMs
OpenAI (GPT-4), Anthropic (Claude), Google (Gemini)
Orchestration
LangGraph, CrewAI, or custom Python framework
Database
PostgreSQL (transactional), TimescaleDB (time-series)
Vector DB
Pinecone (managed) or Weaviate (self-hosted)
Cache
Redis (ElastiCache/Memorystore)
Queue
AWS SQS, Google Pub/Sub, or RabbitMQ
Streaming
Apache Kafka (MSK/Confluent)
Compute
AWS Lambda, ECS, or GKE
Storage
S3/GCS (documents, logs), CloudFront/CDN (assets)
Monitoring
Datadog, Prometheus + Grafana, or CloudWatch
Security
AWS KMS, HashiCorp Vault, AWS WAF
ML Ops
MLflow (model registry), Feast (feature store), Weights & Biases (experiment tracking)
🏗️

Need Architecture Review?

We'll audit your system design, identify bottlenecks, and show you how to scale 10x with multi-agent AI and ML pipelines.

©

2026 Randeep Bhatia. All Rights Reserved.

No part of this content may be reproduced, distributed, or transmitted in any form without prior written permission.