← Wednesday's Workflows

Real Estate Listing System Architecture 🏗️

From MLS data to 50+ platform variations in 90 seconds with AI agents

July 10, 2025
🏡 Real Estate🏗️ Architecture🤖 Multi-Agent📊 Scalable

From MLS feed to multi-platform listings in under 2 minutes.

Monday: 3 prompts for listing variations. Tuesday: automated code pipeline. Wednesday: agent/broker workflows. Thursday: complete technical architecture. Agent orchestration, MLS integration, variation generation, and distribution to 50+ platforms with quality checks and compliance.

Key Assumptions

  • Process 100-10,000 listings per day per brokerage
  • MLS data updates hourly via RETS/RESO API
  • Support 50+ distribution platforms (Zillow, Realtor.com, social, email, print)
  • Fair Housing Act compliance required (no discriminatory language)
  • Multi-tenant SaaS: 1,000+ brokerages, each with 10-500 agents
  • 99.9% uptime SLA for listing publication
  • GDPR/CCPA compliance for consumer data

System Requirements

Functional

  • Ingest MLS data via RETS/RESO API (hourly sync)
  • Generate 50+ platform-specific variations (web, social, email, print)
  • Validate Fair Housing compliance (no discriminatory terms)
  • Distribute to platforms via APIs (Zillow, Realtor.com, etc.)
  • Track performance metrics (views, leads, conversions)
  • Support custom branding per agent/brokerage
  • Handle image optimization and watermarking

Non-Functional (SLOs)

latency p95 ms90000
freshness min60
availability percent99.9
quality score min0.95

💰 Cost Targets: {"per_listing_usd":0.15,"per_variation_usd":0.003,"storage_per_gb_month_usd":0.023}

Agent Layer

planner

L4

Decompose listing publication into tasks, select tools, route to specialized agents

🔧 task_decomposer, platform_selector, agent_router

⚡ Recovery: Retry with backoff (3x), Fallback to default plan, Alert ops team

executor

L3

Execute primary workflow: generate variations, orchestrate sub-agents

🔧 template_engine, llm_api (GPT-4), image_optimizer, watermarker

⚡ Recovery: Partial success (publish what succeeded), Retry failed variations, Queue for manual review

evaluator

L2

Validate output quality, check completeness, score variations

🔧 quality_scorer, completeness_checker, style_validator

⚡ Recovery: Flag low-quality variations, Trigger regeneration, Human review queue

guardrail

L1

Policy checks, Fair Housing compliance, PII redaction, safety filters

🔧 fair_housing_checker, pii_detector, profanity_filter, legal_validator

⚡ Recovery: Block publication on violation, Alert compliance team, Log for audit

template

L2

Generate platform-specific variations using templates and LLMs

🔧 template_renderer, llm_api (GPT-4), image_selector, seo_optimizer

⚡ Recovery: Fallback to base template, Use cached similar listing, Generate generic version

distribution

L3

Publish approved variations to 50+ platforms via APIs

🔧 zillow_api, realtor_api, social_api (FB, IG, Twitter), email_api (SendGrid), crm_sync

⚡ Recovery: Retry with exponential backoff, Queue failed platforms, Alert agent on critical failures

ML Layer

Feature Store

Update: Hourly for listings, daily for neighborhood stats

  • listing_price_zscore (normalized price)
  • days_on_market
  • price_per_sqft
  • neighborhood_avg_price
  • school_rating
  • walkability_score
  • recent_sales_velocity
  • agent_performance_score
  • listing_quality_score
  • image_quality_metrics

Model Registry

Strategy: Semantic versioning, shadow mode for 7 days before promotion

  • gpt-4-turbo
  • quality-classifier
  • fair-housing-detector
  • image-ranker

Observability

Metrics

  • 📊 listing_ingestion_rate
  • 📊 variation_generation_time_p95_ms
  • 📊 quality_score_avg
  • 📊 fair_housing_violation_rate
  • 📊 platform_publish_success_rate
  • 📊 llm_latency_p95_ms
  • 📊 llm_tokens_per_listing
  • 📊 cost_per_listing_usd
  • 📊 agent_satisfaction_score

Dashboards

  • 📈 ops_dashboard
  • 📈 ml_dashboard
  • 📈 cost_dashboard
  • 📈 agent_performance_dashboard

Traces

✅ Enabled

Deployment Variants

🚀 Startup

Infrastructure:

  • AWS Lightsail or Heroku (simple deploy)
  • Managed PostgreSQL (RDS or Heroku Postgres)
  • Redis Cloud (free tier)
  • OpenAI API (pay-as-you-go)
  • S3 + CloudFront (images)
  • Serverless workers (Lambda or Cloud Run)

Single region (us-east-1)

Single tenant (1 brokerage)

Manual onboarding

Basic monitoring (CloudWatch)

Cost: $150-500/month for 100-1K listings/day

🏢 Enterprise

Infrastructure:

  • EKS/ECS (container orchestration)
  • Aurora PostgreSQL (multi-AZ, read replicas)
  • ElastiCache Redis (cluster mode)
  • Multi-LLM (OpenAI + Anthropic + Azure fallback)
  • Dedicated VPC per tenant
  • Private networking (VPC peering)
  • BYO KMS/HSM for encryption
  • SSO/SAML integration
  • Audit trail (7-year retention)
  • Multi-region (US + EU)

Multi-tenant with tenant isolation

White-label branding

Custom SLA (99.9%+)

Dedicated support

Data residency compliance (GDPR, CCPA)

Cost: $8K-20K/month for 10K+ listings/day

📈 Migration: Start with startup stack. At 1K listings/day, migrate to queue-based workers. At 5K listings/day, containerize and add multi-tenant support. At 10K listings/day, move to Kubernetes with multi-region and enterprise features.

Risks & Mitigations

⚠️ Fair Housing violation (discriminatory language)

Medium

✓ Mitigation: Multi-layer guardrails: rule-based filter + ML classifier + human review queue. 100% compliance required before publication. Regular audits of published listings.

⚠️ LLM hallucination (fake property features)

High

✓ Mitigation: Cross-reference all factual claims with MLS data. Validate neighborhood stats against public APIs. Flag low-confidence outputs (<0.9) for human review. A/B test variations to measure accuracy.

⚠️ MLS API downtime (no new listings)

Low

✓ Mitigation: Retry logic with exponential backoff. Queue for manual sync. Cache recent listings for 24 hours. Alert ops team on prolonged outage.

⚠️ Platform API rate limits (Zillow, Realtor.com)

Medium

✓ Mitigation: Implement rate limiting (5 req/sec per platform). Queue excess requests. Spread publications over time. Monitor rate limit headers. Escalate to enterprise API tier if needed.

⚠️ Cost overrun (LLM API costs)

Medium

✓ Mitigation: Set cost caps per listing ($0.15 target). Monitor token usage. Cache common variations. Use cheaper models for drafts (GPT-3.5), expensive models for final (GPT-4). Alert on cost spikes.

⚠️ Quality degradation over time (model drift)

Medium

✓ Mitigation: Weekly quality monitoring (score distribution, agent feedback). A/B test new prompts/models. Retrain classifiers quarterly. Human review sample (10%) for ground truth.

⚠️ Data breach (agent/buyer PII exposed)

Low

✓ Mitigation: Encrypt all data at rest (AES-256) and in transit (TLS 1.3). PII redaction before LLM processing. Role-based access control. Regular security audits. Incident response plan.

Evolution Roadmap

1

Phase 1: MVP (0-3 months)

Months 0-3
  • Launch with 1 brokerage, 10 agents
  • Support 5 platforms (Zillow, Realtor.com, email, Facebook, Instagram)
  • Generate 10 variations per listing
  • Achieve 0.90+ quality score
  • Process 100 listings/day
2

Phase 2: Scale (3-6 months)

Months 3-6
  • Onboard 10 brokerages, 100 agents
  • Support 20 platforms (add Twitter, LinkedIn, SMS, print)
  • Generate 30 variations per listing
  • Achieve 0.95+ quality score
  • Process 1,000 listings/day
3

Phase 3: Enterprise (6-12 months)

Months 6-12
  • Onboard 100+ brokerages, 1,000+ agents
  • Support 50+ platforms (all major real estate, social, email, print)
  • Generate 50+ variations per listing
  • Achieve 0.98+ quality score
  • Process 10,000+ listings/day
  • Multi-region (US + EU)

Complete Systems Architecture

9-layer architecture from presentation to security

Presentation
Agent Dashboard
Broker Admin Portal
Mobile App
Public Listing Pages
API Gateway
Load Balancer (ALB)
Rate Limiter (per tenant)
Auth (OAuth 2.0 + JWT)
API Versioning
Agent Layer
Planner Agent
Executor Agent
Evaluator Agent
Guardrail Agent
Template Agent
Distribution Agent
ML Layer
Feature Store
Model Registry
Prompt Store
Evaluation Pipeline
Drift Detector
Integration
MLS Adapter (RETS/RESO)
Platform Connectors (Zillow, etc.)
CRM Sync
Image CDN
Data
PostgreSQL (listings, agents)
Redis (cache, queue)
S3 (images, documents)
Vector DB (embeddings)
External
OpenAI API (GPT-4)
MLS APIs (RETS/RESO)
Zillow/Realtor.com APIs
SendGrid (email)
Cloudinary (images)
Observability
CloudWatch Metrics
DataDog APM
Sentry Errors
ELK Logs
Grafana Dashboards
Security
AWS WAF
KMS (secrets)
IAM Roles
Audit Trail (7yr)
PII Redaction

Sequence Diagram - Listing Publication Flow

MLSAPI GatewayPlanner AgentTemplate AgentGuardrail AgentEvaluator AgentDistribution AgentPlatformsNew listing webhookRoute to plannerGenerate 50 variationsCheck Fair HousingValidate qualityApproved variationsPublish to 50 platformsConfirmation

Real Estate Listing System - Agent Orchestration

6 Components
[RPC]Listing task[RPC]Generate variations[Event]Variations ready[RPC]Validate compliance[RPC]Check quality[Event]Compliance results[Event]Quality scores[Event]Approved variations[Event]Publication statusMaster Orchestrator4 capabilitiesPrimary Workflow Agent3 capabilitiesQuality Validator4 capabilitiesCompliance Agent4 capabilitiesContent Generator4 capabilitiesDistribution Agent4 capabilities
HTTP
REST
gRPC
Event
Stream
WebSocket

Real Estate Listing System - External Integrations

10 Components
[REST]Property data[HTTP]Listing submission[WebSocket]Real-time status[REST]Listing variations[REST]Property feed[REST]Social posts[REST]Syndicated listings[REST]Lead data[Webhook]Contact updates[Event]Campaign triggers[HTTP]Media assets[REST]Upload imagesCore Listing System4 capabilitiesMLS Platforms4 capabilitiesZillow3 capabilitiesRealtor.com3 capabilitiesSocial Media Platforms4 capabilitiesAgent Portal4 capabilitiesCRM Systems4 capabilitiesProperty Websites4 capabilitiesEmail Marketing4 capabilitiesImage CDN4 capabilities
HTTP
REST
gRPC
Event
Stream
WebSocket

Data Flow

MLS → 50 platforms in 90 seconds

1
MLS API0s
Webhook: new listingJSON (address, price, beds, baths, sqft, images)
2
API Gateway0.05s
Auth + rate limitValidated request
3
Planner Agent0.5s
Generate task planDAG (50 platform tasks)
4
Executor Agent1s
Orchestrate workflowParallel task queue
5
Template Agent45s
Generate 50 variations50 text + image sets
6
Guardrail Agent2s
Fair Housing checkCompliance flags
7
Evaluator Agent3s
Quality scoringScores (0.92-0.98)
8
Distribution Agent20s
Publish to platformsAPI calls (50 parallel)
9
Analytics Store1s
Log metricsSuccess/failure per platform
10
Agent Dashboard90s total
Notify agentPublication report

Scaling Patterns

Volume
0-100 listings/day
Pattern
Monolith + Queue
Architecture
Single API server (Node.js)
Redis queue
PostgreSQL
OpenAI API
S3 for images
Cost
$150/month
90-120s per listing
Volume
100-1,000 listings/day
Pattern
Queue + Workers
Architecture
API server (load balanced)
SQS queue
Worker pool (3-5 instances)
RDS PostgreSQL
ElastiCache Redis
CloudFront CDN
Cost
$500/month
60-90s per listing
Volume
1,000-10,000 listings/day
Pattern
Multi-Agent Orchestration
Architecture
ECS Fargate (auto-scale)
EventBridge + Lambda
Aurora PostgreSQL (multi-AZ)
ElastiCache cluster
S3 + CloudFront
LangGraph orchestration
Cost
$2,000/month
45-60s per listing
Volume
10,000+ listings/day
Pattern
Enterprise Multi-Region
Architecture
EKS Kubernetes (multi-region)
Kafka event streaming
Aurora Global Database
Redis Enterprise
Multi-LLM failover (OpenAI, Anthropic, Azure)
Dedicated VPC per tenant
Cost
$8,000+/month
30-45s per listing

Key Integrations

MLS Integration (RETS/RESO)

Protocol: RETS 1.8 or RESO Web API
Poll MLS API hourly for new/updated listings
Parse RETS/RESO XML/JSON
Map to internal schema
Trigger listing pipeline
Update status back to MLS

Zillow/Realtor.com APIs

Protocol: REST API
Format listing per platform schema
POST to platform API
Receive external listing ID
Store mapping (internal ID ↔ external ID)
Poll for status updates

CRM Sync (Salesforce, HubSpot)

Protocol: REST API + webhooks
Sync agent/brokerage data
Log lead conversions
Update listing status
Trigger follow-up workflows

Image CDN (Cloudinary)

Protocol: REST API + upload SDK
Upload original images to Cloudinary
Apply transformations (resize, watermark, format)
Generate CDN URLs
Use in variations

Security & Compliance

Failure Modes & Fallbacks

FailureFallbackImpactSLA
OpenAI API downSwitch to Anthropic Claude (failover LLM)Slight quality variation, no downtime99.9%
MLS API timeoutRetry 3x with exponential backoff, then queue for manual syncDelayed listing ingestion (up to 1 hour)99.5%
Fair Housing violation detectedBlock publication, alert agent, suggest editsListing not published until fixed100% compliance
Platform API rate limit (Zillow)Queue for later, spread requests over timeDelayed publication (up to 30 min)99.0%
Database unavailableRead from replica, queue writesRead-only mode, delayed updates99.9%
Quality score too low (<0.9)Regenerate variation, escalate to human reviewDelayed publication, quality maintained99.5% quality
Image optimization failsUse original images, skip watermarkLower image quality, no branding99.0%

Advanced ML/AI Patterns

Production ML beyond basic API calls

RAG vs Fine-Tuning

Hallucination Detection

Evaluation Framework

Dataset Curation

Agentic RAG

Multi-Model Ensemble

Tech Stack Summary

LLMs
OpenAI GPT-4 (primary), Anthropic Claude (failover), Azure OpenAI (enterprise)
Orchestration
LangGraph (agent framework), Temporal (workflow engine), AWS Step Functions
Database
PostgreSQL (listings, agents), Redis (cache, queue), Pinecone (vector DB)
Queue
SQS (startup), Kafka (enterprise), Redis Streams
Compute
ECS Fargate (startup), EKS (enterprise), Lambda (event-driven)
Storage
S3 (images, documents), CloudFront (CDN), Cloudinary (image optimization)
Monitoring
CloudWatch (metrics, logs), DataDog (APM), Sentry (errors), Grafana (dashboards)
Security
AWS WAF, KMS (encryption), Secrets Manager, IAM, Cognito (auth)
MLS Integration
RETS client (rets-client npm), RESO SDK, cron or webhooks
Platform APIs
Zillow API, Realtor.com API, Facebook Graph API, Twitter API, SendGrid
🏗️

Need Architecture Review?

We'll audit your real estate tech stack, identify bottlenecks, and design a system that scales to 10,000+ listings/day with 99.9% uptime.