From prompts to production content pipeline.
Monday: 3 prompts for product descriptions. Tuesday: automated generation code. Wednesday: team workflows for content ops. Thursday: complete technical architecture. Agents, ML pipeline, SEO optimization, and CMS integration for 100K+ products daily.
Key Assumptions
- β’Catalog size: 1K-500K SKUs, growing 5-10% monthly
- β’Update frequency: New products daily, refreshes quarterly
- β’SEO requirements: Target keywords, readability scores, meta tags
- β’Brand consistency: Voice guidelines, prohibited terms, templates
- β’Integration: Shopify/Magento/BigCommerce or custom CMS API
System Requirements
Functional
- Generate product descriptions from attributes (title, specs, images)
- SEO optimization: keyword density, meta descriptions, alt text
- Brand voice enforcement: tone, style, prohibited words
- Bulk processing: 1K+ products in single batch
- CMS integration: Push to Shopify, Magento, or custom API
- A/B testing: Multiple variants per product
- Quality scoring: Readability, uniqueness, keyword coverage
Non-Functional (SLOs)
π° Cost Targets: {"per_product_usd":0.02,"per_batch_1k_usd":15,"monthly_infra_usd":500}
Agent Layer
planner
L3Decompose product description task into steps: template selection, content generation, SEO optimization, quality check
π§ template_selector, keyword_research_api, brand_policy_checker
β‘ Recovery: If no template found β fallback to generic template, If keyword API fails β use cached keywords from last 24h
executor
L2Execute the plan: generate description, apply template, call LLM, format output
π§ openai_api, template_engine, image_analyzer
β‘ Recovery: If LLM timeout β retry 3x with exponential backoff, If generation fails β queue for human review
seo_agent
L3Optimize description for search: keyword density, readability, meta tags, structured data
π§ semrush_api, readability_scorer, keyword_density_analyzer
β‘ Recovery: If SEMrush API down β use cached keyword data, If density too low β regenerate with keyword boost
evaluator
L4Validate output quality: brand voice, readability, uniqueness, keyword coverage
π§ plagiarism_checker, brand_voice_classifier, readability_api
β‘ Recovery: If score < 80 β trigger regeneration with feedback, If plagiarism detected β block and flag
guardrail
L4Policy enforcement: prohibited terms, legal compliance, safety filters
π§ prohibited_terms_checker, legal_compliance_api, toxicity_detector
β‘ Recovery: If violation found β block publish, queue for review, If legal API down β default to conservative blocking
template_agent
L2Select and populate category-specific templates with product data
π§ template_db, variable_extractor, style_matcher
β‘ Recovery: If no category template β use generic, If missing variables β mark as optional
ML Layer
Feature Store
Update: Daily batch + real-time on product update
- β’ product_attribute_embeddings
- β’ historical_conversion_rate
- β’ category_avg_quality_score
- β’ brand_voice_vector
- β’ competitor_keyword_density
- β’ seasonal_keyword_trends
Model Registry
Strategy: Blue-green deployment with 10% canary
- β’ gpt-4o-mini
- β’ brand_voice_classifier
- β’ readability_scorer
- β’ keyword_ranker
Observability
Metrics
- π generation_success_rate
- π llm_latency_p95_ms
- π quality_score_avg
- π approval_rate_percent
- π regeneration_rate_percent
- π cost_per_product_usd
- π seo_score_avg
- π keyword_density_avg
Dashboards
- π ops_dashboard
- π ml_dashboard
- π cost_dashboard
- π quality_dashboard
Traces
β Enabled
Deployment Variants
π Startup
Infrastructure:
- β’ Vercel/Netlify for frontend
- β’ Serverless functions (Lambda/Cloud Run)
- β’ Managed PostgreSQL (Supabase/Neon)
- β’ Redis Cloud
- β’ OpenAI API (pay-as-you-go)
- β’ Shopify integration (OAuth app)
β Quick to ship, low upfront cost
β Auto-scaling with serverless
β Managed services reduce ops burden
β Cost: ~$100-500/mo depending on volume
π’ Enterprise
Infrastructure:
- β’ Kubernetes (EKS/GKE) for control plane
- β’ VPC isolation + private subnets
- β’ Aurora PostgreSQL (multi-AZ)
- β’ Redis Cluster (ElastiCache)
- β’ BYO LLM (self-hosted or Azure OpenAI)
- β’ KMS/HSM for encryption
- β’ SSO/SAML integration
- β’ Multi-region deployment
- β’ Dedicated support + SLA
β Full control over infrastructure
β Data residency compliance (GDPR, SOC2)
β Private networking, no public internet
β Cost: $5K-20K/mo depending on scale
π Migration: Start with startup stack. Migrate to enterprise when: (1) >10K products/day, (2) Need data residency, (3) Require 99.9% SLA, (4) Custom LLM or private deployment. Migration path: Lift-and-shift to containers β VPC setup β Multi-region replication β BYO LLM.
Risks & Mitigations
β οΈ LLM generates off-brand content
Mediumβ Mitigation: Multi-layer validation: brand voice classifier (94% accuracy), human review queue for low-confidence outputs, regular fine-tuning on approved content
β οΈ Hallucinated product features
Mediumβ Mitigation: Attribute validation against product DB, fact-checking layer, confidence scoring, 100% human review for high-value products (>$500)
β οΈ SEO keyword stuffing (Google penalty)
Lowβ Mitigation: Keyword density limits (2-3%), readability scoring (Flesch-Kincaid >60), A/B test against human-written baseline
β οΈ API rate limits (OpenAI, Shopify)
Highβ Mitigation: Multi-LLM failover, exponential backoff, queue-based retry, rate limiter (10 req/sec), caching for 24h
β οΈ Cost overruns (LLM API costs)
Mediumβ Mitigation: Cost tracking per product, alerts at $1K/day, auto-throttle at $5K/day, monthly budget caps, cheaper models for low-priority products
β οΈ Data privacy violation (PII in descriptions)
Lowβ Mitigation: PII detection + redaction, no customer data in prompts, audit logs (2yr retention), privacy-by-design
β οΈ Competitor trademark infringement
Mediumβ Mitigation: Trademark database check, prohibited terms blocklist (500+ brands), legal review for high-risk categories, guardrail agent enforcement
Evolution Roadmap
Phase 1: MVP (0-3 months)
Q1 2025- β Launch with 3 core agents (Executor, Evaluator, Guardrail)
- β Shopify integration only
- β 100-500 products/day capacity
- β 90% quality score target
Phase 2: Scale (3-6 months)
Q2 2025- β Add Planner, SEO, Template agents
- β Multi-platform support (Magento, BigCommerce)
- β 1K-10K products/day capacity
- β A/B testing framework
- β 95% approval rate
Phase 3: Enterprise (6-12 months)
Q3-Q4 2025- β 10K-100K products/day capacity
- β Multi-region deployment
- β 99.9% SLA
- β Custom LLM support
- β Enterprise security (SSO, RBAC, audit)
Complete Systems Architecture
End-to-end layer view
Sequence Diagram - Product Description Flow
E-commerce Product Description - Agent Orchestration
7 ComponentsE-commerce Product Description - External Integrations
9 ComponentsData Flow - Product to Published Description
End-to-end flow in 4.5 seconds
Scaling Patterns
Key Integrations
Shopify API
SEMrush API
OpenAI API
Image Analysis (AWS Rekognition)
Security & Compliance
Failure Modes & Fallbacks
Failure | Fallback | Impact | SLA |
---|---|---|---|
OpenAI API down | Switch to backup LLM (Anthropic Claude) or queue for retry | Degraded performance, 10% slower | 99.5% |
SEMrush API timeout | Use cached keywords from last 24h or generic keywords | Slightly lower SEO optimization | 99.0% |
Quality score < 80 | Regenerate with feedback or queue for human review | Delayed publication, maintains quality | 95% auto-approval |
Guardrail detects policy violation | Block publication, flag for review | Safety maintained, no bad content published | 100% enforcement |
Shopify API rate limit | Exponential backoff, queue remaining products | Delayed sync, eventual consistency | 99.0% |
Database connection loss | Read from replica, queue writes | Read-only mode for up to 5 minutes | 99.9% |
Template not found for category | Use generic fallback template | Less customized output | 100% coverage |
Multi-Agent Architecture
6 specialized agents collaborate autonomously
ββββββββββββββββ β Planner β β Orchestrates all agents ββββββββ¬ββββββββ β βββββ΄βββββ¬βββββββββββ¬βββββββββββ¬βββββββββββ β β β β β ββββΌβββ βββΌβββ ββββββΌβββββ ββββΌββββ ββββΌβββββ βTemp β βExecβ β SEO β βEval β βGuard β βAgentβ βutorβ β Agent β βuator β βrail β βββββββ ββββββ βββββββββββ ββββββββ βββββββββ β β β β β βββββββββ΄βββββββββββ΄βββββββββββ΄βββββββββββ β βββββΌβββββ βShopify β βAdapter β ββββββββββ
Agent Collaboration Flow
Reactive Agent
Reflexive Agent
Deliberative Agent
Orchestrator Agent
Levels of Autonomy
Advanced ML/AI Patterns
Production ML engineering beyond basic API calls