The Problem
On Monday you tested the 3-prompt framework in ChatGPT. You saw how transaction analysis → risk scoring → alert generation works. But here's reality: your fraud team is drowning. Each analyst reviews maybe 50 transactions per day. That's 6 minutes per transaction. With 5,000 daily transactions, you need 100 analysts working full-time. The math doesn't work. Plus manual review misses patterns. One analyst can't remember that weird $47.23 charge from 3 weeks ago that matches today's suspicious activity. Your false positive rate is 60% because humans are guessing. Real fraud slips through while your team chases ghosts.
See It Work
Watch the 3 prompts chain together automatically. This is what you'll build.
Watch It Work
See the AI automation in action
The Code
Three levels: start simple, add reliability, then scale to real-time. Pick where you are.
When to Level Up
Simple API Calls
- Basic prompt chaining (extract → risk → alert)
- Manual review of all flagged transactions
- Simple logging to files
- No caching or optimization
- Single-threaded processing
With Error Handling & Caching
- Retry logic with exponential backoff
- Redis caching (1 hour TTL)
- Rate limiting to prevent API overload
- Structured logging (Winston)
- Rule-based fallback if AI fails
- Prometheus metrics
Stream Processing Pipeline
- Kafka stream processing
- PostgreSQL for audit trail
- Circuit breaker pattern
- Async processing (100+ concurrent)
- Automated alert routing
- Historical pattern matching
- Real-time dashboards
Multi-Agent System
- Multi-region deployment
- Specialized agents (amount, geo, behavior, merchant)
- ML model ensemble
- Auto-scaling based on load
- 99.9% uptime SLA
- Regulatory compliance automation
- Advanced analytics and reporting
- A/B testing for model improvements
Banking-Specific Gotchas
Real challenges from production fraud detection systems. Learn from our mistakes.
PCI-DSS Compliance: Never Log Full Card Numbers
Mask sensitive data BEFORE logging. Use last 4 digits only.
# ❌ WRONG - Logs full card number
logger.error(f"Transaction failed: {transaction.card_number}")
# ✅ RIGHT - Mask before logging
def mask_card(card_number: str) -> str:
return f"****{card_number[-4:]}"
logger.error(f"Transaction failed: {mask_card(transaction.card_number)}")Real-Time vs Batch: Don't Block Legitimate Transactions
Use async processing with timeout fallback. Approve low-risk immediately, queue high-risk for review.
# ❌ WRONG - Blocks customer while AI thinks
async def process_transaction(txn):
risk_score = await analyze_fraud(txn) # Takes 5-10 seconds
if risk_score > 70:
return "BLOCKED"
return "APPROVED"
# ✅ RIGHT - Fast-path for low-risk, async for high-riskFalse Positives Kill Customer Trust
Tune thresholds based on REAL fraud rates. Track false positive rate as closely as fraud detection rate.
# ❌ WRONG - Blocks too aggressively
if risk_score > 50: # 60% of transactions flagged
return "BLOCKED"
# ✅ RIGHT - Graduated response with customer context
def determine_action(risk_score, customer_history):
# Adjust thresholds based on customer
if customer_history.get('years_active', 0) > 5:Geographic Patterns Change: Don't Hardcode Risk Lists
Use ML-based geographic risk scoring. Update patterns weekly based on actual fraud data.
# ❌ WRONG - Hardcoded risk countries
HIGH_RISK_COUNTRIES = ['nigeria', 'ghana', 'philippines']
if transaction.country.lower() in HIGH_RISK_COUNTRIES:
risk_score += 50 # Huge penalty
# ✅ RIGHT - Dynamic risk scoring based on recent fraud
class GeographicRiskModel:Regulatory Reporting: Automate SAR Filing
Automate SAR generation and filing. Track deadlines. Never miss a report.
# ❌ WRONG - Manual SAR filing
if risk_score > 85:
print("High risk transaction - someone should file a SAR")
# (Nobody does)
# ✅ RIGHT - Automated SAR workflow
class SARManager:
def __init__(self, db, fincen_api):Adjust Your Numbers
❌ Manual Process
✅ AI-Automated
You Save
2026 Randeep Bhatia. All Rights Reserved.
No part of this content may be reproduced, distributed, or transmitted in any form without prior written permission.