The Problem
On Monday you tested the 3 prompts in ChatGPT. Nice! You saw how extraction → validation → risk scoring works. But here's the reality: you can't ask your associates to copy-paste 50 times per day. One lawyer spending 3 hours manually reviewing contracts? That's $450/day in billable time lost. Multiply that across a mid-size firm and you're looking at $135,000/year just on contract admin. Plus the copy-paste errors that lead to missed clauses and compliance risks.
See It Work
Watch the 3 prompts chain together automatically. This is what you'll build.
Watch It Work
See the AI automation in action
The Code
Three levels: start simple, add reliability, then scale to production. Pick where you are.
When to Level Up
0-50 contracts/day
- Basic extraction → validation → questions
- Simple error handling
- Local file storage
- Manual retry on failures
50-500 contracts/day
- Exponential backoff retries
- S3 document storage
- Winston/Python logging
- Timeout handling
- Version tracking
500-2000 contracts/day
- Redis result caching (24hr TTL)
- Async/concurrent processing
- Queue-based ingestion
- Duplicate detection
- Performance monitoring
2000+ contracts/day
- Specialized extraction agents per contract type
- Multi-model routing (GPT-4 for complex, Claude for speed)
- Load balancing across API keys
- Real-time analytics dashboard
- Custom fine-tuned models for domain-specific clauses
Legal-Specific Gotchas
Edge cases and compliance issues unique to contract automation
Version Control & Redlining
Store every version in S3 with timestamps. Use diff algorithms to highlight clause changes. Track who made edits.
# Track contract versions
import difflib
from datetime import datetime
def track_version(contract_id, new_text, previous_text):
version = {
'contract_id': contract_id,
'timestamp': datetime.now().isoformat(),Jurisdiction-Specific Language
Build jurisdiction-specific validation rules. Flag when governing law doesn't match expected clause language.
# Jurisdiction-aware validation
jurisdiction_rules = {
'Delaware': {
'required_clauses': ['indemnification', 'limitation_liability'],
'force_majeure_keywords': ['act of god', 'governmental action'],
'liability_cap_max': 'unlimited' # Delaware allows
},
'California': {Multi-Party Contracts
Explicitly prompt for all parties and their relationships. Create party graph showing who owes what to whom.
# Extract party relationships extraction_prompt = """Extract ALL parties from this contract. For each party, identify: - name - role (e.g., 'Prime Contractor', 'Client', 'Subcontractor') - entity_type - obligations_to (list of other party names) - receives_from (list of other party names)
Nested Definitions & Cross-References
First pass: extract all defined terms. Second pass: resolve references. Build term glossary.
# Two-pass extraction with term resolution
import re
def extract_defined_terms(contract_text):
"""Extract all defined terms from contract"""
# Pattern: "Term" ("Defined Term") or "Term" (the "Defined Term")
pattern = r'"([^"]+)"\s*\((?:the\s+)?"([^"]+)"\)'
Confidential Information Handling
Pre-process to redact PII and sensitive terms. Use placeholder tokens. Post-process to restore for final report.
# Redact sensitive info before sending to LLM
import re
from typing import Dict, Tuple
class ContractRedactor:
def __init__(self):
self.redaction_map = {}
self.counter = 0Adjust Your Numbers
❌ Manual Process
✅ AI-Automated
You Save
2026 Randeep Bhatia. All Rights Reserved.
No part of this content may be reproduced, distributed, or transmitted in any form without prior written permission.