AI Production Readiness Checklist
A systematic checklist for evaluating whether an AI feature is ready for production deployment. Covers safety, trust, monitoring, and user impact.
AI Production Readiness Checklist
Before shipping any AI feature to production, evaluate it against these dimensions.
1. Safety & Trust
- Harm potential assessed — What happens when the AI is wrong? (High/Medium/Low)
- Escalation path defined — Users can always reach a human
- Confidence thresholds set — AI only acts when confidence is above threshold
- Failure mode documented — Team knows what failure looks like and how to detect it
- Content guardrails tested — Injection, prompt leaking, harmful outputs tested
2. Quality Evaluation
- Real conversations reviewed — Not just synthetic test cases
- Edge cases tagged — Unusual requests, multilingual, sarcasm, etc.
- Quality scoring defined — What does “good” look like? Rubric exists.
- Baseline measured — Current non-AI performance documented
- A/B test plan ready — Statistical significance, rollback criteria
3. Monitoring & Observability
- Metrics pipeline live — Latency, error rate, confidence distribution
- Alert thresholds set — Team gets paged when quality degrades
- Audit trail exists — Every AI decision is logged for review
- Feedback loop active — Users can flag bad outputs
- Dashboard accessible — Non-technical stakeholders can check health
4. User Experience
- Expectations set — Users know they’re interacting with AI
- Transparency built in — Users can see why the AI made a decision
- Opt-out available — Users can choose the non-AI path
- Loading states handled — AI latency doesn’t break the UX
- Graceful degradation — Feature works when AI service is down
5. Operational Readiness
- Rollback plan documented — Can ship be reverted in <5 minutes?
- Cost model understood — Per-request cost, monthly projection
- Rate limits in place — Abuse protection, cost ceiling
- On-call knows the feature — Runbook exists, team briefed
- Legal review complete — Data handling, privacy, compliance checked
When to Use This
Run this checklist:
- Before any AI feature reaches GA
- Before expanding AI to new user segments
- After any significant model or prompt change
- Quarterly as part of AI feature health review
Related
- Signal Scorecard — For evaluating customer signal quality
- RICE/DRICE — For prioritising which AI features to build