Part VII: Practice and Teaching Kit
Appendix E

Architecture Checklists

Objective: Provide pre-launch architecture review checklists for AI products.

E.1 Pre-Launch AI Architecture Review

Pre-Launch Architecture Checklist

Functional Requirements must verify that all priority user stories have been implemented, core AI capability has been verified against requirements, edge cases have been identified and handled, and graceful degradation has been tested. Performance requirements must confirm that latency meets SLA targets at the ninety-fifth and ninety-ninth percentiles, throughput has been validated under load, cost per request is within budget, and resource utilization has been optimized. Reliability requirements must ensure that failover mechanisms have been tested, circuit breakers have been implemented, timeout handling has been verified, and error rates are within acceptable range. Scalability requirements must verify that horizontal scaling has been verified, database connection pooling has been configured, a caching strategy has been implemented, and rate limiting is in place. Observability requirements must confirm that structured logging has been implemented, a metrics dashboard has been deployed, alerts have been configured with thresholds, and distributed tracing has been enabled.

E.2 Security and Privacy Assessment

Security Checklist for AI Systems

Data Protection requires that PII data has been identified and classified, data encryption at rest and in transit has been implemented, sensitive data logging is prohibited, and data retention policies are enforced. Access Control requires that authentication has been implemented, authorization is properly scoped, API keys are rotated regularly, and the least privilege principle has been followed. AI-Specific Threats require that prompt injection mitigations are in place, input validation and sanitization has been implemented, output filtering has been implemented, and rate limiting prevents abuse. Compliance requires that GDPR requirements have been addressed if operating in the EU, HIPAA requirements have been addressed if in healthcare, data residency requirements have been met, and an audit trail has been implemented.

E.3 Cost and Latency Evaluation

Cost Factor Target Measurement
LLM cost per 1K tokens [Define based on unit economics] Production monitoring
Avg tokens per request [Optimize for quality/cost] Request logging
Cost per user session [Below willingness to pay] User cohort analysis
P50 latency [< 500ms for interactive] APM monitoring
P95 latency [< 1s for most interactions] APM monitoring
P99 latency [Define acceptable max] APM monitoring

Cost and latency targets template

E.4 Observability Readiness

Observability Checklist

Metrics to Capture must include request volume across total, by endpoint, and by user dimensions; latency at the fiftieth, ninety-fifth, and ninety-ninth percentiles by endpoint; error rates across total, by type, and by severity; AI-specific metrics including token usage, model latency, and retrieval quality; and business metrics including conversion, engagement, and satisfaction. Logging Requirements must include structured JSON logging, request correlation IDs, user ID and session tracking, an AI decision audit trail, and verification that no sensitive data appears in logs. Alerting Rules must include error rate spike alerts, latency degradation alerts, cost anomaly alerts, AI quality regression alerts, and a defined on-call escalation procedure.

E.5 Compliance Verification

Compliance Verification Checklist

Pre-Launch Compliance Gates require that legal review has been completed, a privacy impact assessment has been done, a security penetration test has been conducted, an accessibility audit has been passed, and documentation for regulators has been prepared. Ongoing Compliance requires that data retention audits occur quarterly, access log reviews occur monthly, model bias assessments occur quarterly, the incident response plan has been tested, and compliance training is current.

E.6 Rollback Procedures

Rollback Readiness Checklist

Feature flags must be in place for all major changes, the rollback procedure must be documented, rollback must be tested in staging, database migration rollback scripts must be ready, the communication template must be prepared, the on-call runbook must be updated, and customer support must be briefed.