## Infrastructure Overview
**Cloud**: AWS (primary: us-west-2, DR: us-east-1)
**Orchestration**: EKS 1.28 with Karpenter autoscaling
**Service Mesh**: Istio 1.20
## Services
| Service | Criticality | Team | Notes |
|---------|-------------|------|-------|
| payments | P0 | checkout | PCI compliant, separate VPC |
| cart | P1 | checkout | Redis for session |
| catalog | P2 | inventory | Read-heavy, uses caching |
| analytics | P3 | data | Batch, can tolerate delays |
## Data Sources
- **Logs**: Coralogix (primary), CloudWatch (backup)
- **Metrics**: Grafana Cloud (Prometheus)
- **Traces**: Datadog APM
- **Alerts**: PagerDuty → Slack
- **Enrichment**: Snowflake (historical data)
## Key Dashboards
- Production Overview: https://grafana.acme.com/d/prod-overview
- Error Rates: https://grafana.acme.com/d/errors
- Database Performance: https://grafana.acme.com/d/rds