Build, manage, and operate APIs with Amazon API Gateway (REST, HTTP, and WebSocket). Triggers on phrases like: API Gateway, REST API, HTTP API, WebSocket API, custom domain, Lambda authorizer, usage plan, throttling, CORS, VPC link, private API. Also covers troubleshooting API Gateway errors (4xx, 5xx, timeout, CORS failures) and IaC templates containing API Gateway resources. For general REST API design unrelated to AWS, do not trigger.
Expert guidance for building, managing, governing, and operating APIs with Amazon API Gateway. Covers REST APIs (v1), HTTP APIs (v2), and WebSocket APIs.
When answering API Gateway questions:
references/sam-cloudformation.md or references/sam-service-integrations.md and provide complete, working SAM/CloudFormation YAMLChoose the right API type first. This decision affects every downstream choice.
REST API is the full-featured API management platform for enterprises. It provides the governance, security, monetization, and operational controls that organizations need to build, publish, and manage APIs at scale, including usage plans with per-consumer throttling and quotas, API keys, request validation, WAF integration, resource policies, caching, canary deployments, and private endpoints.
HTTP API is the lightweight, low-cost proxy optimized for simpler API workloads. It offers ~70% lower cost and lower latency but trades away the API management features. Choose HTTP API when you need a fast, lightweight proxy to Lambda or HTTP backends and don't require the enterprise controls above.
| Factor | REST API (v1) | HTTP API (v2) | WebSocket API |
|---|---|---|---|
| Positioning | Full API management | Low-cost proxy | Real-time bidirectional |
| Cost | Higher | ~70% cheaper | Per-message pricing |
| Latency | Higher | Lower | Persistent connection |
| Max timeout | 50ms-29s (up to 300s Regional/Private) | 30s hard limit | 29s |
| Payload | 10 MB | 10 MB | 128 KB message / 32 KB frame |
| API Management | |||
| Usage plans/API keys | Yes | No | No |
| Request validation | Yes (JSON Schema draft 4) | No | No |
| Caching | Yes (0.5-237 GB) | No | No |
| Custom gateway responses | Yes | No | No |
| VTL mapping templates | Yes | No (parameter mapping only) | Yes |
| Security & Governance | |||
| WAF | Yes | No (use CloudFront + WAF) | No |
| Resource policies | Yes | No | No |
| Private endpoints | Yes | No | No |
| mTLS | Yes (Regional custom domain only) | Yes (Regional custom domain only) | Via CloudFront viewer mTLS |
| Auth | |||
| Lambda authorizer | Yes (TOKEN + REQUEST) | Yes (REQUEST only, simple + IAM policy format) | Yes (REQUEST on $connect only) |
| JWT authorizer | No (use Cognito authorizer) | Yes (native) | No |
| Cognito authorizer | Yes (native) | Use JWT authorizer | No |
| Operations | |||
| Canary deployments | Yes | No | No |
| Response streaming | Yes | No | No |
| X-Ray tracing | Yes | No | No |
| Execution logging | Yes | No | Yes |
| Custom domain sharing | Not with WebSocket | Not with WebSocket | Not with REST/HTTP |
Use REST API when: you are building APIs for external consumers, partners, or multi-tenant platforms; need to enforce per-consumer rate limits and quotas; require request validation, caching, or WAF at the API layer; need private endpoints, resource policies, or canary deployments; or are building an API product with monetization and governance requirements.
Use HTTP API when: you are building lightweight APIs or simple backend proxies; cost and latency are the primary concerns; you don't need per-consumer throttling, request validation, caching, or WAF at the API layer; and native JWT authorization with OIDC/OAuth 2.0 meets your auth needs. Accept the hard 30s timeout and lack of API management features. For WAF, edge caching, or edge compute, place a CloudFront distribution in front of the HTTP API.
Use WebSocket API when you need: persistent bidirectional connections for real-time use cases (chat, notifications, live dashboards).
Before implementation, gather requirements systematically. Consult references/requirements-gathering.md for the full requirements workflow covering endpoints, auth, data models, performance, security, and deployment needs.
Key design decisions:
references/authentication.md for the decision treeConsult these references based on what you're building:
references/architecture-patterns.md: topology, multi-tenant SaaS, hybrid workloads, private APIs, multi-region, streamingreferences/websocket.md: route selection, @connections management, session management, client resilience, SAM templates, limits, multi-regionreferences/service-integrations.md: direct AWS service integrations (EventBridge, SQS, SNS, DynamoDB, Kinesis, Step Functions, S3), HTTP proxy, mock, VTL mapping templates, binary media types, Lambda sync/async invocationreferences/custom-domains-routing.md: base path mappings, routing rules, header-based versioningreferences/security.md: mTLS (API Gateway native + CloudFront viewer mTLS), TLS policies, resource policies, WAF, HttpOnly cookies, CRL checksreferences/sam-cloudformation.md: IaC patterns, OpenAPI extensions, VTL reference, binary datareferences/sam-service-integrations.md: EventBridge, SQS, DynamoDB CRUD, Kinesis, Step Functions (REST + WebSocket) templatesreferences/performance-scaling.mdAlways configure access logging. For REST and WebSocket APIs, also enable execution logging (ERROR level for production, INFO only for debugging). HTTP API does not support execution logging; use access logs with enhanced observability variables instead.
Consult the observability references based on what you need:
references/observability-logging.mdreferences/observability-metrics-alarms.mdreferences/observability-analytics.mdreferences/deployment.md for detailed patternsFor organization-wide API standards, see references/governance.md covering:
When responding to API Gateway questions, structure your answer as:
references/pitfalls.mdWhen diagnosing API Gateway errors, consult references/troubleshooting.md for detailed resolution steps. Here are the most common issues:
| Error | Most Common Cause | Quick Fix |
|---|---|---|
| 400 Bad Request | Protocol mismatch (HTTP/HTTPS) with ALB | Match protocol to listener type |
| 401 Unauthorized | Wrong token type (ID vs access) or missing identity sources | Check token type matches scope config; verify all identity sources sent |
| 403 Missing Auth Token | Stage name in URL when using custom domain | Remove stage name from URL path |
| 403 from VPC | Private DNS on VPC endpoint intercepts ALL API calls | Use custom domain names for public APIs |
| 403 Access Denied | Resource policy + auth type mismatch or missing redeployment | Review policy, check auth type, redeploy API |
| 403 mTLS | Certificate issuer not in truststore or weak signature algorithm | Verify CA in truststore, use SHA-256+ |
| 429 Too Many Requests | Account/stage/method throttle limits exceeded | Implement jittered exponential backoff; request limit increase |
| 500 Internal Error | Missing Lambda invoke permission (especially with stage variables) | Add resource-based policy to Lambda function |
| 502 Bad Gateway | Lambda response not in required proxy format | Return {statusCode, headers, body} from Lambda |
| 504 Timeout | Backend exceeds 29s (REST, increasable) or 30s (HTTP, hard). HTTP API body says "Service Unavailable" but status is 504 | Optimize backend, request timeout increase (REST Regional/Private), or switch to async invocation |
| CORS errors | Missing CORS headers on Gateway Responses (4XX/5XX) | Add CORS headers to DEFAULT_4XX and DEFAULT_5XX gateway responses |
| SSL/PKIX errors | Incomplete certificate chain on backend | Provide full cert chain; use insecureSkipVerification only for testing |
{"message":"Service Unavailable"} while Lambda continues/ping and /sping are reserved paths. Do not use for API resourcesREQUEST_TOO_LARGE is the only gateway response that cannot be customized. Use DEFAULT_4XX as a catch-all to add CORS headers for all 4xx errors including 413maxItems/minItems not validated in REST API request validationsecurity in OpenAPI is ignored. Must set per-operationFor additional pitfalls (header handling, URL encoding, caching charges, canary deployments, usage plans), see references/pitfalls.md.
Default: CDK TypeScript
Override syntax:
When not specified, ALWAYS use CDK TypeScript.
See references/service-limits.md for the complete table. Most numeric quotas below are default values and adjustable; check with your AWS account team and the latest quotas page before using them for architectural decisions. Key limits:
| Resource | REST API | HTTP API | WebSocket |
|---|---|---|---|
| Payload size | 10 MB | 10 MB | 128 KB |
| Integration timeout | 50ms-29s (up to 300s Regional/Private) | 30s hard | 29s |
| APIs per region | 600 Regional/Private; 120 Edge-optimized | 600 | 600 |
| Stages per API | 10 | 10 | 10 |
| Routes/resources per API | 300 | 300 | 300 |
| Custom domains (public) | 120 | 120 | 120 |
| Account throttle | 10,000 rps / 5,000 burst | Same | Same (shared quota) |
| API keys per region | 10,000 | N/A | N/A |
| Usage plans per region | 300 | N/A | N/A |
| Cache sizes | 0.5 GB - 237 GB | N/A | N/A |