Beyond the Basics: Production-Grade Load Testing of AWS API Gateway with Locust and Terraform
by Gary Worthington, More Than Monkeys

When you deploy APIs to AWS, you inevitably reach the question: what happens when traffic spikes?
API Gateway is marketed as a “scales for you” service, but scaling is only part of the story. Every request may trigger a Lambda, fetch from DynamoDB, verify a Cognito token, or traverse a VPC link. Each of those has limits, and under heavy load you’ll only discover them if you test.
In this guide, we’ll build a complete load testing setup with Locust that works at three levels:
- A minimal local project that hits https://example.com — so you can see Locust working in minutes.
- A production-style ECS Fargate project that runs master and workers as containers, with Cognito + API Key auth helpers.
- A Terraform deployment that provisions ECS, IAM, networking, Cloud Map, autoscaling, and CloudWatch logging.
Everything you need is bundled in one download:
locust-complete-guide.zip
1. Why Locust Is a Good Fit
Locust is a Python-based load testing tool. Instead of describing traffic in XML or a GUI, you write Python classes that behave like “users.”
Advantages:
- Python code: you can loop, branch, randomise, or import libraries.
- Composable: define multiple user types in one test run.
- Distributed: master/worker architecture scales horizontally.
- Interactive: web UI for ad-hoc runs, headless mode for CI.
This flexibility makes it easy to model real workloads, like “80% public GET requests, 20% authenticated writes.”
2. Why Load Test API Gateway
API Gateway itself rarely fails under traffic; AWS runs it at scale. The bottlenecks are usually downstream:
- Cognito adds latency for token validation.
- Lambda cold starts cause spikes in P99 latency.
- DynamoDB can throttle when provisioned capacity is exceeded.
- VPC links saturate network interfaces under burst load.
- Downstream APIs may timeout or error under stress.
By load testing, you can catch these issues before your users do.
3. Start Simple: Local Example
The locust-example/ folder in the zip is a self-contained starter. It hits https://example.com so you don’t need to configure AWS to see Locust in action.
Step 1: Install Locust
cd locust-example
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
This creates a virtual environment and installs Locust.
Step 2: Explore locustfile.py
from locust import FastHttpUser, task, between
from utils.headers import build_base_headers
class ExampleUser(FastHttpUser):
wait_time = between(1, 2)
def on_start(self):
self.client.base_url = "https://example.com"
self.base_headers = build_base_headers()
@task
def load_homepage(self):
self.client.get("/", headers=self.base_headers, name="GET /")
- on_start sets the base URL and headers.
- @task marks methods as load test actions.
- between(1, 2) simulates a wait time between requests.
Step 3: Run Locust
locust -f locustfile.py
Visit http://localhost:8089, start a test, and watch requests flow.
This proves your environment is set up correctly.
4. Moving to Real APIs: Adding Auth
Most APIs require authentication. The locust-ecs-fargate/ project adds two helpers:
- auth/apikey.py for API Key headers.
- auth/cognito.py for Cognito user pool login.
API Key User
from locust import FastHttpUser, task, between
from auth.apikey import api_key_headers
class PublicUser(FastHttpUser):
wait_time = between(0.5, 2.0)
@task(3)
def list_articles(self):
self.client.get("/articles", headers=api_key_headers(), name="GET /articles"
This reads API_KEY from an environment variable.
Cognito User
from locust import FastHttpUser, task, between
from auth.cognito import CognitoAuth
class AuthUser(FastHttpUser):
wait_time = between(1.0, 3.0)
def on_start(self):
self.cognito = CognitoAuth(region="eu-west-1")
@task(2)
def post_comment(self):
token = self.cognito.get_token()
self.client.post("/comments", json={"text":"load test"}, headers={"Authorization": f"Bearer {token}"})
Here CognitoAuth logs in once, caches the ID token, and refreshes when needed.
5. Scaling Up: ECS Fargate
Local runs are fine for small numbers. To simulate thousands of users, you need distributed workers. ECS Fargate is perfect because you don’t manage servers.
The locust-ecs-fargate/ folder gives you:
- Dockerfile — builds a Locust container.
- locustfile.py — defines both API Key and Cognito users.
- ecs/task-master.json and ecs/task-worker.json — sample task definitions.
- Makefile — for ECR build and push.
Step 1: Build and Push Image
cd locust-ecs-fargate
make all AWS_REGION=eu-west-1 REPO=locust-runner
This builds the Docker image and pushes it to ECR.
Step 2: Secrets in SSM
Store API key and Cognito credentials:
aws ssm put-parameter --name /locust/API_KEY --type SecureString --value "abcd123" --overwrite
aws ssm put-parameter --name /locust/COGNITO_CLIENT_ID --type SecureString --value "clientid" --overwrite
aws ssm put-parameter --name /locust/COGNITO_USERNAME --type SecureString --value "user@example.com" --overwrite
aws ssm put-parameter --name /locust/COGNITO_PASSWORD --type SecureString --value "s3cr3t" --overwrite
The ECS task role will read these at runtime.
Step 3: Run Master + Workers
- Master runs with --master, exposing UI on 8089 and control ports 5557–5558.
- Workers run with --worker --master-host master.locust.local.
You can run one master service and scale the worker service to as many tasks as you need.
6. Automating with Terraform
Manually wiring ECS services is tedious. The terraform/ folder contains a ready-made module.
Step 1: Variables
Edit variables.tf or pass at apply time:
- vpc_id — your VPC.
- private_subnet_ids — worker subnets (with NAT).
- public_subnet_ids — master subnets if exposing UI.
- api_target_url — your API Gateway stage URL.
- image_uri — your pushed ECR image URI.
Step 2: IAM and Security
- Execution role lets tasks pull from ECR and log to CloudWatch.
- Task role allows reading SSM secrets and calling Cognito.
- Master SG: allows 8089 (UI) and 5557–5558 from workers.
- Worker SG: outbound only.
Step 3: Service Discovery
Cloud Map provides internal DNS:
- Namespace: locust.local
- Service: master.locust.local
Workers resolve this name to reach the master.
Step 4: ECS Services
- Master service: 1 task, optionally behind NLB for UI.
- Worker service: N tasks, auto-scaled on CPU.
Step 5: Apply
cd terraform
terraform init
terraform apply \
-var='aws_region=eu-west-1' \
-var='vpc_id=vpc-xxxxxxxx' \
-var='private_subnet_ids=["subnet-a","subnet-b"]' \
-var='public_subnet_ids=["subnet-c","subnet-d"]' \
-var='api_target_url=https://abcd.execute-api.eu-west-1.amazonaws.com/prod' \
-var='image_uri=123456789012.dkr.ecr.eu-west-1.amazonaws.com/locust-runner:latest'
Terraform creates everything: IAM, SGs, Cloud Map, ECS services, autoscaling, and (if enabled) a public NLB.
The output master_ui_url tells you where to access the UI.
7. Interpreting Results
A test is only useful if you read the right signals:
- P95 latency from Locust vs. CloudWatch API Gateway Latency.
- 5XX errors — often mean downstream capacity issues.
- Throughput vs. usage plan limits — ensure you don’t exceed quotas.
- Scaling behaviour — check Lambda concurrency or DynamoDB auto scaling triggers.
Cross-reference Locust’s CSV reports with CloudWatch metrics and X-Ray traces.
8. Cleaning Up
To avoid costs:
- terraform destroy to tear down ECS, SGs, IAM, Cloud Map.
- Delete ECR images if unused.
- Remove SSM parameters if they were test-only.
9. Complete Example Download
You don’t need to copy-paste from this article. Everything is packaged:
Inside:
- locust-example/ — quick local test against example.com.
- locust-ecs-fargate/ — Dockerfile, Locust file, auth helpers, ECS tasks, Makefile.
- terraform/ — production-ready deployment with IAM, SGs, Cloud Map, ECS services, autoscaling, logging.
10. Conclusion
Load testing is about understanding your system’s limits, not breaking it for sport. With Locust you can define realistic scenarios, scale them out on ECS Fargate, and codify the entire setup with Terraform.
This gives you a repeatable, automated way to validate that your API Gateway-backed services will survive the traffic they’re designed for.
Gary Worthington is a software engineer, delivery consultant, and agile coach who helps teams move fast, learn faster, and scale when it matters. He writes about modern engineering, product thinking, and helping teams ship things that matter.
Through his consultancy, More Than Monkeys, Gary helps startups and scaleups improve how they build software — from tech strategy and agile delivery to product validation and team development.
Visit morethanmonkeys.co.uk to learn how we can help you build better, faster.
Follow Gary on LinkedIn for practical insights into engineering leadership, agile delivery, and team performance.