Project Brief — Build an API Health-Check Monitoring Script

10 min read

You've spent seven chapters on the language and the tools. Now you're going to build something a real QA team would use. The project is api-monitor — a small Python script that checks the health of a list of API endpoints, validates the responses against a contract, and produces a report. It's the kind of utility QA teams kick off every morning before manual testing begins, or wire into CI as a smoke gate before the heavier suites run. Building it pulls together everything we've covered: file I/O, HTTP, dataclasses, error handling, modules, type hints, and pytest. This lesson sets the brief; the next walks through the implementation.

The scenario

Your team's staging environment hosts a dozen APIs. Tests on top of them assume the APIs are up, fast, and returning the expected shape. When one of those assumptions fails, the entire downstream test suite churns out red — but the real problem is upstream, and the team wastes thirty minutes diagnosing it. You've been asked to build a small Python script that runs in under ten seconds, hits each endpoint, and reports a clear pass/fail picture before the heavy suite starts.

The brief from your team lead is one paragraph:

"Read a config file listing endpoints. For each one, make the request, check the status code, the response time, and any expected JSON fields. Print a coloured summary so a human can scan it. Save a JSON report so CI can attach it as an artefact. Exit non-zero if anything failed so the build can fail fast. Keep it small — under 200 lines of code."

That's the whole scope. Small, well-defined, immediately useful. Exactly the kind of script Python is good at.

What the script does

End to end, api-monitor:

  1. Reads a config file (config/endpoints.json) that lists API endpoints to check. Each endpoint has a URL, an expected status code, a maximum response time, and an optional list of fields the JSON body must contain.
  2. Makes an HTTP request to each endpoint — requests.get(url, timeout=...).
  3. Validates the response against three contracts: the status code must match what's expected, the round-trip time must be under the threshold, and any required JSON fields must be present.
  4. Collects a result per endpoint — passed or failed, with a useful message when it failed.
  5. Generates a report to the console (with colour markers for fast scanning) and to a JSON file (for CI, dashboards, or replay).
  6. Returns an exit code — 0 if every check passed, 1 if any failed. CI then fails the job, blocking the rest of the pipeline.

That's the entire workflow. No browser, no database, no UI. A single CLI script that's done in ten seconds.

Example config

The config file is plain JSON. A small but realistic example:

{
  "endpoints": [
    {
      "name": "health",
      "url": "https://api.staging.example.com/health",
      "expected_status": 200,
      "max_time_ms": 1000
    },
    {
      "name": "users-list",
      "url": "https://api.staging.example.com/users",
      "expected_status": 200,
      "max_time_ms": 2000,
      "expected_fields": ["users", "total"]
    },
    {
      "name": "admin-locked",
      "url": "https://api.staging.example.com/admin",
      "expected_status": 401,
      "max_time_ms": 500
    }
  ]
}

Three checks, three different failure modes the script needs to express:

  • /health — must be 200 and fast. The classic liveness check.
  • /users — must be 200, fast, and return a JSON object with users and total keys.
  • /admin — must return 401 (anonymous shouldn't reach it). A 200 here is a security failure, not just a bug.

The third case is interesting — sometimes "expected status" is not a 2xx. Your monitor should handle that without baking in the assumption that 200 = success.

Skills the project draws on

Every chapter contributes:

ChapterWhat you'll use
1Setting up Python, venv, pip — the project lives in its own venv
2if/elif/else, for, list comprehensions for filtering results
3Lists, dicts, sets, JSON — config loading, response validation
4File I/O (with open), CSV (optional report), requests, response parsing
5Dataclasses for Endpoint and CheckResult, classes for the monitor
6try/except per endpoint, custom exceptions, modules across src/
7pytest tests for the check_endpoint function (stretch goal)

If you've done the lessons, you have all the parts. The capstone is what assembling them looks like.

Project structure

A small but clean layout — modules separated by responsibility:

api-monitor/
├── config/
│   └── endpoints.json              # input: what to check
├── src/
│   ├── __init__.py
│   ├── models.py                   # Endpoint, CheckResult dataclasses
│   ├── config_loader.py            # read and validate config
│   ├── monitor.py                  # the actual checking logic
│   └── reporter.py                 # console + JSON output
├── output/                          # generated artefacts (created at run time)
│   └── report.json
├── tests/
│   └── test_monitor.py             # stretch — pytest tests with mocked responses
├── main.py                         # entry point — parse args, kick off
├── requirements.txt
├── pyproject.toml
├── .gitignore                      # venv/, __pycache__/, output/
└── README.md

Reading top-down: config is the input contract; src/ is the implementation, split by single-responsibility module; output/ is where reports land at run time; tests/ is the optional pytest suite; main.py is the thin CLI entry. Every file does one thing — exactly the structure chapter 6 prepared you for.

Acceptance criteria

The done-list. Tick each one before calling the project finished:

  • Reads config/endpoints.json. Reports a clear error if the file is missing or malformed.
  • Hits each endpoint with requests.get(url, timeout=...).
  • One endpoint failing (timeout, DNS error, exception) does not stop the others.
  • Validates status, response time, and optional expected_fields.
  • Prints a console summary with one line per endpoint, plus a totals line at the end.
  • Writes a JSON report to output/report.json.
  • Exits with status 0 if everything passed, status 1 if anything failed.
  • Custom exception class for config problems vs check failures.
  • At least one type hint on every public function and dataclass field.
  • requirements.txt pins requests to an exact version.
  • README.md explains how to run the script in five lines.

The "200 lines of code" target is generous — a clean implementation lands closer to 150. Don't pad; don't skip.

Stretch goals

Once the core is working, four upgrades each teach something extra:

  1. Retry logic — if an endpoint fails on a transient error (timeout, 5xx), retry up to twice before marking it failed. Exponential back-off (time.sleep(0.5 * attempt)). Reuses the pattern from chapter 6's try/except lesson.
  2. Email notifications — if any check fails, send an email summary (use the standard library's smtplib, no extra packages). Useful but optional; many teams point CI at Slack instead.
  3. Historical tracking — append each run's totals to output/history.csv (timestamp, total, passed, failed). After ten runs, the file shows a trend you can graph.
  4. Pytest tests — write a small test suite for monitor.check_endpoint. Use unittest.mock.patch("requests.get") to fake responses; assert on the resulting CheckResult. Practises chapter 7 directly.
  5. Parallel checking — replace the sequential for loop with concurrent.futures.ThreadPoolExecutor so all endpoints fire at once. Real production health-checks always do this; ten endpoints over the network finish in 1 second instead of 10.

Pick one or two stretch goals once the core works. They're not graded; they're for your portfolio.

How to run it (the target shape)

By the end, the experience for a teammate using your script should look like this:

git clone <your-repo> && cd api-monitor
python -m venv venv && source venv/bin/activate
pip install -r requirements.txt
 
python main.py config/endpoints.json

Output:

api-monitor — checking 3 endpoints
✅ health         status=200  time=  87ms  ok
✅ users-list     status=200  time= 312ms  ok
❌ admin-locked   status=200  time=  45ms  expected 401, got 200

3 checks: 2 passed, 1 failed
report written to output/report.json
exit 1

Three lines per endpoint, one totals line, a hint at where the JSON went, an honest exit code. That's the deliverable.

The end-to-end flow

Nine boxes, four files, one CLI command. The next lesson takes each block and writes the code that fills it.

⚠️ Common pitfalls before you start

A few things to avoid:

  • Hardcoding the config path. Take it from sys.argv[1] or argparse. CI will pass a different path than your laptop.
  • Letting one endpoint kill the whole script. Wrap each check_endpoint call in try/except — a DNS error on endpoint 1 must not skip endpoints 2-N.
  • Silent failures. Always print the reason a check failed (status mismatch, timeout, missing field). "❌ failed" is useless; "❌ expected 401, got 200" is actionable.

🚀 Get started

Set up the skeleton before reading the next lesson:

  1. Create the project folder and the directory structure above.
  2. Make a venv and activate it.
  3. Create requirements.txt with requests==2.32.0 and run pip install -r requirements.txt.
  4. Add .gitignore covering venv/, __pycache__/, output/, *.pyc.
  5. Create the empty files: src/__init__.py, src/models.py, src/config_loader.py, src/monitor.py, src/reporter.py, main.py, config/endpoints.json, tests/__init__.py, tests/test_monitor.py.
  6. Drop the example config from above into config/endpoints.json. (Pick a real public API like JSONPlaceholder for testing if you don't have a staging environment to point at.)
  7. Sketch each module's responsibility in one comment at the top of the file. Don't write code yet — just the headers. This is the "read the question carefully before answering" step.

When the skeleton is in place, move on to the walkthrough.

// tip to track lessons you complete and pick up where you left off across devices.