JSON is the data format every web API speaks and every test fixture uses. Python's standard library ships with a json module, so you don't need to install anything — import json and you're working. JSON's six types map almost one-to-one onto Python's built-in types, which is why moving data between an API and a test script feels frictionless. This lesson covers parsing JSON strings, serialising Python data, reading and writing JSON files, navigating nested structures, and the small set of errors that come up in practice.
The four functions you'll actually use
import jsonFour functions cover everything:
| Function | Direction | Mnemonic |
|---|---|---|
json.loads() | JSON string → Python | "load string" |
json.dumps() | Python → JSON string | "dump string" |
json.load() | JSON file → Python | (no s) — works on a file object |
json.dump() | Python → JSON file | (no s) — writes to a file object |
The s literally means "string." loads/dumps deal with strings; load/dump deal with file objects. Mixing them up is the most common JSON error in Python.
Parsing a JSON string — json.loads
import json
raw = '{"id": 7, "name": "Alice", "roles": ["admin", "tester"]}'
data = json.loads(raw)
print(type(data)) # <class 'dict'>
print(data["name"]) # "Alice"
print(data["roles"][0]) # "admin"json.loads parses a JSON string and returns Python objects — a dict here, with a nested list. Once parsed, you index it like any dict or list.
Serialising — json.dumps
The other direction:
import json
result = {
"name": "login_test",
"status": "PASS",
"duration_ms": 1240,
"tags": ["smoke", "auth"]
}
print(json.dumps(result))Output (one line):
{"name": "login_test", "status": "PASS", "duration_ms": 1240, "tags": ["smoke", "auth"]}
For a human-readable form, pass indent=2 (or 4):
print(json.dumps(result, indent=2))Output:
{
"name": "login_test",
"status": "PASS",
"duration_ms": 1240,
"tags": [
"smoke",
"auth"
]
}
That's the format you'd save to disk for a test fixture or a CI artefact.
Reading a JSON file — json.load
The same with open() pattern you'd use for any file:
import json
with open("fixtures/users.json", "r") as f:
users = json.load(f)
print(f"Loaded {len(users)} users")json.load(f) reads the entire file and parses it in one step. Notice it's load, not loads — the function operates on the file object, not on a string.
You could read the file into a string yourself and call loads, but load saves a step:
# Equivalent — but unnecessary
with open("fixtures/users.json", "r") as f:
raw = f.read()
users = json.loads(raw)The with statement is Python's context manager. It guarantees the file is closed when the block ends, even if the JSON parser raises. Always use with open(...) for files; we'll cover context managers properly in chapter 4.
Writing a JSON file — json.dump
import json
results = [
{"name": "login", "status": "PASS"},
{"name": "checkout", "status": "FAIL"}
]
with open("output/results.json", "w") as f:
json.dump(results, f, indent=2)Same with shape, mode "w" for write. json.dump(obj, f) writes the JSON straight into the file. indent=2 prettifies it; without it everything goes on one line.
JSON ↔ Python type mapping
Every JSON type has a clear Python counterpart. This is the table you'll memorise within a week:
JSON to Python type mapping
JSON
object — { "k": "v" }
array — [ 1, 2, 3 ]
string — "hello"
number — 42 or 1.5
true / false
null
Python
dict — { "k": "v" }
list — [ 1, 2, 3 ]
str — "hello"
int — 42 or float — 1.5
True / False (note the capitals)
None
The two columns line up directly. Once you load JSON, you're working with normal Python collections — no special "JsonNode" class to navigate (looking at you, Java's Jackson).
Navigating nested data
JSON from an API is usually nested several levels deep. Each ["…"] step drills one more level; [i] indexes into a list:
import json
raw = '''
{
"user": {
"id": 7,
"name": "Alice",
"roles": ["admin", "tester"],
"preferences": {"theme": "dark"}
}
}
'''
data = json.loads(raw)
name = data["user"]["name"] # "Alice"
first_role = data["user"]["roles"][0] # "admin"
theme = data["user"]["preferences"]["theme"] # "dark"
print(name, first_role, theme)Output:
Alice admin dark
That's the same shape you'd get from any REST API: a dict at the top, with keys that may point at sub-dicts, lists, or scalars. The whole tree is just nested dicts and lists.
Safe access — .get() chaining
Bracket access raises KeyError the moment a key is missing — sometimes you want that, sometimes you don't. For optional paths, chain .get() with sensible default fallbacks:
data = json.loads('{"user": {}}')
name = data.get("user", {}).get("name", "Unknown")
print(name) # "Unknown"The trick is the {} mid-chain. Without it, data.get("user") could return None, and the next .get() would fail with 'NoneType' object has no attribute 'get'. Default each level to an empty dict so the next call still works.
For deeper paths, this pattern grows ugly. Tools like glom (third-party) or dataclasses (chapter 5) tame deep access; for now, two or three levels of .get(..., {}) are fine.
The errors you'll meet
Three errors come up in practice:
data = json.loads('{"name": "Alice"}')
# 1. KeyError — wrong / missing key
data["email"] # KeyError: 'email'
# 2. TypeError — indexing the wrong type
data["name"]["first"] # TypeError: string indices must be integers
# 3. JSONDecodeError — malformed JSON
json.loads("{invalid}") # json.decoder.JSONDecodeError: …Two practical habits help:
- Wrap risky parsing in
try / except(chapter 6 covers exception handling fully). - Use
.get()for optional keys; reserve[]for keys you're confident must exist — its loud failure is then an alarm bell, not noise.
A QA example — filter a fixture file
A complete, runnable script that reads users from a JSON file, picks out the admins, and writes them to a new file:
import json
# fixtures/users.json contains an array of users
with open("fixtures/users.json", "r") as f:
users = json.load(f)
admins = [u for u in users if u.get("role") == "admin"]
with open("output/admins.json", "w") as f:
json.dump(admins, f, indent=2)
print(f"Wrote {len(admins)} admin users to output/admins.json")Three steps — read, filter, write — and you have a tiny ETL pipeline. The list comprehension from chapter 2 does the filtering; everything else is json.load and json.dump. No JSON library to install, no schema to declare upfront, no pom.xml to update.
Tip: if you need to inspect a JSON file without writing a script, the JSON Formatter utility on qa.codes takes raw JSON and returns a formatted, copyable view. Useful when an API response in your terminal is a single 2 KB line.
Comparing to Java and JavaScript
| Task | Python | JavaScript | Java (Jackson) |
|---|---|---|---|
| Parse a string | json.loads(s) | JSON.parse(s) | mapper.readValue(s, Map.class) |
| Stringify a value | json.dumps(obj) | JSON.stringify(obj) | mapper.writeValueAsString(obj) |
| Read a file | json.load(open(...)) | JSON.parse(fs.readFileSync(...)) | mapper.readValue(file, Map.class) |
| Pretty-print | dumps(obj, indent=2) | JSON.stringify(obj, null, 2) | writerWithDefaultPrettyPrinter() |
Python's API is the most uniform of the three — same names for serialise/deserialise, just s for "string" and no s for "file." JavaScript is also concise but works on strings only, so file reading is two steps. Java is the most ceremonial.
⚠️ Common mistakes
- Mixing
loadsandload.json.loadstakes a string;json.loadtakes a file object. Swap them and you getAttributeError("'str' object has no attribute 'read'") or "TypeError: the JSON object must be str, bytes or bytearray." Remember: the trailingsis for strings. - Forgetting
indentwhen writing for humans.json.dump(data, f)writes the entire object on a single line — fine for machine-to-machine, miserable to read in a fixture file. Passindent=2for any file a human will inspect. - Trying to dump types JSON doesn't know.
json.dumps({"set": {1, 2, 3}})raisesTypeError: Object of type set is not JSON serializable. JSON has no set type. Convert sets to lists (list(my_set)), datetimes to ISO strings, etc., before serialising.
🎯 Practice task
Read, transform, and write JSON. 25-30 minutes.
- Create
fixtures/users.jsoncontaining an array of at least six user objects, each withname,email,role(mix of"admin","tester","viewer"), andactive(trueorfalse). - Create
filter_users.py. Usejson.loadto read the file intousers. - Print
len(users)and the names of the first two using indexing. - Use a list comprehension (chapter 2) to build
admins = [u for u in users if u["role"] == "admin"]. - Use
json.dump(admins, f, indent=2)to write them tooutput/admins.json. Confirm the file is readable and pretty-printed. - Print the parsed dict in two ways:
json.dumps(users[0])(one line) andjson.dumps(users[0], indent=2)(multi-line). Notice the difference. - Demonstrate safe access:
data.get("missing", "fallback")for a missing key, and the chained patternusers[0].get("metadata", {}).get("source", "unknown"). - Demonstrate one error: try
json.loads("{nope}")inside atry / except json.JSONDecodeError as e:and printe.msg. - Stretch: write a function
def summarise(users: list) -> dict:that returns a dict like{"total": …, "admins": …, "active": …, "inactive": …}. Usejson.dumps(summary, indent=2)to print the result. Bonus: persist it tooutput/summary.json.
You now have the full toolkit for working with the JSON every API and every fixture file uses. The next chapter zooms out from in-memory data to files and APIs — opening text and CSV files, calling HTTP endpoints with requests, and parsing the responses you get back.