You've seen what GraphQL is and how queries, mutations, and subscriptions are shaped. This lesson is about turning that knowledge into a concrete test plan: which assertions to write, what to negative-test, and the GraphQL-specific bugs you should actively look for. The mental model carries over from REST testing — auth, validation, errors, performance — but each shifts in subtle ways that, if you miss them, leave gaps.
The three-step assertion pattern
Every GraphQL test should answer three questions in this order:
- Did the HTTP request succeed? (Status code 200.)
- Did the GraphQL operation succeed? (
errorsarray is null or empty.) - Is the data correct? (Specific field assertions on
data.)
Skipping step 2 is the single biggest GraphQL testing mistake. Many test suites assert on the data without ever checking errors, so they happily report success on a partially-failed response.
A reusable helper covers the first two steps:
def gql(query: str, variables: dict | None = None):
response = requests.post(
GRAPHQL_URL,
json={"query": query, "variables": variables or {}},
headers={"Authorization": f"Bearer {token}"},
timeout=5
)
assert response.status_code == 200, f"HTTP {response.status_code}: {response.text}"
body = response.json()
if body.get("errors"):
raise AssertionError(f"GraphQL errors: {body['errors']}")
return body["data"]Tests then call data = gql("{ user(id: 42) { email } }") and assert against the parsed data.
Testing queries
A query test should cover:
- Happy path — valid arguments, expected fields populated.
- Field selection — request a subset of fields, verify only those come back.
- Nested data — request
user { orders { items } }, verify deep structure. - Arguments —
user(id: 42)returns user 42;user(id: 43)returns user 43. - Empty results —
user(id: 999999)returnsdata.user: null(not an error in most schemas). - Invalid field —
{ user(id: 42) { nonExistentField } }→ 400 or 200 with errors describing the unknown field. - Wrong argument type —
user(id: true)→ validation error. - Required argument missing —
user { name }(no id) → validation error. - Authorisation — anonymous request to a protected query → error with
code: UNAUTHENTICATED.
A subtlety: data.user: null and a errors entry mean different things. In GraphQL, null means "the field resolved successfully and the value happens to be null." An entry in errors means the field couldn't be resolved. Treat them differently in your assertions.
Testing mutations
Mutations need the same rigour as REST POST/PUT/DELETE endpoints:
- Happy path — valid input → mutation succeeds, returned fields match.
- Missing required input → validation error before the mutation runs.
- Invalid input values → resolver-level error (e.g. duplicate email →
code: CONFLICT). - Authentication — no token → unauthenticated error.
- Authorisation — token with wrong scope/role → forbidden error.
- Idempotency — calling the same mutation twice. Does it create two records, or detect the duplicate?
- Side effects — verify the change actually happened (DB read or follow-up query).
The "follow-up query" pattern is GraphQL-specific and powerful:
data = gql(
"mutation Create($input: UserInput!) { createUser(input: $input) { id } }",
variables={"input": {"name": "Alice", "email": "alice@test.com"}}
)
new_id = data["createUser"]["id"]
data = gql("query Get($id: ID!) { user(id: $id) { email } }", {"id": new_id})
assert data["user"]["email"] == "alice@test.com"Two operations, end-to-end verification, all over the same /graphql endpoint.
Errors in the response body
A typical GraphQL error response:
{
"data": { "user": null },
"errors": [
{
"message": "User not found",
"path": ["user"],
"extensions": { "code": "NOT_FOUND" }
}
]
}Each error has a message, a path indicating which field in the query failed, and extensions holding structured metadata (often an error code). When asserting on errors, prefer the extensions.code over the human message — codes are stable; messages change wording.
errors = body.get("errors", [])
assert len(errors) == 1
assert errors[0]["extensions"]["code"] == "NOT_FOUND"A frequent cause of confusion: a GraphQL response can have both data and errors populated. If your query asks for ten things and three fail, the response includes the seven that succeeded plus three error entries. Test for this partial success explicitly when it matters.
Introspection
GraphQL servers expose a meta-query that returns the entire schema:
query {
__schema {
types { name kind }
}
}Useful in development; risky in production. Many teams disable introspection on production to make API surface reconnaissance harder for attackers. Worth a test:
response = requests.post(prod_url, json={"query": "{ __schema { types { name } } }"})
assert response.json().get("errors"), "Introspection should be disabled in production"In staging or development, the opposite assertion may apply — confirm introspection works so the team can debug schema issues.
N+1 query risk
GraphQL's flexibility lets a client ask for users { posts { comments } } in one request. A naive backend implementation issues:
- 1 query to fetch the users.
- 1 query per user to fetch their posts.
- 1 query per post to fetch its comments.
For 100 users with 10 posts each, that's 1 + 100 + 1,000 = 1,101 database queries to satisfy a single GraphQL request. Backend developers typically defend against this with a batching layer (DataLoader). Tests can detect when the defence is missing:
- Run the query against a test database with logging enabled.
- Count the SQL queries triggered.
- Assert "fewer than N" — typically 5-10 — for a query that should fan out widely.
If you don't have DB-level instrumentation, response time is a usable proxy: an N+1 explosion shows up as a 5-30× latency increase on nested queries.
Query depth and complexity limits
A malicious or buggy client can send a deeply nested query:
{
user {
friends {
friends {
friends {
friends { id name }
}
}
}
}
}Without limits, the server traverses an exponentially growing set. A defence layer (graphql-depth-limit, query complexity calculators) should reject deep or expensive queries before they run. As QA, the test:
- Send a deeply nested query past the documented limit.
- Expect an error (typically
QUERY_TOO_COMPLEXor similar) and a fast response (the server doesn't actually execute the query).
If the server runs the deep query to completion, you've found a denial-of-service vector worth flagging.
A worked test plan
For a User type with a createUser mutation and a user(id) query, the standing test set looks like:
Query — user(id):
✓ Valid id → data.user with all fields
✓ Subset selection → only requested fields
✓ Nested orders → deep shape
✓ Non-existent id → data.user is null, no errors
✓ Missing id arg → validation error
✓ Wrong type id → validation error
✓ Anonymous → UNAUTHENTICATED error
Mutation — createUser:
✓ Valid input → returns id
✓ Created user retrievable via user(id) query
✓ Missing email → validation error
✓ Duplicate email → CONFLICT error
✓ Anonymous → UNAUTHENTICATED
✓ Insufficient role → FORBIDDEN
Schema/security:
✓ Introspection disabled in production
✓ Excessive depth rejected with depth-limit error
✓ Response time on nested user.orders.items query under threshold
About fifteen tests per type. Parameterise where possible to keep maintenance low.
⚠️ Common mistakes
- Asserting only on the data field. A response with
data: nulland an errors array passes a naiveassert data["user"]["email"] == ...test by raising aKeyError— but the failure message is unhelpful. Always checkerrorsfirst. - Skipping introspection tests in production. A leaked schema makes attacks easier. Verify it's disabled where it should be.
- Accepting any
extensions.codeas fine. The server may return a genericINTERNAL_SERVER_ERRORfor what should be a specificNOT_FOUNDorVALIDATION_ERROR. Assert on the correct code.
🎯 Practice task
Build a small GraphQL test suite. 30-40 minutes.
- Pick a public GraphQL API — Countries, SpaceX, or GitHub GraphQL. Use one that doesn't require auth so you can iterate fast.
- Write a
gql()helper in your favourite language that posts a query, checks HTTP 200, raises on errors, and returnsdata. - Write three positive tests: a simple query, a query with variables, and a query with nested data.
- Write three negative tests: unknown field, wrong argument type, missing required argument. Assert on the
errorsarray'sextensions.codewhere available. - Try an introspection query (
{ __schema { types { name } } }). Note whether it works on this API. - Stretch: time a single-level query and a deeply-nested query. The nested one should be slower — sometimes dramatically. That's the N+1 signal.
You can now write meaningful tests against any GraphQL API. The final lesson of this chapter catalogues the GraphQL-specific bugs and pitfalls that surprise even experienced testers.