JSON Schema Validation

9 min read

A field-level test asserts that name equals "Alice". A schema-level test asserts that name is present at all, is a string, and is non-empty. The first test catches a wrong value; the second catches a renamed field, a removed field, a field whose type changed, or a new field nobody told you about. The API Testing Masterclass lesson on JSON Schema introduced the concept; this lesson is how to wire matchesJsonSchemaInClasspath(...) into a Rest Assured test and use it as a contract guard the rest of your suite leans on.

Why schema validation earns its keep

A field-level matcher tells you known fields have known values. It can't tell you the API just stopped sending the email field, or started sending it as null, or renamed it to emailAddress. Those are the bugs that escape into prod because every existing test is asserting on name and id, and nobody noticed the email column quietly disappear.

Schema validation is the single test that catches all four shape regressions:

  1. A required field is missing
  2. A field's type changed (string → number, string → object)
  3. A new field appeared (with additionalProperties: false)
  4. A value escaped its allowed range or enum

One line in the chain. Worth its weight.

The dependency you already have

The course pom.xml from Chapter 1 already includes:

<dependency>
    <groupId>io.rest-assured</groupId>
    <artifactId>json-schema-validator</artifactId>
    <version>5.4.0</version>
    <scope>test</scope>
</dependency>

The static import you'll need:

import static io.restassured.module.jsv.JsonSchemaValidator.matchesJsonSchemaInClasspath;

That single matcher does all the work.

A schema for one user

Drop the schema under src/test/resources/schemas/user-schema.json:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "required": ["id", "name", "email", "role"],
  "properties": {
    "id":        { "type": "integer", "minimum": 1 },
    "name":      { "type": "string", "minLength": 1 },
    "email":     { "type": "string", "format": "email" },
    "role":      { "type": "string", "enum": ["admin", "tester", "viewer"] },
    "createdAt": { "type": "string", "format": "date-time" }
  },
  "additionalProperties": false
}

Key parts:

  • required lists the fields that must be present. Missing any of them = schema failure.
  • properties declares each field's expected type and constraints.
  • enum enforces a closed set of values.
  • format: email and format: date-time are JSON Schema's built-in string validators.
  • additionalProperties: false rejects any field not listed in properties — the strictest setting, and the one that catches new-field regressions.

Validating a response

@Test
public void getUserOneMatchesSchema() {
    given()
    .when()
        .get("/users/1")
    .then()
        .statusCode(200)
        .body(matchesJsonSchemaInClasspath("schemas/user-schema.json"));
}

That's the entire integration. The validator loads the schema, walks the response, and reports the first violation (or every violation, depending on configuration) with a JSON Pointer to the offending field. A failure looks like:

Schema validation failed:
  /role: instance value ("superadmin") not found in enum (admin, tester, viewer)
  /email: required key not found

That's a debugging dream compared to a generic 200 OK passing while the body has rotted.

A schema for an array of users

When the response is [{...}, {...}, ...], the schema describes the array, then references the per-item schema:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "array",
  "items": { "$ref": "user-schema.json" },
  "minItems": 1,
  "maxItems": 100
}

$ref resolves relative to the schema file's directory — keep both files in the same folder. minItems and maxItems capture the contract that the endpoint always returns at least one and never more than a hundred users.

What a schema test catches

Field-level matchers vs schema validation — what each one catches

Field matchers only

  • name == "Alice"

    fails on wrong value

  • email contains @

    fails on malformed email

  • Field renamed (email → emailAddress)

    passes silently — name still matches

  • New field added (createdBy)

    passes silently

  • Type changed (id: 1 → "1")

    may pass — Hamcrest coerces

Schema validation

  • Required field missing

    FAILS with /email: required key not found

  • Type changed (string → number)

    FAILS with /id: instance type (string) does not match (integer)

  • Field renamed

    FAILS — old field missing, new field forbidden by additionalProperties

  • Value outside enum

    FAILS with /role: not found in enum

  • Specific values like name == "Alice"

    DOESN'T check — that's still the field matcher's job

The two are complements, not competitors. Schema validation defines the contract; field matchers verify specific business invariants. Run both: schema as the regression net, matchers for the meaningful values.

Constraints worth remembering

A few JSON Schema features that come up constantly in API tests:

{
  "properties": {
    "username":   { "type": "string", "minLength": 3, "maxLength": 30, "pattern": "^[a-z0-9_]+$" },
    "age":        { "type": "integer", "minimum": 0, "maximum": 150 },
    "tags":       { "type": "array", "items": { "type": "string" }, "uniqueItems": true },
    "status":     { "type": "string", "enum": ["pending", "active", "closed"] },
    "metadata":   { "type": "object", "additionalProperties": { "type": "string" } },
    "deletedAt":  { "type": ["string", "null"], "format": "date-time" }
  }
}

["string", "null"] is the canonical way to say string or null — important when the API legitimately returns null for soft-deleted records. uniqueItems is the matcher you'll be glad you knew when an array starts duplicating because of a JOIN bug.

When to write a schema

The honest rule: one schema per resource type, validated on every response that returns that resource. A /users/1 response and a /users array response both reference user-schema.json. A /orders/{id} response references order-schema.json. The schemas live next to the tests; updating them is part of any API contract change.

The work compounds: a schema written today catches every breaking change for that endpoint forever. Refusing to write one means relying on field-level matchers to notice missing fields — which they don't.

Generating schemas instead of writing them by hand

For a fast start, paste a known-good response into a JSON-to-Schema generator (there are several online tools, or libraries like everit-json-schema's schema inferrer) and edit the result. The generated schema usually needs the required list trimmed and the additionalProperties flag flipped, but it gets you 80% of the way without writing braces by hand.

A schema test with a rich failure message

@Test
public void usersResponseStructureIsValid() {
    given()
    .when()
        .get("/users")
    .then()
        .statusCode(200)
        .body(matchesJsonSchemaInClasspath("schemas/users-array-schema.json"));
}

When the response shape changes, the failure is precise: which user (/items/3), which field (/email), what was expected (required), what it got (null). Compare to a field-level test that just says "expected non-null but was null" with no idea where in the array.

⚠️ Common mistakes

  • Skipping additionalProperties: false. Without it, the API can sprout new fields and your schema test silently passes. The whole point of the schema test is to fail when the contract drifts. Default to strict; relax it only with deliberate intent.
  • Putting test-specific values in the schema. Schemas describe the shape, not the content. Asserting "role": { "const": "admin" } couples the schema to one test case — when the next test logs in as a tester, the schema fails. Keep specific values in field-level matchers.
  • Letting schemas drift from the API. A schema that hasn't been updated since 2022 produces false failures, which trains the team to ignore schema failures, which defeats the test. When the API contract changes, the schema is part of the change — review it like code.

🎯 Practice task

Wire schema validation into the suite you've been growing. 25–35 minutes against JSONPlaceholder.

  1. Create src/test/resources/schemas/user-schema.json matching the lesson's example, but tuned to JSONPlaceholder's /users/1 response (it has fields like username, phone, website, plus nested address and company objects).
  2. Write getUserOneMatchesSchema() and run it green.
  3. Force a schema failure. Add a required field that doesn't exist ("hometown"). Run the test, read the message — note that the message names the field. Remove the bad entry.
  4. Catch a renamed field. Set additionalProperties: false. Add a fake required field ("emailAddress"). The test should fail twice — once for missing emailAddress, once for unexpected email. Restore.
  5. Validate the array endpoint. Create users-array-schema.json referencing user-schema.json via $ref. Add minItems: 10, maxItems: 10 (JSONPlaceholder always returns 10). Validate GET /users against it.
  6. Enum validation. Add "$ref": "user-schema.json" is fine — but try a smaller schema for /posts/1 with "id": { "type": "integer" } and force it to fail by changing it to "type": "string". Read the failure.
  7. Pattern validation. Add "pattern": "^[\\d-]+$" to address.zipcode. Run green. Tighten to "pattern": "^\\d{5}$" and watch it fail on JSONPlaceholder's hyphenated zip codes.
  8. Stretch: add a schema for /posts/1 and validate. Then add "const": "Bret" somewhere it doesn't apply, run, read the terrible failure message, and remove it. This is why schemas should describe shape, not values.

Next lesson: XML responses — when the server doesn't speak JSON and you need XmlPath, XPath, and XSD validation to assert on a SOAP-flavoured world.

// tip to track lessons you complete and pick up where you left off across devices.