How to Build SKILL.md for QA Agent Skills

SKILL.md anatomy in depth

Every SKILL.md has two parts: a YAML frontmatter block and a Markdown instruction body.

---
# ── FRONTMATTER ──────────────────────────────────────────────────────────────
name: api-test-generator              # required: kebab-case, unique
description: |                        # required: the activation trigger
  Generates API test cases (using Playwright, Supertest, or raw fetch) from
  an OpenAPI spec, Postman collection, or endpoint description. Use when the
  user asks to write or generate API tests, HTTP endpoint tests, or REST
  client test suites. Do NOT use for UI/browser tests or contract tests.
version: "1.2.0"                      # optional but recommended
metadata:                             # optional — any key-value pairs
  author: QA Platform Team
  tags: [api-testing, playwright, supertest]
  last-reviewed: "2025-11"
---

# ── INSTRUCTION BODY ─────────────────────────────────────────────────────────

## When to use
- User provides an OpenAPI spec, route definition, or endpoint description
- User asks to "write API tests", "test this endpoint", or "generate REST tests"

## Inputs
- API spec file path or endpoint description
- Base URL and auth mechanism (bearer, API key, session cookie, none)
- Expected response schemas

## Instructions
1. Parse the provided spec and identify all distinct endpoints.
2. For each endpoint, generate: happy path, validation error, and auth error cases.
3. Use the framework indicated by the user; default to Playwright's APIRequestContext.
4. Organise tests by resource (e.g. /users → users.api.spec.ts).
5. Include assertions for status code, Content-Type header, and response body shape.

## Output format
Full test file(s) + a brief coverage summary table (endpoint | cases | gaps).
Flag any spec ambiguities as assumptions.

## Anti-patterns
- Do not embed secrets or real API keys — use environment variables.
- Do not use hard-coded base URLs; read from process.env.BASE_URL.

## Safety
- Validate inputs before running scripts.
- Never write generated tests directly to disk without user confirmation.

Writing a good description

The description field is the only thing the agent reads during the Discovery stage. It is the activation trigger. If it is vague, the agent will either miss the skill entirely or activate it when it should not.

A good description answers three questions in 2–5 sentences:

1.What does this skill do? (specific, not generic)
2.When should the agent use it? (trigger conditions — what user request or context matches)
3.When should the agent NOT use it? (exclusions — prevents false positives)

Vague — avoid

description: Helps with testing.

Specific — do this

description: |
  Generates Playwright TypeScript end-to-end tests from a feature description,
  acceptance criteria, or user story. Use when asked to write, scaffold, or
  generate end-to-end or browser-level tests using Playwright. Do NOT use for
  unit tests, API-level tests, or test frameworks other than Playwright.

Description checklist

Names the specific tool or framework (Playwright, Supertest, Jest…)
States the input type (user story, spec file, endpoint description…)
Includes explicit trigger phrases ("Use when asked to…")
Includes explicit exclusion phrases ("Do NOT use for…")
Is 2–5 sentences — not a one-liner, not a paragraph

Folder structure

Start minimal and add directories only when you need them. A skill with just a SKILL.md is a valid, complete skill.

.agents/skills/                  ← Claude Code / general skill root
└── playwright-test-gen/         ← one directory per skill
    ├── SKILL.md                 ← required
    ├── references/              ← add when you have style guides or specs to reference
    │   └── playwright-conventions.md
    ├── templates/               ← add when you want consistent output structure
    │   └── spec-template.ts
    ├── examples/                ← add when worked examples help the agent
    │   └── login.spec.ts
    └── scripts/                 ← add when the skill needs to run commands
        └── scaffold.sh

Security note

Scripts in the scripts/ directory are executed by the agent. Only include scripts you wrote and reviewed. Prefer read-only references over runnable scripts unless execution is genuinely needed.

Adding references, scripts, examples, and templates

references/

Documents the agent reads during Execution — coding conventions, API specs, project-specific rules.

Keep each file focused on one topic.
Link from SKILL.md instructions: "Read references/playwright-conventions.md before writing tests."
Plain Markdown or text — the agent reads them as text, not executes them.

templates/

Skeleton files the agent fills in — test file structure, bug report format, test plan outline.

Use placeholder comments: /* INSERT TEST CASES HERE */.
Instruct the agent to copy the template before filling it: "Use templates/spec-template.ts as the starting point."

examples/

Worked examples the agent cites for style and structure.

Real, working code from your codebase is more useful than invented examples.
Name files descriptively: login-happy-path.spec.ts, not example1.ts.

scripts/

Shell or Python scripts the agent can run as part of the workflow.

Pin dependencies and versions in the script header.
Never include secrets — read from environment variables.
Validate inputs inside the script; do not trust agent-generated values blindly.
Require user confirmation before writing to disk or making network calls.

Step-by-step: build a QA skill from scratch

1
Identify the repeatable workflow
Pick one QA task you perform regularly with an AI agent. Good candidates: Playwright test generation, API test authoring, bug report structuring, test case generation from ACs.
2
Create the directory
Create a new directory under your skills root (e.g. .agents/skills/playwright-test-gen/). Use a lowercase, hyphenated name that matches the skill's purpose.
3
Write the frontmatter
Fill in name, description, and optionally version. Spend most time on description — use the checklist from the section above.
4
Write the instruction body
Use h2 headings for "When to use", "Inputs", "Instructions", "Output format", and "Anti-patterns". Keep instructions numbered and sequential — the agent follows them in order.
5
Add supporting files (optional)
Add references/, templates/, or examples/ as needed. Only add scripts/ if the workflow genuinely requires running code.
6
Commit to version control
Commit the entire skill directory. Treat it like production code — PR review, changelog entries, semantic versioning if your team shares skills.
7
Test it
Open a new agent session and ask for the task in natural language. Observe: did the agent activate the skill? Did it follow the instructions? If not, refine the description or instructions and repeat.

Testing and iterating

Skills are software. They need testing. The main failure modes are:

Missed activation

Cause: Description too vague — agent doesn't recognise the task as a match.

Fix: Add more specific trigger phrases to the description.

False activation

Cause: Description too broad — agent activates the skill for unrelated tasks.

Fix: Strengthen the "Do NOT use for…" exclusions in the description.

Instruction drift

Cause: Agent follows some steps but skips or reorders others.

Fix: Use numbered steps; make each step atomic and unambiguous.

Output mismatch

Cause: Output format section is under-specified.

Fix: Add a worked example of the expected output (or reference one in examples/).

Stale references

Cause: references/ files reference APIs or patterns that have changed.

Fix: Update references alongside the codebase; add a last-reviewed date to metadata.

Keep it maintainable

One skill per workflow — resist the urge to combine unrelated tasks into one SKILL.md.
Update the version field when you make breaking changes to instructions.
Add a last-reviewed date in metadata and schedule periodic reviews (quarterly is a good cadence for active projects).
Document why, not what — if a constraint in the instructions seems arbitrary, add a comment explaining the reasoning.
Remove skills that are no longer used — unused skills add noise at the Discovery stage.
Never put secrets, credentials, or tokens in SKILL.md or any file in the skill directory.
If you distribute skills outside your organisation, pin the version or commit hash before using — third-party skills can carry prompt-injection risks.

See complete examples →

How to Build SKILL.md for QA Agent Skills.