Blog

#ai-testing.

7 articles tagged "ai-testing".

All Tutorials Deep dives Opinions Comparisons Career Field notes Case studies QA trends

Tutorials·13 June 2026 · 9 min read

How I evaluate an AI chatbot before release

A practical evaluation pass for AI chat features: hallucinations, refusals, prompt injection, and the cases with no single right answer.

ai-testingllmevaluation

Tutorials·13 June 2026 · 9 min read

How to review AI-written Playwright tests

AI writes plausible Playwright tests that pass for the wrong reasons. Here is the review checklist that catches them.

ai-testingplaywrightreview

Deep dives·13 June 2026 · 9 min read

Prompt injection testing for QA engineers

LLMs can't reliably separate instructions from data, so user input can hijack the model. Direct and indirect injection, what to check for, and how to report it QA-safe.

ai-testingsecurity-testingprompt-injectionllm

Tutorials·13 June 2026 · 8 min read

What QA should log when testing AI features

A screenshot isn't a repro when outputs vary. Capture the full assembled prompt, retrieved context, model version, and parameters so an AI bug is actually reproducible.

ai-testingobservabilityllm

Tutorials·13 June 2026 · 9 min read

The hallucination test cases I run on AI features

Concrete test cases for AI hallucination — unanswerable questions, false premises, invented entities, citations — and how to judge answers with no 'correct' value.

ai-testingllmhallucinationtest-cases

Tutorials·13 June 2026 · 8 min read

How to use Claude Code for QA without breaking your repo

Get the speed of an AI agent on your test repo without the mess: work on a branch, review every change like a junior's PR, and make tests fail first to catch assert-nothing tests.

ai-testingclaude-codeai-toolsautomation

Opinions·13 June 2026 · 8 min read

AI test case generation: where it helps and where it fails

AI covers the expected cases fast and misses the suspicion-driven ones that catch bugs. Division of labour: let it handle breadth of the predictable; you handle the unexpected.

ai-testingtest-casesai-toolsopinion