Why mobile bugs escape web-first QA teams
Web-first teams carry assumptions that quietly break on mobile — permissions, offline state, lifecycle, and updates.
Blog
Strong, dated takes from working QA engineers. Disagree freely; we change our minds in public.
Web-first teams carry assumptions that quietly break on mobile — permissions, offline state, lifecycle, and updates.
The average response time is the metric most likely to make a slow system look fine. Here is what to watch instead.
Not yes or no — which coding and for what. Reading code and light scripting help every tester; automation is where the roles are. Coding extends testing, doesn't replace the judgement.
Automate the mechanical (axe/lint: alt, labels, contrast) and spot-check the obvious in a PR; route keyboard, focus, and screen-reader testing to QA on a real build.
Not a purity contest — emulators for functional/UI/CI, real devices for performance, sensors, network, and sign-off. Decide per test whether the check needs real hardware.
AI covers the expected cases fast and misses the suspicion-driven ones that catch bugs. Division of labour: let it handle breadth of the predictable; you handle the unexpected.
POM was a Selenium-era solution to a Selenium-era problem. In modern Cypress and Playwright, custom commands and locator helpers cover 90% of what POM was supposed to give you.
Cucumber and Gherkin make sense when non-technical stakeholders write tests. They don't make sense when engineers write tests for engineers. Here's the pragmatic test: who actually reads your tests?
There's a take going around that data-testid 'couples tests to implementation.' It's exactly backwards — data-testid is the only selector explicitly decoupled from implementation.
Flaky tests don't cost you in CI minutes. They cost you in developer trust. And the compounding interest on lost trust is the most expensive tax in engineering.
AI writes 80% of a test 80% of the way, and the remaining 20% is exactly the part that makes it a test. Where AI saves time, where it's a trap, and the distinction that separates the two.
A 100 Lighthouse accessibility score doesn't mean your site is accessible. The score is a smoke alarm — useful, but not a test. Here's what it actually measures, and what you still need to check manually.
The Cohn test pyramid has been gospel since 2009. It was a useful heuristic for a 2009 monolith Java app. It's been quoted unchanged ever since — and most modern stacks don't fit its shape.
The 40-page IEEE 829 test plan: written once at kickoff, opened twice during the project, abandoned after release. There's a single-page replacement that teams actually update.
Wiki pages about APIs go stale in three months. The API test suite gets opened every single day. Write tests that read like documentation — and stop writing the wiki.
The pitch: 'run load tests on every PR.' The reality: you'll have flaky thresholds in three days and disabled tests in two weeks. Here's the four-tier strategy that actually survives.
What automation replaced was regression checks — running the same path repeatedly. What it didn't replace, and can't replace, is human intuition trying to break a product.