Walkthrough Part 2 — Accessibility and Compatibility Plans

This lesson continues the ShopRight plan walkthrough. Part 1 covered Performance and Security. Here the focus is on Accessibility, Compatibility, Localisation, and Reliability — followed by a worked Resources section that consolidates every tool, environment, and team dependency.

Accessibility testing plan

Regulatory obligation

WCAG 2.1 Level AA is not a recommendation for ShopRight — it is a legal requirement under the UK Equality Act 2010. The accessibility plan must be designed to produce evidence of compliance, not just to fix the most obvious issues. Evidence matters: if a complaint is filed post-launch, ShopRight needs documentation showing systematic testing was conducted.

Automated scanning (Month 1 onwards)

axe-core is integrated into the Playwright end-to-end test suite from Month 1. Every automated E2E test run includes an axe scan of the pages it traverses. Any new critical or serious violation introduced by a pull request fails the build.

Automated scanning covers approximately 30–40% of WCAG 2.1 AA criteria — specifically the machine-detectable ones: missing alt text, insufficient colour contrast, form inputs without labels, missing language attribute, and keyboard traps. The remaining 60–70% requires manual testing.

Manual WCAG audit (Month 2)

A structured manual audit is conducted against all user-facing screens in Month 2, following the WCAG 2.1 AA success criteria checklist. Priority screens:

Homepage and navigation
Product listing and search results
Product detail page
Basket and checkout (all steps)
Account creation and login
Order history and tracking
Mobile app equivalent screens (post-agency handover)

The audit uses the WebAIM WCAG 2.1 checklist as a reference. Each success criterion is marked Pass, Fail, or Not Applicable. Fails are filed as bugs with severity aligned to the WCAG impact: Level A failures are Critical; Level AA failures are High.

Two screen readers are tested: NVDA with Firefox on Windows (most common screen reader combination among assistive technology users) and VoiceOver with Safari on macOS and iOS.

The test protocol is journey-based, not criteria-based — a screen reader user must be able to complete the same three journeys used in performance testing (browse and search, checkout, account actions) without sighted assistance. Any journey that cannot be completed end-to-end is a Critical finding.

Acceptance criteria

Criterion	Target
axe-core Critical violations in CI	Zero (blocks merge from Month 1)
WCAG Level A violations at launch	Zero
WCAG Level AA violations at launch	Zero
Screen reader: checkout journey completable (NVDA)	Yes
Screen reader: checkout journey completable (VoiceOver iOS)	Yes
Colour contrast ratio (normal text)	≥ 4.5:1

Compatibility testing plan

Browser and device matrix

The ShopRight user analytics are not available yet (the legacy platform did not capture detailed browser data). The plan defaults to StatCounter UK data as a proxy until real analytics are available post-launch.

Based on UK market data, the tier-1 browsers are Chrome (desktop and Android), Safari (macOS and iOS), and Edge. Firefox and Samsung Internet are tier-2.

Test execution strategy — three tiers of coverage

Smoke (Every PR)

Chrome latest (desktop)
Safari latest (macOS)
iOS Safari (iPhone 15 class)
3 critical paths only
Login → Browse → Checkout
Runs in ~5 minutes via BrowserStack

Regression (Every PR, Chromium)

Chrome latest (desktop)
Chrome on Android (Pixel 8 class)
Full Playwright test suite
All user journeys
All form interactions
Fast — one browser only

Full Matrix (Weekly)

Chrome, Edge, Firefox (desktop)
Safari macOS, Safari iOS
Android Chrome, Samsung Internet
Tablet: iPad, Galaxy Tab
Viewport drag: 320px → 1920px
All Playwright tests × all browsers

This tiered strategy prevents the matrix from becoming a bottleneck. Running the full browser matrix on every pull request would slow down the pipeline unacceptably — smoke tests on the two most important browsers give fast feedback, while the weekly full matrix catches regressions before they accumulate.

Responsive testing protocol

Responsive testing is incorporated into the Playwright suite using per-test viewport configuration. Three viewport configurations are tested in the full regression suite:

Mobile: 375×812 (iPhone 14 equivalent)
Tablet: 768×1024 (iPad portrait)
Desktop: 1280×800

In addition to the automated viewport tests, manual responsive testing is conducted during Month 2 using Chrome DevTools: dragging the viewport from 1400px to 320px slowly on each priority screen to catch layout breaks at transition points.

Acceptance criteria

Criterion	Target
Critical paths pass on Chrome latest	Yes
Critical paths pass on Safari latest (macOS)	Yes
Critical paths pass on iOS Safari	Yes
No horizontal scroll at 375px viewport	Yes
Touch targets meet 44×44px minimum	Yes
No content hidden behind iOS safe area insets	Yes

Localisation testing plan

Year 1 scope: i18n foundation only

ShopRight launches in English only. No translations exist in Year 1. The Year 1 localisation plan is entirely about i18n foundation — building the internationalisation infrastructure so that Year 2 translations can be added without touching the React components.

This is the right approach. Retrofitting i18n into a codebase that shipped hardcoded strings requires changing every string in the application. Starting from day one with externalised strings costs one engineer a few weeks. Doing it later costs the entire team several months.

i18n foundation requirements (verified in testing):

All user-visible strings fetched from translation files via t() calls — no hardcoded English in component JSX
Dates formatted via Intl.DateTimeFormat with locale parameter — never new Date().toLocaleDateString() without locale
Numbers and currencies formatted via Intl.NumberFormat — never manual decimal/comma insertion
Currency display uses ISO 4217 codes via the formatting API, not hardcoded GBP symbols
RTL layout support is not required for Year 1 (none of the Year 2 locales are RTL), but CSS logical properties should be used in new components from Month 1

Pseudo-localisation sweep (Month 2)

A pseudo-localisation pass is run in Month 2: the English translation file is processed by a script that replaces ASCII characters with accented variants and pads strings by 40%. The result is loaded into the staging environment and every screen is reviewed for:

Strings still appearing in unmodified English (hardcoded, not using t())
Buttons, labels, or navigation items that clip or overflow at expanded string length
Translation keys appearing as raw strings (user.profile.title instead of a displayable string)
Encoding errors: accented characters not rendering correctly

This test does not require any French, German, or Spanish content. It catches i18n infrastructure bugs now, before translations are commissioned.

Year 2 preview: what testing will look like

When fr-FR, de-DE, and es-ES are added in Year 2, the localisation test plan expands to include:

Native speaker review of translations before any locale ships (machine translation is not acceptable for release)
Date and currency format verification for each locale (de-DE inverts decimal and thousands separators)
German string expansion: German strings run 30–40% longer than English; any layout that is tight in English will overflow in German
Email template localisation: transactional emails (order confirmation, shipping notification, password reset) must be localised alongside the UI — these are commonly missed

Acceptance criteria (Year 1)

Criterion	Target
Hardcoded English strings in production build	Zero
Pseudo-localisation: strings failing to render via t()	Zero
Pseudo-localisation: elements clipping at 140% length	Zero
Date formatting using Intl.DateTimeFormat	All date displays
Currency formatting using Intl.NumberFormat	All currency displays

Reliability testing plan

Failover and the RTO/RPO targets

ShopRight has defined RTO of 30 minutes and RPO of 15 minutes. These targets must be validated by actual tests — not assumed based on the AWS Multi-AZ architecture documentation.

Multi-AZ is configured and theoretically failover should be automatic. The word "theoretically" is the reason for testing.

Failover test 1 — RDS primary instance failure: Terminate the primary RDS instance using AWS Fault Injection Simulator (FIS). Measure time from termination to the standby promotion completing and the application accepting writes. Acceptance criterion: application recovers and write traffic is processed within 30 minutes.

Failover test 2 — Availability zone failure simulation: Using AWS FIS, simulate an AZ becoming unavailable. Observe that traffic routes to instances in the remaining two AZs, health checks detect the failure, and user-facing error rates during the failover event do not exceed 1% for longer than 5 minutes.

Failover test 3 — Application server failure: Terminate two-thirds of the ECS task instances simultaneously. Observe that the remaining instances continue handling traffic, ECS replaces the terminated tasks, and auto-scaling adds capacity within 3 minutes.

Backup restoration test

The RPO of 15 minutes requires point-in-time recovery (PITR) to be enabled on the RDS instance and validated. In Month 2, a backup restoration test is run:

Take note of a specific data state (a known order ID, a known user record)
Trigger an RDS snapshot
Restore the snapshot to a separate temporary RDS instance
Verify the known data state is present and consistent in the restored instance
Verify the restoration completed within the RPO window

This is the most commonly skipped reliability test and the most important. A backup that has never been restored is unverified.

Soak test connection to reliability

The 8-hour soak test defined in the performance plan also serves as a reliability test. Metrics monitored during the soak specifically for reliability signals: RDS connection pool utilisation, Lambda/ECS memory growth over time, CloudWatch error rate trends, and queue depth if SQS is used for order processing. Any upward trend in these metrics that does not plateau within 2 hours is a reliability finding.

Acceptance criteria

Criterion	Target
RDS failover: writes resume within	30 minutes
AZ failure: error rate spike duration	< 5 minutes above 1%
Backup restoration test: data integrity verified	Yes
Backup restoration test: completes within RPO (15 min)	Yes
8-hour soak: memory growth in API service	< 200 MB
8-hour soak: error rate trend	Flat (no upward trend)

Resources summary

A plan without a resources section is incomplete. Every tool and environment dependency collected across all eight plan areas:

Resource	Purpose	When needed	Owner
k6 Cloud subscription	Load and spike tests	Month 2–3	DevOps engineer
BrowserStack Automate	Cross-browser/device testing	Month 1 onwards	QA lead
OWASP ZAP	DAST scanning	Month 2	QA lead
SonarQube	SAST in CI	Month 1	DevOps engineer
Dependabot	Dependency CVE scanning	Month 1	DevOps engineer
External pen test firm	Professional penetration test	Month 2 week 3	QA lead (coordinate)
AWS FIS	Fault injection for failover tests	Month 2	DevOps engineer
NVDA (Windows VM)	Screen reader testing	Month 2	QA lead
Staging environment (production-equivalent)	All load, security, and failover tests	Available Month 2	DevOps engineer
Seeded test database (100K products)	Realistic load test data	Month 2 before load tests	DevOps engineer

The two highest-risk dependencies: the production-equivalent staging environment (without it, Month 2 tests cannot run as designed) and the seeded database (without realistic data volume, load test results are misleading). Both dependencies have the DevOps engineer as owner — verify availability before Month 2 begins, not after.

Accessibility testing plan

Regulatory obligation

Automated scanning (Month 1 onwards)

Manual WCAG audit (Month 2)

Screen reader testing (Month 2)

Acceptance criteria

Compatibility testing plan

Browser and device matrix

Test execution strategy — three tiers of coverage

Smoke (Every PR)

Regression (Every PR, Chromium)

Full Matrix (Weekly)

Responsive testing protocol

Acceptance criteria

Localisation testing plan

Year 1 scope: i18n foundation only

Pseudo-localisation sweep (Month 2)

Year 2 preview: what testing will look like

Acceptance criteria (Year 1)

Reliability testing plan

Failover and the RTO/RPO targets

Backup restoration test

Soak test connection to reliability

Acceptance criteria

Resources summary