startsWith, contains, and equals cover most String comparisons. When the rule is more shape-based — any three digits, anything that looks like an email, every UUID in this log — you reach for a regular expression. Java's regex API lives in java.util.regex: Pattern (the compiled rule) and Matcher (the engine that walks it across an input). The regex syntax is the same as in JavaScript and Python, with one Java-specific wrinkle: every backslash in the pattern has to be doubled in the source code.
Two ways to use regex
Two entry points, two use cases:
// 1) String.matches — does the WHOLE string match this pattern?
boolean ok = "alice@test.com".matches(".*@.*\\.com"); // true
// 2) Pattern + Matcher — find one or many matches inside a larger string
import java.util.regex.Matcher;
import java.util.regex.Pattern;
Pattern p = Pattern.compile("\\d{3}");
Matcher m = p.matcher("Status: 200 OK");
if (m.find()) {
System.out.println(m.group()); // "200"
}String.matches(...) is the shortcut for "does the entire input fit?" It returns true only if the whole string matches — there's an implicit ^...$ around your pattern. Useful for validation: "is this email-shaped?", "is this a UUID?".
Pattern.compile(...).matcher(input) is the workhorse: it gives you a Matcher that you drive with find(), group(), and start()/end(). Use this when you need to extract substrings or scan for every occurrence.
Double backslashes — the Java tax
In a raw regex, \d means "any digit." In a Java string literal, \ is itself an escape character, so to write a literal backslash you need \\. To get a regex \d inside a Java string, you write "\\d".
The cheat sheet:
| Regex (what the engine sees) | Java string literal |
|---|---|
\d | "\\d" |
\s | "\\s" |
\. | "\\." |
\\ | "\\\\" |
[0-9]+ | "[0-9]+" (no backslashes needed) |
Every time you see \\ in a Java pattern, the engine sees a single \. This catches absolutely every newcomer once. Your IDE may underline literals with valid regex or warn you about a malformed one — pay attention to those squiggles.
Common patterns for QA work
Patterns you'll meet day-to-day:
String email = "\\w+@\\w+\\.\\w+"; // alice@test.com
String status = "\\d{3}"; // 200, 404, 500
String uuid = "[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}";
String isoDate = "\\d{4}-\\d{2}-\\d{2}"; // 2026-05-06
String url = "https?://[\\w.-]+(/[\\w.-]*)*"; // http or https URLA bigger character set:
\ddigit,\Dnon-digit,\swhitespace,\Snon-whitespace,\wword char ([A-Za-z0-9_]),\Wnon-word.any character (use\\.for a literal dot)*zero or more,+one or more,?zero or one,{n}exactly n,{n,m}n to m^start,$end (matters withMatcher.find();String.matchesalready anchors)[abc]any of a, b, c —[^abc]any except a, b, c(group)capture group,(?:group)non-capturing
We're not going deep into regex syntax — qa.codes has a dedicated Regex for Testers cheat sheet and the Regex Tester utility for live experimentation. The lesson is about how to use regex from Java; the patterns themselves transfer between languages.
A real example — extracting a status code
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class ExtractStatus {
public static void main(String[] args) {
String response = "HTTP/1.1 502 Bad Gateway";
Matcher m = Pattern.compile("HTTP/\\d\\.\\d (\\d{3}) (.+)").matcher(response);
if (m.find()) {
String code = m.group(1);
String reason = m.group(2);
System.out.println("code = " + code);
System.out.println("reason = " + reason);
}
}
}Output:
code = 502
reason = Bad Gateway
The pattern has two capture groups in parentheses: (\\d{3}) and (.+). After a successful find(), m.group(0) (or just m.group()) is the whole match; m.group(1) is the first capture; m.group(2) is the second. Capture groups are how you pull parts of a match out of the input.
Finding all occurrences
find() returns one match at a time. Call it in a loop to walk every occurrence:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class CountNumbers {
public static void main(String[] args) {
String summary = "Total: 32, Passed: 28, Failed: 4";
Matcher m = Pattern.compile("\\d+").matcher(summary);
int count = 0;
while (m.find()) {
System.out.println("found " + m.group() + " at index " + m.start());
count++;
}
System.out.println("total numbers: " + count);
}
}Output:
found 32 at index 7
found 28 at index 19
found 4 at index 31
total numbers: 3
m.start() returns the index where the current match begins; m.end() is one past the end. Useful when you need to know where in the original string a match was, not just what it matched.
replaceAll with regex — masking and rewriting
String.replaceAll(regex, replacement) rewrites every match. The replacement can reference capture groups with $1, $2, etc.:
public class MaskNumbers {
public static void main(String[] args) {
String log = "User 42 logged in from IP 10.0.0.7 with id 12345";
// Mask 4+ digit numbers
String masked = log.replaceAll("\\d{4,}", "****");
System.out.println(masked);
// Reformat key=value pairs into key: value
String pairs = "env=staging timeout=10 retries=3";
String niceFormat = pairs.replaceAll("(\\w+)=(\\w+)", "$1: $2");
System.out.println(niceFormat);
}
}Output:
User 42 logged in from IP 10.0.0.7 with id ****
env: staging timeout: 10 retries: 3
Two distinct uses: pure replacement ($1, $2 not used) and capture-and-rewrite ($1, $2 reorganise the match). For QA reporting, masking PII (account numbers, tokens, emails) before logs leave the test agent is a typical use.
String.replaceFirst(...) is the same but stops after the first match; String.replace(literal, literal) does a non-regex replacement (cheaper and safer when the search is fixed text).
Compile once, reuse many
Pattern.compile(...) is not free — the regex engine builds a state machine. If a pattern is used in a loop, compile it once outside the loop and reuse the Pattern:
private static final Pattern STATUS_CODE = Pattern.compile("\\d{3}");
public static String extractCode(String line) {
Matcher m = STATUS_CODE.matcher(line);
return m.find() ? m.group() : null;
}String.matches, String.replaceAll, etc. compile the pattern internally on every call — fine for one-off uses, wasteful in a tight loop. The static final field idiom is the standard way to compile-once-use-many.
How regex matching flows
Reading top down: compile the rule, attach it to the input, repeatedly call find() to advance, pull out matches with group(). That four-step rhythm covers nearly every regex use case in QA tooling.
Tip: qa.codes/utilities/regex-tester is a sandbox for prototyping a pattern against real input before pasting it into your Java code. Most regex bugs come from authoring the pattern in your head; testing it interactively first saves a lot of recompiles.
⚠️ Common mistakes
- Single backslash in a Java pattern.
Pattern.compile("\d{3}")is a compile error (\dis an unknown escape). The fix is"\\d{3}". Watch for\.,\\,\s,\b— every regex backslash needs to be doubled in the source. - Confusing
String.matchesandMatcher.find.matchesrequires the entire input to match;findlooks for any match anywhere."abc 200 def".matches("\\d{3}")returnsfalse(the whole string isn't three digits);"abc 200 def".matches(".*\\d{3}.*")returnstrue. Pick the right one for what you're asking. - Recompiling the same pattern in a tight loop.
for (...) { Pattern.compile("...").matcher(s).find(); }rebuilds the state machine every iteration. Hoist thePattern.compile(...)out — usually as astatic finalfield.
🎯 Practice task
Extract structured data from log lines. 25-30 minutes.
- Create
LogScanner.java.import java.util.regex.Matcher;andimport java.util.regex.Pattern;. - Define a sample input array of log lines. Example:
String[] lines = { "2026-05-06 09:00:01 INFO Login OK in 1450ms", "2026-05-06 09:00:03 ERROR Status 502 from /checkout", "2026-05-06 09:00:05 INFO Search OK in 820ms", "2026-05-06 09:00:07 ERROR Status 500 from /export" }; - Build a
static final Pattern STATUS = Pattern.compile("Status (\\d{3}) from (/\\w+)");. Loop over the lines and useSTATUS.matcher(line).find()to find any line that matches. Printcodeandpathfrom groups 1 and 2. - Use
line.matches("\\d{4}-\\d{2}-\\d{2}.*")(or just checkstartsWithif you prefer) to confirm every line begins with an ISO date. - Use
String.replaceAll("\\d{1,5}ms", "***ms")to mask all duration values in the lines. Print before and after. - Use a single regex with three capture groups to parse a key/value pair input like
env=staging timeout=10 retries=3. Walk every match and print each key/value. - Stretch: open the Regex Tester on qa.codes, paste a real Selenium error log into it, and find the pattern that captures every "element not found" CSS selector. Then paste that pattern into your Java code (with the
\\doubling) and confirm it produces the same captures. Crossing between an interactive tester and Java's escape rules is a useful skill on its own.
You can now extract and rewrite text by shape, not just by literal. Lesson 3 introduces lambdas — the syntax that powers the next-generation collection processing in lesson 4.