Mobile apps are built around touch. Swipes dismiss notifications, pinch gestures zoom maps, long presses reveal context menus, and drag-and-drop reorders lists. If your test suite can only tap and type, it cannot test a significant portion of most apps. This lesson covers the W3C Actions API — the modern, Appium-2-compliant way to execute every gesture — plus platform-specific shortcuts where they exist.
The W3C Actions API
Appium 2.x removed all legacy gesture methods (swipe(), pinch(), zoom(), longPress()). The replacement is the W3C Actions API, which models input events as sequences of pointer and keyboard actions.
The core classes:
import org.openqa.selenium.interactions.PointerInput;
import org.openqa.selenium.interactions.Sequence;
import org.openqa.selenium.interactions.Pause;A gesture is one or more Sequence objects. Each Sequence belongs to one input device (a finger). Multi-touch gestures use multiple sequences executed simultaneously.
Swipe
A swipe is a press-move-release on one finger:
private void swipe(int startX, int startY, int endX, int endY, int durationMs) {
PointerInput finger = new PointerInput(PointerInput.Kind.TOUCH, "finger");
Sequence swipe = new Sequence(finger, 1)
.addAction(finger.createPointerMove(
Duration.ZERO, PointerInput.Origin.viewport(), startX, startY))
.addAction(finger.createPointerDown(PointerInput.MouseButton.LEFT.asArg()))
.addAction(finger.createPointerMove(
Duration.ofMillis(durationMs), PointerInput.Origin.viewport(), endX, endY))
.addAction(finger.createPointerUp(PointerInput.MouseButton.LEFT.asArg()));
driver.perform(List.of(swipe));
}Usage examples:
// Swipe up (scroll down — content moves up to reveal more below)
swipe(540, 1400, 540, 400, 500);
// Swipe down (scroll up — content moves down to reveal top)
swipe(540, 400, 540, 1400, 500);
// Swipe left (carousel next slide)
swipe(900, 700, 100, 700, 400);
// Swipe right (carousel previous slide, or drawer open)
swipe(100, 700, 900, 700, 400);The coordinates are absolute viewport pixels. Get the device screen size to calculate relative positions:
Dimension size = driver.manage().window().getSize();
int centerX = size.width / 2;
int startY = (int)(size.height * 0.8);
int endY = (int)(size.height * 0.2);
swipe(centerX, startY, centerX, endY, 500);Pinch (zoom out)
A pinch uses two simultaneous touch sequences converging inward:
private void pinch(WebElement element) {
Dimension size = element.getSize();
Point location = element.getLocation();
int centerX = location.getX() + size.width / 2;
int centerY = location.getY() + size.height / 2;
int startX1 = centerX - 200;
int startX2 = centerX + 200;
PointerInput finger1 = new PointerInput(PointerInput.Kind.TOUCH, "finger1");
PointerInput finger2 = new PointerInput(PointerInput.Kind.TOUCH, "finger2");
Sequence seq1 = new Sequence(finger1, 1)
.addAction(finger1.createPointerMove(Duration.ZERO, PointerInput.Origin.viewport(), startX1, centerY))
.addAction(finger1.createPointerDown(PointerInput.MouseButton.LEFT.asArg()))
.addAction(finger1.createPointerMove(Duration.ofMillis(600), PointerInput.Origin.viewport(), centerX - 50, centerY))
.addAction(finger1.createPointerUp(PointerInput.MouseButton.LEFT.asArg()));
Sequence seq2 = new Sequence(finger2, 1)
.addAction(finger2.createPointerMove(Duration.ZERO, PointerInput.Origin.viewport(), startX2, centerY))
.addAction(finger2.createPointerDown(PointerInput.MouseButton.LEFT.asArg()))
.addAction(finger2.createPointerMove(Duration.ofMillis(600), PointerInput.Origin.viewport(), centerX + 50, centerY))
.addAction(finger2.createPointerUp(PointerInput.MouseButton.LEFT.asArg()));
driver.perform(List.of(seq1, seq2));
}Zoom (pinch out)
The same as pinch but the fingers move outward from the centre:
// Swap start and end positions relative to center
Sequence seq1 = new Sequence(finger1, 1)
.addAction(finger1.createPointerMove(Duration.ZERO, PointerInput.Origin.viewport(), centerX - 50, centerY))
.addAction(finger1.createPointerDown(PointerInput.MouseButton.LEFT.asArg()))
.addAction(finger1.createPointerMove(Duration.ofMillis(600), PointerInput.Origin.viewport(), startX1, centerY))
.addAction(finger1.createPointerUp(PointerInput.MouseButton.LEFT.asArg()));Long press
Hold the pointer down for an extended duration before releasing:
private void longPress(WebElement element, Duration holdDuration) {
Point center = element.getRect().getPoint();
int x = center.getX() + element.getSize().getWidth() / 2;
int y = center.getY() + element.getSize().getHeight() / 2;
PointerInput finger = new PointerInput(PointerInput.Kind.TOUCH, "finger");
Sequence longPress = new Sequence(finger, 1)
.addAction(finger.createPointerMove(Duration.ZERO, PointerInput.Origin.viewport(), x, y))
.addAction(finger.createPointerDown(PointerInput.MouseButton.LEFT.asArg()))
.addAction(new Pause(finger, holdDuration))
.addAction(finger.createPointerUp(PointerInput.MouseButton.LEFT.asArg()));
driver.perform(List.of(longPress));
}
// Usage: long press a list item for 2 seconds to trigger context menu
longPress(driver.findElement(AppiumBy.accessibilityId("message_item")), Duration.ofSeconds(2));Drag and drop
Move an element to a new position:
private void dragAndDrop(WebElement source, WebElement target) {
Point sourceCenter = getCenter(source);
Point targetCenter = getCenter(target);
PointerInput finger = new PointerInput(PointerInput.Kind.TOUCH, "finger");
Sequence drag = new Sequence(finger, 1)
.addAction(finger.createPointerMove(Duration.ZERO, PointerInput.Origin.viewport(),
sourceCenter.getX(), sourceCenter.getY()))
.addAction(finger.createPointerDown(PointerInput.MouseButton.LEFT.asArg()))
.addAction(new Pause(finger, Duration.ofMillis(500))) // hold briefly to register
.addAction(finger.createPointerMove(Duration.ofMillis(800), PointerInput.Origin.viewport(),
targetCenter.getX(), targetCenter.getY()))
.addAction(finger.createPointerUp(PointerInput.MouseButton.LEFT.asArg()));
driver.perform(List.of(drag));
}Platform shortcuts
For common gestures, platform-specific mobile: commands are often simpler than building full W3C sequences:
Android:
// Swipe inside a specific element
driver.executeScript("mobile: swipeGesture", Map.of(
"elementId", element.getAttribute("resourceId"),
"direction", "left",
"percent", 0.75
));
// Scroll to element
driver.executeScript("mobile: scrollGesture", Map.of(
"left", 0, "top", 0, "width", 1080, "height", 2400,
"direction", "down",
"percent", 3.0
));iOS:
// Swipe
driver.executeScript("mobile: swipe", Map.of("direction", "left"));
// Pinch
driver.executeScript("mobile: pinch", Map.of(
"scale", 0.5, // < 1 to zoom out, > 1 to zoom in
"velocity", 1.0 // speed of the gesture
));
// Scroll element into view
driver.executeScript("mobile: scroll", Map.of(
"direction", "down",
"toVisible", true
));The mobile: commands are faster to write and less brittle for simple cases. Use W3C Actions when you need precise control over timing, distance, or multi-touch coordination.