Here's the fully updated spec:
Feature Spec: Add Shell Command Tools to Android MCP
Background / Problem
Current tools (Snapshot → Click) require multiple round-trips to launch an app:
- Snapshot to read screen
- Find target coordinates or selectors
- Click
- Snapshot again to verify
This is slow, fragile (coordinate/layout-dependent), and requires the target element to be visible on screen.
Current Tool Inventory (android-mcp-sanjar)
Tool | Description -- | -- ListDevices | List connected ADB devices ConnectDevice(serial) | Connect to a device by serial Device(action) | list / connect / disconnect Snapshot(use_vision, use_annotation) | Screenshot + accessibility tree Click(x, y) | Tap at coordinates LongClick(x, y) | Long press at coordinates ClickBySelector(text, resourceId, className, description) | Tap by UI element attributes Press(button) | Hardware button press (home, back, etc.) Type | Type text input Swipe | Swipe gesture Drag | Drag gesture Notification | Read device notifications Wait | Fixed time wait WaitForElement(...) | Wait for element to appear
Clarification on "adb shell" method: This runs adb shell am start from the laptop/host machine (where adb is installed, e.g. ~/Library/Android/sdk/platform-tools/adb), communicating to the phone over ADB (USB or Wireless Debugging). It does not execute a shell on the device itself in the MCP sense — the command originates on the host. This was validated using the Macos:Shell MCP tool to invoke the local adb binary directly.
The goal of this spec is to bring this capability natively into the android-mcp server, so callers don't need Macos:Shell or a local adb binary exposed to Claude as a separate escape hatch.
Workaround (Current State)
Without a ShellCommand tool, the only way to do fast app launching today is via Macos:Shell calling the host's adb binary:
~/Library/Android/sdk/platform-tools/adb -s <serial> shell am start -n com.whatsapp/.HomeActivity
This works but has two problems:
- It requires
Macos:Shell (or equivalent host shell access) to be available as an MCP tool — a separate, unrelated server
- It leaks host machine details and is not portable across setups
Requested Features
1. ShellCommand tool — executes adb shell <command> from the MCP host machine against the connected device.
ShellCommand(command: str) -> stdout: str, exit_code: int
Enables: am start, monkey, input, dumpsys, pm list packages, etc.
2. LaunchApp tool (convenience wrapper) — launches an app by package name.
LaunchApp(package: str, activity: str = None) -> success: bool
If activity is omitted, resolves the default launcher activity automatically via monkey.
3. OpenDeepLink tool (convenience wrapper) — launches directly into a specific screen via URI.
OpenDeepLink(uri: str) -> success: bool
Example: blinkit://search?q=popcorn
Priority: ShellCommand is foundational — the other two are thin wrappers on top of it.
Implementation sketch
The MCP server already has adb wired internally (it's how Snapshot, Click etc. work). Adding ShellCommand is a ~10 line change:
@mcp.tool()
def shell_command(command: str) -> dict:
"""Run an adb shell command on the connected device."""
result = subprocess.run(
["adb", "-s", DEVICE_SERIAL, "shell"] + command.split(),
capture_output=True, text=True
)
return {
"stdout": result.stdout.strip(),
"exit_code": result.returncode
}
Failure Modes
1. Package name must be known a priori adb shell am start requires the exact package name (e.g. com.whatsapp). For popular apps this works from model memory. For less common, enterprise, or region-specific apps, the model may not know the package name. Without ShellCommand, there's no way to run pm list packages to discover it dynamically — making this approach unreliable for the general case. Once ShellCommand exists, this failure mode disappears since the model can enumerate packages first.
2. Accessibility/description tree dumping is unreliable Using uiautomator dump or the accessibility tree (as surfaced by Snapshot) to find elements by description is fragile. Many apps set empty, generic, or non-deterministic content descriptions. System UI elements, custom views, and WebView-rendered content are particularly bad offenders. ClickBySelector with description= will silently timeout or match the wrong element in these cases. This makes selector-based navigation an unreliable fallback when shell access is unavailable.
3. Multiple devices require explicit serial targeting As encountered in testing, adb fails with "more than one device/emulator" when multiple devices are connected. The MCP server must either: (a) always pass -s <serial> using the currently active device, or (b) expose a SetActiveDevice tool. Currently ConnectDevice exists but it's unclear if it sets a global active serial for subsequent commands.
Here's the fully updated spec:
Feature Spec: Add Shell Command Tools to Android MCP
Background / Problem
Current tools (Snapshot → Click) require multiple round-trips to launch an app:
This is slow, fragile (coordinate/layout-dependent), and requires the target element to be visible on screen.
Current Tool Inventory (
android-mcp-sanjar)Tool | Description -- | -- ListDevices | List connected ADB devices ConnectDevice(serial) | Connect to a device by serial Device(action) | list / connect / disconnect Snapshot(use_vision, use_annotation) | Screenshot + accessibility tree Click(x, y) | Tap at coordinates LongClick(x, y) | Long press at coordinates ClickBySelector(text, resourceId, className, description) | Tap by UI element attributes Press(button) | Hardware button press (home, back, etc.) Type | Type text input Swipe | Swipe gesture Drag | Drag gesture Notification | Read device notifications Wait | Fixed time wait WaitForElement(...) | Wait for element to appear
Workaround (Current State)
Without a
ShellCommandtool, the only way to do fast app launching today is viaMacos:Shellcalling the host'sadbbinary:This works but has two problems:
Macos:Shell(or equivalent host shell access) to be available as an MCP tool — a separate, unrelated serverRequested Features
1. ShellCommand tool — executes
adb shell <command>from the MCP host machine against the connected device.Enables:
am start,monkey,input,dumpsys,pm list packages, etc.2. LaunchApp tool (convenience wrapper) — launches an app by package name.
If activity is omitted, resolves the default launcher activity automatically via
monkey.3. OpenDeepLink tool (convenience wrapper) — launches directly into a specific screen via URI.
Example:
blinkit://search?q=popcornPriority:
ShellCommandis foundational — the other two are thin wrappers on top of it.Implementation sketch
The MCP server already has
adbwired internally (it's howSnapshot,Clicketc. work). AddingShellCommandis a ~10 line change:Failure Modes
1. Package name must be known a priori
adb shell am startrequires the exact package name (e.g.com.whatsapp). For popular apps this works from model memory. For less common, enterprise, or region-specific apps, the model may not know the package name. WithoutShellCommand, there's no way to runpm list packagesto discover it dynamically — making this approach unreliable for the general case. OnceShellCommandexists, this failure mode disappears since the model can enumerate packages first.2. Accessibility/description tree dumping is unreliable Using
uiautomator dumpor the accessibility tree (as surfaced bySnapshot) to find elements by description is fragile. Many apps set empty, generic, or non-deterministic content descriptions. System UI elements, custom views, and WebView-rendered content are particularly bad offenders.ClickBySelectorwithdescription=will silently timeout or match the wrong element in these cases. This makes selector-based navigation an unreliable fallback when shell access is unavailable.3. Multiple devices require explicit serial targeting As encountered in testing,
adbfails with "more than one device/emulator" when multiple devices are connected. The MCP server must either: (a) always pass-s <serial>using the currently active device, or (b) expose aSetActiveDevicetool. CurrentlyConnectDeviceexists but it's unclear if it sets a global active serial for subsequent commands.