Speed up app start/open

Here's the fully updated spec:

* * *

Feature Spec: Add Shell Command Tools to Android MCP
----------------------------------------------------

### Background / Problem

Current tools (Snapshot → Click) require multiple round-trips to launch an app:

*   Snapshot to read screen
*   Find target coordinates or selectors
*   Click
*   Snapshot again to verify

This is slow, fragile (coordinate/layout-dependent), and requires the target element to be visible on screen.

* * *

### Current Tool Inventory (`android-mcp-sanjar`)

Tool | Description -- | -- ListDevices | List connected ADB devices ConnectDevice(serial) | Connect to a device by serial Device(action) | list / connect / disconnect Snapshot(use\_vision, use\_annotation) | Screenshot + accessibility tree Click(x, y) | Tap at coordinates LongClick(x, y) | Long press at coordinates ClickBySelector(text, resourceId, className, description) | Tap by UI element attributes Press(button) | Hardware button press (home, back, etc.) Type | Type text input Swipe | Swipe gesture Drag | Drag gesture Notification | Read device notifications Wait | Fixed time wait WaitForElement(...) | Wait for element to appear

> **Clarification on "adb shell" method:** This runs `adb shell am start` from the **laptop/host machine** (where `adb` is installed, e.g. `~/Library/Android/sdk/platform-tools/adb`), communicating to the phone over ADB (USB or Wireless Debugging). It does not execute a shell on the device itself in the MCP sense — the command originates on the host. This was validated using the `Macos:Shell` MCP tool to invoke the local `adb` binary directly.
> 
> **The goal of this spec** is to bring this capability natively into the android-mcp server, so callers don't need `Macos:Shell` or a local `adb` binary exposed to Claude as a separate escape hatch.

* * *

### Workaround (Current State)

Without a `ShellCommand` tool, the only way to do fast app launching today is via `Macos:Shell` calling the host's `adb` binary:

    ~/Library/Android/sdk/platform-tools/adb -s <serial> shell am start -n com.whatsapp/.HomeActivity
    

This works but has two problems:

1.  It requires `Macos:Shell` (or equivalent host shell access) to be available as an MCP tool — a separate, unrelated server
2.  It leaks host machine details and is not portable across setups

* * *

### Requested Features

**1\. ShellCommand tool** — executes `adb shell <command>` from the MCP host machine against the connected device.

    ShellCommand(command: str) -> stdout: str, exit_code: int
    

Enables: `am start`, `monkey`, `input`, `dumpsys`, `pm list packages`, etc.

**2\. LaunchApp tool** (convenience wrapper) — launches an app by package name.

    LaunchApp(package: str, activity: str = None) -> success: bool
    

If activity is omitted, resolves the default launcher activity automatically via `monkey`.

**3\. OpenDeepLink tool** (convenience wrapper) — launches directly into a specific screen via URI.

    OpenDeepLink(uri: str) -> success: bool
    

Example: `blinkit://search?q=popcorn`

Priority: `ShellCommand` is foundational — the other two are thin wrappers on top of it.

### Implementation sketch

The MCP server already has `adb` wired internally (it's how `Snapshot`, `Click` etc. work). Adding `ShellCommand` is a ~10 line change:

    @mcp.tool()
    def shell_command(command: str) -> dict:
        """Run an adb shell command on the connected device."""
        result = subprocess.run(
            ["adb", "-s", DEVICE_SERIAL, "shell"] + command.split(),
            capture_output=True, text=True
        )
        return {
            "stdout": result.stdout.strip(),
            "exit_code": result.returncode
        }
    

* * *

### Failure Modes

**1\. Package name must be known a priori** `adb shell am start` requires the exact package name (e.g. `com.whatsapp`). For popular apps this works from model memory. For less common, enterprise, or region-specific apps, the model may not know the package name. Without `ShellCommand`, there's no way to run `pm list packages` to discover it dynamically — making this approach unreliable for the general case. Once `ShellCommand` exists, this failure mode disappears since the model can enumerate packages first.

**2\. Accessibility/description tree dumping is unreliable** Using `uiautomator dump` or the accessibility tree (as surfaced by `Snapshot`) to find elements by description is fragile. Many apps set empty, generic, or non-deterministic content descriptions. System UI elements, custom views, and WebView-rendered content are particularly bad offenders. `ClickBySelector` with `description=` will silently timeout or match the wrong element in these cases. This makes selector-based navigation an unreliable fallback when shell access is unavailable.

**3\. Multiple devices require explicit serial targeting** As encountered in testing, `adb` fails with "more than one device/emulator" when multiple devices are connected. The MCP server must either: (a) always pass `-s <serial>` using the currently active device, or (b) expose a `SetActiveDevice` tool. Currently `ConnectDevice` exists but it's unclear if it sets a global active serial for subsequent commands.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up app start/open #3

Feature Spec: Add Shell Command Tools to Android MCP

Background / Problem

Current Tool Inventory (`android-mcp-sanjar`)

Workaround (Current State)

Requested Features

Implementation sketch

Failure Modes

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Speed up app start/open #3

Description

Feature Spec: Add Shell Command Tools to Android MCP

Background / Problem

Current Tool Inventory (android-mcp-sanjar)

Workaround (Current State)

Requested Features

Implementation sketch

Failure Modes

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Current Tool Inventory (`android-mcp-sanjar`)