Skip to content

Bump VERSION to 1.1.0 for release#13

Merged
daniel-pittman merged 1 commit into
developfrom
chore/bump-1.1.0
May 27, 2026
Merged

Bump VERSION to 1.1.0 for release#13
daniel-pittman merged 1 commit into
developfrom
chore/bump-1.1.0

Conversation

@daniel-pittman
Copy link
Copy Markdown
Owner

Summary

VERSION bump 1.0.1 → 1.1.0 in preparation for the v1.1.0 release.

Scope of v1.1.0

(Captured in the release notes; PRs #2#12 merged since v1.0.1.)

@daniel-pittman daniel-pittman merged commit a642baa into develop May 27, 2026
3 checks passed
@daniel-pittman daniel-pittman deleted the chore/bump-1.1.0 branch May 27, 2026 23:41
@claude
Copy link
Copy Markdown

claude Bot commented May 27, 2026

Claude finished @daniel-pittman's task in 25s —— View job


Code Review — PR #13

  • Read PR context
  • Inspect VERSION change
  • Verify scope matches title/body
  • Post review

Summary

This PR is a single-line version bump in VERSION: 1.0.11.1.0, in preparation for the v1.1.0 release. The scope matches the title and body exactly.

- 1.0.1
+ 1.1.0

Assessment

  • Correctness — The new value follows SemVer and is a valid MINOR bump from 1.0.1, appropriate for an accumulation of feature PRs (Phase 4.1: bb_api.py + Python test infrastructure (MCP foundation) #2Fix bb prs bash-3.2 incompat + macOS install docs #12) without breaking changes. The file remains a single line with a trailing newline (consistent with prior convention).
  • Scope — Strictly the version bump. No collateral edits, no stray formatting changes. Good hygiene for a release-prep PR.
  • Parity rule — Per CLAUDE.md, bash and Python sides must stay in lockstep. VERSION is a single source of truth read by both, so no parallel update is required here. Verified bb and mcp_server.py do not embed a hard-coded version string that would need to be bumped alongside this.
  • Tagging — As a follow-up after merge, ensure a matching v1.1.0 git tag and GitHub release are created so consumers pinning by tag stay aligned with VERSION. (Out of scope for this PR.)
  • Changelog — The PR body defers release notes to the GitHub release. That's fine, but if a CHANGELOG.md exists or is added later, this would be the natural place to enumerate Phase 4.1: bb_api.py + Python test infrastructure (MCP foundation) #2Fix bb prs bash-3.2 incompat + macOS install docs #12.

Findings

No blocking issues. No security, correctness, or style concerns in the diff.

Verdict

LGTM — ship it. Status notes this is already MERGED, so this review is informational. Confirm the v1.1.0 tag/release is pushed to finalize the release.


• Branch: claude/pr-13-20260527-2341

daniel-pittman added a commit that referenced this pull request May 27, 2026
* Add mcp_server.py — FastMCP wrapper around bb_ops + git_ops

Phase 4.6 of the MCP server work. Wires every bb_ops and git_ops
function into a FastMCP tool registry, with a self-bootstrap venv
mirroring zenhub-cli's mcp_server.py pattern. 39 new tests, 335 total.

mcp_server.py — 30 tools:

  PIPELINES (6): pipelines_list, pipeline_show, pipeline_steps,
    pipeline_trigger, pipeline_stop, pipeline_logs
  PULL REQUESTS (11): prs_list, pr_show, pr_activity, pr_create,
    pr_approve, pr_unapprove, pr_merge, pr_decline, pr_diff,
    pr_comments_list, pr_comment_add
  REPOS / METADATA (7): repos_list, repo_show, branches_list,
    branch_show, vars_list, downloads_list, commits_list
  GIT CONTEXT (5): git_current_branch, git_status, git_remote_repo,
    git_recent_commits, git_uncommitted_changes
  META (1): whoami

Self-bootstrap pattern (mirrors zenhub-cli):

  - On launch, checks /tmp/bbenv. If missing, finds a python3 >= 3.10
    (preferring the launching interpreter, falling back to pyenv /
    Homebrew / system locations), runs `python -m venv`, installs the
    `mcp` package, then `os.execv`s into the venv.

  - Bootstrap is stdlib-only (uses subprocess + sys, no third-party
    imports until after the venv is active). Means any python3 on PATH
    can launch the server.

  - `mcp` is the only venv dep — no heavy ML libs (no torch /
    sentence-transformers / numpy / huggingface_hub). bb doesn't have
    semantic-similarity workflows like zenhub-cli does.

  - BB_MCP_SKIP_BOOTSTRAP=1 escape hatch (set in conftest.py) lets
    pytest exercise tool wiring and result-dict shapes without
    pulling in mcp. Production NEVER sets this — without real
    FastMCP, .run() raises.

Repo resolution — every Bitbucket tool takes an optional `repo`:

  - ""              → auto-detect via `git remote get-url origin`
                      from BB_DEFAULT_REPO_PATH (or cwd)
  - "myrepo"        → use BB_WORKSPACE from config + "myrepo"
  - "acme/myrepo"   → use "acme" workspace + "myrepo" slug (overrides
                      BB_WORKSPACE for this call)

Validated in `_resolve_repo` with ValueError on malformed input
("a/b/c", "/repo", "ws/", "/", "//" all rejected before any network
call burns API budget).

Error envelope — every tool returns either:

  {"ok": True, "workspace": ..., "repo": ..., <result fields>}
  {"ok": False, "kind": "<ExceptionClassName>", "message": ..., <extras>}

The error dict carries `status`/`url`/`body` for BBApiError,
`returncode`/`stderr` for GitOpError, so the agent can branch on
`kind == "BBApiError" and status == 404` without parsing the message.

pr_create auto-detects source_branch via git_current_branch when
empty (matches bash `bb pr-create`).

whoami reports user / workspace / api_base + best-effort git context
(branch, workspace, repo from remote). Token is NEVER echoed; the
test explicitly asserts the token string is absent from the response.

Test discipline (same as bb_api / bb_ops / git_ops):

  - Tool count + names pinned exactly (test_all_expected_tools_registered)
    — a future rename or accidental drop fails the suite.
  - Per-tool wiring: patch the bb_ops / git_ops function at the module
    level, call the tool, assert (a) the underlying function received
    the right args, (b) the response dict has the expected shape.
    Doesn't re-test bb_ops logic (already covered comprehensively).
  - Error envelope tested for each exception class
    (BBApiError, BBOpNotFound, GitOpError, ValueError).
  - whoami's no-token guarantee asserted as a security property.
  - FastMCP stub's .run() raises a clear error so a test that
    accidentally calls it fails loud instead of hanging on stdio.

CI / packaging:

  pyproject.toml py-modules += "mcp_server".
  ci.yml syntax-check += "mcp_server.py".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Apply Phase 4.6 review fixes

Round 1 review on PR #7 flagged 15 findings, with two explicitly called
out as production hazards ("will bite real users on first deploy on
common Linux topologies"). Addressed 11 of 15; deferring 4 with
rationale.

Production hazards (must-fix per reviewer):

  #1 Bootstrap venv detection now uses `sys.prefix == str(_VENV_DIR)`
    instead of `realpath(sys.executable) == realpath(_VENV_PY)`.
    `python -m venv` on Linux/macOS defaults to --symlinks, so
    realpath(/tmp/bbenv/bin/python3) resolves to the SAME canonical
    path as the builder interpreter (e.g. /usr/bin/python3.12) — the
    realpath comparison was claiming "already under venv" while we
    were actually still under the system interpreter, skipping the
    execv and dying on `from mcp.server.fastmcp import FastMCP`.
    sys.prefix is set per-interpreter from the venv layout and is the
    authoritative signal.

  #2 Bootstrap idempotency. Pip install can be interrupted (Ctrl-C,
    OOM-kill, disk full) leaving _VENV_PY in place but `mcp` missing.
    The old guard skipped reinstall on every subsequent launch and
    died on import with no self-repair path. Now writes a
    `.bbenv-ready` sentinel ONLY after the full bootstrap succeeds;
    the venv create step is also idempotent (only runs if _VENV_PY
    is absent) so retries pick up where they left off.

  #3 Pinned mcp dependency to `mcp>=1.0,<2`. The previous unpinned
    "mcp" would silently install a future breaking mcp 2.x on the
    next /tmp wipe. Matches the pyproject [mcp] extra.

Correctness:

  #4 pr_create now rejects detached-HEAD auto-detect. git_current_branch
    returns the literal "HEAD" for detached/unborn state; bb_ops only
    checks non-empty/non-whitespace, so the call would have reached
    Bitbucket with source.branch.name="HEAD" and surfaced as an
    opaque API error. Now raises ValueError locally with a clear
    "pass source_branch=" hint.

  #5 _resolve_repo strips whitespace before parsing. A sloppy paste
    like "  acme/widget  " no longer slips through as
    workspace="  acme" and fails deep in the API layer.

 #11 pr_create source_branch auto-detect now triggers on whitespace
    (not just empty string), symmetric with #5.

 #13 OSError added to _TOOL_EXPECTED_EXCEPTIONS. Covers paths
    git_ops._run_git doesn't wrap explicitly (IsADirectoryError,
    BlockingIOError, ConnectionResetError), and catches
    os.getcwd() on a deleted cwd inside _default_repo_path() which
    fires BEFORE any wrapped git call.

Surface improvements:

  #6 pipelines_list now exposes `sort=` kwarg. Default "-created_on"
    (newest first); agent can pass "created_on" for oldest-first or
    other sort keys.

  #7 pipeline_logs + pr_diff now expose `timeout=` kwarg. Default
    120s matches bb_ops; agent can bump for pipelines with very
    large log payloads or oversized PR diffs that would otherwise
    cap out.

  #8 git_status payload renamed from `status` to `working_tree`. The
    `status` key collided with the HTTP-status field _error_dict
    uses for BBApiError. Today the collision can't fire (git_status
    doesn't raise BBApiError), but the rename pre-empts a future
    broadening hazard.

 #10 branch_show echoes the stripped name in the response. bb_ops
    strips before encoding the URL, so the response now matches what
    Bitbucket actually resolved rather than the unstripped input.

Deferred (with rationale):

  #9  Response-key naming consistency (approval/result/pr). Reviewer
    suggested standardizing all mutation tools to "result", but the
    descriptive keys carry useful semantic info — pr_approve returns
    an approval record, pr_merge returns the merged PR, etc. The
    docstrings explain the shape. Skipping.

 #12 _client_cache concurrency + invalidation. Real concern for
    long-lived servers with rotated tokens, but stdio MCP is
    single-threaded and token rotation is rare; defer to Phase 4.6+
    if a real bug surfaces.

 #14 CI smoke step that actually imports the real mcp package.
    Larger CI workflow change; defer.

 #15 README MCP documentation. Explicitly Phase 4.8 scope.

Test count: 335 -> 341. All passing locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Apply Phase 4.6 round-2 review fixes

Round 2 surfaced 15 new findings including a real security bug. Fixed
all 15.

Security:

  #1 BBApiError.url now redacted in _error_dict. pipeline_logs and
    pr_diff follow Bitbucket's 307 to a presigned S3 URL via
    fetch_redirected_text; on a downstream non-3xx (S3 clock skew,
    expired signature, redirect cap), BBApiError(url=<signed S3 URL>)
    carried AWS-credential-bearing query parameters
    (X-Amz-Signature, X-Amz-Credential) verbatim into the agent
    error dict and downstream logs. New _redact_url helper strips
    `user:token@` AND replaces signed-URL query strings with
    `[redacted-signed-url-params]` while preserving the path so
    the agent knows what host was called.

Correctness:

  #2 AttributeError added to _TOOL_EXPECTED_EXCEPTIONS. JSON null
    from an MCP client deserialises to Python None; `.strip()` on
    None crashes uncaught. Also pre-normalised every stringy arg
    path through `(value or "").strip()` so None and "" funnel to
    the same auto-detect path uniformly.

  #3 _default_repo_path no longer evaluates os.getcwd() eagerly.
    `os.environ.get("KEY", os.getcwd())` runs the default even when
    the env var is set — the override never protected against a
    deleted cwd. Now uses `... or os.getcwd()` so the env var is
    the lazy short-circuit. whoami also wraps cwd resolution in a
    try block so a deleted cwd surfaces as a structured cwd_error
    field instead of escaping the function.

  #4 mcp import wrapped with helpful recovery message. If `mcp/`
    gets deleted post-bootstrap (manual `pip uninstall`, partial
    /tmp cleanup that wiped the package dir but spared the
    sentinel), the ImportError now tells the user to
    `rm .bbenv-ready` or `rm -rf /tmp/bbenv`.

  #5 _resolve_repo now strips inner slug parts. "acme/ widget"
    no longer slips through as workspace="acme", slug=" widget"
    and 404s on `/repositories/acme/%20widget`.

  #6 repos_list workspace stripped. ` acme` / `acme ` no longer
    slip through.

  #7 TypeError removed from _TOOL_EXPECTED_EXCEPTIONS (was
    contradicting the file's stated philosophy). A refactor that
    renames a bb_ops kwarg now crashes visibly during dev/test
    instead of being silently reported back to the agent as a
    fake Bitbucket failure.

  #8 whoami git-context handlers now use _TOOL_EXPECTED_EXCEPTIONS
    (not narrow GitOpError) AND store the full structured
    _error_dict (not str(e)). An agent branching on
    err.returncode for git-127 vs not-a-git-repo-128 vs
    timeout-(-1000) has the same shape available as every other
    tool.

  #10 _opt_str() helper introduced. (branch or None) became
    `_opt_str(branch)` everywhere — "" / "   " / None all funnel
    to None at the bb_ops boundary. Without this, "  " on a
    branch filter posted `?target.ref_name=%20%20`, on a pattern
    raised a confusing two-layers-down ValueError, on commits
    raised a different "non-empty, non-whitespace" error.
    Inconsistent surfaces for the same agent typo.

  #12 sys.prefix venv check now resolves both sides through
    Path.resolve() so macOS's `/tmp` -> `/private/tmp` symlink
    doesn't trigger an infinite execv loop.

  #13 Every tool that takes an identifier (pr_id / number /
    step_index) now threads it into the error dict via
    _error_dict_with. An agent fanning out parallel
    pipeline_logs(number=42, step_index=5) and
    pipeline_logs(number=42, step_index=6) can correlate
    failures to originating requests without substring-matching
    the message.

  #14 _resolve_repo validates the repo arg shape BEFORE calling
    _get_client(). A fresh-machine user without config + a
    malformed slug now sees the actual ValueError, not the
    BBConfigError that previously masked the real cause.

  #15 git_recent_commits echoes the stripped ref in the response
    (matches what `git log` actually ran). Consistent with
    branch_show.

UX:

  #11 pip install no longer --quiet. Now uses subprocess.run
    capture_output=True and re-raises with the captured stderr
    inlined so a network blip / mirror 503 / version yank /
    SSL cert error / proxy block surfaces with pip's actual
    diagnostic instead of bare `CalledProcessError: returned
    non-zero exit status 1`.

Deferred:

  #9 Test recorder accepts any kwarg name (false sense of
    wiring pinned). Real concern but the fix (functools.wraps
    or create_autospec) is invasive across the test suite;
    defer to a focused refactor.

Test count: 341 -> 348. All passing locally on Python 3.11.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Apply Phase 4.6 round-3 review fixes (security regression + edges)

Round 3 caught a real security regression of round 2's fix, plus a
batch of consistency / hardening items.

Security regression fixed:

  #1 _error_dict's `message` field now also routed through URL
    redaction. Round 2 redacted `out["url"]` but left
    `out["message"]` carrying the raw URL — BBApiError.__str__
    constructs the message via
    `f"HTTP {status} from {url}: {body[:500]}"`, so str(e) ALWAYS
    contains the original URL.

    For the S3 presigned case (pipeline_logs / pr_diff → 307 → S3
    non-3xx), the error dict was shipping:
      {"url": "...?[redacted-signed-url-params]",   # fine
       "message": "HTTP 403 from https://.../?X-Amz-Signature=secret..."}
                                              ^^^^^^ secret leaked

    The existing test_bbapierror_redacts_signed_s3_url only asserted
    on d["url"], so the regression was invisible to CI. The fix adds
    a `_redact_message` helper that scans for `http(s)://...`
    substrings and runs each through `_redact_url`. Tests now assert
    redaction on BOTH `url` AND `message` for every cred-bearing
    fixture.

  #9 Case-insensitive signed-URL indicator matching. MinIO / R2 /
    Backblaze / mixed-case AWS variants would have slipped past the
    case-sensitive substring check. Now lowercases the query part
    before membership. Tests added for lowercase variant + plain
    `Signature=` form (used by some non-AWS S3-compatible services).

Correctness:

  #2 pipeline_trigger now validates + strips `branch`. Was the only
    branch-accepting tool that forwarded the arg verbatim — every
    other tool funnels through _opt_str, but pipeline_trigger
    couldn't (branch is required, not optional). Now explicitly
    validates non-empty/non-whitespace and strips, so `"  main  "`
    becomes `"main"` and `"   "` raises a clean local ValueError
    instead of POSTing whitespace and getting an opaque 4xx.

  #3 _resolve_repo bare-slug path now validates `repo not in (".", "..")`
    before _get_client(). Round 2 fixed the slash-containing branch
    but the bare-slug fallback still hit _get_client() first, so a
    fresh-machine user with `repo='.'` saw BBConfigError masking
    the real ValueError. Parametrized regression test added.

  #5 pr_create now rejects explicit `source_branch="HEAD"` (not just
    the auto-detect path). Same severity class as round-1 #4 — just
    on the other entry point. Bitbucket would have silently created
    a degenerate PR named after the literal HEAD ref.

  #6 pr_create error path now uses _error_dict_with(e, title=...).
    Threads the title into the error dict so parallel pr_create
    fan-outs (e.g. agent creating one PR per stacked branch in a
    train) can correlate failures with originating requests.

Defensive:

  #10 os.execv argv now uses Path(__file__).resolve(). Today's
    documented launch pattern uses absolute paths so the issue
    can't fire; this hardens against a future bootstrap enhancement
    or sitecustomize.py that introduces a chdir between launch and
    execv.

Deferred (carried forward with rationale):

  #4 _recorder accepts any kwarg name (false sense of wiring
    pinned). Fixing requires either create_autospec or
    inspect.signature.bind across the test suite — bigger refactor
    than this commit. Notable enough to track separately.

  #7 _client_cache invalidation. Long-lived MCP holds the original
    BBClient for its lifetime; rotating BB_TOKEN requires restart.
    Architectural, defer.

  #8 git_recent_commits silently defaults None ref to "HEAD".
    Cosmetic UX, defer.

Test count: 348 -> 356. All passing locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Apply Phase 4.6 round-4 review fixes (structural redaction)

Round 4 caught a third iteration of the same credential-leak pattern.
This commit closes it structurally so the class can't regress.

Critical security fix (structural):

  #1 _error_dict now routes EVERY string field through _safe_text:
    message, body (BBApiError), stderr (GitOpError). The previous
    rounds whack-a-mole'd one field at a time:
      - Round 2: redacted `url` → message still leaked
      - Round 3: redacted `message` → body still leaked
      - Round 4: redacted `body` → stderr would leak the moment
        Phase 4.7 wraps `git fetch` / `git push` (whose stderr
        commonly contains `fatal: unable to access 'https://x-token-
        auth:TOKEN@bb.org/...'`).

    Unified `_safe_text()` helper applies URL + SCP-cred redaction
    to free-form text. Tests now assert redaction across ("url",
    "message", "body") for the API case AND ("message", "stderr")
    for the git case, plus a new test_giterror_stderr_redacted that
    covers the Phase 4.7 hazard pre-emptively.

  #2 SECURITY: signed-URL indicator list broadened beyond AWS.
    Added: Azure SAS (`sig=`, lowercase only — the trailing `=`
    blocks `signature=` substring matching), GCP X-Goog-Signature /
    -Credential, bearer-token-in-URL forms (`access_token=`,
    `api_key=`). Plus the existing AWS / generic `Signature=`.
    Tests added for Azure SAS and bearer-token shapes.

  #8 SECURITY: free-form-text URL redaction broadened to any
    scheme. The previous `https?://` regex missed `ssh://`,
    `git+ssh://`, etc. The new `_ANY_URL_PATTERN` matches any URL
    scheme; combined with the existing SCP-style pattern, free-form
    git stderr / log output gets redacted regardless of remote URL
    shape. Test added: ssh:// URL with embedded passphrase redacted
    by _safe_text.

Correctness:

  #3 pr_create normalises every string arg at the wrapper boundary
    (source_branch, destination_branch, title, description). The
    wrapper previously only used `.strip()` for the empty-check
    guard, then forwarded the unstripped value to bb_ops which
    wrote it verbatim into the JSON payload. `"feat/widget "`
    became a 404, `"Hi "` became a PR title with trailing
    whitespace.

  #4 pr_create handles JSON null for source_branch. Previously
    `None.strip()` crashed uncaught with AttributeError (round 2
    fixed this class for _resolve_repo but missed pr_create).
    Title required validation also added — empty/whitespace title
    raises ValueError with the rejected value echoed.

  #6 _resolve_repo slash-branch now validates `.` / `..` in either
    workspace OR repo segment BEFORE _get_client(). Round 3 only
    landed the check on the bare-slug fallback; `repo="./foo"`
    still surfaced as BBConfigError on a config-less machine.
    Parametrized regression test added.

  #9 git_recent_commits ref normalisation handles None and
    whitespace-only uniformly. `(ref or "").strip() or "HEAD"` —
    None and "   " both fall back to HEAD, consistent with the
    empty-string success path.

Test hermeticity:

  #12 _reset_state fixture now scrubs BB_USER / BB_TOKEN /
    BB_WORKSPACE / BB_API_BASE before each test. A developer
    running pytest locally with these exported in their shell
    no longer leaks ambient config into the tests — the suite
    is hermetic against dev env.

Deferred (carried forward with rationale):

  #5 Identifier-threading inconsistency on remaining tools
    (pipeline_trigger / branches_list / branch_show / commits_list
    / repos_list / repo_show / vars_list / downloads_list). Real
    but stylistic; pick this up in a focused symmetry pass.

  #7 Stderr redaction proactive defense for future git wrappers —
    addressed structurally by routing stderr through _safe_text in
    #1 above (so technically not deferred).

  #10 _client_cache invalidation. Architectural; documented in
    earlier commit messages, defer.

  #11 pr_approve/pr_unapprove key naming inconsistency. Pure
    naming; intentional descriptive names per round-1 #9 deferral.

  #13 _resolve_repo error path lacks workspace/repo/path. Style;
    bundle with #5 in a symmetry pass.

Test count: 356 -> 360. All passing locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
daniel-pittman added a commit that referenced this pull request May 27, 2026
* Phase 4.7: bash parity — 5 new commands + 8 parity bugfixes

Extends bb with the commands the Python ops exposed but bash didn't,
and fixes the parity bugs Python tests surfaced. Also removes a
personal-data leak in the help text.

NEW COMMANDS

  bb pr-unapprove [repo] <id>          DELETE /pullrequests/{id}/approve.
  bb pr-comment [repo] <id> <body>     POST /pullrequests/{id}/comments
                                       with {"content": {"raw": "..."}}.
  bb branch [repo] <name>              GET single branch detail; URL-encodes
                                       slashes in name (feat/widget →
                                       feat%2Fwidget).
  bb commits [repo] [branch] [count]   GET /commits or /commits/{branch}
                                       with branch URL-encoding + count.
  bb whoami                            Resolved config + git context +
                                       /user reachability probe. NEVER
                                       echoes BB_TOKEN.

PARITY BUGFIXES

  (a) cmd_pipeline_trigger no longer drops `variables` when `pattern`
      is empty. Previously the no-pattern branch built the payload
      WITHOUT the variables array, silently discarding any VAR=value
      pairs the user passed.

  (b) cmd_pipeline_stop captures the bb_post exit code instead of
      `> /dev/null`-ing it. A stop failure (already stopped,
      permission denied, 4xx) now surfaces with a clear error and
      non-zero exit instead of being masked by the unconditional
      success print.

  (c) build_number → uuid lookup bumped 50 → 100 (Bitbucket's max
      pagelen) in cmd_pipeline / cmd_pipeline_stop / cmd_logs.
      Older pipelines beyond 100 still unfindable in bash; full
      pagination is the Python-side improvement (2000-scan cap).

  (d) repo_path now validates workspace + repo at the boundary:
      rejects empty/whitespace, embedded `/`, `.`, and `..`.
      Mirrors bb_api.repo_path's Python validation so both
      surfaces enforce the same contract — no more silent
      `/repositories//foo` or `/repositories/../foo`.

  (e) cmd_pipeline_trigger empty-variables payload now omits the
      key entirely (was inconsistent: pattern-set path emitted
      `variables: []`, no-pattern path dropped variables). Matches
      Python's omit-when-empty contract.

  (f) cmd_pr_approve / cmd_pr_decline now check the bb_post exit
      code instead of discarding it — symmetric with (b). Approve /
      decline failures (already approved, declined PR can't be
      re-declined, etc.) now surface to the user.

  (h) cmd_pr_create omits `description` from the payload when
      empty/whitespace. Matches the Python omit-when-empty
      contract.

  (j) cmd_pr_diff now uses `curl -sfL` (was `-sf`) so a future
      redirect on this endpoint doesn't fail silently. `curl -L -u`
      does NOT resend credentials to a different host by default,
      so the cross-host credential-leak concern is already
      mitigated by curl's own semantics — no manual auth-stripping
      required here.

  Removed personal-data leak: help text previously named a specific
  personal workspace as the default. Generic example only now.

DEFERRED

  (g) cmd_pr_create hardcodes close_source_branch=true. Both
      surfaces share this default; exposing as a flag would be a
      coordinated bash + Python change. Defer to a future PR.

  (i) cmd_pr_merge uses PUT; modern Bitbucket REST docs use POST.
      Bitbucket accepts both for this endpoint and bash's PUT is
      the verified-working contract. Python also uses PUT for
      parity. Coordinated investigation deferred.

  (k) cmd_pr_merge unhashable-strategy raises TypeError instead
      of ValueError. n/a in bash — bash doesn't enforce types this
      way.

  (l) Bash repos_list / branches_list / vars_list / downloads_list
      don't expose BBQL `q` query filter or `count` parameters.
      Agent gets richer access via the MCP surface; bash retains
      its simpler human-driven shape. Defer.

Smoke tests run locally:
  - bash -n bb (clean)
  - bb help (all 5 new commands appear)
  - Python suite: 360/360 still green

DISPATCHER

  Added case-statement entries for: pr-unapprove (alias `prua`),
  pr-comment, branch, commits, whoami. Help text reorganised to
  show the new commands grouped with their family.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Apply Phase 4.7 review fixes (security + real bugs in new code)

Round 1 review surfaced 15 findings. Folding 8 — the critical-and-mine
items plus a few cheap symmetries; deferring 7 pre-existing items that
weren't directly touched.

Security:

  #1 cmd_whoami strips `user:token@` from origin URL before echoing.
    `git remote get-url origin` returns the raw URL including any
    embedded credentials (a common pattern for token-based git auth).
    whoami output is exactly the kind of thing users paste into bug
    reports and screenshots — can't have secrets in it. Matches the
    `_redact_url` helper in mcp_server.py / git_ops.py.

  #6 cmd_whoami probes `/repositories/{workspace}?pagelen=1` instead
    of `/user`. Atlassian's workspace-scoped tokens (the modern
    recommended type) reject /user with 401/403 while still serving
    /repositories/{workspace} correctly — the old probe would have
    told users with valid tokens to rotate them.

Real bugs in code this PR added or touched:

  #2 cmd_watch pagelen bumped 50 → 100. I missed this when bumping
    cmd_pipeline / cmd_pipeline_stop / cmd_logs in the same PR. A
    pipeline at positions 51-100 in the recent list would never
    match here and the watch loop would spin forever printing blanks.

  #3 cmd_pipeline_trigger variables built via jq instead of string
    concatenation. The previous shape produced invalid JSON when
    any value contained `"`, `\`, newline, or tab — the `jq -n
    --argjson vars` would then refuse to parse and abort after the
    user-visible `Triggering pipeline...` had already printed. JSON
    injection vector for any agent-supplied variable value.

  #4 cmd_pipeline_trigger `shift 3 || true` no longer masks the
    under-3-args case. Previously `bb trigger myrepo` (1 arg) left
    "myrepo" in $@ and the var-loop parsed it as a VAR=VALUE pair,
    sending `{"key":"myrepo","value":"myrepo"}` as a pipeline
    variable. Now guards on `$# -ge 3` and explicitly shifts the
    available count.

  #15 cmd_pipeline_trigger now rejects detached HEAD. `git rev-parse
    --abbrev-ref HEAD` exits 0 with the literal "HEAD" in detached
    state, so the `|| echo "main"` fallback didn't fire. Bitbucket
    400s on `ref_name=HEAD`. Clean local error instead.

Validation parity (symmetric with Python):

  #5 repo_path whitespace check now uses `[[:space:]]+` regex
    instead of `${var// }` (which stripped only ASCII space).
    Tabs / newlines / CRs now correctly rejected — true parity
    with Python's `.strip()`.

  #9 repo_path now uses `kill -TERM $$` on validation failure
    instead of `exit 1`. `exit 1` inside a `$(repo_path ...)` only
    terminates the subshell; the caller proceeded with an empty
    path and ate a confusing layered failure. `kill -TERM $$`
    terminates the parent script so validation actually halts.

  #7 + #14 cmd_pr_merge captures exit code AND validates strategy.
    Mirror of the (b)/(f) capture pattern this PR established for
    cmd_pr_approve / cmd_pr_unapprove / cmd_pr_decline /
    cmd_pipeline_stop. A merge conflict / failing-required-build /
    typo-strategy now surfaces with a labelled error including
    likely causes instead of `set -e` silently aborting.

Deferred (pre-existing, not touched by this PR's scope):

  #8 cmd_pr_comment_add stdin fallback — nice-to-have parity with
    cmd_pr_create's pattern. Defer.

  #10 build_number numeric validation in cmd_pipeline / cmd_logs /
    cmd_pipeline_stop — pre-existing. Defer.

  #11 cmd_branch_show jq null-safety on .target.hash / .target.date.
    Real but narrow (annotated-tag refs). Defer.

  #12 cmd_pr_diff curl failures print nothing — pre-existing
    asymmetry. Defer.

  #13 cmd_commits count validation — minor. Defer.

Smoke tests: bash -n bb clean, bb help renders, Python 360/360 still green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Apply Phase 4.7 round-2 review fixes

Round 2 caught a regression I introduced in round 1 + several real
items the first pass missed.

Critical:

  #1 cmd_whoami origin URL fallback now actually fires when git
    fails. My round-1 fix piped through sed for credential redaction
    but the `||` attached to sed (which always exits 0 on empty
    input) instead of git. So `bb whoami` in a non-git directory
    printed `Origin:` blank instead of the promised `(no origin
    remote)` placeholder. Refactored to an `if remote=$(...); then
    ... else ... fi` block so the fallback branches on git's
    exit status.

  #2 cmd_pipeline_trigger variables parser switched to NUL delimiter
    (was newline). My round-1 fix replaced unsafe string-concat
    with `printf '%s\n' "$@" | jq -Rs 'split("\n") ...'` — but
    that still fragments any VAR=value where value contains a
    newline. `VAR=$'line1\nline2'` became `{VAR:line1}` plus a
    ghost `{line2:""}`. NUL is the safe delimiter; printf's
    `%s\0` + jq's `split(" ")` round-trip cleanly.

  #3 SECURITY: build_number now validated numeric in every command
    that splices it into a jq filter (cmd_pipeline, cmd_watch,
    cmd_logs, cmd_pipeline_stop). The previous raw interpolation
    let jq treat a non-numeric value as an undefined-function
    reference (typo path) OR — confirmed by the reviewer — as a
    jq-injection vector via `$ENV` exfil (e.g. `1) | $ENV.BB_TOKEN,
    .uuid` would land the token in pipeline_uuid → next URL path).
    New `_require_build_number` helper enforces `^[0-9]+$` at the
    boundary; rejected input never reaches jq.

  #4 cmd_pipeline_trigger `Variables: $*` echo now masks values.
    Previously emitted the user's raw VAR=value args to stdout —
    secrets in terminal scrollback, CI logs, shell-history wrappers,
    `bb trigger ... | tee` outputs. Now prints `KEY=***` for each
    variable. Contradicted the "NEVER echoes BB_TOKEN" claim in
    cmd_whoami.

  #8 repo_path whitespace check now catches MIXED whitespace, not
    just all-whitespace. My round-1 `^[[:space:]]+$` regex let
    `' acme '` through; `tr -d '[:space:]'` + equality check now
    rejects any value that differs from its stripped form. True
    parity with Python's `.strip()`.

Symmetry (rc-capture pattern applied to commands the first round
missed):

  #5 cmd_pr_create now captures the bb_post exit code. A 400
    (dest branch typo, duplicate PR open, source not pushed)
    surfaces as a labelled error with likely causes instead of
    `set -e` silently aborting after the "Creating PR:" banner.

  #6 cmd_pipeline_trigger now captures the bb_post exit code with
    the same labelled-error pattern. Common 4xx causes (protected
    branch, custom-pipeline-not-found, invalid variable shape)
    spelled out.

Deferred (pre-existing, narrow):

  #7 cmd_branch_show / cmd_commits print banner before fetch.
    Same pattern as #5/#6 but on read commands where the silent
    failure has less consequence; the curl exit code does at least
    surface non-zero. Defer to a follow-up if needed.

  #9 cmd_branch_show jq null-safety on .target.hash / .target.date.
    Real but narrow (annotated-tag refs). Defer.

  #10 cmd_pr_diff curl-silent failure. Pre-existing. Defer.

  #11 cmd_commits count not validated. Minor. Defer.

Smoke tests: bash -n clean, bb help renders, Python 360/360 still green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
daniel-pittman added a commit that referenced this pull request May 27, 2026
* Phase 4.1: bb_api.py + Python test infrastructure (MCP foundation) (#2)

* Add bb_api.py + pyproject + Python test infrastructure (MCP foundation)

First slice of the MCP server work. Establishes the Python module layout
the MCP server will sit on top of, plus the test seam and CI extension
required to develop the rest of the modules with confidence.

bb_api.py — Python parallel of bb's config / auth / transport helpers.
Stdlib-only (urllib.request), so the MCP server's bootstrap stays small.
Public surface:
  - BBConfig: resolved credentials + workspace + API base URL.
  - load_config(): env > ~/.config/bb/config > .env > error; tolerant of
    bash `export KEY=value` and quoted values, matching the bash script's
    `source` semantics close enough for parity.
  - parse_remote_url() / detect_repo(): both HTTPS and SSH remote shapes;
    repos with dots in their slugs are preserved (bash regex truncates).
  - repo_path(): /repositories/{workspace}/{repo}, rejecting embedded
    slashes that would silently change the URL structure.
  - BBClient: thin transport with .get / .post / .put / .delete / .paginate.
    Pagination is defended against runaway cursors (max 200 pages) and
    stuck cursors (`next` URL unchanged between iterations), and refuses
    to follow `next` URLs that point at a different host than api_base.
  - BBApiError carries status + URL + body so callers can branch on
    expected non-2xx responses.

bb and bb_api are PARALLEL implementations of the same Bitbucket REST
contract — the MCP server does not shell out to bb. The bug-parity rule
is now documented in CONTRIBUTING.md: tests verify correct API behaviour,
and a defect surfaced by a Python test fixes both code paths.

pyproject.toml — flat `py-modules = ["bb_api"]` layout (additional modules
land in their own PRs). `mcp` is in an optional [mcp] extra so test
matrices stay lean.

tests/ — pytest with two responsibilities in conftest.py: set
BB_MCP_SKIP_BOOTSTRAP=1 before any mcp_server import to defend against
the future self-bootstrap venv pattern, and inject the repo root onto
sys.path. 36 tests covering config precedence (env > file > error),
remote-URL parsing (HTTPS + SSH + repos-with-dots), client transport
(URL + method + auth header + body shape per call, not just response
status), and pagination (walking, stuck-cursor defence, iteration cap,
cross-host refusal, empty page).

CI extension — Python 3.10/3.11/3.12 matrix added to the existing
`syntax` job. fail-fast: false so a failure on one version doesn't
mask another. Installs only pytest; the MCP runtime deps stay out of
unit-test CI because the suite mocks urllib.

CODEOWNERS — extended to gate review on bb_api.py, future bb_ops.py /
git_ops.py / mcp_server.py, pyproject.toml, and the agents/ directory.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Apply review fixes for Phase 4.1 foundation

Addresses 13 of the 15 findings from the Phase 4.1 code review. The
strongest finding (#1) was a parity bug where a test pinned the wrong
contract — exactly the anti-pattern the testing methodology warns about.

Parity / correctness fixes:

  1. dotenv > home-config precedence. Bash sources ~/.config/bb/config
     first, then `.env`, so `.env` wins. Python had the order inverted
     and the test pinned the inverted behaviour. Now matches bash; the
     test now verifies the correct contract.

  2. Empty env var no longer falls through to file. Switched from
     `env.get(key) or file_config.get(key)` to a membership test so
     `BB_TOKEN=""` is treated as an explicit (failing) value rather
     than absence — matches bash's `[[ -z ]]` check behaviour.

  3. URLError (DNS / TLS / connection / timeout) now wraps as
     BBApiError(status=0) instead of escaping the documented contract.

  4. Pagination cross-host refusal hardened against the prefix trick.
     `startswith(api_base)` is replaced with a separator-aware check
     so `https://api.bitbucket.org/2.0evil.example.com/...` no longer
     slips past the host-mismatch guard.

  5. BB_MCP_SKIP_BOOTSTRAP is now set unconditionally in conftest;
     `setdefault` was a no-op if a developer had `BB_MCP_SKIP_BOOTSTRAP=0`
     in their shell, defeating the guard silently.

  6. Paginated response missing the `values` key now raises rather
     than being silently treated as an empty page (which could leave a
     hole in the result set if `next` was still set).

  7. Non-string `next` raises BBApiError with a clear message rather
     than crashing on AttributeError.

  8. repo_path() rejects empty / whitespace-only / `.` / `..` values
     in addition to embedded `/`. Defends against path-traversal where
     `/repositories/../widget` normalises to the wrong workspace.

 10. Query values validated as scalar or list-of-scalar; nested dicts
     and arbitrary objects raise TypeError instead of being silently
     stringified by urlencode(doseq=True) into a misleading URL.

 11. BB_API_BASE trailing slash normalised on load. Prevents
     `api_base + "/path"` becoming `//path`.

 13. Replaced `assert ... is not None` with an explicit raise so
     `python -O` deployments retain the safety net if a future edit
     narrows the missing-keys check.

 14. detect_repo test now asserts the exact subprocess call shape
     (command, capture_output, text, cwd, check). Previously the test
     was canned-output-only and a refactor to a different git command
     would have passed silently.

 15. BBClient methods accept a per-call `timeout=` override for
     endpoints that legitimately take longer (log streaming, large
     diffs). Default still comes from the client.

Tightened docstrings:

   - BBApiError now documents `status=0` as the sentinel for
     non-HTTP transport / malformed-response errors so callers know
     to gate HTTP-status dispatch on `status > 0`.

Deferred with rationale:

   #9 (remote URL parser accepts non-Bitbucket hosts): intentional
   for enterprise / self-hosted Bitbucket support. The bash side has
   the same loose parsing. No restriction added.

  #12 (BBApiError used as transport-error wrapper): the dedicated
   exception class would expand the surface; the docstring update
   above is sufficient until a real consumer needs to discriminate.

Test count: 36 -> 55. All passing locally on Python 3.11.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Apply follow-up review fixes (security: no-redirect default, test rigor)

Follow-up review on commit e2567e8 verified all 13 prior fixes landed
correctly and surfaced 6 new findings, 3 of them worth taking before
more code lands on top of this foundation.

Security:

  new #1. Default BBClient opener no longer follows 3xx redirects.

    urllib's default HTTPRedirectHandler resubmits the original request
    against the Location URL — INCLUDING the Authorization header. urllib
    does not strip auth on cross-origin redirects, so a misconfigured
    proxy or DNS hijack pointing api.bitbucket.org at another host would
    have leaked the Basic auth credential to that host on the first 3xx.

    `bb` doesn't have this exposure: `curl -sf` doesn't follow redirects
    without -L, and the one command that needs to (`bb logs`) opts in
    explicitly. Mirror that posture here.

    Implementation: _NoRedirectHandler subclasses HTTPRedirectHandler and
    returns None from redirect_request, which causes urllib to surface
    the 3xx as an HTTPError that our existing handler wraps into
    BBApiError. Two tests cover it — one introspects the default opener's
    handler chain, the other unit-tests the handler's redirect_request
    contract directly.

    When bb_ops adds pipeline_logs (which legitimately needs to follow a
    307 to S3), that work will require a separate code path that strips
    Authorization on cross-host hops rather than weakening this default.

Test rigor:

  new #2. test_paginate_walks_pages now asserts the URL for each of the
    three calls (base, base?page=2, base?page=3). Without this, a
    regression that refetched the original URL three times would still
    consume the three queued responses and pass — exactly the
    mock-returns-success-regardless-of-request anti-pattern.

Cleanup:

  new #4. Removed the dead `page_query = None` assignment in paginate().
    The variable was only read on the first iteration (when url is None),
    and the reset ran after `url` had been assigned, so it was never
    observed. Removed and the closure variable `query` is used directly.

Documentation:

  new #5. parse_remote_url() now has an IMPORTANT caller-contract note:
    the function does not anchor to a Bitbucket host (intentional for
    enterprise / self-hosted Bitbucket support). Callers feeding it
    untrusted external URLs (webhook payloads, etc.) must verify the
    host upstream before calling. If a future consumer needs that
    guarantee, the right shape is a `strict=True` mode here, not
    duplicated checks at every call site.

Deferred:

  new #3 (BB_API_BASE shape validation) — low-impact today; will revisit
    if a self-hosted Bitbucket Server use case lands a typo bug.

  new #6 (older _fake_runner tests still allow kwargs drift) — minor;
    the new recording-style test_detect_repo_invokes_git_remote_get_url
    is the load-bearer, the looser fixtures around it are acceptable.

Test count: 55 -> 57. All passing locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Phase 4.2: bb_ops pipeline operations + cross-host-safe log fetch (#3)

* Add bb_ops pipeline operations + cross-host-safe log fetch

Phase 4.2 of the MCP server work. Adds the pipeline subset of bb_ops
(list, show, steps, trigger, stop, logs), plus a new bb_api transport
method for following redirects safely. 33 new tests, 90 total.

bb_ops.py — public surface:

  pipelines_list(client, ws, repo, *, count=10, sort, branch=None)
    Paginates pipelines/?sort=-created_on. Honours `count` exactly by
    walking pages as needed; clamps per-page to Bitbucket's 100 cap.
    Optional branch filter via target.ref_name.

  pipeline_show(client, ws, repo, build_number)
    Resolves build_number -> uuid then GETs /pipelines/{uuid}.

  pipeline_steps(client, ws, repo, build_number)
    Same uuid lookup, then GETs /pipelines/{uuid}/steps/.

  pipeline_trigger(client, ws, repo, *, branch, pattern=None, variables=None)
    Builds the POST payload (target + optional custom-pattern selector
    + optional variables-as-list-of-{key,value}). Variables accept dict
    or iterable-of-pairs at the boundary. Rejects non-string values
    (Bitbucket's contract).

  pipeline_stop(client, ws, repo, build_number)
    POSTs to /pipelines/{uuid}/stopPipeline. Returns the API response
    (typically None on 204). The bash equivalent discards the response
    with `> /dev/null`, masking failures — flagged as a 4.7 parity fix.

  pipeline_logs(client, ws, repo, build_number, step_index, *, timeout=120)
    Uses the new BBClient.fetch_redirected_text() path to follow
    Bitbucket's 307-to-S3 redirect on the log endpoint with
    Authorization stripped on cross-host hops.

bb_api.py — BBClient.fetch_redirected_text():

  Necessary because the log endpoint sometimes returns 200 with an
  inline log body and sometimes returns 307 to a signed S3 URL. The
  default opener refuses 3xx (credential-safety), so this method opens
  with the default opener, catches HTTPError 3xx, extracts Location,
  and follows up to max_redirects hops. CRITICAL: when the next hop's
  host differs from api.bitbucket.org, Authorization is removed from
  the request headers — Basic credentials never reach S3 or any other
  redirect target. Once stripped, the credential stays stripped for
  the rest of the chain.

  Bash's `bb logs` uses `curl -sfL` which (depending on curl version)
  may send Authorization on cross-host redirects. Whether that's
  presently broken in bash is a parity question for 4.7; Python is
  correct by construction.

Test coverage (33 new):

  _wrap_uuid: brace normalisation, URL encoding, whitespace tolerance.
  _strip_uuid_braces: bare/braced/empty inputs.
  pipelines_list: single page, count > 100 (pagination), mid-page
    cutoff, branch filter URL shape, non-positive-count rejection
    without emitting a request.
  _resolve_pipeline_uuid: first-page hit, multi-page walk, not-found
    raises BBOpNotFound (distinct from BBApiError), invalid build_number
    rejection, scan_limit cap respected.
  pipeline_show: uuid lookup THEN show by uuid; URL uses %7B...%7D
    bracket encoding (bash contract).
  pipeline_steps: uuid lookup THEN list steps.
  pipeline_trigger: default-pipeline payload shape, custom-pipeline
    payload shape (selector embedded), variables-as-dict shape,
    variables-as-iterable-of-pairs shape, empty-variables omits the
    key (not an empty list), non-string variable value rejected,
    empty pattern rejected, empty branch rejected.
  pipeline_stop: POST to /stopPipeline with no body; 204 returns None.
  pipeline_logs: inline body (no redirect), 307 follow with
    Authorization stripped on cross-host (the security-critical test),
    too-many-redirects raises, missing Location raises.
  _resolve_step_uuid: valid index, out-of-range raises BBOpNotFound,
    negative index rejected.

Every HTTP-touching test asserts request URL, method, and body shape
- not just the response value. Guards against the
mock-returns-success-regardless-of-body anti-pattern.

CI / packaging:

  pyproject.toml py-modules now includes bb_ops.
  ci.yml syntax-checks bb_ops.py alongside bb_api.py.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Apply Phase 4.2 review fixes (robustness + cleanup)

Addresses 6 of 8 follow-up suggestions from the Phase 4.2 code review.
Approve-with-suggestions; nothing blocking, all minor robustness or
clean-up items.

  #1 (doc). Soften the cross-host-auth narrative in
    fetch_redirected_text. `curl -L -u` also strips credentials on
    cross-host hops; only `-H "Authorization: ..."` would leak. Bash
    is safe today via `-u`. Python's explicit strip is the symmetric
    future-proof version (in case the MCP server ever needs
    header-based auth).

  #4 (robustness). Case-insensitive cross-host comparison in
    fetch_redirected_text. RFC 3986 §3.2.2 makes hostnames
    case-insensitive; a redirect to API.bitbucket.org would have
    triggered a needless auth-strip on a capitalisation-only diff.
    Now lowercased both at construction and on each new host.

  #5 (resource hygiene). Explicit `e.close()` on the redirect-path
    HTTPError so the underlying socket is released immediately
    rather than waiting on GC under load. Wrapped in try/finally so
    a raise from the redirect-extract logic still closes the
    response.

  #6 (dead return). _resolve_step_uuid now returns just the uuid
    rather than a (uuid, name) tuple. The name was never consumed
    by pipeline_logs. Callers that want the name should fetch
    pipeline_steps() themselves. Simpler API, no behavioural change.

  #7 (nit). Dropped redundant `safe='-'` from _wrap_uuid's quote()
    call. Dash is in urllib's always-safe unreserved set.

  #8 (naming). Renamed pipeline_steps_raw -> _pipeline_steps_by_uuid
    (leading underscore to match the rest of the file's
    internal-helper convention; the function was only called
    module-internally).

Bash bug newly surfaced by the review, queued for 4.7:

  cmd_pipeline_trigger drops `variables` entirely when `pattern` is
  empty (bb:382-391: the no-pattern branch builds a payload without
  the variables array, even when the caller passed VAR=value pairs).
  Python pipeline_trigger handles this consistently; tests pin it.

Deferred:

  #2: bash bug above — fixed in 4.7 not here (Python-only PR).
  #3: empty-variables payload shape (omit vs []). Python omits;
  bash emits []. Both are valid against Bitbucket. Picking
  one and aligning is a 4.7 follow-up.

Test count unchanged (90); test_resolve_step_uuid updated to assert
the new single-uuid return.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Tighten finally-close comment per re-review nit

The finally runs on every HTTPError path (both 3xx and non-3xx),
not just redirects. Reword the comment so a future reader doesn't
assume the close is redirect-only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Phase 4.3: bb_ops pull-request operations + tests (#4)

* Add bb_ops pull-request operations + tests

Phase 4.3 of the MCP server work. Adds the PR subset of bb_ops with 33
new tests, 123 total.

Public surface:

  prs_list(client, ws, repo, *, state="OPEN", count=25)
    State filter + paginated count, matching bash defaults. Walks pages
    when count > 100 (Bitbucket's cap).

  pr_show(client, ws, repo, pr_id)
    GET /pullrequests/{id} with positive-int validation on pr_id.

  pr_activity(client, ws, repo, pr_id, *, count=50)
    Walks the activity stream — used by bash to surface approver names,
    surfaced as its own op so the MCP agent can render its own view.

  pr_create(client, ws, repo, *, title, source_branch,
            destination_branch="main", description="",
            close_source_branch=True, reviewers=None)
    Builds the POST payload matching Bitbucket's contract. Optional
    reviewers as a list of UUID strings (Bitbucket expects
    `[{"uuid": "..."}, ...]`). Empty description / empty reviewers
    omitted from the payload (Python convention; bash's "always
    include empty description" is a 4.7 alignment item).

  pr_approve(client, ws, repo, pr_id)
    POST /approve. Returns the approval record — bash discards it with
    `> /dev/null`.

  pr_unapprove(client, ws, repo, pr_id)
    DELETE /approve. Not exposed by bash today (4.7 parity gap).

  pr_merge(client, ws, repo, pr_id, *, strategy="merge_commit",
           close_source_branch=True, message=None)
    PUT /merge with strategy + close_source_branch + optional message.
    Strategy validated at boundary: merge_commit / squash / fast_forward.
    Mirrors bash's PUT verb (Bitbucket has historically accepted both
    PUT and POST; flagged for 4.7 investigation against current REST
    docs).

  pr_decline(client, ws, repo, pr_id)
    POST /decline.

  pr_diff(client, ws, repo, pr_id, *, timeout=120)
    Routes through fetch_redirected_text so the cross-host-auth-strip
    protection applies if Bitbucket ever redirects this endpoint. Bash
    uses `curl -sf` without `-L`, so it would fail visibly in that
    case — Python continues to work safely (parity divergence noted
    for 4.7).

  pr_comments_list(client, ws, repo, pr_id, *, count=100)
    Paginated comments listing matching bash's default pagelen.

  pr_comment_add(client, ws, repo, pr_id, body)
    POST /comments with payload {"content": {"raw": "<text>"}}. Not
    exposed by bash today (4.7 parity gap).

Validation: every function rejects invalid pr_id (must be positive int)
and non-string body / message values at the Python boundary. Tests
assert `opener.calls == []` after a rejection to prove no request was
emitted on bad input.

Bash parity items newly surfaced (queued for 4.7):

  - cmd_pr_approve / cmd_pr_decline discard API responses with > /dev/null.
  - cmd_pr_create hardcodes close_source_branch=true; expose as a flag.
  - cmd_pr_create always includes an empty description; align with
    Python's omit-when-empty.
  - cmd_pr_merge uses PUT; verify against current docs and align both
    sides on one verb.
  - cmd_pr_diff uses curl -sf without -L; Python's fetch handles
    redirects safely.

Test discipline: every HTTP-touching test asserts URL, method, AND body
shape, not just response value. Rejection tests assert opener.calls == []
to prove no network IO on bad input.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Apply Phase 4.3 review fixes (bool-is-int, type checks, test coverage)

Addresses all 12 findings from the code review. The strongest finding
was a Python footgun: bool is a subclass of int, so a bare
isinstance(x, int) check accepts True/False and stringifies them in
URLs as 'True'/'False' rather than '1'/'0'. The cascading fix closes
six findings at once.

High-severity:

  #1 _validate_pr_id accepted bool. Now uses a shared _is_positive_int
    helper that explicitly excludes bool. Same fix applied to every
    count validator (prs_list, pr_activity, pr_comments_list,
    pipelines_list, _resolve_pipeline_uuid) and the step_index check.
    test_rejects_invalid_pr_id extended with True/False; the new
    test_every_pr_op_rejects_bad_pr_id matrix runs every PR-id-taking
    op (8 functions × 7 bad values) so a future refactor that drops
    _validate_pr_id from any one op gets caught.

  #2 pr_create(reviewers="alice-uuid") would have iterated the string
    char-by-char and built [{"uuid":"a"}, {"uuid":"l"}, ...]. Added an
    explicit isinstance(reviewers, str) early-reject with a hint
    ("did you mean [...]?"). Test added.

  #3 Same bool-is-int slip on every count parameter (fixed by the
    shared _is_positive_int helper above).

Medium:

  #4 pr_create now type-checks `description`; a dict or other
    non-string raises rather than slipping into the JSON payload.
    Test added.

  #5 _KNOWN_PR_STATES docstring claim that Bitbucket accepts comma-
    separated `?state=OPEN,MERGED` was wrong; that shape returns 400
    or empty. Docstring corrected to point at the BBQL `q` parameter
    for compound filtering.

  #6 close_source_branch now type-checked in both pr_create and
    pr_merge; a string like "yes" no longer slips into the JSON
    payload. Tests added.

Low:

  #7 pr_merge(message="") was accepted; now rejected for consistency
    with pr_comment_add(body=""). Empty merge-message would leave the
    merge commit's subject line blank.

  #8 pr_diff docstring softened — the "Python continues to work
    safely" framing didn't acknowledge that the returned body would
    be the redirect target's content if a redirect ever materialised.
    Now spells out that today the diff endpoint doesn't redirect and
    flags the behavioural divergence from bash if it ever does.

  #9 Whitespace-only description (`description="   \n  "`) now omitted
    from the payload via description.strip() check. Test added.

Test gaps closed:

  #10 True/False added to test_rejects_invalid_pr_id.
  #11 New test_every_pr_op_rejects_bad_pr_id matrix covers every
    PR-id-taking op (9 functions × 7 bad values = 63 cases).
  #12 TestPrActivity now has count-walks-pages, invalid-count, and
    invalid-pr_id tests, symmetric with TestPrsList.

Test count: 123 -> 141. All passing locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Apply Phase 4.3 round-2 review fixes

Addresses 6 of 7 findings from the round-2 review. The strongest finding
was a real boundary-validation gap: _KNOWN_PR_STATES was added in the
previous fix round but never actually referenced — defined-and-documented
as the prs_list state guard, yet the function only type-checked. Closes
the gap.

High:

  #1 prs_list now validates state against _KNOWN_PR_STATES. Typos
    ("OPENED"), case bugs ("open"), and unsupported compound forms
    ("OPEN,MERGED" — which Bitbucket does NOT accept on the simple
    ?state= filter; that needs ?q=) all fail at the boundary instead
    of burning an API call for a 400 / empty result. Symmetric with
    pr_merge's _VALID_MERGE_STRATEGIES check.

  #2 Whitespace-only string payloads no longer slip through. Replaced
    `not value` with `not value.strip()` in five places:
      - pr_create.title / source_branch / destination_branch
      - pr_merge.message
      - pr_comment_add.body
    Without this, a title like " " or a comment body like "\n\t"
    would have shipped to Bitbucket as accepted-but-visually-blank
    fields. Error messages now say "non-empty, non-whitespace".

Test coverage:

  #3 TestPrsList.test_rejects_non_positive_count now includes True /
    False, matching TestPrActivity. Without symmetry, a refactor that
    reverted prs_list's _is_positive_int(count) check to a naive
    isinstance(count, int) would have regressed silently.

  #4 TestPrCommentsList now has test_rejects_non_positive_count with
    the same coverage shape (was happy-path only).

  #5 TestPrDiff.test_follows_cross_host_redirect_and_strips_auth pins
    the redirect-with-auth-strip behaviour at the pr_diff layer. Mirrors
    test_pipeline_logs.test_follows_s3_redirect_and_strips_auth — without
    it, a regression that wired pr_diff to plain client.get (which
    would 5xx on any future redirect because the default opener
    refuses 3xx) would not break any test in this file.

  #7 TestPrDiff.test_returns_raw_diff_text now also asserts the first
    hop carries Authorization. A regression that dropped the Basic
    header from the diff-fetch path would otherwise pass silently.

Parametrized whitespace cases added across pr_create / pr_merge /
pr_comment_add tests so each field's reject-on-whitespace is verified
independently.

Deferred (filed under 4.7):

  #6 pr_merge(strategy=[1,2]) raises TypeError (unhashable) rather
    than ValueError. Defensive-coding nit; either error surface stops
    the bad call. Defer to the parity sweep.

Test count: 141 -> 155. All passing locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Apply Phase 4.3 round-3 review fixes

Three small test/consistency fixes from the round-3 review. The PR is
in good shape — reviewer explicitly called these out as non-blockers,
but they're cheap and close the consistency loop.

  - pr_merge strategy validator now gates on isinstance(str) before
    the frozenset `in` check. Unhashable strategy values (list, set,
    dict) previously raised TypeError instead of the documented
    ValueError, breaking the "every boundary failure is ValueError"
    convention. Test extended to cover list/set/None/int alongside
    the original unknown-string case.

  - test_rejects_invalid_reviewer (formerly _empty_reviewer_uuid) now
    parameterizes over empty-string, None, int, and the most plausible
    caller mistake — pre-shaping as Bitbucket's payload contract
    `{"uuid": "..."}`. The validator already rejects all four; this
    pins coverage on the isinstance branch.

  - test_follows_cross_host_redirect_and_strips_auth now also asserts
    the first hop carries Authorization (a regression that dropped
    auth on the initial Bitbucket request would otherwise pass) and
    that the Location was actually followed (not refetched original).
    Symmetric with round-2 finding #7 on the no-redirect path.

Test count: 155 -> 163. All passing locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Phase 4.4: bb_ops repos / branches / vars / downloads / commits + tests (#5)

* Add bb_ops repos / branches / vars / downloads / commits + tests

Phase 4.4 of the MCP server work. Closes out the read-oriented surface
on bb_ops (Phase 4.5 will add git context tools, then 4.6 wires
everything into the MCP server). 48 new tests, 211 total.

Public surface:

  repos_list(client, workspace=None, *, count=100, sort, query=None)
    Defaults workspace to client.config.workspace; pass explicit to
    query a different one (bash only ever uses BB_WORKSPACE — parity
    gap). Optional BBQL `q` filter for name-substring etc. (also
    bash gap).

  repo_show(client, ws, repo)
    GET /repositories/{ws}/{repo} — full metadata.

  branches_list(client, ws, repo, *, count=50, sort, query=None)
    Most-recently-updated first, matches bash default. Optional `q`
    filter.

  branch_show(client, ws, repo, name)
    Fetch a single branch by name. URL-encodes the name so
    `feat/widget` isn't interpreted as a sub-resource path. Not in
    bash today — useful for the agent's "does this branch exist?"
    pre-flight check before creating a PR.

  vars_list(client, ws, repo, *, count=100)
    Pipeline configuration variables. Bash masks secured values to
    `********` in its display layer; Python returns the raw dicts
    (Bitbucket sends `"value": null` for secured entries — MCP tools
    surface the `secured` flag so callers don't misread null as unset).

  downloads_list(client, ws, repo, *, count=25)
    Repository download artifacts.

  commits_list(client, ws, repo, *, branch=None, count=10)
    With branch=None, lists across all branches via /commits.
    With branch=NAME, lists commits reachable from that branch via
    /commits/{name}. Branch names URL-encoded. Not in bash today.

Bash parity items newly surfaced (queued for 4.7):

  - bash repos_list / branches_list / vars_list / downloads_list don't
    expose BBQL `q` query filter or `count` parameters — agent has
    richer access by default.
  - bash has no branch_show, commits_list — add as new subcommands
    alongside the previously-queued pr_unapprove / pr_comment_add /
    whoami.

Test discipline (same as prior bb_ops PRs):

  - Every HTTP test asserts URL, method, AND body shape per call.
  - Rejection tests assert opener.calls == [] (no network IO on bad input).
  - Bool-is-int (True/False) rejection covered on every count parameter.
  - Whitespace-only rejection on every required-string parameter.
  - Branch-name slash URL-encoding verified specifically in branch_show
    and commits_list.

Test count: 163 -> 211. All passing locally on Python 3.11.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Apply Phase 4.4 review fixes

Folds in all 5 findings from the code review (none blocking; reviewer
explicitly called the PR "clean, well-disciplined"):

  1. repos_list now applies the same workspace validation as
     bb_api.repo_path — rejects embedded '/', '.', '..' in addition to
     empty/whitespace. Without this, `workspace="acme/widget"` would
     silently build a single-repo endpoint URL and paginate against a
     response that lacks `values` — confusing failure mode the boundary
     validator exists to prevent everywhere else. repos_list is the
     only op that doesn't route through repo_path, so the check is
     duplicated here.

  2. Removed redundant local `from urllib.parse import quote as _quote`
     in branch_show and commits_list; both now use the module-level
     `quote` import that's been there since 4.2.

  3. Updated stale module docstring — was still saying "This PR
     (Phase 4.2) adds pipeline operations." Now reflects current
     scope (pipelines + PRs + repos/branches/vars/downloads/commits)
     and references the companion git_ops module that 4.5 will add.

  4. TestBranchesList.test_query_filter now asserts the exact
     URL-encoded form of the BBQL query (matching the style of
     TestReposList.test_query_filter). The previous `"q=" in url`
     would have passed even if the query string were silently
     mangled.

  5. Three new tests close coverage gaps:
     - TestBranchesList.test_rejects_empty_query — pin the
       empty/whitespace rejection that was implemented but untested.
     - TestCommitsList.test_count_walks_pages — pin commits_list's
       multi-page pagination (it has the most complex path shape
       of the bunch with the branch-vs-no-branch split).
     - TestReposList.test_rejects_workspace_with_slash +
       test_rejects_workspace_dot_segments — pin the new workspace
       validation from finding #1.

Skipped optional suggestions:
  - vars_list `query=` filter (no concrete agent use case yet).
  - _collect_paginated() helper (5x repetition is still inline-readable;
    refactor when a 6th list op materialises).

Test count: 211 -> 218. All passing locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Phase 4.5: git_ops — git-context wrappers for the MCP agent (#6)

* Add git_ops — git-context wrappers for the MCP agent

Phase 4.5 of the MCP server work. Five subprocess-mocked wrappers the
MCP agent uses to resolve "current branch", "is the tree clean?",
"workspace/repo from origin", "recent commits", and "what's
uncommitted?" — the context tools that have to run before any bb_ops
call. 38 new tests, 256 total.

git_ops.py public surface (stdlib-only, subprocess + bb_api.parse_remote_url):

  git_current_branch(path?) -> str
    `git rev-parse --abbrev-ref HEAD`. Detached HEAD returns the literal
    string "HEAD" (matches git's own output; callers detect detached
    state explicitly).

  git_remote_repo(path?) -> (workspace, repo_slug)
    Routes through bb_api.parse_remote_url for parity with
    bb_api.detect_repo's loose-host parsing (intentional — enterprise
    Bitbucket Server uses non-bitbucket.org URLs). Returns BOTH the
    workspace AND repo, unlike bb_api.detect_repo which returns only
    the slug.

  git_status(path?) -> dict
    Parses `git status --porcelain=v2 --branch --untracked-files=normal`.
    Returns a structured snapshot:
      branch, upstream, ahead, behind, clean, staged, modified,
      untracked, unmerged.
    Handles ordinary (type "1"), rename/copy (type "2", with
    new-path\told-path tab-separator), and unmerged (type "u") lines.
    Header lines parsed for branch.head / branch.upstream / branch.ab.

  git_recent_commits(path?, *, count=10, ref="HEAD") -> list[dict]
    `git log --pretty=format:%H<US>%h<US>%s<US>%an<US>%aI`. Uses
    U+001F (Unit Separator) as the field delimiter — a control
    character that cannot appear in commit subjects, so we never have
    to escape or parse-around content. Each entry:
      {sha, short, subject, author, date}.

  git_uncommitted_changes(path?) -> dict
    Three subprocess calls:
      git diff --cached       -> staged_diff
      git diff                -> working_diff
      git ls-files --others   -> untracked_files
        --exclude-standard
    Diffs returned as raw unified-diff text; agent can show verbatim
    or parse further.

Error handling:

  GitOpError carries the failing command + returncode + stderr (truncated
  in the message). Distinct from BBApiError so the MCP layer can render
  "git failure" vs "Bitbucket failure" differently.

  FileNotFoundError on `git` binary missing -> GitOpError(127, ..., "git
  executable not found on PATH"). Matches bb_api.detect_repo's behaviour.

Test discipline (same as bb_api / bb_ops):

  - Every test asserts the EXACT subprocess shape (command + args +
    cwd + capture_output / text / check). A refactor swapping
    `git status --porcelain=v2` for v1 would change every parse
    result but the canned-output tests would still pass without this.
  - Parametrised tests on remote-URL shapes (https, ssh, embedded
    user-info, self-hosted Bitbucket Server) — verify the loose-host
    parser handles every shape the bash side accepts.
  - git_status parser tested against five realistic v2 captures:
    clean, ahead/behind, no-upstream, mixed (staged + modified +
    untracked), renamed, unmerged.
  - git_recent_commits format string pinned: %H, %h, %s, %an, %aI all
    present + U+001F delimiter. Malformed log lines skipped
    defensively rather than crashing.
  - Invalid-count / empty-ref rejection on git_recent_commits with
    True/False included (bool-as-int trap).
  - git_uncommitted_changes verifies the EXACT three-call sequence
    with explicit `cwd=` propagation on all three.

CI / packaging:

  pyproject.toml py-modules += "git_ops".
  ci.yml syntax-check += "git_ops.py".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Apply Phase 4.5 review fixes

Addresses 5 of 6 findings from the round-1 review (the 6th — str vs
os.fspath — is pre-existing in bb_api and would be unrelated churn in
this PR).

Security:

  #1 git_recent_commits(ref=...) now rejects refs starting with '-'.
    Without this, an agent-supplied ref like `ref='--all'` would be
    parsed by git as an option flag, not a ref — silently replacing
    the call with a glob over all branches. `ref='-h'` would print
    help and return exit 0 with help text in stdout, which the
    parser would skip. The MCP agent is the consumer, and agent
    inputs are user-influenced, so option-injection at this boundary
    is a real concern. Belt + suspenders: also added `--` terminator
    to the `git log` invocation so anything past it is parsed as a
    positional ref regardless.

Correctness:

  #3 GitOpError(returncode=0) for parse failures was misleading. A
    real git exit is always >= 0, so callers branching on
    `err.returncode == 0 -> success` would have been confused. Now
    parse-failure paths use the GIT_PARSE_ERROR_RETURNCODE sentinel
    (-1) and the class docstring documents the convention. Both
    parse-error tests now assert on the sentinel.

Defensive:

  #4 git_recent_commits added an upper bound of _MAX_LOG_COUNT (1000)
    on `count` so an agent that confabulates `count=1_000_000`
    doesn't silently pull a million records back across the MCP
    boundary.

Documentation:

  #2 git_status docstring now documents the C-quoted-pathname
    limitation as a known caveat (porcelain v2 line-oriented output
    quotes pathnames with newlines/tabs/quotes/control-chars). Fixing
    properly requires switching to `-z` + NUL-split parser — deferred
    until a real bug report shows up.

  #6 _LOG_FIELD_SEP comment now says "almost never" rather than
    "cannot" — git stores arbitrary bytes so U+001F technically could
    appear in a commit subject; the parser already handles it
    defensively.

New tests:

  - 4 parametrized cases for ref-starts-with-dash rejection.
  - count-above-cap rejection + count-at-cap accepted (boundary
    verification).
  - Spaces-in-filename preserved (porcelain v2 type-1 line uses
    split(" ", 8) — the suggested regression guard).
  - Existing parse-failure tests now assert returncode is the
    sentinel.

Skipped: #5 (str vs os.fspath nit). Pre-existing in bb_api.detect_repo,
would be unrelated churn here; defer to a separate consistency PR.

Test count: 256 -> 263. All passing locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Apply Phase 4.5 round-2 review fixes

Substantive round-2 review surfaced 10 findings (medium to very-low),
several with concrete failure modes the MCP server would hit in
production. All 10 addressed plus the 3 test-coverage gaps.

Medium-severity correctness:

  #1 git diff no longer leaks ANSI color codes. _run_git now prepends
    `git -c color.ui=never` to every invocation, overriding any
    user's `color.ui = always` gitconfig that would otherwise emit
    `\x1b[31m...` escapes into pipe-captured output. Diffs and log
    output are now safe to ship to the MCP agent verbatim.

  #2 Detached-HEAD reporting normalised. `git_current_branch` already
    returned the literal "HEAD" for detached state; the porcelain v2
    parser returned "(detached)" for the same state. Now both return
    "HEAD" so cross-checks between the two functions agree on the
    underlying state. Test added.

  #3 subprocess.run now has timeout=30s with TimeoutExpired wrapped
    as GitOpError(returncode=GIT_PARSE_ERROR_RETURNCODE). A wedged git
    (held index.lock, credential-helper prompting on stdin, NFS server
    gone) can no longer hang the MCP server thread indefinitely. Test
    added.

  #4 subprocess.run now passes encoding="utf-8", errors="replace"
    explicitly. text=True alone defers to locale.getpreferredencoding(),
    which on minimal Docker / cron / systemd contexts is ASCII or C —
    a UTF-8 filename or author name would crash inside subprocess.run
    before _run_git could wrap it. Explicit UTF-8 + replace-on-error
    means we never lose to a locale-restricted host.

Low-severity correctness / data loss:

  #6 splitlines() replaced with split("\n") in three parsers. The
    broader splitlines() also breaks on \r / \v / \f / U+0085 /
    U+2028 / U+2029, any of which in a commit subject (or path) would
    fragment one logical record into multiple "lines" that each fail
    the per-line guards and silently vanish. Test added covering
    carriage-return-in-subject preservation.

  #7 git_uncommitted_changes diffs now capped at _MAX_DIFF_BYTES (1 MiB)
    per diff, with an explicit truncation marker so the caller (the
    MCP agent) knows what happened and can fall back to `git diff
    --stat`. Without this cap, a stray multi-GB staged blob would
    OOM-kill the MCP server. Test added.

  #8 git_recent_commits parser now skips records with empty SHA. A
    line of pure separators ("\x1f\x1f\x1f\x1f") passes the
    parts-count guard and would otherwise append a degenerate
    {"sha":"", "short":"", ...} entry. Test added.

Very-low defensive:

  #9 branch.ab parser now validates sign prefixes ("+N -M") before
    stripping. A malformed "-3 +1" line would otherwise propagate
    bogus values through as negative ahead / positive behind. Test
    added.

  #10 Type-1/2 XY field bounds-checked. A single-char XY from a
    corrupted git output stream no longer raises IndexError outside
    GitOpError; the malformed line is skipped defensively. Test
    added.

Docstring:

  #5 _LOG_FIELD_SEP comment further softened — explicitly notes the
    trade-off ("rare data loss on pathological commit" vs "robust
    separator that won't collide with common subject content") and
    points to `git log -z` as the genuinely-safe variant if this
    ever bites a real user.

Test-coverage gaps closed:

  - test_unborn_branch_raises_giterror — pin real-world behaviour
    where `git log` on a freshly-init'd repo exits 128 (the previous
    test_empty_output_returns_empty_list only exercised the mocked
    exit-0-empty-stdout path).
  - test_passes_cwd_when_path_given added to git_remote_repo,
    git_status, and git_recent_commits — was previously only on
    git_current_branch and git_uncommitted_changes.
  - test_detached_head_normalised_to_HEAD pins the new cross-function
    sentinel agreement.

All subprocess-shape assertions updated for the new `git -c
color.ui=never` prefix (argv indices shifted by 2).

Skipped:
  - The U+001F dropping case (#5 partial) — already returns [] for
    a pathological subject; the docstring softening is the right
    response without a real-user bug report.

Test count: 263 -> 274. All passing locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Apply Phase 4.5 round-3 review fixes (hardening)

Reviewer flagged the PR as "in good shape to merge" but surfaced 5
hardening items, two of which they explicitly called out as
worth-doing-before-MCP-wires-this-up. All 5 addressed plus the
corresponding test coverage.

Security:

  #2 Origin URL credential redaction. CI pipelines and credential-
    helper-less dev setups commonly embed auth in remote URLs
    (`https://x-token-auth:abcd@bitbucket.org/...`). On the rare
    unparseable URL, the secret would have landed in the GitOpError
    message, flowed up through MCP into agent context, and likely
    ended up in downstream logs. New `_redact_url_creds()` strips
    `://user:token@` -> `://[redacted]@` before any URL is echoed
    in an error string. Test added that asserts the secret token
    is absent from the raised error and that "[redacted]" appears.

Reliability:

  #3 stdin / env hardening on every git call. `_run_git` now passes
    `stdin=subprocess.DEVNULL` so a credential prompt fails
    immediately with EOF rather than blocking on the MCP server's
    inherited stdin. Environment also adds GIT_TERMINAL_PROMPT=0
    (git refuses to prompt at all) and GIT_ASKPASS="" (no GUI
    askpass dialog). Belt + suspenders alongside the 30s timeout —
    what was previously a 30-second wedge into "timed out" is now
    an instant clear error. Subprocess-shape assertions updated.

UX:

  #1 FileNotFoundError disambiguation. subprocess.run raises
    FileNotFoundError for TWO distinct cases — git binary missing
    OR cwd directory missing — which the previous catch conflated
    into "git executable not found on PATH". Once the MCP agent is
    passing paths (e.g. a worktree that's since been removed), the
    user would chase their PATH config when the real cause is the
    bad directory. New code inspects `e.filename`: if it matches
    the cwd, raises with "path does not exist: <cwd>" instead.
    Test added covering the cwd-missing path.

  #4 Unborn-branch sentinel normalisation. Round 2 normalised
    `# branch.head (detached)` -> "HEAD" so git_status and
    git_current_branch agree on detached state. The same shape of
    asymmetry existed for unborn branches: porcelain v2 reports
    `# branch.oid (initial)` + `# branch.head main` (the would-be
    branch) while git_current_branch raises GitOpError on the same
    repo. Parser now tracks an `is_unborn` flag (line order
    independent — branch.oid / branch.head ordering in v2 output
    isn't documented as stable) and overrides branch to "HEAD" at
    end-of-loop so the two functions agree on "weird state, not a
    regular branch." Test added.

Defensive:

  #5 `_cap_diff` fast-path on `len(text)`. UTF-8 byte count is
    always >= char count, so if char count fits the cap, the
    encoded form definitely does. Avoids materialising a transient
    bytes copy of a multi-GB string just to check its size — the
    cap exists specifically to defend against OOM on huge diffs,
    and the check itself shouldn't be the trigger.

Deferred:

  None. All 5 findings addressed in this commit.

Test count: 274 -> 277. All passing locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Apply Phase 4.5 round-4 review fixes

Round 4 found a real bug I introduced in round 3 plus two cheap
hardening items. All three addressed.

Real bug fixed:

  #1 _cap_diff fast path was inverted. The round-3 fast path used
    `if len(text) <= cap: return text` with the reasoning "UTF-8
    byte count is always >= char count, so chars <= cap implies
    bytes <= cap." That's backwards — bytes >= chars means
    `chars <= cap` does NOT imply `bytes <= cap`. A 600K-emoji
    string was 600K chars (under the 1 MiB char cap) but encoded
    to 2.4 MiB, defeating the OOM defense for non-ASCII content.

    The existing oversize-diff test used ASCII `"x"` chars (where
    chars == bytes), so the divergence was never exercised — the
    bug rode through round-3 review undetected.

    Fixed by gating the fast path on `text.isascii()`: ASCII-only
    strings genuinely have bytes == chars, so the len check is
    sound. Non-ASCII falls through to the encode-and-measure path.
    Regression test added with `"\U0001F600" * 600_000` (4-byte
    emoji) — asserts the returned diff is byte-capped, not
    char-capped.

Hardening:

  #2 GitOpError.command in parse-failure paths now includes the
    `-c color.ui=never` prefix that _run_git actually passes to
    subprocess. Previously err.command in git_current_branch's
    empty-output path and git_remote_repo's unparseable-URL path
    hard-coded the unprefixed form, so a caller introspecting
    err.command (or reproducing the failure from the message)
    would see a different invocation than what was actually run.
    Tests strengthened to assert the prefix is present.

  #3 File-list cap added: _MAX_PATH_LIST = 10_000. Mirrors the
    _MAX_DIFF_BYTES and _MAX_LOG_COUNT bounds that already protect
    the other two agent-facing surfaces. Without this, a repo that
    forgot to gitignore node_modules / .venv / target/ would
    return hundreds of thousands of untracked paths through MCP —
    plausible OOM trigger.

    Applied to:
      - git_uncommitted_changes' untracked_files
      - _parse_status_porcelain_v2's staged / modified / untracked /
        unmerged lists

    New `_truncated_path_list` helper appends a sentinel marker
    (`<... N more paths omitted (cap N exceeded)>`) so the agent
    sees the truncation and can fall back to a narrower query.
    New TestPathListCap class pins the helper directly:
    under-cap / at-cap / one-over / many-over / git_status
    integration.

Test count: 277 -> 284. All passing locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Apply Phase 4.5 round-5 review fixes

Round 5 (extra-high recall mode) surfaced 12 findings. Reviewer's
explicit verdict: "Findings 1, 2, and 4 are concrete defects worth
fixing before Phase 4.6 wires this up to an agent." Addressed those
three plus six more cheap real items; skipped three cosmetic / very-
narrow ones.

Correctness:

  #1 branch.ab strict validation. The startswith-only check accepted
    smuggled double-sign strings like "+-3 -1" where `int(parts[0][1:])`
    parsed as -3, contradicting the parser's own promise to reject
    negative values. Now requires `parts[N][1:].isdigit()` after the
    sign so "+5", "-3" parse but "+-3", "++5", "+junk" don't.
    Parametrized regression test added.

  #2 branch.ab atomic update. Previous try/except wrapped both int()
    calls together — if the first succeeded and the second raised,
    out["ahead"] was mutated to the new value while out["behind"]
    kept the default 0, producing an internally-inconsistent dict
    on a half-malformed line ("+5 -junk"). Now parses both into
    locals first, commits to out[] only on full success.

  #4 NotADirectoryError + PermissionError wrapped as GitOpError.
    Previously only FileNotFoundError + TimeoutExpired were caught,
    so passing path=/etc/passwd (regular file) or path=/restricted
    (unreadable directory) leaked raw OSErrors out of _run_git,
    breaking the module docstring's "Errors raise GitOpError"
    promise. Tests added with dedicated runner stubs for each
    error class.

Security:

  #3 URL credential redaction handles `@` in passwords. The old
    `[^/@]+@` regex stopped at the first `@`, leaving the tail of
    a password containing a literal `@` exposed (e.g.
    `https://user:p@ss@host/...` → `https://[redacted]@ss@host/...`).
    Switched to `[^/]+@` which is greedy up to the LAST `@` before
    the path. Regression test added.

Architecture:

  #11 _truncated_path_list returns (list, omitted_count) tuple
    instead of mixing a sentinel string into the path list. The
    in-list sentinel broke the "list of valid paths" contract — an
    agent iterating with `os.stat(p)` / `Path(p).exists()` /
    `os.path.join(root, p)` would hit the non-path "<...>" entry
    and crash or silently mis-handle it. Now the cap surfaces via
    sibling `<key>_omitted` fields (always present, 0 when not
    truncated) so the agent can detect truncation explicitly
    without iterating over non-path data.

    Applied to:
      - _parse_status_porcelain_v2: adds staged_omitted /
        modified_omitted / untracked_omitted / unmerged_omitted
      - git_uncommitted_changes: adds untracked_files_omitted

  #12 GIT_PARSE_ERROR_RETURNCODE moved from -1 to -1000. Python's
    subprocess.run uses negative returncodes for signal-killed
    processes (-1 = SIGHUP, -9 = SIGKILL, -15 = SIGTERM, etc.). A
    SIGHUP-killed git would otherwise land at returncode = -1,
    colliding with the parse-error sentinel and being misclassified.
    -1000 is safely outside the signal range. Test added asserting
    the sentinel is < -100.

Documentation / parity:

  #5 git_status docstring updated to reflect the actual return
    shape. Previously said `"branch": ... | "(detached)"` but the
    code returns `"HEAD"` for both detached and unborn (per round 3's
    normalization). A caller reading the docstring and writing
    `if branch == "(detached)": handle_detached()` would never
    enter that branch.

  #6 Type-? (untracked) parser skips empty-path lines. Previously
    `"? "` (prefix only, no path body) would append "" to untracked
    and flip clean=False with a phantom entry. Asymmetric with type-1
    / type-2 which have length guards. Test added.

  #8 Type-u (unmerged) parser checks XY field width. Type-1 and
    type-2 paths added this in round 3; the type-u path missed.
    Trivial parity fix. Test added.

Skipped:

  #7 (cosmetic — diff cap is _MAX_DIFF_BYTES + ~140 marker bytes,
    not a strict cap; mattering only to a downstream consumer that
    enforces an exact ceiling, which no current consumer does)
  #9 (doc asymmetry — `git_uncommitted_changes` docstring doesn't
    mirror `git_status`'s C-quoting caveat; added to git_status'
    expanded docstring covering both surfaces)
  #10 (SCP-style URL credentials — narrow trigger, parse_remote_url
    accepts most SCP shapes; revisit if a real leak surfaces)

Test count: 284 -> 296. All passing locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Phase 4.6: mcp_server.py — FastMCP wrapper around bb_ops + git_ops (#7)

* Add mcp_server.py — FastMCP wrapper around bb_ops + git_ops

Phase 4.6 of the MCP server work. Wires every bb_ops and git_ops
function into a FastMCP tool registry, with a self-bootstrap venv
mirroring zenhub-cli's mcp_server.py pattern. 39 new tests, 335 total.

mcp_server.py — 30 tools:

  PIPELINES (6): pipelines_list, pipeline_show, pipeline_steps,
    pipeline_trigger, pipeline_stop, pipeline_logs
  PULL REQUESTS (11): prs_list, pr_show, pr_activity, pr_create,
    pr_approve, pr_unapprove, pr_merge, pr_decline, pr_diff,
    pr_comments_list, pr_comment_add
  REPOS / METADATA (7): repos_list, repo_show, branches_list,
    branch_show, vars_list, downloads_list, commits_list
  GIT CONTEXT (5): git_current_branch, git_status, git_remote_repo,
    git_recent_commits, git_uncommitted_changes
  META (1): whoami

Self-bootstrap pattern (mirrors zenhub-cli):

  - On launch, checks /tmp/bbenv. If missing, finds a python3 >= 3.10
    (preferring the launching interpreter, falling back to pyenv /
    Homebrew / system locations), runs `python -m venv`, installs the
    `mcp` package, then `os.execv`s into the venv.

  - Bootstrap is stdlib-only (uses subprocess + sys, no third-party
    imports until after the venv is active). Means any python3 on PATH
    can launch the server.

  - `mcp` is the only venv dep — no heavy ML libs (no torch /
    sentence-transformers / numpy / huggingface_hub). bb doesn't have
    semantic-similarity workflows like zenhub-cli does.

  - BB_MCP_SKIP_BOOTSTRAP=1 escape hatch (set in conftest.py) lets
    pytest exercise tool wiring and result-dict shapes without
    pulling in mcp. Production NEVER sets this — without real
    FastMCP, .run() raises.

Repo resolution — every Bitbucket tool takes an optional `repo`:

  - ""              → auto-detect via `git remote get-url origin`
                      from BB_DEFAULT_REPO_PATH (or cwd)
  - "myrepo"        → use BB_WORKSPACE from config + "myrepo"
  - "acme/myrepo"   → use "acme" workspace + "myrepo" slug (overrides
                      BB_WORKSPACE for this call)

Validated in `_resolve_repo` with ValueError on malformed input
("a/b/c", "/repo", "ws/", "/", "//" all rejected before any network
call burns API budget).

Error envelope — every tool returns either:

  {"ok": True, "workspace": ..., "repo": ..., <result fields>}
  {"ok": False, "kind": "<ExceptionClassName>", "message": ..., <extras>}

The error dict carries `status`/`url`/`body` for BBApiError,
`returncode`/`stderr` for GitOpError, so the agent can branch on
`kind == "BBApiError" and status == 404` without parsing the message.

pr_create auto-detects source_branch via git_current_branch when
empty (matches bash `bb pr-create`).

whoami reports user / workspace / api_base + best-effort git context
(branch, workspace, repo from remote). Token is NEVER echoed; the
test explicitly asserts the token string is absent from the response.

Test discipline (same as bb_api / bb_ops / git_ops):

  - Tool count + names pinned exactly (test_all_expected_tools_registered)
    — a future rename or accidental drop fails the suite.
  - Per-tool wiring: patch the bb_ops / git_ops function at the module
    level, call the tool, assert (a) the underlying function received
    the right args, (b) the response dict has the expected shape.
    Doesn't re-test bb_ops logic (already covered comprehensively).
  - Error envelope tested for each exception class
    (BBApiError, BBOpNotFound, GitOpError, ValueError).
  - whoami's no-token guarantee asserted as a security property.
  - FastMCP stub's .run() raises a clear error so a test that
    accidentally calls it fails loud instead of hanging on stdio.

CI / packaging:

  pyproject.toml py-modules += "mcp_server".
  ci.yml syntax-check += "mcp_server.py".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Apply Phase 4.6 review fixes

Round 1 review on PR #7 flagged 15 findings, with two explicitly called
out as production hazards ("will bite real users on first deploy on
common Linux topologies"). Addressed 11 of 15; deferring 4 with
rationale.

Production hazards (must-fix per reviewer):

  #1 Bootstrap venv detection now uses `sys.prefix == str(_VENV_DIR)`
    instead of `realpath(sys.executable) == realpath(_VENV_PY)`.
    `python -m venv` on Linux/macOS defaults to --symlinks, so
    realpath(/tmp/bbenv/bin/python3) resolves to the SAME canonical
    path as the builder interpreter (e.g. /usr/bin/python3.12) — the
    realpath comparison was claiming "already under venv" while we
    were actually still under the system interpreter, skipping the
    execv and dying on `from mcp.server.fastmcp import FastMCP`.
    sys.prefix is set per-interpreter from the venv layout and is the
    authoritative signal.

  #2 Bootstrap idempotency. Pip install can be interrupted (Ctrl-C,
    OOM-kill, disk full) leaving _VENV_PY in place but `mcp` missing.
    The old guard skipped reinstall on every subsequent launch and
    died on import with no self-repair path. Now writes a
    `.bbenv-ready` sentinel ONLY after the full bootstrap succeeds;
    the venv create step is also idempotent (only runs if _VENV_PY
    is absent) so retries pick up where they left off.

  #3 Pinned mcp dependency to `mcp>=1.0,<2`. The previous unpinned
    "mcp" would silently install a future breaking mcp 2.x on the
    next /tmp wipe. Matches the pyproject [mcp] extra.

Correctness:

  #4 pr_create now rejects detached-HEAD auto-detect. git_current_branch
    returns the literal "HEAD" for detached/unborn state; bb_ops only
    checks non-empty/non-whitespace, so the call would have reached
    Bitbucket with source.branch.name="HEAD" and surfaced as an
    opaque API error. Now raises ValueError locally with a clear
    "pass source_branch=" hint.

  #5 _resolve_repo strips whitespace before parsing. A sloppy paste
    like "  acme/widget  " no longer slips through as
    workspace="  acme" and fails deep in the API layer.

 #11 pr_create source_branch auto-detect now triggers on whitespace
    (not just empty string), symmetric with #5.

 #13 OSError added to _TOOL_EXPECTED_EXCEPTIONS. Covers paths
    git_ops._run_git doesn't wrap explicitly (IsADirectoryError,
    BlockingIOError, ConnectionResetError), and catches
    os.getcwd() on a deleted cwd inside _default_repo_path() which
    fires BEFORE any wrapped git call.

Surface improvements:

  #6 pipelines_list now exposes `sort=` kwarg. Default "-created_on"
    (newest first); agent can pass "created_on" for oldest-first or
    other sort keys.

  #7 pipeline_logs + pr_diff now expose `timeout=` kwarg. Default
    120s matches bb_ops; agent can bump for pipelines with very
    large log payloads or oversized PR diffs that would otherwise
    cap out.

  #8 git_status payload renamed from `status` to `working_tree`. The
    `status` key collided with the HTTP-status field _error_dict
    uses for BBApiError. Today the collision can't fire (git_status
    doesn't raise BBApiError), but the rename pre-empts a future
    broadening hazard.

 #10 branch_show echoes the stripped name in the response. bb_ops
    strips before encoding the URL, so the response now matches what
    Bitbucket actually resolved rather than the unstripped input.

Deferred (with rationale):

  #9  Response-key naming consistency (approval/result/pr). Reviewer
    suggested standardizing all mutation tools to "result", but the
    descriptive keys carry useful semantic info — pr_approve returns
    an approval record, pr_merge returns the merged PR, etc. The
    docstrings explain the shape. Skipping.

 #12 _client_cache concurrency + invalidation. Real concern for
    long-lived servers with rotated tokens, but stdio MCP is
    single-threaded and token rotation is rare; defer to Phase 4.6+
    if a real bug surfaces.

 #14 CI smoke step that actually imports the real mcp package.
    Larger CI workflow change; defer.

 #15 README MCP documentation. Explicitly Phase 4.8 scope.

Test count: 335 -> 341. All passing locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Apply Phase 4.6 round-2 review fixes

Round 2 surfaced 15 new findings including a real security bug. Fixed
all 15.

Security:

  #1 BBApiError.url now redacted in _error_dict. pipeline_logs and
    pr_diff follow Bitbucket's 307 to a presigned S3 URL via
    fetch_redirected_text; on a downstream non-3xx (S3 clock skew,
    expired signature, redirect cap), BBApiError(url=<signed S3 URL>)
    carried AWS-credential-bearing query parameters
    (X-Amz-Signature, X-Amz-Credential) verbatim into the agent
    error dict and downstream logs. New _redact_url helper strips
    `user:token@` AND replaces signed-URL query strings with
    `[redacted-signed-url-params]` while preserving the path so
    the agent knows what host was called.

Correctness:

  #2 AttributeError added to _TOOL_EXPECTED_EXCEPTIONS. JSON null
    from an MCP client deserialises to Python None; `.strip()` on
    None crashes uncaught. Also pre-normalised every stringy arg
    path through `(value or "").strip()` so None and "" funnel to
    the same auto-detect path uniformly.

  #3 _default_repo_path no longer evaluates os.getcwd() eagerly.
    `os.environ.get("KEY", os.getcwd())` runs the default even when
    the env var is set — the override never protected against a
    deleted cwd. Now uses `... or os.getcwd()` so the env var is
    the lazy short-circuit. whoami also wraps cwd resolution in a
    try block so a deleted cwd surfaces as a structured cwd_error
    field instead of escaping the function.

  #4 mcp import wrapped with helpful recovery message. If `mcp/`
    gets deleted post-bootstrap (manual `pip uninstall`, partial
    /tmp cleanup that wiped the package dir but spared the
    sentinel), the ImportError now tells the user to
    `rm .bbenv-ready` or `rm -rf /tmp/bbenv`.

  #5 _resolve_repo now strips inner slug parts. "acme/ widget"
    no longer slips through as workspace="acme", slug=" widget"
    and 404s on `/repositories/acme/%20widget`.

  #6 repos_list workspace stripped. ` acme` / `acme ` no longer
    slip through.

  #7 TypeError removed from _TOOL_EXPECTED_EXCEPTIONS (was
    contradicting the file's stated philosophy). A refactor that
    renames a bb_ops kwarg now crashes visibly during dev/test
    instead of being silently reported back to the agent as a
    fake Bitbucket failure.

  #8 whoami git-context handlers now use _TOOL_EXPECTED_EXCEPTIONS
    (not narrow GitOpError) AND store the full structured
    _error_dict (not str(e)). An agent branching on
    err.returncode for git-127 vs not-a-git-repo-128 vs
    timeout-(-1000) has the same shape available as every other
    tool.

  #10 _opt_str() helper introduced. (branch or None) became
    `_opt_str(branch)` everywhere — "" / "   " / None all funnel
    to None at the bb_ops boundary. Without this, "  " on a
    branch filter posted `?target.ref_name=%20%20`, on a pattern
    raised a confusing two-layers-down ValueError, on commits
    raised a different "non-empty, non-whitespace" error.
    Inconsistent surfaces for the same agent typo.

  #12 sys.prefix venv check now resolves both sides through
    Path.resolve() so macOS's `/tmp` -> `/private/tmp` symlink
    doesn't trigger an infinite execv loop.

  #13 Every tool that takes an identifier (pr_id / number /
    step_index) now threads it into the error dict via
    _error_dict_with. An agent fanning out parallel
    pipeline_logs(number=42, step_index=5) and
    pipeline_logs(number=42, step_index=6) can correlate
    failures to originating requests without substring-matching
    the message.

  #14 _resolve_repo validates the repo arg shape BEFORE calling
    _get_client(). A fresh-machine user without config + a
    malformed slug now sees the actual ValueError, not the
    BBConfigError that previously masked the real cause.

  #15 git_recent_commits echoes the stripped ref in the response
    (matches what `git log` actually ran). Consistent with
    branch_show.

UX:

  #11 pip ins…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant