Release v1.1.0: Python MCP server + bash CLI parity + agent definition by daniel-pittman · Pull Request #14 · daniel-pittman/bitbucket-cli

daniel-pittman · 2026-05-27T23:42:05Z

v1.1.0 — Python MCP server + bash CLI parity + bundled agent

Second-release milestone. v1.0.0/v1.0.1 were bash-CLI-only with OSS hardening; v1.1.0 adds a full Python MCP server (parallel implementation of the Bitbucket REST contract — does not shell out to bb), a bundled bitbucket agent definition for Claude Code delegation, and brings the bash CLI to parity with new commands and bugfixes surfaced during MCP development.

19 commits since v1.0.1.

Headline additions

Python MCP server (mcp_server.py + bb_api.py + bb_ops.py + git_ops.py) — 30 tools covering pipelines, pull requests, repos, branches, commits, repo variables, and git-context helpers. Stdlib-only HTTP runtime; the mcp package is the only third-party dep, self-bootstrapped into a durable venv at $XDG_DATA_HOME/bitbucket-cli/venv on first invocation. ~367 pytest tests.
Bundled bitbucket agent (agents/bitbucket.md) — generic, user-installable Claude Code agent that drives the MCP server with propose-first protocol for destructive ops, pipeline-failure triage discipline, and a project-conventions section for per-workspace defaults.
bb bash CLI parity — 5 new commands (pr-comment, pr-unapprove, branch, commits, whoami) and 8 parity bugfixes from MCP-side test discoveries (variable JSON injection, masked variable echo, redacted origin URL, _require_build_number validator, etc.).
Bash 3.2 compatibility — macOS system bash is now a supported runtime (was implicitly 4.0+).

Security posture

BB_TOKEN never echoed in whoami, error envelopes, or log lines.
URL credential leaks (https://user:token@host/...) and signed-URL query params (AWS X-Amz-Signature, Azure SAS, GCP signing keys, bearer / access_token / api_key) stripped from every error message via a single _safe_text chokepoint.
Cross-host Authorization header stripping on redirect — Bitbucket Basic auth never reaches S3 when fetching pipeline logs.
Pipeline variable values masked as KEY=*** when echoed back.
subprocess calls in git_ops use GIT_TERMINAL_PROMPT=0, GIT_ASKPASS="", stdin=DEVNULL, timeout — no interactive prompts can hang the MCP server.
whoami workspace-reachability probe (single GET /repositories/{workspace}?pagelen=1, 10 s timeout) confirms credentials reach the configured workspace; never echoes the token.

Notable behind-the-scenes fixes (discovered in PR review rounds)

_safe_text structural fix for error-dict credential leaks (replaces three rounds of per-field redaction).
_is_positive_int rejects bool (the bool-is-int trap).
Venv detection survives Linux symlinked venvs (Path.resolve() on both sides).
pr_create source-branch auto-detect with detached-HEAD rejection.
bash ${state,,} → portable printf '%s' | tr so bb prs works under macOS bash 3.2 on empty result sets.

PRs included

#2 (Phase 4.1 bb_api foundation), #3 (4.2 pipeline ops), #4 (4.3 PR ops), #5 (4.4 repos/branches/vars/downloads/commits), #6 (4.5 git_ops), #7 (4.6 mcp_server), #8 (4.7 bash parity), #9 (XDG venv path), #10 (4.8 agent + docs), #11 (README MCP install restructure), #12 (bash-3.2 fix + macOS install docs), #13 (VERSION bump).

Test plan

pytest — 367 tests pass on develop
Hands-on verification in two real workspaces (pittmollc/pittmo_auction, dreamfacesbir/johnny-server) — bash CLI + MCP tools + bundled agent all healthy
Install verification: claude mcp add --scope user bitbucket -- python3 .../mcp_server.py registers and connects; agent file copied to ~/.claude/agents/bitbucket.md loads as a user-scope subagent
Symlink verification: ln -s "$(pwd)/bb" "$(brew --prefix)/bin/bb" works no-sudo on macOS

Deferred to v1.2.0 (rolled into one epic — no separate v1.1.1 planned)

prs_list MCP response payload trim for rich-PR repos (#27)
CI guard for the bash-3.2 floor + state arg allowlist (#29)
3 deferred PR README: restructure MCP-server section to mirror zenhub-cli install pattern #11 round-3 polish items (#28)
bb prs OPEN UX gotcha — positional [repo] consumes the state (#30)
Multi-workspace epic (#24): bb workspaces command, bash workspace auto-detect parity, BB_WORKSPACE semantic shift to "default"
Post-v1.2.0: bb-driven agent customization workflow (#25)

* Add bb_api.py + pyproject + Python test infrastructure (MCP foundation) First slice of the MCP server work. Establishes the Python module layout the MCP server will sit on top of, plus the test seam and CI extension required to develop the rest of the modules with confidence. bb_api.py — Python parallel of bb's config / auth / transport helpers. Stdlib-only (urllib.request), so the MCP server's bootstrap stays small. Public surface: - BBConfig: resolved credentials + workspace + API base URL. - load_config(): env > ~/.config/bb/config > .env > error; tolerant of bash `export KEY=value` and quoted values, matching the bash script's `source` semantics close enough for parity. - parse_remote_url() / detect_repo(): both HTTPS and SSH remote shapes; repos with dots in their slugs are preserved (bash regex truncates). - repo_path(): /repositories/{workspace}/{repo}, rejecting embedded slashes that would silently change the URL structure. - BBClient: thin transport with .get / .post / .put / .delete / .paginate. Pagination is defended against runaway cursors (max 200 pages) and stuck cursors (`next` URL unchanged between iterations), and refuses to follow `next` URLs that point at a different host than api_base. - BBApiError carries status + URL + body so callers can branch on expected non-2xx responses. bb and bb_api are PARALLEL implementations of the same Bitbucket REST contract — the MCP server does not shell out to bb. The bug-parity rule is now documented in CONTRIBUTING.md: tests verify correct API behaviour, and a defect surfaced by a Python test fixes both code paths. pyproject.toml — flat `py-modules = ["bb_api"]` layout (additional modules land in their own PRs). `mcp` is in an optional [mcp] extra so test matrices stay lean. tests/ — pytest with two responsibilities in conftest.py: set BB_MCP_SKIP_BOOTSTRAP=1 before any mcp_server import to defend against the future self-bootstrap venv pattern, and inject the repo root onto sys.path. 36 tests covering config precedence (env > file > error), remote-URL parsing (HTTPS + SSH + repos-with-dots), client transport (URL + method + auth header + body shape per call, not just response status), and pagination (walking, stuck-cursor defence, iteration cap, cross-host refusal, empty page). CI extension — Python 3.10/3.11/3.12 matrix added to the existing `syntax` job. fail-fast: false so a failure on one version doesn't mask another. Installs only pytest; the MCP runtime deps stay out of unit-test CI because the suite mocks urllib. CODEOWNERS — extended to gate review on bb_api.py, future bb_ops.py / git_ops.py / mcp_server.py, pyproject.toml, and the agents/ directory. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply review fixes for Phase 4.1 foundation Addresses 13 of the 15 findings from the Phase 4.1 code review. The strongest finding (#1) was a parity bug where a test pinned the wrong contract — exactly the anti-pattern the testing methodology warns about. Parity / correctness fixes: 1. dotenv > home-config precedence. Bash sources ~/.config/bb/config first, then `.env`, so `.env` wins. Python had the order inverted and the test pinned the inverted behaviour. Now matches bash; the test now verifies the correct contract. 2. Empty env var no longer falls through to file. Switched from `env.get(key) or file_config.get(key)` to a membership test so `BB_TOKEN=""` is treated as an explicit (failing) value rather than absence — matches bash's `[[ -z ]]` check behaviour. 3. URLError (DNS / TLS / connection / timeout) now wraps as BBApiError(status=0) instead of escaping the documented contract. 4. Pagination cross-host refusal hardened against the prefix trick. `startswith(api_base)` is replaced with a separator-aware check so `https://api.bitbucket.org/2.0evil.example.com/...` no longer slips past the host-mismatch guard. 5. BB_MCP_SKIP_BOOTSTRAP is now set unconditionally in conftest; `setdefault` was a no-op if a developer had `BB_MCP_SKIP_BOOTSTRAP=0` in their shell, defeating the guard silently. 6. Paginated response missing the `values` key now raises rather than being silently treated as an empty page (which could leave a hole in the result set if `next` was still set). 7. Non-string `next` raises BBApiError with a clear message rather than crashing on AttributeError. 8. repo_path() rejects empty / whitespace-only / `.` / `..` values in addition to embedded `/`. Defends against path-traversal where `/repositories/../widget` normalises to the wrong workspace. 10. Query values validated as scalar or list-of-scalar; nested dicts and arbitrary objects raise TypeError instead of being silently stringified by urlencode(doseq=True) into a misleading URL. 11. BB_API_BASE trailing slash normalised on load. Prevents `api_base + "/path"` becoming `//path`. 13. Replaced `assert ... is not None` with an explicit raise so `python -O` deployments retain the safety net if a future edit narrows the missing-keys check. 14. detect_repo test now asserts the exact subprocess call shape (command, capture_output, text, cwd, check). Previously the test was canned-output-only and a refactor to a different git command would have passed silently. 15. BBClient methods accept a per-call `timeout=` override for endpoints that legitimately take longer (log streaming, large diffs). Default still comes from the client. Tightened docstrings: - BBApiError now documents `status=0` as the sentinel for non-HTTP transport / malformed-response errors so callers know to gate HTTP-status dispatch on `status > 0`. Deferred with rationale: #9 (remote URL parser accepts non-Bitbucket hosts): intentional for enterprise / self-hosted Bitbucket support. The bash side has the same loose parsing. No restriction added. #12 (BBApiError used as transport-error wrapper): the dedicated exception class would expand the surface; the docstring update above is sufficient until a real consumer needs to discriminate. Test count: 36 -> 55. All passing locally on Python 3.11. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply follow-up review fixes (security: no-redirect default, test rigor) Follow-up review on commit e2567e8 verified all 13 prior fixes landed correctly and surfaced 6 new findings, 3 of them worth taking before more code lands on top of this foundation. Security: new #1. Default BBClient opener no longer follows 3xx redirects. urllib's default HTTPRedirectHandler resubmits the original request against the Location URL — INCLUDING the Authorization header. urllib does not strip auth on cross-origin redirects, so a misconfigured proxy or DNS hijack pointing api.bitbucket.org at another host would have leaked the Basic auth credential to that host on the first 3xx. `bb` doesn't have this exposure: `curl -sf` doesn't follow redirects without -L, and the one command that needs to (`bb logs`) opts in explicitly. Mirror that posture here. Implementation: _NoRedirectHandler subclasses HTTPRedirectHandler and returns None from redirect_request, which causes urllib to surface the 3xx as an HTTPError that our existing handler wraps into BBApiError. Two tests cover it — one introspects the default opener's handler chain, the other unit-tests the handler's redirect_request contract directly. When bb_ops adds pipeline_logs (which legitimately needs to follow a 307 to S3), that work will require a separate code path that strips Authorization on cross-host hops rather than weakening this default. Test rigor: new #2. test_paginate_walks_pages now asserts the URL for each of the three calls (base, base?page=2, base?page=3). Without this, a regression that refetched the original URL three times would still consume the three queued responses and pass — exactly the mock-returns-success-regardless-of-request anti-pattern. Cleanup: new #4. Removed the dead `page_query = None` assignment in paginate(). The variable was only read on the first iteration (when url is None), and the reset ran after `url` had been assigned, so it was never observed. Removed and the closure variable `query` is used directly. Documentation: new #5. parse_remote_url() now has an IMPORTANT caller-contract note: the function does not anchor to a Bitbucket host (intentional for enterprise / self-hosted Bitbucket support). Callers feeding it untrusted external URLs (webhook payloads, etc.) must verify the host upstream before calling. If a future consumer needs that guarantee, the right shape is a `strict=True` mode here, not duplicated checks at every call site. Deferred: new #3 (BB_API_BASE shape validation) — low-impact today; will revisit if a self-hosted Bitbucket Server use case lands a typo bug. new #6 (older _fake_runner tests still allow kwargs drift) — minor; the new recording-style test_detect_repo_invokes_git_remote_get_url is the load-bearer, the looser fixtures around it are acceptable. Test count: 55 -> 57. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Add bb_ops pipeline operations + cross-host-safe log fetch Phase 4.2 of the MCP server work. Adds the pipeline subset of bb_ops (list, show, steps, trigger, stop, logs), plus a new bb_api transport method for following redirects safely. 33 new tests, 90 total. bb_ops.py — public surface: pipelines_list(client, ws, repo, *, count=10, sort, branch=None) Paginates pipelines/?sort=-created_on. Honours `count` exactly by walking pages as needed; clamps per-page to Bitbucket's 100 cap. Optional branch filter via target.ref_name. pipeline_show(client, ws, repo, build_number) Resolves build_number -> uuid then GETs /pipelines/{uuid}. pipeline_steps(client, ws, repo, build_number) Same uuid lookup, then GETs /pipelines/{uuid}/steps/. pipeline_trigger(client, ws, repo, *, branch, pattern=None, variables=None) Builds the POST payload (target + optional custom-pattern selector + optional variables-as-list-of-{key,value}). Variables accept dict or iterable-of-pairs at the boundary. Rejects non-string values (Bitbucket's contract). pipeline_stop(client, ws, repo, build_number) POSTs to /pipelines/{uuid}/stopPipeline. Returns the API response (typically None on 204). The bash equivalent discards the response with `> /dev/null`, masking failures — flagged as a 4.7 parity fix. pipeline_logs(client, ws, repo, build_number, step_index, *, timeout=120) Uses the new BBClient.fetch_redirected_text() path to follow Bitbucket's 307-to-S3 redirect on the log endpoint with Authorization stripped on cross-host hops. bb_api.py — BBClient.fetch_redirected_text(): Necessary because the log endpoint sometimes returns 200 with an inline log body and sometimes returns 307 to a signed S3 URL. The default opener refuses 3xx (credential-safety), so this method opens with the default opener, catches HTTPError 3xx, extracts Location, and follows up to max_redirects hops. CRITICAL: when the next hop's host differs from api.bitbucket.org, Authorization is removed from the request headers — Basic credentials never reach S3 or any other redirect target. Once stripped, the credential stays stripped for the rest of the chain. Bash's `bb logs` uses `curl -sfL` which (depending on curl version) may send Authorization on cross-host redirects. Whether that's presently broken in bash is a parity question for 4.7; Python is correct by construction. Test coverage (33 new): _wrap_uuid: brace normalisation, URL encoding, whitespace tolerance. _strip_uuid_braces: bare/braced/empty inputs. pipelines_list: single page, count > 100 (pagination), mid-page cutoff, branch filter URL shape, non-positive-count rejection without emitting a request. _resolve_pipeline_uuid: first-page hit, multi-page walk, not-found raises BBOpNotFound (distinct from BBApiError), invalid build_number rejection, scan_limit cap respected. pipeline_show: uuid lookup THEN show by uuid; URL uses %7B...%7D bracket encoding (bash contract). pipeline_steps: uuid lookup THEN list steps. pipeline_trigger: default-pipeline payload shape, custom-pipeline payload shape (selector embedded), variables-as-dict shape, variables-as-iterable-of-pairs shape, empty-variables omits the key (not an empty list), non-string variable value rejected, empty pattern rejected, empty branch rejected. pipeline_stop: POST to /stopPipeline with no body; 204 returns None. pipeline_logs: inline body (no redirect), 307 follow with Authorization stripped on cross-host (the security-critical test), too-many-redirects raises, missing Location raises. _resolve_step_uuid: valid index, out-of-range raises BBOpNotFound, negative index rejected. Every HTTP-touching test asserts request URL, method, and body shape - not just the response value. Guards against the mock-returns-success-regardless-of-body anti-pattern. CI / packaging: pyproject.toml py-modules now includes bb_ops. ci.yml syntax-checks bb_ops.py alongside bb_api.py. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.2 review fixes (robustness + cleanup) Addresses 6 of 8 follow-up suggestions from the Phase 4.2 code review. Approve-with-suggestions; nothing blocking, all minor robustness or clean-up items. #1 (doc). Soften the cross-host-auth narrative in fetch_redirected_text. `curl -L -u` also strips credentials on cross-host hops; only `-H "Authorization: ..."` would leak. Bash is safe today via `-u`. Python's explicit strip is the symmetric future-proof version (in case the MCP server ever needs header-based auth). #4 (robustness). Case-insensitive cross-host comparison in fetch_redirected_text. RFC 3986 §3.2.2 makes hostnames case-insensitive; a redirect to API.bitbucket.org would have triggered a needless auth-strip on a capitalisation-only diff. Now lowercased both at construction and on each new host. #5 (resource hygiene). Explicit `e.close()` on the redirect-path HTTPError so the underlying socket is released immediately rather than waiting on GC under load. Wrapped in try/finally so a raise from the redirect-extract logic still closes the response. #6 (dead return). _resolve_step_uuid now returns just the uuid rather than a (uuid, name) tuple. The name was never consumed by pipeline_logs. Callers that want the name should fetch pipeline_steps() themselves. Simpler API, no behavioural change. #7 (nit). Dropped redundant `safe='-'` from _wrap_uuid's quote() call. Dash is in urllib's always-safe unreserved set. #8 (naming). Renamed pipeline_steps_raw -> _pipeline_steps_by_uuid (leading underscore to match the rest of the file's internal-helper convention; the function was only called module-internally). Bash bug newly surfaced by the review, queued for 4.7: cmd_pipeline_trigger drops `variables` entirely when `pattern` is empty (bb:382-391: the no-pattern branch builds a payload without the variables array, even when the caller passed VAR=value pairs). Python pipeline_trigger handles this consistently; tests pin it. Deferred: #2: bash bug above — fixed in 4.7 not here (Python-only PR). #3: empty-variables payload shape (omit vs []). Python omits; bash emits []. Both are valid against Bitbucket. Picking one and aligning is a 4.7 follow-up. Test count unchanged (90); test_resolve_step_uuid updated to assert the new single-uuid return. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Tighten finally-close comment per re-review nit The finally runs on every HTTPError path (both 3xx and non-3xx), not just redirects. Reword the comment so a future reader doesn't assume the close is redirect-only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Add bb_ops pull-request operations + tests Phase 4.3 of the MCP server work. Adds the PR subset of bb_ops with 33 new tests, 123 total. Public surface: prs_list(client, ws, repo, *, state="OPEN", count=25) State filter + paginated count, matching bash defaults. Walks pages when count > 100 (Bitbucket's cap). pr_show(client, ws, repo, pr_id) GET /pullrequests/{id} with positive-int validation on pr_id. pr_activity(client, ws, repo, pr_id, *, count=50) Walks the activity stream — used by bash to surface approver names, surfaced as its own op so the MCP agent can render its own view. pr_create(client, ws, repo, *, title, source_branch, destination_branch="main", description="", close_source_branch=True, reviewers=None) Builds the POST payload matching Bitbucket's contract. Optional reviewers as a list of UUID strings (Bitbucket expects `[{"uuid": "..."}, ...]`). Empty description / empty reviewers omitted from the payload (Python convention; bash's "always include empty description" is a 4.7 alignment item). pr_approve(client, ws, repo, pr_id) POST /approve. Returns the approval record — bash discards it with `> /dev/null`. pr_unapprove(client, ws, repo, pr_id) DELETE /approve. Not exposed by bash today (4.7 parity gap). pr_merge(client, ws, repo, pr_id, *, strategy="merge_commit", close_source_branch=True, message=None) PUT /merge with strategy + close_source_branch + optional message. Strategy validated at boundary: merge_commit / squash / fast_forward. Mirrors bash's PUT verb (Bitbucket has historically accepted both PUT and POST; flagged for 4.7 investigation against current REST docs). pr_decline(client, ws, repo, pr_id) POST /decline. pr_diff(client, ws, repo, pr_id, *, timeout=120) Routes through fetch_redirected_text so the cross-host-auth-strip protection applies if Bitbucket ever redirects this endpoint. Bash uses `curl -sf` without `-L`, so it would fail visibly in that case — Python continues to work safely (parity divergence noted for 4.7). pr_comments_list(client, ws, repo, pr_id, *, count=100) Paginated comments listing matching bash's default pagelen. pr_comment_add(client, ws, repo, pr_id, body) POST /comments with payload {"content": {"raw": "<text>"}}. Not exposed by bash today (4.7 parity gap). Validation: every function rejects invalid pr_id (must be positive int) and non-string body / message values at the Python boundary. Tests assert `opener.calls == []` after a rejection to prove no request was emitted on bad input. Bash parity items newly surfaced (queued for 4.7): - cmd_pr_approve / cmd_pr_decline discard API responses with > /dev/null. - cmd_pr_create hardcodes close_source_branch=true; expose as a flag. - cmd_pr_create always includes an empty description; align with Python's omit-when-empty. - cmd_pr_merge uses PUT; verify against current docs and align both sides on one verb. - cmd_pr_diff uses curl -sf without -L; Python's fetch handles redirects safely. Test discipline: every HTTP-touching test asserts URL, method, AND body shape, not just response value. Rejection tests assert opener.calls == [] to prove no network IO on bad input. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.3 review fixes (bool-is-int, type checks, test coverage) Addresses all 12 findings from the code review. The strongest finding was a Python footgun: bool is a subclass of int, so a bare isinstance(x, int) check accepts True/False and stringifies them in URLs as 'True'/'False' rather than '1'/'0'. The cascading fix closes six findings at once. High-severity: #1 _validate_pr_id accepted bool. Now uses a shared _is_positive_int helper that explicitly excludes bool. Same fix applied to every count validator (prs_list, pr_activity, pr_comments_list, pipelines_list, _resolve_pipeline_uuid) and the step_index check. test_rejects_invalid_pr_id extended with True/False; the new test_every_pr_op_rejects_bad_pr_id matrix runs every PR-id-taking op (8 functions × 7 bad values) so a future refactor that drops _validate_pr_id from any one op gets caught. #2 pr_create(reviewers="alice-uuid") would have iterated the string char-by-char and built [{"uuid":"a"}, {"uuid":"l"}, ...]. Added an explicit isinstance(reviewers, str) early-reject with a hint ("did you mean [...]?"). Test added. #3 Same bool-is-int slip on every count parameter (fixed by the shared _is_positive_int helper above). Medium: #4 pr_create now type-checks `description`; a dict or other non-string raises rather than slipping into the JSON payload. Test added. #5 _KNOWN_PR_STATES docstring claim that Bitbucket accepts comma- separated `?state=OPEN,MERGED` was wrong; that shape returns 400 or empty. Docstring corrected to point at the BBQL `q` parameter for compound filtering. #6 close_source_branch now type-checked in both pr_create and pr_merge; a string like "yes" no longer slips into the JSON payload. Tests added. Low: #7 pr_merge(message="") was accepted; now rejected for consistency with pr_comment_add(body=""). Empty merge-message would leave the merge commit's subject line blank. #8 pr_diff docstring softened — the "Python continues to work safely" framing didn't acknowledge that the returned body would be the redirect target's content if a redirect ever materialised. Now spells out that today the diff endpoint doesn't redirect and flags the behavioural divergence from bash if it ever does. #9 Whitespace-only description (`description=" \n "`) now omitted from the payload via description.strip() check. Test added. Test gaps closed: #10 True/False added to test_rejects_invalid_pr_id. #11 New test_every_pr_op_rejects_bad_pr_id matrix covers every PR-id-taking op (9 functions × 7 bad values = 63 cases). #12 TestPrActivity now has count-walks-pages, invalid-count, and invalid-pr_id tests, symmetric with TestPrsList. Test count: 123 -> 141. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.3 round-2 review fixes Addresses 6 of 7 findings from the round-2 review. The strongest finding was a real boundary-validation gap: _KNOWN_PR_STATES was added in the previous fix round but never actually referenced — defined-and-documented as the prs_list state guard, yet the function only type-checked. Closes the gap. High: #1 prs_list now validates state against _KNOWN_PR_STATES. Typos ("OPENED"), case bugs ("open"), and unsupported compound forms ("OPEN,MERGED" — which Bitbucket does NOT accept on the simple ?state= filter; that needs ?q=) all fail at the boundary instead of burning an API call for a 400 / empty result. Symmetric with pr_merge's _VALID_MERGE_STRATEGIES check. #2 Whitespace-only string payloads no longer slip through. Replaced `not value` with `not value.strip()` in five places: - pr_create.title / source_branch / destination_branch - pr_merge.message - pr_comment_add.body Without this, a title like " " or a comment body like "\n\t" would have shipped to Bitbucket as accepted-but-visually-blank fields. Error messages now say "non-empty, non-whitespace". Test coverage: #3 TestPrsList.test_rejects_non_positive_count now includes True / False, matching TestPrActivity. Without symmetry, a refactor that reverted prs_list's _is_positive_int(count) check to a naive isinstance(count, int) would have regressed silently. #4 TestPrCommentsList now has test_rejects_non_positive_count with the same coverage shape (was happy-path only). #5 TestPrDiff.test_follows_cross_host_redirect_and_strips_auth pins the redirect-with-auth-strip behaviour at the pr_diff layer. Mirrors test_pipeline_logs.test_follows_s3_redirect_and_strips_auth — without it, a regression that wired pr_diff to plain client.get (which would 5xx on any future redirect because the default opener refuses 3xx) would not break any test in this file. #7 TestPrDiff.test_returns_raw_diff_text now also asserts the first hop carries Authorization. A regression that dropped the Basic header from the diff-fetch path would otherwise pass silently. Parametrized whitespace cases added across pr_create / pr_merge / pr_comment_add tests so each field's reject-on-whitespace is verified independently. Deferred (filed under 4.7): #6 pr_merge(strategy=[1,2]) raises TypeError (unhashable) rather than ValueError. Defensive-coding nit; either error surface stops the bad call. Defer to the parity sweep. Test count: 141 -> 155. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.3 round-3 review fixes Three small test/consistency fixes from the round-3 review. The PR is in good shape — reviewer explicitly called these out as non-blockers, but they're cheap and close the consistency loop. - pr_merge strategy validator now gates on isinstance(str) before the frozenset `in` check. Unhashable strategy values (list, set, dict) previously raised TypeError instead of the documented ValueError, breaking the "every boundary failure is ValueError" convention. Test extended to cover list/set/None/int alongside the original unknown-string case. - test_rejects_invalid_reviewer (formerly _empty_reviewer_uuid) now parameterizes over empty-string, None, int, and the most plausible caller mistake — pre-shaping as Bitbucket's payload contract `{"uuid": "..."}`. The validator already rejects all four; this pins coverage on the isinstance branch. - test_follows_cross_host_redirect_and_strips_auth now also asserts the first hop carries Authorization (a regression that dropped auth on the initial Bitbucket request would otherwise pass) and that the Location was actually followed (not refetched original). Symmetric with round-2 finding #7 on the no-redirect path. Test count: 155 -> 163. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ts (#5) * Add bb_ops repos / branches / vars / downloads / commits + tests Phase 4.4 of the MCP server work. Closes out the read-oriented surface on bb_ops (Phase 4.5 will add git context tools, then 4.6 wires everything into the MCP server). 48 new tests, 211 total. Public surface: repos_list(client, workspace=None, *, count=100, sort, query=None) Defaults workspace to client.config.workspace; pass explicit to query a different one (bash only ever uses BB_WORKSPACE — parity gap). Optional BBQL `q` filter for name-substring etc. (also bash gap). repo_show(client, ws, repo) GET /repositories/{ws}/{repo} — full metadata. branches_list(client, ws, repo, *, count=50, sort, query=None) Most-recently-updated first, matches bash default. Optional `q` filter. branch_show(client, ws, repo, name) Fetch a single branch by name. URL-encodes the name so `feat/widget` isn't interpreted as a sub-resource path. Not in bash today — useful for the agent's "does this branch exist?" pre-flight check before creating a PR. vars_list(client, ws, repo, *, count=100) Pipeline configuration variables. Bash masks secured values to `********` in its display layer; Python returns the raw dicts (Bitbucket sends `"value": null` for secured entries — MCP tools surface the `secured` flag so callers don't misread null as unset). downloads_list(client, ws, repo, *, count=25) Repository download artifacts. commits_list(client, ws, repo, *, branch=None, count=10) With branch=None, lists across all branches via /commits. With branch=NAME, lists commits reachable from that branch via /commits/{name}. Branch names URL-encoded. Not in bash today. Bash parity items newly surfaced (queued for 4.7): - bash repos_list / branches_list / vars_list / downloads_list don't expose BBQL `q` query filter or `count` parameters — agent has richer access by default. - bash has no branch_show, commits_list — add as new subcommands alongside the previously-queued pr_unapprove / pr_comment_add / whoami. Test discipline (same as prior bb_ops PRs): - Every HTTP test asserts URL, method, AND body shape per call. - Rejection tests assert opener.calls == [] (no network IO on bad input). - Bool-is-int (True/False) rejection covered on every count parameter. - Whitespace-only rejection on every required-string parameter. - Branch-name slash URL-encoding verified specifically in branch_show and commits_list. Test count: 163 -> 211. All passing locally on Python 3.11. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.4 review fixes Folds in all 5 findings from the code review (none blocking; reviewer explicitly called the PR "clean, well-disciplined"): 1. repos_list now applies the same workspace validation as bb_api.repo_path — rejects embedded '/', '.', '..' in addition to empty/whitespace. Without this, `workspace="acme/widget"` would silently build a single-repo endpoint URL and paginate against a response that lacks `values` — confusing failure mode the boundary validator exists to prevent everywhere else. repos_list is the only op that doesn't route through repo_path, so the check is duplicated here. 2. Removed redundant local `from urllib.parse import quote as _quote` in branch_show and commits_list; both now use the module-level `quote` import that's been there since 4.2. 3. Updated stale module docstring — was still saying "This PR (Phase 4.2) adds pipeline operations." Now reflects current scope (pipelines + PRs + repos/branches/vars/downloads/commits) and references the companion git_ops module that 4.5 will add. 4. TestBranchesList.test_query_filter now asserts the exact URL-encoded form of the BBQL query (matching the style of TestReposList.test_query_filter). The previous `"q=" in url` would have passed even if the query string were silently mangled. 5. Three new tests close coverage gaps: - TestBranchesList.test_rejects_empty_query — pin the empty/whitespace rejection that was implemented but untested. - TestCommitsList.test_count_walks_pages — pin commits_list's multi-page pagination (it has the most complex path shape of the bunch with the branch-vs-no-branch split). - TestReposList.test_rejects_workspace_with_slash + test_rejects_workspace_dot_segments — pin the new workspace validation from finding #1. Skipped optional suggestions: - vars_list `query=` filter (no concrete agent use case yet). - _collect_paginated() helper (5x repetition is still inline-readable; refactor when a 6th list op materialises). Test count: 211 -> 218. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@ss

* Add git_ops — git-context wrappers for the MCP agent Phase 4.5 of the MCP server work. Five subprocess-mocked wrappers the MCP agent uses to resolve "current branch", "is the tree clean?", "workspace/repo from origin", "recent commits", and "what's uncommitted?" — the context tools that have to run before any bb_ops call. 38 new tests, 256 total. git_ops.py public surface (stdlib-only, subprocess + bb_api.parse_remote_url): git_current_branch(path?) -> str `git rev-parse --abbrev-ref HEAD`. Detached HEAD returns the literal string "HEAD" (matches git's own output; callers detect detached state explicitly). git_remote_repo(path?) -> (workspace, repo_slug) Routes through bb_api.parse_remote_url for parity with bb_api.detect_repo's loose-host parsing (intentional — enterprise Bitbucket Server uses non-bitbucket.org URLs). Returns BOTH the workspace AND repo, unlike bb_api.detect_repo which returns only the slug. git_status(path?) -> dict Parses `git status --porcelain=v2 --branch --untracked-files=normal`. Returns a structured snapshot: branch, upstream, ahead, behind, clean, staged, modified, untracked, unmerged. Handles ordinary (type "1"), rename/copy (type "2", with new-path\told-path tab-separator), and unmerged (type "u") lines. Header lines parsed for branch.head / branch.upstream / branch.ab. git_recent_commits(path?, *, count=10, ref="HEAD") -> list[dict] `git log --pretty=format:%H<US>%h<US>%s<US>%an<US>%aI`. Uses U+001F (Unit Separator) as the field delimiter — a control character that cannot appear in commit subjects, so we never have to escape or parse-around content. Each entry: {sha, short, subject, author, date}. git_uncommitted_changes(path?) -> dict Three subprocess calls: git diff --cached -> staged_diff git diff -> working_diff git ls-files --others -> untracked_files --exclude-standard Diffs returned as raw unified-diff text; agent can show verbatim or parse further. Error handling: GitOpError carries the failing command + returncode + stderr (truncated in the message). Distinct from BBApiError so the MCP layer can render "git failure" vs "Bitbucket failure" differently. FileNotFoundError on `git` binary missing -> GitOpError(127, ..., "git executable not found on PATH"). Matches bb_api.detect_repo's behaviour. Test discipline (same as bb_api / bb_ops): - Every test asserts the EXACT subprocess shape (command + args + cwd + capture_output / text / check). A refactor swapping `git status --porcelain=v2` for v1 would change every parse result but the canned-output tests would still pass without this. - Parametrised tests on remote-URL shapes (https, ssh, embedded user-info, self-hosted Bitbucket Server) — verify the loose-host parser handles every shape the bash side accepts. - git_status parser tested against five realistic v2 captures: clean, ahead/behind, no-upstream, mixed (staged + modified + untracked), renamed, unmerged. - git_recent_commits format string pinned: %H, %h, %s, %an, %aI all present + U+001F delimiter. Malformed log lines skipped defensively rather than crashing. - Invalid-count / empty-ref rejection on git_recent_commits with True/False included (bool-as-int trap). - git_uncommitted_changes verifies the EXACT three-call sequence with explicit `cwd=` propagation on all three. CI / packaging: pyproject.toml py-modules += "git_ops". ci.yml syntax-check += "git_ops.py". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.5 review fixes Addresses 5 of 6 findings from the round-1 review (the 6th — str vs os.fspath — is pre-existing in bb_api and would be unrelated churn in this PR). Security: #1 git_recent_commits(ref=...) now rejects refs starting with '-'. Without this, an agent-supplied ref like `ref='--all'` would be parsed by git as an option flag, not a ref — silently replacing the call with a glob over all branches. `ref='-h'` would print help and return exit 0 with help text in stdout, which the parser would skip. The MCP agent is the consumer, and agent inputs are user-influenced, so option-injection at this boundary is a real concern. Belt + suspenders: also added `--` terminator to the `git log` invocation so anything past it is parsed as a positional ref regardless. Correctness: #3 GitOpError(returncode=0) for parse failures was misleading. A real git exit is always >= 0, so callers branching on `err.returncode == 0 -> success` would have been confused. Now parse-failure paths use the GIT_PARSE_ERROR_RETURNCODE sentinel (-1) and the class docstring documents the convention. Both parse-error tests now assert on the sentinel. Defensive: #4 git_recent_commits added an upper bound of _MAX_LOG_COUNT (1000) on `count` so an agent that confabulates `count=1_000_000` doesn't silently pull a million records back across the MCP boundary. Documentation: #2 git_status docstring now documents the C-quoted-pathname limitation as a known caveat (porcelain v2 line-oriented output quotes pathnames with newlines/tabs/quotes/control-chars). Fixing properly requires switching to `-z` + NUL-split parser — deferred until a real bug report shows up. #6 _LOG_FIELD_SEP comment now says "almost never" rather than "cannot" — git stores arbitrary bytes so U+001F technically could appear in a commit subject; the parser already handles it defensively. New tests: - 4 parametrized cases for ref-starts-with-dash rejection. - count-above-cap rejection + count-at-cap accepted (boundary verification). - Spaces-in-filename preserved (porcelain v2 type-1 line uses split(" ", 8) — the suggested regression guard). - Existing parse-failure tests now assert returncode is the sentinel. Skipped: #5 (str vs os.fspath nit). Pre-existing in bb_api.detect_repo, would be unrelated churn here; defer to a separate consistency PR. Test count: 256 -> 263. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.5 round-2 review fixes Substantive round-2 review surfaced 10 findings (medium to very-low), several with concrete failure modes the MCP server would hit in production. All 10 addressed plus the 3 test-coverage gaps. Medium-severity correctness: #1 git diff no longer leaks ANSI color codes. _run_git now prepends `git -c color.ui=never` to every invocation, overriding any user's `color.ui = always` gitconfig that would otherwise emit `\x1b[31m...` escapes into pipe-captured output. Diffs and log output are now safe to ship to the MCP agent verbatim. #2 Detached-HEAD reporting normalised. `git_current_branch` already returned the literal "HEAD" for detached state; the porcelain v2 parser returned "(detached)" for the same state. Now both return "HEAD" so cross-checks between the two functions agree on the underlying state. Test added. #3 subprocess.run now has timeout=30s with TimeoutExpired wrapped as GitOpError(returncode=GIT_PARSE_ERROR_RETURNCODE). A wedged git (held index.lock, credential-helper prompting on stdin, NFS server gone) can no longer hang the MCP server thread indefinitely. Test added. #4 subprocess.run now passes encoding="utf-8", errors="replace" explicitly. text=True alone defers to locale.getpreferredencoding(), which on minimal Docker / cron / systemd contexts is ASCII or C — a UTF-8 filename or author name would crash inside subprocess.run before _run_git could wrap it. Explicit UTF-8 + replace-on-error means we never lose to a locale-restricted host. Low-severity correctness / data loss: #6 splitlines() replaced with split("\n") in three parsers. The broader splitlines() also breaks on \r / \v / \f / U+0085 / U+2028 / U+2029, any of which in a commit subject (or path) would fragment one logical record into multiple "lines" that each fail the per-line guards and silently vanish. Test added covering carriage-return-in-subject preservation. #7 git_uncommitted_changes diffs now capped at _MAX_DIFF_BYTES (1 MiB) per diff, with an explicit truncation marker so the caller (the MCP agent) knows what happened and can fall back to `git diff --stat`. Without this cap, a stray multi-GB staged blob would OOM-kill the MCP server. Test added. #8 git_recent_commits parser now skips records with empty SHA. A line of pure separators ("\x1f\x1f\x1f\x1f") passes the parts-count guard and would otherwise append a degenerate {"sha":"", "short":"", ...} entry. Test added. Very-low defensive: #9 branch.ab parser now validates sign prefixes ("+N -M") before stripping. A malformed "-3 +1" line would otherwise propagate bogus values through as negative ahead / positive behind. Test added. #10 Type-1/2 XY field bounds-checked. A single-char XY from a corrupted git output stream no longer raises IndexError outside GitOpError; the malformed line is skipped defensively. Test added. Docstring: #5 _LOG_FIELD_SEP comment further softened — explicitly notes the trade-off ("rare data loss on pathological commit" vs "robust separator that won't collide with common subject content") and points to `git log -z` as the genuinely-safe variant if this ever bites a real user. Test-coverage gaps closed: - test_unborn_branch_raises_giterror — pin real-world behaviour where `git log` on a freshly-init'd repo exits 128 (the previous test_empty_output_returns_empty_list only exercised the mocked exit-0-empty-stdout path). - test_passes_cwd_when_path_given added to git_remote_repo, git_status, and git_recent_commits — was previously only on git_current_branch and git_uncommitted_changes. - test_detached_head_normalised_to_HEAD pins the new cross-function sentinel agreement. All subprocess-shape assertions updated for the new `git -c color.ui=never` prefix (argv indices shifted by 2). Skipped: - The U+001F dropping case (#5 partial) — already returns [] for a pathological subject; the docstring softening is the right response without a real-user bug report. Test count: 263 -> 274. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.5 round-3 review fixes (hardening) Reviewer flagged the PR as "in good shape to merge" but surfaced 5 hardening items, two of which they explicitly called out as worth-doing-before-MCP-wires-this-up. All 5 addressed plus the corresponding test coverage. Security: #2 Origin URL credential redaction. CI pipelines and credential- helper-less dev setups commonly embed auth in remote URLs (`https://x-token-auth:abcd@bitbucket.org/...`). On the rare unparseable URL, the secret would have landed in the GitOpError message, flowed up through MCP into agent context, and likely ended up in downstream logs. New `_redact_url_creds()` strips `://user:token@` -> `://[redacted]@` before any URL is echoed in an error string. Test added that asserts the secret token is absent from the raised error and that "[redacted]" appears. Reliability: #3 stdin / env hardening on every git call. `_run_git` now passes `stdin=subprocess.DEVNULL` so a credential prompt fails immediately with EOF rather than blocking on the MCP server's inherited stdin. Environment also adds GIT_TERMINAL_PROMPT=0 (git refuses to prompt at all) and GIT_ASKPASS="" (no GUI askpass dialog). Belt + suspenders alongside the 30s timeout — what was previously a 30-second wedge into "timed out" is now an instant clear error. Subprocess-shape assertions updated. UX: #1 FileNotFoundError disambiguation. subprocess.run raises FileNotFoundError for TWO distinct cases — git binary missing OR cwd directory missing — which the previous catch conflated into "git executable not found on PATH". Once the MCP agent is passing paths (e.g. a worktree that's since been removed), the user would chase their PATH config when the real cause is the bad directory. New code inspects `e.filename`: if it matches the cwd, raises with "path does not exist: <cwd>" instead. Test added covering the cwd-missing path. #4 Unborn-branch sentinel normalisation. Round 2 normalised `# branch.head (detached)` -> "HEAD" so git_status and git_current_branch agree on detached state. The same shape of asymmetry existed for unborn branches: porcelain v2 reports `# branch.oid (initial)` + `# branch.head main` (the would-be branch) while git_current_branch raises GitOpError on the same repo. Parser now tracks an `is_unborn` flag (line order independent — branch.oid / branch.head ordering in v2 output isn't documented as stable) and overrides branch to "HEAD" at end-of-loop so the two functions agree on "weird state, not a regular branch." Test added. Defensive: #5 `_cap_diff` fast-path on `len(text)`. UTF-8 byte count is always >= char count, so if char count fits the cap, the encoded form definitely does. Avoids materialising a transient bytes copy of a multi-GB string just to check its size — the cap exists specifically to defend against OOM on huge diffs, and the check itself shouldn't be the trigger. Deferred: None. All 5 findings addressed in this commit. Test count: 274 -> 277. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.5 round-4 review fixes Round 4 found a real bug I introduced in round 3 plus two cheap hardening items. All three addressed. Real bug fixed: #1 _cap_diff fast path was inverted. The round-3 fast path used `if len(text) <= cap: return text` with the reasoning "UTF-8 byte count is always >= char count, so chars <= cap implies bytes <= cap." That's backwards — bytes >= chars means `chars <= cap` does NOT imply `bytes <= cap`. A 600K-emoji string was 600K chars (under the 1 MiB char cap) but encoded to 2.4 MiB, defeating the OOM defense for non-ASCII content. The existing oversize-diff test used ASCII `"x"` chars (where chars == bytes), so the divergence was never exercised — the bug rode through round-3 review undetected. Fixed by gating the fast path on `text.isascii()`: ASCII-only strings genuinely have bytes == chars, so the len check is sound. Non-ASCII falls through to the encode-and-measure path. Regression test added with `"\U0001F600" * 600_000` (4-byte emoji) — asserts the returned diff is byte-capped, not char-capped. Hardening: #2 GitOpError.command in parse-failure paths now includes the `-c color.ui=never` prefix that _run_git actually passes to subprocess. Previously err.command in git_current_branch's empty-output path and git_remote_repo's unparseable-URL path hard-coded the unprefixed form, so a caller introspecting err.command (or reproducing the failure from the message) would see a different invocation than what was actually run. Tests strengthened to assert the prefix is present. #3 File-list cap added: _MAX_PATH_LIST = 10_000. Mirrors the _MAX_DIFF_BYTES and _MAX_LOG_COUNT bounds that already protect the other two agent-facing surfaces. Without this, a repo that forgot to gitignore node_modules / .venv / target/ would return hundreds of thousands of untracked paths through MCP — plausible OOM trigger. Applied to: - git_uncommitted_changes' untracked_files - _parse_status_porcelain_v2's staged / modified / untracked / unmerged lists New `_truncated_path_list` helper appends a sentinel marker (`<... N more paths omitted (cap N exceeded)>`) so the agent sees the truncation and can fall back to a narrower query. New TestPathListCap class pins the helper directly: under-cap / at-cap / one-over / many-over / git_status integration. Test count: 277 -> 284. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.5 round-5 review fixes Round 5 (extra-high recall mode) surfaced 12 findings. Reviewer's explicit verdict: "Findings 1, 2, and 4 are concrete defects worth fixing before Phase 4.6 wires this up to an agent." Addressed those three plus six more cheap real items; skipped three cosmetic / very- narrow ones. Correctness: #1 branch.ab strict validation. The startswith-only check accepted smuggled double-sign strings like "+-3 -1" where `int(parts[0][1:])` parsed as -3, contradicting the parser's own promise to reject negative values. Now requires `parts[N][1:].isdigit()` after the sign so "+5", "-3" parse but "+-3", "++5", "+junk" don't. Parametrized regression test added. #2 branch.ab atomic update. Previous try/except wrapped both int() calls together — if the first succeeded and the second raised, out["ahead"] was mutated to the new value while out["behind"] kept the default 0, producing an internally-inconsistent dict on a half-malformed line ("+5 -junk"). Now parses both into locals first, commits to out[] only on full success. #4 NotADirectoryError + PermissionError wrapped as GitOpError. Previously only FileNotFoundError + TimeoutExpired were caught, so passing path=/etc/passwd (regular file) or path=/restricted (unreadable directory) leaked raw OSErrors out of _run_git, breaking the module docstring's "Errors raise GitOpError" promise. Tests added with dedicated runner stubs for each error class. Security: #3 URL credential redaction handles `@` in passwords. The old `[^/@]+@` regex stopped at the first `@`, leaving the tail of a password containing a literal `@` exposed (e.g. `https://user:p@ss@host/...` → `https://[redacted]@ss@host/...`). Switched to `[^/]+@` which is greedy up to the LAST `@` before the path. Regression test added. Architecture: #11 _truncated_path_list returns (list, omitted_count) tuple instead of mixing a sentinel string into the path list. The in-list sentinel broke the "list of valid paths" contract — an agent iterating with `os.stat(p)` / `Path(p).exists()` / `os.path.join(root, p)` would hit the non-path "<...>" entry and crash or silently mis-handle it. Now the cap surfaces via sibling `<key>_omitted` fields (always present, 0 when not truncated) so the agent can detect truncation explicitly without iterating over non-path data. Applied to: - _parse_status_porcelain_v2: adds staged_omitted / modified_omitted / untracked_omitted / unmerged_omitted - git_uncommitted_changes: adds untracked_files_omitted #12 GIT_PARSE_ERROR_RETURNCODE moved from -1 to -1000. Python's subprocess.run uses negative returncodes for signal-killed processes (-1 = SIGHUP, -9 = SIGKILL, -15 = SIGTERM, etc.). A SIGHUP-killed git would otherwise land at returncode = -1, colliding with the parse-error sentinel and being misclassified. -1000 is safely outside the signal range. Test added asserting the sentinel is < -100. Documentation / parity: #5 git_status docstring updated to reflect the actual return shape. Previously said `"branch": ... | "(detached)"` but the code returns `"HEAD"` for both detached and unborn (per round 3's normalization). A caller reading the docstring and writing `if branch == "(detached)": handle_detached()` would never enter that branch. #6 Type-? (untracked) parser skips empty-path lines. Previously `"? "` (prefix only, no path body) would append "" to untracked and flip clean=False with a phantom entry. Asymmetric with type-1 / type-2 which have length guards. Test added. #8 Type-u (unmerged) parser checks XY field width. Type-1 and type-2 paths added this in round 3; the type-u path missed. Trivial parity fix. Test added. Skipped: #7 (cosmetic — diff cap is _MAX_DIFF_BYTES + ~140 marker bytes, not a strict cap; mattering only to a downstream consumer that enforces an exact ceiling, which no current consumer does) #9 (doc asymmetry — `git_uncommitted_changes` docstring doesn't mirror `git_status`'s C-quoting caveat; added to git_status' expanded docstring covering both surfaces) #10 (SCP-style URL credentials — narrow trigger, parse_remote_url accepts most SCP shapes; revisit if a real leak surfaces) Test count: 284 -> 296. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Add mcp_server.py — FastMCP wrapper around bb_ops + git_ops Phase 4.6 of the MCP server work. Wires every bb_ops and git_ops function into a FastMCP tool registry, with a self-bootstrap venv mirroring zenhub-cli's mcp_server.py pattern. 39 new tests, 335 total. mcp_server.py — 30 tools: PIPELINES (6): pipelines_list, pipeline_show, pipeline_steps, pipeline_trigger, pipeline_stop, pipeline_logs PULL REQUESTS (11): prs_list, pr_show, pr_activity, pr_create, pr_approve, pr_unapprove, pr_merge, pr_decline, pr_diff, pr_comments_list, pr_comment_add REPOS / METADATA (7): repos_list, repo_show, branches_list, branch_show, vars_list, downloads_list, commits_list GIT CONTEXT (5): git_current_branch, git_status, git_remote_repo, git_recent_commits, git_uncommitted_changes META (1): whoami Self-bootstrap pattern (mirrors zenhub-cli): - On launch, checks /tmp/bbenv. If missing, finds a python3 >= 3.10 (preferring the launching interpreter, falling back to pyenv / Homebrew / system locations), runs `python -m venv`, installs the `mcp` package, then `os.execv`s into the venv. - Bootstrap is stdlib-only (uses subprocess + sys, no third-party imports until after the venv is active). Means any python3 on PATH can launch the server. - `mcp` is the only venv dep — no heavy ML libs (no torch / sentence-transformers / numpy / huggingface_hub). bb doesn't have semantic-similarity workflows like zenhub-cli does. - BB_MCP_SKIP_BOOTSTRAP=1 escape hatch (set in conftest.py) lets pytest exercise tool wiring and result-dict shapes without pulling in mcp. Production NEVER sets this — without real FastMCP, .run() raises. Repo resolution — every Bitbucket tool takes an optional `repo`: - "" → auto-detect via `git remote get-url origin` from BB_DEFAULT_REPO_PATH (or cwd) - "myrepo" → use BB_WORKSPACE from config + "myrepo" - "acme/myrepo" → use "acme" workspace + "myrepo" slug (overrides BB_WORKSPACE for this call) Validated in `_resolve_repo` with ValueError on malformed input ("a/b/c", "/repo", "ws/", "/", "//" all rejected before any network call burns API budget). Error envelope — every tool returns either: {"ok": True, "workspace": ..., "repo": ..., <result fields>} {"ok": False, "kind": "<ExceptionClassName>", "message": ..., <extras>} The error dict carries `status`/`url`/`body` for BBApiError, `returncode`/`stderr` for GitOpError, so the agent can branch on `kind == "BBApiError" and status == 404` without parsing the message. pr_create auto-detects source_branch via git_current_branch when empty (matches bash `bb pr-create`). whoami reports user / workspace / api_base + best-effort git context (branch, workspace, repo from remote). Token is NEVER echoed; the test explicitly asserts the token string is absent from the response. Test discipline (same as bb_api / bb_ops / git_ops): - Tool count + names pinned exactly (test_all_expected_tools_registered) — a future rename or accidental drop fails the suite. - Per-tool wiring: patch the bb_ops / git_ops function at the module level, call the tool, assert (a) the underlying function received the right args, (b) the response dict has the expected shape. Doesn't re-test bb_ops logic (already covered comprehensively). - Error envelope tested for each exception class (BBApiError, BBOpNotFound, GitOpError, ValueError). - whoami's no-token guarantee asserted as a security property. - FastMCP stub's .run() raises a clear error so a test that accidentally calls it fails loud instead of hanging on stdio. CI / packaging: pyproject.toml py-modules += "mcp_server". ci.yml syntax-check += "mcp_server.py". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.6 review fixes Round 1 review on PR #7 flagged 15 findings, with two explicitly called out as production hazards ("will bite real users on first deploy on common Linux topologies"). Addressed 11 of 15; deferring 4 with rationale. Production hazards (must-fix per reviewer): #1 Bootstrap venv detection now uses `sys.prefix == str(_VENV_DIR)` instead of `realpath(sys.executable) == realpath(_VENV_PY)`. `python -m venv` on Linux/macOS defaults to --symlinks, so realpath(/tmp/bbenv/bin/python3) resolves to the SAME canonical path as the builder interpreter (e.g. /usr/bin/python3.12) — the realpath comparison was claiming "already under venv" while we were actually still under the system interpreter, skipping the execv and dying on `from mcp.server.fastmcp import FastMCP`. sys.prefix is set per-interpreter from the venv layout and is the authoritative signal. #2 Bootstrap idempotency. Pip install can be interrupted (Ctrl-C, OOM-kill, disk full) leaving _VENV_PY in place but `mcp` missing. The old guard skipped reinstall on every subsequent launch and died on import with no self-repair path. Now writes a `.bbenv-ready` sentinel ONLY after the full bootstrap succeeds; the venv create step is also idempotent (only runs if _VENV_PY is absent) so retries pick up where they left off. #3 Pinned mcp dependency to `mcp>=1.0,<2`. The previous unpinned "mcp" would silently install a future breaking mcp 2.x on the next /tmp wipe. Matches the pyproject [mcp] extra. Correctness: #4 pr_create now rejects detached-HEAD auto-detect. git_current_branch returns the literal "HEAD" for detached/unborn state; bb_ops only checks non-empty/non-whitespace, so the call would have reached Bitbucket with source.branch.name="HEAD" and surfaced as an opaque API error. Now raises ValueError locally with a clear "pass source_branch=" hint. #5 _resolve_repo strips whitespace before parsing. A sloppy paste like " acme/widget " no longer slips through as workspace=" acme" and fails deep in the API layer. #11 pr_create source_branch auto-detect now triggers on whitespace (not just empty string), symmetric with #5. #13 OSError added to _TOOL_EXPECTED_EXCEPTIONS. Covers paths git_ops._run_git doesn't wrap explicitly (IsADirectoryError, BlockingIOError, ConnectionResetError), and catches os.getcwd() on a deleted cwd inside _default_repo_path() which fires BEFORE any wrapped git call. Surface improvements: #6 pipelines_list now exposes `sort=` kwarg. Default "-created_on" (newest first); agent can pass "created_on" for oldest-first or other sort keys. #7 pipeline_logs + pr_diff now expose `timeout=` kwarg. Default 120s matches bb_ops; agent can bump for pipelines with very large log payloads or oversized PR diffs that would otherwise cap out. #8 git_status payload renamed from `status` to `working_tree`. The `status` key collided with the HTTP-status field _error_dict uses for BBApiError. Today the collision can't fire (git_status doesn't raise BBApiError), but the rename pre-empts a future broadening hazard. #10 branch_show echoes the stripped name in the response. bb_ops strips before encoding the URL, so the response now matches what Bitbucket actually resolved rather than the unstripped input. Deferred (with rationale): #9 Response-key naming consistency (approval/result/pr). Reviewer suggested standardizing all mutation tools to "result", but the descriptive keys carry useful semantic info — pr_approve returns an approval record, pr_merge returns the merged PR, etc. The docstrings explain the shape. Skipping. #12 _client_cache concurrency + invalidation. Real concern for long-lived servers with rotated tokens, but stdio MCP is single-threaded and token rotation is rare; defer to Phase 4.6+ if a real bug surfaces. #14 CI smoke step that actually imports the real mcp package. Larger CI workflow change; defer. #15 README MCP documentation. Explicitly Phase 4.8 scope. Test count: 335 -> 341. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.6 round-2 review fixes Round 2 surfaced 15 new findings including a real security bug. Fixed all 15. Security: #1 BBApiError.url now redacted in _error_dict. pipeline_logs and pr_diff follow Bitbucket's 307 to a presigned S3 URL via fetch_redirected_text; on a downstream non-3xx (S3 clock skew, expired signature, redirect cap), BBApiError(url=<signed S3 URL>) carried AWS-credential-bearing query parameters (X-Amz-Signature, X-Amz-Credential) verbatim into the agent error dict and downstream logs. New _redact_url helper strips `user:token@` AND replaces signed-URL query strings with `[redacted-signed-url-params]` while preserving the path so the agent knows what host was called. Correctness: #2 AttributeError added to _TOOL_EXPECTED_EXCEPTIONS. JSON null from an MCP client deserialises to Python None; `.strip()` on None crashes uncaught. Also pre-normalised every stringy arg path through `(value or "").strip()` so None and "" funnel to the same auto-detect path uniformly. #3 _default_repo_path no longer evaluates os.getcwd() eagerly. `os.environ.get("KEY", os.getcwd())` runs the default even when the env var is set — the override never protected against a deleted cwd. Now uses `... or os.getcwd()` so the env var is the lazy short-circuit. whoami also wraps cwd resolution in a try block so a deleted cwd surfaces as a structured cwd_error field instead of escaping the function. #4 mcp import wrapped with helpful recovery message. If `mcp/` gets deleted post-bootstrap (manual `pip uninstall`, partial /tmp cleanup that wiped the package dir but spared the sentinel), the ImportError now tells the user to `rm .bbenv-ready` or `rm -rf /tmp/bbenv`. #5 _resolve_repo now strips inner slug parts. "acme/ widget" no longer slips through as workspace="acme", slug=" widget" and 404s on `/repositories/acme/%20widget`. #6 repos_list workspace stripped. ` acme` / `acme ` no longer slip through. #7 TypeError removed from _TOOL_EXPECTED_EXCEPTIONS (was contradicting the file's stated philosophy). A refactor that renames a bb_ops kwarg now crashes visibly during dev/test instead of being silently reported back to the agent as a fake Bitbucket failure. #8 whoami git-context handlers now use _TOOL_EXPECTED_EXCEPTIONS (not narrow GitOpError) AND store the full structured _error_dict (not str(e)). An agent branching on err.returncode for git-127 vs not-a-git-repo-128 vs timeout-(-1000) has the same shape available as every other tool. #10 _opt_str() helper introduced. (branch or None) became `_opt_str(branch)` everywhere — "" / " " / None all funnel to None at the bb_ops boundary. Without this, " " on a branch filter posted `?target.ref_name=%20%20`, on a pattern raised a confusing two-layers-down ValueError, on commits raised a different "non-empty, non-whitespace" error. Inconsistent surfaces for the same agent typo. #12 sys.prefix venv check now resolves both sides through Path.resolve() so macOS's `/tmp` -> `/private/tmp` symlink doesn't trigger an infinite execv loop. #13 Every tool that takes an identifier (pr_id / number / step_index) now threads it into the error dict via _error_dict_with. An agent fanning out parallel pipeline_logs(number=42, step_index=5) and pipeline_logs(number=42, step_index=6) can correlate failures to originating requests without substring-matching the message. #14 _resolve_repo validates the repo arg shape BEFORE calling _get_client(). A fresh-machine user without config + a malformed slug now sees the actual ValueError, not the BBConfigError that previously masked the real cause. #15 git_recent_commits echoes the stripped ref in the response (matches what `git log` actually ran). Consistent with branch_show. UX: #11 pip install no longer --quiet. Now uses subprocess.run capture_output=True and re-raises with the captured stderr inlined so a network blip / mirror 503 / version yank / SSL cert error / proxy block surfaces with pip's actual diagnostic instead of bare `CalledProcessError: returned non-zero exit status 1`. Deferred: #9 Test recorder accepts any kwarg name (false sense of wiring pinned). Real concern but the fix (functools.wraps or create_autospec) is invasive across the test suite; defer to a focused refactor. Test count: 341 -> 348. All passing locally on Python 3.11. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.6 round-3 review fixes (security regression + edges) Round 3 caught a real security regression of round 2's fix, plus a batch of consistency / hardening items. Security regression fixed: #1 _error_dict's `message` field now also routed through URL redaction. Round 2 redacted `out["url"]` but left `out["message"]` carrying the raw URL — BBApiError.__str__ constructs the message via `f"HTTP {status} from {url}: {body[:500]}"`, so str(e) ALWAYS contains the original URL. For the S3 presigned case (pipeline_logs / pr_diff → 307 → S3 non-3xx), the error dict was shipping: {"url": "...?[redacted-signed-url-params]", # fine "message": "HTTP 403 from https://.../?X-Amz-Signature=secret..."} ^^^^^^ secret leaked The existing test_bbapierror_redacts_signed_s3_url only asserted on d["url"], so the regression was invisible to CI. The fix adds a `_redact_message` helper that scans for `http(s)://...` substrings and runs each through `_redact_url`. Tests now assert redaction on BOTH `url` AND `message` for every cred-bearing fixture. #9 Case-insensitive signed-URL indicator matching. MinIO / R2 / Backblaze / mixed-case AWS variants would have slipped past the case-sensitive substring check. Now lowercases the query part before membership. Tests added for lowercase variant + plain `Signature=` form (used by some non-AWS S3-compatible services). Correctness: #2 pipeline_trigger now validates + strips `branch`. Was the only branch-accepting tool that forwarded the arg verbatim — every other tool funnels through _opt_str, but pipeline_trigger couldn't (branch is required, not optional). Now explicitly validates non-empty/non-whitespace and strips, so `" main "` becomes `"main"` and `" "` raises a clean local ValueError instead of POSTing whitespace and getting an opaque 4xx. #3 _resolve_repo bare-slug path now validates `repo not in (".", "..")` before _get_client(). Round 2 fixed the slash-containing branch but the bare-slug fallback still hit _get_client() first, so a fresh-machine user with `repo='.'` saw BBConfigError masking the real ValueError. Parametrized regression test added. #5 pr_create now rejects explicit `source_branch="HEAD"` (not just the auto-detect path). Same severity class as round-1 #4 — just on the other entry point. Bitbucket would have silently created a degenerate PR named after the literal HEAD ref. #6 pr_create error path now uses _error_dict_with(e, title=...). Threads the title into the error dict so parallel pr_create fan-outs (e.g. agent creating one PR per stacked branch in a train) can correlate failures with originating requests. Defensive: #10 os.execv argv now uses Path(__file__).resolve(). Today's documented launch pattern uses absolute paths so the issue can't fire; this hardens against a future bootstrap enhancement or sitecustomize.py that introduces a chdir between launch and execv. Deferred (carried forward with rationale): #4 _recorder accepts any kwarg name (false sense of wiring pinned). Fixing requires either create_autospec or inspect.signature.bind across the test suite — bigger refactor than this commit. Notable enough to track separately. #7 _client_cache invalidation. Long-lived MCP holds the original BBClient for its lifetime; rotating BB_TOKEN requires restart. Architectural, defer. #8 git_recent_commits silently defaults None ref to "HEAD". Cosmetic UX, defer. Test count: 348 -> 356. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.6 round-4 review fixes (structural redaction) Round 4 caught a third iteration of the same credential-leak pattern. This commit closes it structurally so the class can't regress. Critical security fix (structural): #1 _error_dict now routes EVERY string field through _safe_text: message, body (BBApiError), stderr (GitOpError). The previous rounds whack-a-mole'd one field at a time: - Round 2: redacted `url` → message still leaked - Round 3: redacted `message` → body still leaked - Round 4: redacted `body` → stderr would leak the moment Phase 4.7 wraps `git fetch` / `git push` (whose stderr commonly contains `fatal: unable to access 'https://x-token- auth:TOKEN@bb.org/...'`). Unified `_safe_text()` helper applies URL + SCP-cred redaction to free-form text. Tests now assert redaction across ("url", "message", "body") for the API case AND ("message", "stderr") for the git case, plus a new test_giterror_stderr_redacted that covers the Phase 4.7 hazard pre-emptively. #2 SECURITY: signed-URL indicator list broadened beyond AWS. Added: Azure SAS (`sig=`, lowercase only — the trailing `=` blocks `signature=` substring matching), GCP X-Goog-Signature / -Credential, bearer-token-in-URL forms (`access_token=`, `api_key=`). Plus the existing AWS / generic `Signature=`. Tests added for Azure SAS and bearer-token shapes. #8 SECURITY: free-form-text URL redaction broadened to any scheme. The previous `https?://` regex missed `ssh://`, `git+ssh://`, etc. The new `_ANY_URL_PATTERN` matches any URL scheme; combined with the existing SCP-style pattern, free-form git stderr / log output gets redacted regardless of remote URL shape. Test added: ssh:// URL with embedded passphrase redacted by _safe_text. Correctness: #3 pr_create normalises every string arg at the wrapper boundary (source_branch, destination_branch, title, description). The wrapper previously only used `.strip()` for the empty-check guard, then forwarded the unstripped value to bb_ops which wrote it verbatim into the JSON payload. `"feat/widget "` became a 404, `"Hi "` became a PR title with trailing whitespace. #4 pr_create handles JSON null for source_branch. Previously `None.strip()` crashed uncaught with AttributeError (round 2 fixed this class for _resolve_repo but missed pr_create). Title required validation also added — empty/whitespace title raises ValueError with the rejected value echoed. #6 _resolve_repo slash-branch now validates `.` / `..` in either workspace OR repo segment BEFORE _get_client(). Round 3 only landed the check on the bare-slug fallback; `repo="./foo"` still surfaced as BBConfigError on a config-less machine. Parametrized regression test added. #9 git_recent_commits ref normalisation handles None and whitespace-only uniformly. `(ref or "").strip() or "HEAD"` — None and " " both fall back to HEAD, consistent with the empty-string success path. Test hermeticity: #12 _reset_state fixture now scrubs BB_USER / BB_TOKEN / BB_WORKSPACE / BB_API_BASE before each test. A developer running pytest locally with these exported in their shell no longer leaks ambient config into the tests — the suite is hermetic against dev env. Deferred (carried forward with rationale): #5 Identifier-threading inconsistency on remaining tools (pipeline_trigger / branches_list / branch_show / commits_list / repos_list / repo_show / vars_list / downloads_list). Real but stylistic; pick this up in a focused symmetry pass. #7 Stderr redaction proactive defense for future git wrappers — addressed structurally by routing stderr through _safe_text in #1 above (so technically not deferred). #10 _client_cache invalidation. Architectural; documented in earlier commit messages, defer. #11 pr_approve/pr_unapprove key naming inconsistency. Pure naming; intentional descriptive names per round-1 #9 deferral. #13 _resolve_repo error path lacks workspace/repo/path. Style; bundle with #5 in a symmetry pass. Test count: 356 -> 360. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Phase 4.7: bash parity — 5 new commands + 8 parity bugfixes Extends bb with the commands the Python ops exposed but bash didn't, and fixes the parity bugs Python tests surfaced. Also removes a personal-data leak in the help text. NEW COMMANDS bb pr-unapprove [repo] <id> DELETE /pullrequests/{id}/approve. bb pr-comment [repo] <id> <body> POST /pullrequests/{id}/comments with {"content": {"raw": "..."}}. bb branch [repo] <name> GET single branch detail; URL-encodes slashes in name (feat/widget → feat%2Fwidget). bb commits [repo] [branch] [count] GET /commits or /commits/{branch} with branch URL-encoding + count. bb whoami Resolved config + git context + /user reachability probe. NEVER echoes BB_TOKEN. PARITY BUGFIXES (a) cmd_pipeline_trigger no longer drops `variables` when `pattern` is empty. Previously the no-pattern branch built the payload WITHOUT the variables array, silently discarding any VAR=value pairs the user passed. (b) cmd_pipeline_stop captures the bb_post exit code instead of `> /dev/null`-ing it. A stop failure (already stopped, permission denied, 4xx) now surfaces with a clear error and non-zero exit instead of being masked by the unconditional success print. (c) build_number → uuid lookup bumped 50 → 100 (Bitbucket's max pagelen) in cmd_pipeline / cmd_pipeline_stop / cmd_logs. Older pipelines beyond 100 still unfindable in bash; full pagination is the Python-side improvement (2000-scan cap). (d) repo_path now validates workspace + repo at the boundary: rejects empty/whitespace, embedded `/`, `.`, and `..`. Mirrors bb_api.repo_path's Python validation so both surfaces enforce the same contract — no more silent `/repositories//foo` or `/repositories/../foo`. (e) cmd_pipeline_trigger empty-variables payload now omits the key entirely (was inconsistent: pattern-set path emitted `variables: []`, no-pattern path dropped variables). Matches Python's omit-when-empty contract. (f) cmd_pr_approve / cmd_pr_decline now check the bb_post exit code instead of discarding it — symmetric with (b). Approve / decline failures (already approved, declined PR can't be re-declined, etc.) now surface to the user. (h) cmd_pr_create omits `description` from the payload when empty/whitespace. Matches the Python omit-when-empty contract. (j) cmd_pr_diff now uses `curl -sfL` (was `-sf`) so a future redirect on this endpoint doesn't fail silently. `curl -L -u` does NOT resend credentials to a different host by default, so the cross-host credential-leak concern is already mitigated by curl's own semantics — no manual auth-stripping required here. Removed personal-data leak: help text previously named a specific personal workspace as the default. Generic example only now. DEFERRED (g) cmd_pr_create hardcodes close_source_branch=true. Both surfaces share this default; exposing as a flag would be a coordinated bash + Python change. Defer to a future PR. (i) cmd_pr_merge uses PUT; modern Bitbucket REST docs use POST. Bitbucket accepts both for this endpoint and bash's PUT is the verified-working contract. Python also uses PUT for parity. Coordinated investigation deferred. (k) cmd_pr_merge unhashable-strategy raises TypeError instead of ValueError. n/a in bash — bash doesn't enforce types this way. (l) Bash repos_list / branches_list / vars_list / downloads_list don't expose BBQL `q` query filter or `count` parameters. Agent gets richer access via the MCP surface; bash retains its simpler human-driven shape. Defer. Smoke tests run locally: - bash -n bb (clean) - bb help (all 5 new commands appear) - Python suite: 360/360 still green DISPATCHER Added case-statement entries for: pr-unapprove (alias `prua`), pr-comment, branch, commits, whoami. Help text reorganised to show the new commands grouped with their family. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.7 review fixes (security + real bugs in new code) Round 1 review surfaced 15 findings. Folding 8 — the critical-and-mine items plus a few cheap symmetries; deferring 7 pre-existing items that weren't directly touched. Security: #1 cmd_whoami strips `user:token@` from origin URL before echoing. `git remote get-url origin` returns the raw URL including any embedded credentials (a common pattern for token-based git auth). whoami output is exactly the kind of thing users paste into bug reports and screenshots — can't have secrets in it. Matches the `_redact_url` helper in mcp_server.py / git_ops.py. #6 cmd_whoami probes `/repositories/{workspace}?pagelen=1` instead of `/user`. Atlassian's workspace-scoped tokens (the modern recommended type) reject /user with 401/403 while still serving /repositories/{workspace} correctly — the old probe would have told users with valid tokens to rotate them. Real bugs in code this PR added or touched: #2 cmd_watch pagelen bumped 50 → 100. I missed this when bumping cmd_pipeline / cmd_pipeline_stop / cmd_logs in the same PR. A pipeline at positions 51-100 in the recent list would never match here and the watch loop would spin forever printing blanks. #3 cmd_pipeline_trigger variables built via jq instead of string concatenation. The previous shape produced invalid JSON when any value contained `"`, `\`, newline, or tab — the `jq -n --argjson vars` would then refuse to parse and abort after the user-visible `Triggering pipeline...` had already printed. JSON injection vector for any agent-supplied variable value. #4 cmd_pipeline_trigger `shift 3 || true` no longer masks the under-3-args case. Previously `bb trigger myrepo` (1 arg) left "myrepo" in $@ and the var-loop parsed it as a VAR=VALUE pair, sending `{"key":"myrepo","value":"myrepo"}` as a pipeline variable. Now guards on `$# -ge 3` and explicitly shifts the available count. #15 cmd_pipeline_trigger now rejects detached HEAD. `git rev-parse --abbrev-ref HEAD` exits 0 with the literal "HEAD" in detached state, so the `|| echo "main"` fallback didn't fire. Bitbucket 400s on `ref_name=HEAD`. Clean local error instead. Validation parity (symmetric with Python): #5 repo_path whitespace check now uses `[[:space:]]+` regex instead of `${var// }` (which stripped only ASCII space). Tabs / newlines / CRs now correctly rejected — true parity with Python's `.strip()`. #9 repo_path now uses `kill -TERM $$` on validation failure instead of `exit 1`. `exit 1` inside a `$(repo_path ...)` only terminates the subshell; the caller proceeded with an empty path and ate a confusing layered failure. `kill -TERM $$` terminates the parent script so validation actually halts. #7 + #14 cmd_pr_merge captures exit code AND validates strategy. Mirror of the (b)/(f) capture pattern this PR established for cmd_pr_approve / cmd_pr_unapprove / cmd_pr_decline / cmd_pipeline_stop. A merge conflict / failing-required-build / typo-strategy now surfaces with a labelled error including likely causes instead of `set -e` silently aborting. Deferred (pre-existing, not touched by this PR's scope): #8 cmd_pr_comment_add stdin fallback — nice-to-have parity with cmd_pr_create's pattern. Defer. #10 build_number numeric validation in cmd_pipeline / cmd_logs / cmd_pipeline_stop — pre-existing. Defer. #11 cmd_branch_show jq null-safety on .target.hash / .target.date. Real but narrow (annotated-tag refs). Defer. #12 cmd_pr_diff curl failures print nothing — pre-existing asymmetry. Defer. #13 cmd_commits count validation — minor. Defer. Smoke tests: bash -n bb clean, bb help renders, Python 360/360 still green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.7 round-2 review fixes Round 2 caught a regression I introduced in round 1 + several real items the first pass missed. Critical: #1 cmd_whoami origin URL fallback now actually fires when git fails. My round-1 fix piped through sed for credential redaction but the `||` attached to sed (which always exits 0 on empty input) instead of git. So `bb whoami` in a non-git directory printed `Origin:` blank instead of the promised `(no origin remote)` placeholder. Refactored to an `if remote=$(...); then ... else ... fi` block so the fallback branches on git's exit status. #2 cmd_pipeline_trigger variables parser switched to NUL delimiter (was newline). My round-1 fix replaced unsafe string-concat with `printf '%s\n' "$@" | jq -Rs 'split("\n") ...'` — but that still fragments any VAR=value where value contains a newline. `VAR=$'line1\nline2'` became `{VAR:line1}` plus a ghost `{line2:""}`. NUL is the safe delimiter; printf's `%s\0` + jq's `split(" ")` round-trip cleanly. #3 SECURITY: build_number now validated numeric in every command that splices it into a jq filter (cmd_pipeline, cmd_watch, cmd_logs, cmd_pipeline_stop). The previous raw interpolation let jq treat a non-numeric value as an undefined-function reference (typo path) OR — confirmed by the reviewer — as a jq-injection vector via `$ENV` exfil (e.g. `1) | $ENV.BB_TOKEN, .uuid` would land the token in pipeline_uuid → next URL path). New `_require_build_number` helper enforces `^[0-9]+$` at the boundary; rejected input never reaches jq. #4 cmd_pipeline_trigger `Variables: $*` echo now masks values. Previously emitted the user's raw VAR=value args to stdout — secrets in terminal scrollback, CI logs, shell-history wrappers, `bb trigger ... | tee` outputs. Now prints `KEY=***` for each variable. Contradicted the "NEVER echoes BB_TOKEN" claim in cmd_whoami. #8 repo_path whitespace check now catches MIXED whitespace, not just all-whitespace. My round-1 `^[[:space:]]+$` regex let `' acme '` through; `tr -d '[:space:]'` + equality check now rejects any value that differs from its stripped form. True parity with Python's `.strip()`. Symmetry (rc-capture pattern applied to commands the first round missed): #5 cmd_pr_create now captures the bb_post exit code. A 400 (dest branch typo, duplicate PR open, source not pushed) surfaces as a labelled error with likely causes instead of `set -e` silently aborting after the "Creating PR:" banner. #6 cmd_pipeline_trigger now captures the bb_post exit code with the same labelled-error pattern. Common 4xx causes (protected branch, custom-pipeline-not-found, invalid variable shape) spelled out. Deferred (pre-existing, narrow): #7 cmd_branch_show / cmd_commits print banner before fetch. Same pattern as #5/#6 but on read commands where the silent failure has less consequence; the curl exit code does at least surface non-zero. Defer to a follow-up if needed. #9 cmd_branch_show jq null-safety on .target.hash / .target.date. Real but narrow (annotated-tag refs). Defer. #10 cmd_pr_diff curl-silent failure. Pre-existing. Defer. #11 cmd_commits count not validated. Minor. Defer. Smoke tests: bash -n clean, bb help renders, Python 360/360 still green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…, XDG) (#9) The previous _VENV_DIR was /tmp/bbenv. /tmp gets wiped at every reboot, which means the ~30s bootstrap (pip install mcp) re-ran on every boot the user launched the MCP server. The new location follows the XDG Base Directory spec and matches the pattern used by zenhub-cli (`~/.local/share/zenhub-cli/venv`). Changes: _xdg_data_home() — new helper returning the resolved XDG data dir, honouring XDG_DATA_HOME when set and falling back to ~/.local/share. _VENV_DIR — now `_xdg_data_home() / "bitbucket-cli" / "venv"`. Resolved at module-import time so the path is pinned for the rest of the bootstrap. _bootstrap_venv — now `mkdir -p` the parent directory before calling `python -m venv` (on a fresh machine ~/.local/share/bitbucket-cli may not exist yet; python -m venv doesn't create intermediates). Docstrings + inline comments — updated all references that called out /tmp wipe behaviour. The "resolve sys.prefix to handle symlinks" comment now mentions the path-quirk reasoning generically rather than the specific /tmp -> /private/tmp case. Module-header env-var list — added XDG_DATA_HOME as an override knob. Tests added (TestVenvLocation): - test_venv_dir_is_under_xdg_data_home: path ends in bitbucket-cli/venv AND is NOT under /tmp (regression guard against reverting). - test_xdg_data_home_env_var_honoured: explicit XDG_DATA_HOME= routes the venv to the override location. - test_xdg_data_home_default_under_home: unset XDG_DATA_HOME falls back to ~/.local/share. - test_venv_ready_sentinel_inside_venv_dir: the .bbenv-ready sentinel lives inside the venv dir so `rm -rf $venv_dir` also removes the sentinel (otherwise a stale sentinel could survive and claim a missing venv is ready). Test count: 360 -> 364. All passing locally. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…DE.md refresh - agents/bitbucket.md: 245-line generic agent definition documenting the 30-tool surface, operating principles (resolve-git-context-first, show-diffs-before- destructive-ops, parity discipline), generic worked examples, and the "Extending bb itself" workflow. No personal data, no project-specific references — installable as a user-scope agent in any environment. - README.md: social-preview header, Python badge, Python 3.10+ requirement, and a new "MCP server" section with install instructions for Claude Code (both .mcp.json and `claude mcp add` forms), generic stdio-MCP-client guidance, the 30-tool surface table, and the security posture summary. - CLAUDE.md: replace bash-only project description with the dual-implementation reality (bash CLI + Python MCP server), document the parity rule, MCP-tool workflow, security posture (token never echoed, URL credential redaction, cross-host Authorization stripping, KEY=*** variable masking). - docs/img/social-preview.png: 1280×640 OSS social preview asset. - .gitignore: add .claude/ (local Claude Code workspace artifact).

- agents/bitbucket.md:19 — clarify pipeline-scan depth (single 100-pipeline page for bash, 2000-pipeline scan up to 20 pages via MCP); the old "100-page" phrasing was ambiguous. - README.md (MCP server section) — link "clone the repo" to the Installation section so readers who jump straight to the MCP section know where the clone steps live. - bb_ops.py:50–56 — refresh _PIPELINE_SCAN_LIMIT comment: bash uses pagelen=100 (not the old "50-pipeline page" claim), and drop the stale 4.7 follow-up note.

- agents/bitbucket.md: drop incorrect "(capped at 1 MiB)" note on pr_diff and the parallel "logs > 1 MiB the redirect-fetch helper truncates" claim on pipeline_logs. Neither path is byte-capped — the 1 MiB cap is exclusive to git_uncommitted_changes. Replace with the actual large-payload failure mode (BBApiError on timeout mid-read; bump timeout=). - agents/bitbucket.md (commits_list bullet): use "branch omitted (or '')" instead of "branch=None"; the MCP tool's JSON schema is string-typed, not string-or-null, so a literal null over the wire would be rejected. - README.md (MCP install): clarify the XDG_DATA_HOME override — it relocates the parent directory, not the entire venv path. Spell out the resulting path ($XDG_DATA_HOME/bitbucket-cli/venv) to match CLAUDE.md's phrasing.

The agent doc + README both promised whoami performs a "workspace- reachability check" / "connectivity smoke test", and bash cmd_whoami already did — but Python whoami only loaded config and read git context locally, never hitting the API. An invalid BB_TOKEN (expired / revoked / wrong account) still returned {"ok": True}, and an agent using whoami as a pre-flight check before a destructive op would proceed unaware. Add Phase 3 to whoami: a single low-cost GET against the configured workspace endpoint (/repositories/{workspace}?pagelen=1). Surface the result as out["auth"] = {"ok": True} or {"ok": False, ...full error dict}. Best-effort — failure does not flip outer ok=False (config + git context are still useful when debugging a stale token). Targets /repositories/{workspace} not /user because Atlassian's workspace-scoped tokens (now the recommended shape) reject /user with 401/403 while serving the workspace endpoint correctly. Matches bash's choice for the same reason. Tests: - Existing whoami tests were not hermetic — they were hitting api.bitbucket.org for real (the BBClient stub had a real opener; before this change whoami never called .get() so it didn't matter). Replaced with explicit client.get patches that record the call. - New: auth probe success path asserts {ok: True} + correct endpoint + correct pagelen=1 query. - New: auth probe failure path (BBApiError 401) asserts out["ok"] stays True, out["auth"]["ok"] is False, and the BB_TOKEN is not leaked into any error field. - New: auth probe URL-encodes workspaces containing `/` (parity with bb_ops.py's quote-with-safe='' pattern). - Hardened test_config_error_flips_ok_false: when no client could be constructed, out["auth"] must be absent (never let a None-deref regress in).

Finding #1 (real regression I introduced in round-3): the modified test_handles_git_failure_gracefully weakened a strict two-assertion test ("git_branch_error" AND "git_remote_error" both present) to a trivially-satisfied OR with a misleading comment about _default_repo_path raising — a scenario the test never triggers because the autouse fixture sets BB_DEFAULT_REPO_PATH. Restored the strict assertions and added a separate test_cwd_error_skips_git_probes that actually monkeypatches _default_repo_path to raise and proves both git probes are skipped (with assertion-trap stubs that would fail loudly on a regression). Finding #2 (UX polish): whoami's Phase 3 probe was using the BBClient 30s default timeout. Tightened to 10s — a workspace-listing endpoint taking >10s is itself a degraded-connectivity signal, and whoami should fail fast rather than block. Finding #3 (doc accuracy): probe-path docs framed /repositories/{workspace} as authoritative, but the path requires repository:read scope. A workspace-scoped token granting only pipelines:read or pullrequest:read will report auth.ok=False even though those tools still work. Surface the trade-off in the Python docstring, agent file, README, AND bash cmd_whoami (parity) — so neither agents nor humans mis-read the probe as a global credential verdict. Tests: 367 passing (was 366; +1 for the new cwd_error coverage).

… pattern The previous MCP section bundled tools, install, and agent install into overlapping subsections — readers had to scroll back and forth to figure out the two-part install (MCP server registration vs agent file copy). Restructure to mirror the zenhub-cli pattern that this project is the sibling of: - Lead with "What it exposes" (the 30-tool table) so readers know what they're buying before being asked to install. - Collapse the per-client install variants into the canonical `claude mcp add --scope user bitbucket ...` form. Drop the .mcp.json / per-server env-var block — bb_api reads ~/.config/bb/config automatically, so the registration only needs the python + script path, matching the zh pattern. - Add an "Environment overrides" table covering BB_DEFAULT_REPO_PATH and XDG_DATA_HOME so the venv-relocation knob is discoverable. - Add a new "Optional: install the bundled bitbucket agent" subsection with: - Explicit framing that agents/bitbucket.md is a generic template, not a personalized config. - Hard rule against contributing personalizations back upstream — workspace slugs, reviewer handles, custom-pipeline patterns belong only in the user's ~/.claude/agents/ copy. - The two-step install (mkdir + cp), the customization checklist pointer, and the no-restart-needed footnote. - A comparison table of "with agent vs without agent" enforcement (propose-first for destructive ops, auto-detect source branch, pipeline triage order, etc.) so the value of the behavioral layer is concrete rather than abstract. Brings bb to docs-parity with zenhub-cli's install flow.

- README: replace hardcoded /usr/bin/python3 with bare python3 in the Installation block and "Other MCP clients" paragraph. Apple's bundled /usr/bin/python3 on macOS 14+/15 is 3.9, below the MCP server's 3.10 minimum; bare python3 lets `claude mcp add`'s inherited PATH find Homebrew/pyenv. Add an inline comment in the install block explaining why the bare form is correct, and a parenthetical in the "Other MCP clients" paragraph documenting both the version floor and the macOS Python-path gotcha. - README: rewrite the "What the agent enforces" comparison table. The prior table over-attributed two behaviors (source-branch auto-detect on pr_create, repo auto-detect via _resolve_repo) to the agent — both are tool-level features that work without the agent. Replace those rows with what the agent actually adds on top ("avoid redundant probes — let tool-level auto-detect carry the call") and lead the table with one explicit sentence acknowledging that the tools do per-call auto-detection on their own. - README: clarify BB_DEFAULT_REPO_PATH covers BOTH Bitbucket-tool repo auto-detection AND the git_* tools' default path= resolution. Note the default (the MCP server's launch cwd) explicitly. - README: clients that strip HOME from the subprocess environment need to pass BB_* env vars explicitly; ~/.config/bb/config requires HOME-expansion. One-liner in the "Other MCP clients" paragraph. - README: split the dense whoami cell in the tool table into a one-line table entry plus a footnote below the table — the parenthetical was crammed with timeout + redaction + scope-quirk notes that didn't render well in markdown table cells. - README + agents/bitbucket.md: trivial "10 s" / "10s" spacing consistency — use "10 s" (matches the Python docstring in mcp_server.py). 367 pytest tests still pass (docs-only changes; no behavior touched).

Real regressions I introduced in round-1: - Restore the dropped [Model Context Protocol] link in the opening sentence (was: "A Python MCP server"; now: "A Python [Model Context Protocol] server"). Lost during the restructure; first-time readers who don't know what MCP stands for lose the entry point. - Fix the clean-rebuild rm command to respect XDG_DATA_HOME: was `rm -rf ~/.local/share/bitbucket-cli/venv`, now `rm -rf "${XDG_DATA_HOME:-$HOME/.local/share}/bitbucket-cli/venv"`. The old form silently no-ops on systems with XDG_DATA_HOME set non-standardly. - Re-document the multi-workspace `--env` pattern that the restructure dropped. New callout under the install block shows how to register bitbucket-work / bitbucket-personal etc. with per-server credentials. The single-workspace ~/.config/bb/config path remains the default. UX correctness: - Add explicit credential-sanity step 5 to install (run whoami in a session) — `claude mcp list → ✓ Connected` only confirms stdio handshake, not credentials. A user who skipped step 1 sees ✓ and doesn't realize until first tool call returns 401. - Add a "← replace with your clone path" comment to the placeholder so copy-paste users don't run `claude mcp add … /absolute/path/to/…` verbatim and only discover the FileNotFoundError on first session. - Reorder section under ## MCP server: hoist Optional-agent BEFORE Security (Security stays the conventional closing subsection; agent install no longer buried after readers stop at Security). - Sync mcp_server.py module docstring: was /usr/bin/python3 + server name `bb`, now `python3` (bare) + `bitbucket` — matches new README guidance. Also adds the macOS Python-path gotcha to the docstring. - Note in install step 4 that first `claude mcp list` may show "✗ Failed to connect" transiently during venv bootstrap (5-30 s pip install). Prevents panic-uninstalls. Polish: - Replace fake `*(see footnote)*` in the whoami table cell with a real "(see note below)" — was italic prose with no anchor, would silently orphan if anything inserted between table and prose. - Rename comparison-table columns from "Without agent / With agent" (ambiguous — MCP tools are always invoked by *some* agent) to "Raw MCP tools / With bundled `bitbucket` subagent". - Restore `-- python3 /path` separator in install command (was bare `python3 /path`); keeps parsing robust if users add `--env` flags per the new multi-workspace section. - Remove duplicate macOS-Python-3.9 caveat from the "Other MCP clients" paragraph (was repeated from the install code-block). - Rename `### Installation` (MCP-section subheading) to `### MCP server install` so the GitHub slug `#mcp-server-install` is unique and doesn't collide with the top-level `## Installation` (which slugs to `#installation`). - Rewrite the "No restart needed" note to "Newly-started Claude Code sessions pick up the agent automatically. Existing sessions need a restart." — disambiguates. - Add ", or BB_DEFAULT_REPO_PATH if set" to the line-219 auto-detect description so readers see the override at first mention rather than discovering it two sections down. 367 pytest tests still pass.

The empty-results branch at bb:611 used ${state,,} (bash 4.0+ lowercase substitution), which fails under macOS system bash 3.2 (still default at /bin/bash on Sequoia / Sonoma) with: /opt/homebrew/bin/bb: line 611: No ${state,,} pull requests.: bad substitution Triggers only when count==0 — repos with at least one matching PR take the printf branch and skip line 611. Discovered during v1.1.0 hands-on testing against pittmollc/pittmo_auction (0 OPEN PRs). bb:611 — replace ${state,,} with the portable `$(echo "$state" | tr '[:upper:]' '[:lower:]')`. Inline comment explains why (macOS env-bash → /bin/bash 3.2 resolution + the count==0 single branch). I briefly considered adding a hard bash-version gate at startup that aborts on bash < 4 with a "install Homebrew bash" message, but that would be a regression: macOS users would now hit the gate instead of the working command. With ${state,,} replaced and a clean grep for other bash 4+ idioms (`mapfile`, `declare -A`, `^^`, `,,?`, indirect expansion, etc.) returning nothing, bash 3.2 is a viable runtime — no gate needed. README: - Lower the bash floor from 4.0+ to 3.2+ — matches reality. The previous floor was overstated; macOS system bash works fine now. - Update the README Bash badge from `bash-4.0+` to `bash-3.2+`. - Install step 3 (symlink): show Homebrew path /opt/homebrew/bin/bb as the no-sudo macOS option, alongside the existing /usr/local/bin/bb form (which needs sudo on most macOS setups). Discovered during hands-on install testing. 367 pytest tests still pass (Python side untouched).

- README install step 3: use $(brew --prefix)/bin/bb instead of hardcoded /opt/homebrew/bin/bb. Apple Silicon Macs resolve to /opt/homebrew; Intel Macs resolve to /usr/local. The previous form silently failed for Intel Mac Homebrew users with "No such file or directory" and pushed them into a sudo path they didn't need. - README install step 3: add the zsh alternative for the rc-file PATH addition. macOS defaulted to zsh in Catalina (2019); the previous form silently no-op'd for any macOS user not on bash. - CLAUDE.md: fix bash-floor parity drift. README and badge are now 3.2+; CLAUDE.md still said 4.0+, which would have set up future contributors (and future Claude sessions) to reach for bash-4-only idioms and re-introduce the exact regression PR #12 fixes. - bb:611: switch from echo to printf '%s' for the lowercase pipe input. echo would interpret -n / -e / -E as flags if state ever starts with a dash — not a real risk for Bitbucket states (OPEN/MERGED/DECLINED/SUPERSEDED) but bulletproof and only marginally longer. Inline comment explains the swap. Deferred to task #29 (v1.1.x or v1.2.0): - CI guard for the new bash-3.2 floor (Docker bash:3.2 step) so the exact bug PR #12 fixes can't re-ship past CI. Currently CI is bash 5.x on ubuntu-latest, and `bash -n` doesn't catch portability bugs. - Allowlist validation for `state` arg at bb:605 (mirror Python's _PR_STATES set) — query-param injection surface widened by this diff, low real risk but parity with how build_number is validated. Verified: `/bin/bash bb -w pittmollc prs` in an empty-PR repo prints "No open pull requests." cleanly under macOS system bash 3.2. 367 pytest tests still pass.

claude · 2026-05-27T23:42:18Z

Claude finished @daniel-pittman's task in 13m 14s —— View job

Code Review — PR #14 (v1.1.0)

Effort: extra-high (recall mode). 5 finder angles × up to 8 candidates each → verified against source → top 15 below, ranked by severity. The base ref (pre-v1.1.0) isn't fetched in the local clone, so I reviewed the HEAD state of the new/modified files directly. PR is already MERGED, so these are notes for a follow-up patch (most fit the v1.2.0 epic).

1. `mcp_server.py:801-813` — `pr_create` auto-detects branch from cwd, ignoring the explicit `repo` arg

When source_branch is empty, the wrapper calls git_ops.git_current_branch(path=_default_repo_path()) — _default_repo_path() reads BB_DEFAULT_REPO_PATH or os.getcwd(). It does not consult the repo argument. If an agent calls pr_create(title="...", repo="acme/foo") while the MCP server's cwd sits in a different repo acme/bar, the branch comes from bar, and the PR opens against foo with whatever branch happens to be checked out in bar. The PR description even calls this out as fixed (with detached-HEAD rejection), but the wrong-repo case is unaddressed. Severity: high (silent correctness bug, agent-visible).

2. `bb:362, 400, 408` — `cmd_logs` interpolates user-supplied `step_index` into a jq filter with no `_require_*` validator

_require_build_number was added explicitly to close the jq filter-injection vector (the in-source comment at bb:153-159 even cites $ENV.BB_TOKEN exfil as the motivating threat). step_index shares the same shape and lands in the same filter — .values[${step_index}].uuid — with only a [[ -z "$step_index" ]] check. A crafted value like 0] | $ENV.BB_TOKEN, .values[0 injects into the jq program. Severity: high (same threat class the related guard exists to close). A parallel _require_step_index (or a single _require_non_negative_int for both) closes it.

3. `bb:1211` — `cmd_whoami` origin-URL redaction regex stops at first `@`

sed -E 's#://[^/@]+@#://[redacted]@#'

The character class excludes @, so the match ends at the first @. For https://x-token-auth:my@secret@bitbucket.org/acme/foo the substitution yields https://[redacted]@secret@bitbucket.org/acme/foo — the @secret tail leaks. The Python redactor (mcp_server.py:392, git_ops.py:47) uses [^/]+@ (greedy, no @ exclusion) which redacts correctly. Direct violation of CLAUDE.md security claim #2 and a parity drift between bash and Python. Severity: high (security claim breach, simple regex fix). Replace [^/@]+ with [^/]+.

4. `bb:162` — `${1!r:-empty}` is Python f-string syntax, not bash

echo "Error: build_number must be a positive integer (got ${1!r:-empty})." >&2

Bash parses ${1!r...} as parameter indirection on a name 1!r, not a Python-style repr conversion. Under set -u (set at bb:21) the expansion fails, replacing the intended diagnostic with a bad substitution-style bash error. Replace with ${1:-empty} or '${1//\'/\'\\\'\'}'. Severity: medium (the curated error path is broken; users see bash internals instead of the message).

5. `bb:438-456` — `cmd_pipeline_trigger` claims NUL-delimited args but `jq` splits on space

variables=$(printf '%s\0' "$@" | jq -Rs '
    split(" ")
    | map(select(length > 0))
    | map(split("=") | { key: .[0], value: (.[1:] | join("=")) })
')

The comment at bb:438-442 explicitly states "jq's split(" ") on a NUL-delimited stream sidesteps that" — but split(" ") splits on space, not NUL. For bb trigger repo main "" FOO=bar BAZ=qux, jq receives "FOO=barBAZ=qux", split(" ") returns a single element, split("=") then splits at the first = and the value becomes "barBAZ=qux" — the second variable is silently dropped and the first's value is poisoned with NUL plus the second's literal key=value. Severity: high (functional bug: multi-variable triggers don't work as documented). Fix: split("") instead of split(" ").

6. `bb_ops.py:333` and `bb_ops.py:631` — `pipeline_logs` / `pr_diff` return unbounded body strings

Both pass through client.fetch_redirected_text(...) and return the full body as a Python str straight into the MCP JSON response. There is no cap analogous to git_ops.git_uncommitted_changes's explicit 1 MiB limit. A multi-GB CI log or a PR touching a vendored lockfile will materialise entirely in the MCP server's heap and again in the JSON encoder before the agent ever sees it. Severity: medium-high (OOM / FastMCP transport stalls on a hostile-but-legal Bitbucket payload). Add a max_bytes truncation with a marker, mirroring _MAX_DIFF_BYTES.

7. `mcp_server.py:166-192` — `_bootstrap_venv` has no concurrency guard

Two MCP launches racing on first install (two desktop windows, parent + child Claude process) both observe _VENV_READY.exists() → False, both proceed past the venv-create idempotency check (line 180), both run pip install --upgrade pip and pip install mcp into the same site-packages, interleaving file writes. Result: half-written .dist-info/RECORD, partial wheel extraction, or duplicate metadata → every subsequent launch fails with an ImportError. Severity: medium (rare in practice; recovery requires rm -rf of the venv dir, not just the sentinel). A simple fcntl.flock over _VENV_DIR.parent / ".bootstrap.lock" makes this race-free.

8. `bb:289, 384` — bash uses `pagelen=50` on `/steps/` while Python paginates fully

steps=$(bb_get "$(repo_path "$repo")/pipelines/%7B${pipeline_uuid}%7D/steps/?pagelen=50")

Python's _pipeline_steps_by_uuid paginates with pagelen=_BITBUCKET_MAX_PAGELEN=100. A pipeline with 60 sequential steps cannot be addressed by step index ≥ 50 through bash — bb logs myrepo 42 55 silently reports "Step index 55 not found." even though MCP can fetch it. Severity: medium (parity gap, lost data without notice). Bump to 100 to match the rest of the bash file, or paginate.

9. `bb:545-554` — `cmd_pipeline_stop` scans only 100 recent pipelines vs Python's 2000

The in-source comment at bb:545-549 acknowledges this as a "partial parity step." For a long-running build at position 101+ in the recent list, bb stop reports "not found" while mcp pipeline_stop succeeds. Severity: medium (active parity divergence agents will surface — the user reports MCP can do something bb cannot).

10. `bb:213, 605, 916, 981, 1041, 1126, 1163` — `count` interpolated into `pagelen` URL without shape validation

The Python side enforces _is_positive_int at every count boundary. Bash interpolates raw — bb pipelines myrepo "10&q=state%3D%22DECLINED%22" injects an extra Bitbucket query parameter and alters which pipelines are returned. Severity: low-medium (UX foot-gun; not a credential leak but breaks the bash/Python contract). A _require_positive_int helper analogous to _require_build_number closes it once and reuses across cmd_*.

11. `bb:605` — `cmd_pr_list` doesn't validate `state` against the known set

Python's bb_ops.prs_list validates against {OPEN, MERGED, DECLINED, SUPERSEDED} (_KNOWN_PR_STATES). Bash interpolates whatever the user typed: bb prs myrepo OPENED sends ?state=OPENED&pagelen=25 and gets an empty result with no hint about the typo. Severity: low-medium (parity gap; the same value would be rejected locally on the MCP side).

12. `bb:459-467` and `bb_ops.py:261-262` — `pipeline_trigger` HEAD-rejection is asymmetric and incomplete

Bash rejects branch=='HEAD' only inside the if [[ -z "$branch" ]] auto-detect block — an explicit bb trigger myrepo HEAD slips past. Python's bb_ops.pipeline_trigger has no HEAD guard at all (only pr_create checks source_branch.strip() == "HEAD" at the MCP wrapper level). pr_create defends symmetrically across both surfaces; pipeline_trigger does not. Severity: low-medium (Bitbucket usually 400s but may accept the ref literally, producing a degenerate pipeline labelled "HEAD").

13. `bb_ops.py:473-490` — `pr_create` validates `not value.strip()` but uses the raw, untrimmed value in the payload

The check at line 473 only rejects whitespace-only strings. A leading/trailing space slips through into {"source":{"branch":{"name":source_branch}}}. The MCP wrapper (mcp_server.py:801) strips defensively, but any direct bb_ops caller (tests, future scripts) posts a request with the untrimmed branch name and gets an opaque Bitbucket 404. Severity: low (validator-vs-payload asymmetry; either reassign value = value.strip() before the payload build, or document that callers must pre-strip). Same pattern in pr_merge.message and pr_comment_add.body.

14. `mcp_server.py:666` — `pipeline_trigger` exception handler uses `_error_dict(e)` not `_error_dict_with(e, branch=...)`

Every other write-side tool threads a correlation ID into the error envelope (pipeline_stop → number, pr_create → title, pr_approve → pr_id). An agent fanning out parallel triggers across many branches cannot route failures back to the originating branch. Severity: low (observability gap; the rest of the surface is consistent). Change to return _error_dict_with(e, branch=normalised_branch).

15. `bb_api.py:223-231` — `detect_repo` shells out to git without subprocess hardening

result = runner.run(
    ["git", "remote", "get-url", "origin"],
    capture_output=True, text=True, cwd=cwd, check=False,
)

No env={"GIT_TERMINAL_PROMPT": "0", "GIT_ASKPASS": ""}, no stdin=DEVNULL, no timeout=. CLAUDE.md security claim #6 asserts every git subprocess call sets these — this one doesn't. MCP itself routes through git_ops.git_remote_repo (hardened), so the MCP server isn't directly exposed today, but bb_api.detect_repo is a documented public parity entry point. Severity: low (the hardening lives one layer deeper than the docs imply; any third-party importer of bb_api hits the hang risk). Mirror git_ops._run_git's env/stdin/timeout.

Sweep notes (gaps not in the top 15)

bb_api.py:651 — pagination cursor-stuck check (if next_url == last_next) only catches 1-step repeats, not 2-cycles. MAX_PAGINATION_ITERATIONS=200 is the real backstop.
bb_api.py:391 — urlencode(cleaned, doseq=True) serialises Python bool as True/False (capitalised). Bitbucket expects lowercase. No current caller passes a bool, but _validate_query_value allows bool through the scalar gate.
bb_ops.py:272-279 — pipeline_trigger accepts variables=Iterable. A bare string is Iterable[str] (yields chars), producing a confusing TypeError mid-loop rather than the explicit "must be a dict / list of pairs" error pr_create raises for reviewers.
mcp_server.py:535-549 (_TOOL_EXPECTED_EXCEPTIONS) — excludes TypeError. _validate_query_value raises TypeError; if any future query arg slips through bb_ops' boundary checks, the exception escapes the _error_dict redaction chokepoint.
bb:510 — variable-masking sed substitution s/=.*$/=***/ splits on \n. A secret containing an embedded newline puts subsequent lines through without an = and they pass unmasked. Theoretical (Bitbucket variables rarely contain newlines) but the masking is meant to be defence-in-depth.
git_ops.py:173 — error f-string {getattr(e, 'filename', cwd) or cwd!r} applies !r only to the fallback, so the same path renders quoted or unquoted depending on which branch fires.

What I didn't post inline

The gh CLI isn't pre-authorised in this session, and no mcp__github_inline_comment tool is registered, so the --comment flag falls back to a single consolidated review here. If you want any of these promoted to an inline thread anchored to the file/line, ping me and I'll do it manually.

Load tools and parse trigger
Run 5 finder angles (line-scan, parity, cross-file, language pitfalls, security/wrapper)
Verify top candidates against actual source
Rank top 15 by severity and post review
· branch claude/pr-14-20260527-2342

* Add mcp_server.py — FastMCP wrapper around bb_ops + git_ops Phase 4.6 of the MCP server work. Wires every bb_ops and git_ops function into a FastMCP tool registry, with a self-bootstrap venv mirroring zenhub-cli's mcp_server.py pattern. 39 new tests, 335 total. mcp_server.py — 30 tools: PIPELINES (6): pipelines_list, pipeline_show, pipeline_steps, pipeline_trigger, pipeline_stop, pipeline_logs PULL REQUESTS (11): prs_list, pr_show, pr_activity, pr_create, pr_approve, pr_unapprove, pr_merge, pr_decline, pr_diff, pr_comments_list, pr_comment_add REPOS / METADATA (7): repos_list, repo_show, branches_list, branch_show, vars_list, downloads_list, commits_list GIT CONTEXT (5): git_current_branch, git_status, git_remote_repo, git_recent_commits, git_uncommitted_changes META (1): whoami Self-bootstrap pattern (mirrors zenhub-cli): - On launch, checks /tmp/bbenv. If missing, finds a python3 >= 3.10 (preferring the launching interpreter, falling back to pyenv / Homebrew / system locations), runs `python -m venv`, installs the `mcp` package, then `os.execv`s into the venv. - Bootstrap is stdlib-only (uses subprocess + sys, no third-party imports until after the venv is active). Means any python3 on PATH can launch the server. - `mcp` is the only venv dep — no heavy ML libs (no torch / sentence-transformers / numpy / huggingface_hub). bb doesn't have semantic-similarity workflows like zenhub-cli does. - BB_MCP_SKIP_BOOTSTRAP=1 escape hatch (set in conftest.py) lets pytest exercise tool wiring and result-dict shapes without pulling in mcp. Production NEVER sets this — without real FastMCP, .run() raises. Repo resolution — every Bitbucket tool takes an optional `repo`: - "" → auto-detect via `git remote get-url origin` from BB_DEFAULT_REPO_PATH (or cwd) - "myrepo" → use BB_WORKSPACE from config + "myrepo" - "acme/myrepo" → use "acme" workspace + "myrepo" slug (overrides BB_WORKSPACE for this call) Validated in `_resolve_repo` with ValueError on malformed input ("a/b/c", "/repo", "ws/", "/", "//" all rejected before any network call burns API budget). Error envelope — every tool returns either: {"ok": True, "workspace": ..., "repo": ..., <result fields>} {"ok": False, "kind": "<ExceptionClassName>", "message": ..., <extras>} The error dict carries `status`/`url`/`body` for BBApiError, `returncode`/`stderr` for GitOpError, so the agent can branch on `kind == "BBApiError" and status == 404` without parsing the message. pr_create auto-detects source_branch via git_current_branch when empty (matches bash `bb pr-create`). whoami reports user / workspace / api_base + best-effort git context (branch, workspace, repo from remote). Token is NEVER echoed; the test explicitly asserts the token string is absent from the response. Test discipline (same as bb_api / bb_ops / git_ops): - Tool count + names pinned exactly (test_all_expected_tools_registered) — a future rename or accidental drop fails the suite. - Per-tool wiring: patch the bb_ops / git_ops function at the module level, call the tool, assert (a) the underlying function received the right args, (b) the response dict has the expected shape. Doesn't re-test bb_ops logic (already covered comprehensively). - Error envelope tested for each exception class (BBApiError, BBOpNotFound, GitOpError, ValueError). - whoami's no-token guarantee asserted as a security property. - FastMCP stub's .run() raises a clear error so a test that accidentally calls it fails loud instead of hanging on stdio. CI / packaging: pyproject.toml py-modules += "mcp_server". ci.yml syntax-check += "mcp_server.py". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.6 review fixes Round 1 review on PR #7 flagged 15 findings, with two explicitly called out as production hazards ("will bite real users on first deploy on common Linux topologies"). Addressed 11 of 15; deferring 4 with rationale. Production hazards (must-fix per reviewer): #1 Bootstrap venv detection now uses `sys.prefix == str(_VENV_DIR)` instead of `realpath(sys.executable) == realpath(_VENV_PY)`. `python -m venv` on Linux/macOS defaults to --symlinks, so realpath(/tmp/bbenv/bin/python3) resolves to the SAME canonical path as the builder interpreter (e.g. /usr/bin/python3.12) — the realpath comparison was claiming "already under venv" while we were actually still under the system interpreter, skipping the execv and dying on `from mcp.server.fastmcp import FastMCP`. sys.prefix is set per-interpreter from the venv layout and is the authoritative signal. #2 Bootstrap idempotency. Pip install can be interrupted (Ctrl-C, OOM-kill, disk full) leaving _VENV_PY in place but `mcp` missing. The old guard skipped reinstall on every subsequent launch and died on import with no self-repair path. Now writes a `.bbenv-ready` sentinel ONLY after the full bootstrap succeeds; the venv create step is also idempotent (only runs if _VENV_PY is absent) so retries pick up where they left off. #3 Pinned mcp dependency to `mcp>=1.0,<2`. The previous unpinned "mcp" would silently install a future breaking mcp 2.x on the next /tmp wipe. Matches the pyproject [mcp] extra. Correctness: #4 pr_create now rejects detached-HEAD auto-detect. git_current_branch returns the literal "HEAD" for detached/unborn state; bb_ops only checks non-empty/non-whitespace, so the call would have reached Bitbucket with source.branch.name="HEAD" and surfaced as an opaque API error. Now raises ValueError locally with a clear "pass source_branch=" hint. #5 _resolve_repo strips whitespace before parsing. A sloppy paste like " acme/widget " no longer slips through as workspace=" acme" and fails deep in the API layer. #11 pr_create source_branch auto-detect now triggers on whitespace (not just empty string), symmetric with #5. #13 OSError added to _TOOL_EXPECTED_EXCEPTIONS. Covers paths git_ops._run_git doesn't wrap explicitly (IsADirectoryError, BlockingIOError, ConnectionResetError), and catches os.getcwd() on a deleted cwd inside _default_repo_path() which fires BEFORE any wrapped git call. Surface improvements: #6 pipelines_list now exposes `sort=` kwarg. Default "-created_on" (newest first); agent can pass "created_on" for oldest-first or other sort keys. #7 pipeline_logs + pr_diff now expose `timeout=` kwarg. Default 120s matches bb_ops; agent can bump for pipelines with very large log payloads or oversized PR diffs that would otherwise cap out. #8 git_status payload renamed from `status` to `working_tree`. The `status` key collided with the HTTP-status field _error_dict uses for BBApiError. Today the collision can't fire (git_status doesn't raise BBApiError), but the rename pre-empts a future broadening hazard. #10 branch_show echoes the stripped name in the response. bb_ops strips before encoding the URL, so the response now matches what Bitbucket actually resolved rather than the unstripped input. Deferred (with rationale): #9 Response-key naming consistency (approval/result/pr). Reviewer suggested standardizing all mutation tools to "result", but the descriptive keys carry useful semantic info — pr_approve returns an approval record, pr_merge returns the merged PR, etc. The docstrings explain the shape. Skipping. #12 _client_cache concurrency + invalidation. Real concern for long-lived servers with rotated tokens, but stdio MCP is single-threaded and token rotation is rare; defer to Phase 4.6+ if a real bug surfaces. #14 CI smoke step that actually imports the real mcp package. Larger CI workflow change; defer. #15 README MCP documentation. Explicitly Phase 4.8 scope. Test count: 335 -> 341. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.6 round-2 review fixes Round 2 surfaced 15 new findings including a real security bug. Fixed all 15. Security: #1 BBApiError.url now redacted in _error_dict. pipeline_logs and pr_diff follow Bitbucket's 307 to a presigned S3 URL via fetch_redirected_text; on a downstream non-3xx (S3 clock skew, expired signature, redirect cap), BBApiError(url=<signed S3 URL>) carried AWS-credential-bearing query parameters (X-Amz-Signature, X-Amz-Credential) verbatim into the agent error dict and downstream logs. New _redact_url helper strips `user:token@` AND replaces signed-URL query strings with `[redacted-signed-url-params]` while preserving the path so the agent knows what host was called. Correctness: #2 AttributeError added to _TOOL_EXPECTED_EXCEPTIONS. JSON null from an MCP client deserialises to Python None; `.strip()` on None crashes uncaught. Also pre-normalised every stringy arg path through `(value or "").strip()` so None and "" funnel to the same auto-detect path uniformly. #3 _default_repo_path no longer evaluates os.getcwd() eagerly. `os.environ.get("KEY", os.getcwd())` runs the default even when the env var is set — the override never protected against a deleted cwd. Now uses `... or os.getcwd()` so the env var is the lazy short-circuit. whoami also wraps cwd resolution in a try block so a deleted cwd surfaces as a structured cwd_error field instead of escaping the function. #4 mcp import wrapped with helpful recovery message. If `mcp/` gets deleted post-bootstrap (manual `pip uninstall`, partial /tmp cleanup that wiped the package dir but spared the sentinel), the ImportError now tells the user to `rm .bbenv-ready` or `rm -rf /tmp/bbenv`. #5 _resolve_repo now strips inner slug parts. "acme/ widget" no longer slips through as workspace="acme", slug=" widget" and 404s on `/repositories/acme/%20widget`. #6 repos_list workspace stripped. ` acme` / `acme ` no longer slip through. #7 TypeError removed from _TOOL_EXPECTED_EXCEPTIONS (was contradicting the file's stated philosophy). A refactor that renames a bb_ops kwarg now crashes visibly during dev/test instead of being silently reported back to the agent as a fake Bitbucket failure. #8 whoami git-context handlers now use _TOOL_EXPECTED_EXCEPTIONS (not narrow GitOpError) AND store the full structured _error_dict (not str(e)). An agent branching on err.returncode for git-127 vs not-a-git-repo-128 vs timeout-(-1000) has the same shape available as every other tool. #10 _opt_str() helper introduced. (branch or None) became `_opt_str(branch)` everywhere — "" / " " / None all funnel to None at the bb_ops boundary. Without this, " " on a branch filter posted `?target.ref_name=%20%20`, on a pattern raised a confusing two-layers-down ValueError, on commits raised a different "non-empty, non-whitespace" error. Inconsistent surfaces for the same agent typo. #12 sys.prefix venv check now resolves both sides through Path.resolve() so macOS's `/tmp` -> `/private/tmp` symlink doesn't trigger an infinite execv loop. #13 Every tool that takes an identifier (pr_id / number / step_index) now threads it into the error dict via _error_dict_with. An agent fanning out parallel pipeline_logs(number=42, step_index=5) and pipeline_logs(number=42, step_index=6) can correlate failures to originating requests without substring-matching the message. #14 _resolve_repo validates the repo arg shape BEFORE calling _get_client(). A fresh-machine user without config + a malformed slug now sees the actual ValueError, not the BBConfigError that previously masked the real cause. #15 git_recent_commits echoes the stripped ref in the response (matches what `git log` actually ran). Consistent with branch_show. UX: #11 pip install no longer --quiet. Now uses subprocess.run capture_output=True and re-raises with the captured stderr inlined so a network blip / mirror 503 / version yank / SSL cert error / proxy block surfaces with pip's actual diagnostic instead of bare `CalledProcessError: returned non-zero exit status 1`. Deferred: #9 Test recorder accepts any kwarg name (false sense of wiring pinned). Real concern but the fix (functools.wraps or create_autospec) is invasive across the test suite; defer to a focused refactor. Test count: 341 -> 348. All passing locally on Python 3.11. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.6 round-3 review fixes (security regression + edges) Round 3 caught a real security regression of round 2's fix, plus a batch of consistency / hardening items. Security regression fixed: #1 _error_dict's `message` field now also routed through URL redaction. Round 2 redacted `out["url"]` but left `out["message"]` carrying the raw URL — BBApiError.__str__ constructs the message via `f"HTTP {status} from {url}: {body[:500]}"`, so str(e) ALWAYS contains the original URL. For the S3 presigned case (pipeline_logs / pr_diff → 307 → S3 non-3xx), the error dict was shipping: {"url": "...?[redacted-signed-url-params]", # fine "message": "HTTP 403 from https://.../?X-Amz-Signature=secret..."} ^^^^^^ secret leaked The existing test_bbapierror_redacts_signed_s3_url only asserted on d["url"], so the regression was invisible to CI. The fix adds a `_redact_message` helper that scans for `http(s)://...` substrings and runs each through `_redact_url`. Tests now assert redaction on BOTH `url` AND `message` for every cred-bearing fixture. #9 Case-insensitive signed-URL indicator matching. MinIO / R2 / Backblaze / mixed-case AWS variants would have slipped past the case-sensitive substring check. Now lowercases the query part before membership. Tests added for lowercase variant + plain `Signature=` form (used by some non-AWS S3-compatible services). Correctness: #2 pipeline_trigger now validates + strips `branch`. Was the only branch-accepting tool that forwarded the arg verbatim — every other tool funnels through _opt_str, but pipeline_trigger couldn't (branch is required, not optional). Now explicitly validates non-empty/non-whitespace and strips, so `" main "` becomes `"main"` and `" "` raises a clean local ValueError instead of POSTing whitespace and getting an opaque 4xx. #3 _resolve_repo bare-slug path now validates `repo not in (".", "..")` before _get_client(). Round 2 fixed the slash-containing branch but the bare-slug fallback still hit _get_client() first, so a fresh-machine user with `repo='.'` saw BBConfigError masking the real ValueError. Parametrized regression test added. #5 pr_create now rejects explicit `source_branch="HEAD"` (not just the auto-detect path). Same severity class as round-1 #4 — just on the other entry point. Bitbucket would have silently created a degenerate PR named after the literal HEAD ref. #6 pr_create error path now uses _error_dict_with(e, title=...). Threads the title into the error dict so parallel pr_create fan-outs (e.g. agent creating one PR per stacked branch in a train) can correlate failures with originating requests. Defensive: #10 os.execv argv now uses Path(__file__).resolve(). Today's documented launch pattern uses absolute paths so the issue can't fire; this hardens against a future bootstrap enhancement or sitecustomize.py that introduces a chdir between launch and execv. Deferred (carried forward with rationale): #4 _recorder accepts any kwarg name (false sense of wiring pinned). Fixing requires either create_autospec or inspect.signature.bind across the test suite — bigger refactor than this commit. Notable enough to track separately. #7 _client_cache invalidation. Long-lived MCP holds the original BBClient for its lifetime; rotating BB_TOKEN requires restart. Architectural, defer. #8 git_recent_commits silently defaults None ref to "HEAD". Cosmetic UX, defer. Test count: 348 -> 356. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.6 round-4 review fixes (structural redaction) Round 4 caught a third iteration of the same credential-leak pattern. This commit closes it structurally so the class can't regress. Critical security fix (structural): #1 _error_dict now routes EVERY string field through _safe_text: message, body (BBApiError), stderr (GitOpError). The previous rounds whack-a-mole'd one field at a time: - Round 2: redacted `url` → message still leaked - Round 3: redacted `message` → body still leaked - Round 4: redacted `body` → stderr would leak the moment Phase 4.7 wraps `git fetch` / `git push` (whose stderr commonly contains `fatal: unable to access 'https://x-token- auth:TOKEN@bb.org/...'`). Unified `_safe_text()` helper applies URL + SCP-cred redaction to free-form text. Tests now assert redaction across ("url", "message", "body") for the API case AND ("message", "stderr") for the git case, plus a new test_giterror_stderr_redacted that covers the Phase 4.7 hazard pre-emptively. #2 SECURITY: signed-URL indicator list broadened beyond AWS. Added: Azure SAS (`sig=`, lowercase only — the trailing `=` blocks `signature=` substring matching), GCP X-Goog-Signature / -Credential, bearer-token-in-URL forms (`access_token=`, `api_key=`). Plus the existing AWS / generic `Signature=`. Tests added for Azure SAS and bearer-token shapes. #8 SECURITY: free-form-text URL redaction broadened to any scheme. The previous `https?://` regex missed `ssh://`, `git+ssh://`, etc. The new `_ANY_URL_PATTERN` matches any URL scheme; combined with the existing SCP-style pattern, free-form git stderr / log output gets redacted regardless of remote URL shape. Test added: ssh:// URL with embedded passphrase redacted by _safe_text. Correctness: #3 pr_create normalises every string arg at the wrapper boundary (source_branch, destination_branch, title, description). The wrapper previously only used `.strip()` for the empty-check guard, then forwarded the unstripped value to bb_ops which wrote it verbatim into the JSON payload. `"feat/widget "` became a 404, `"Hi "` became a PR title with trailing whitespace. #4 pr_create handles JSON null for source_branch. Previously `None.strip()` crashed uncaught with AttributeError (round 2 fixed this class for _resolve_repo but missed pr_create). Title required validation also added — empty/whitespace title raises ValueError with the rejected value echoed. #6 _resolve_repo slash-branch now validates `.` / `..` in either workspace OR repo segment BEFORE _get_client(). Round 3 only landed the check on the bare-slug fallback; `repo="./foo"` still surfaced as BBConfigError on a config-less machine. Parametrized regression test added. #9 git_recent_commits ref normalisation handles None and whitespace-only uniformly. `(ref or "").strip() or "HEAD"` — None and " " both fall back to HEAD, consistent with the empty-string success path. Test hermeticity: #12 _reset_state fixture now scrubs BB_USER / BB_TOKEN / BB_WORKSPACE / BB_API_BASE before each test. A developer running pytest locally with these exported in their shell no longer leaks ambient config into the tests — the suite is hermetic against dev env. Deferred (carried forward with rationale): #5 Identifier-threading inconsistency on remaining tools (pipeline_trigger / branches_list / branch_show / commits_list / repos_list / repo_show / vars_list / downloads_list). Real but stylistic; pick this up in a focused symmetry pass. #7 Stderr redaction proactive defense for future git wrappers — addressed structurally by routing stderr through _safe_text in #1 above (so technically not deferred). #10 _client_cache invalidation. Architectural; documented in earlier commit messages, defer. #11 pr_approve/pr_unapprove key naming inconsistency. Pure naming; intentional descriptive names per round-1 #9 deferral. #13 _resolve_repo error path lacks workspace/repo/path. Style; bundle with #5 in a symmetry pass. Test count: 356 -> 360. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Phase 4.7: bash parity — 5 new commands + 8 parity bugfixes Extends bb with the commands the Python ops exposed but bash didn't, and fixes the parity bugs Python tests surfaced. Also removes a personal-data leak in the help text. NEW COMMANDS bb pr-unapprove [repo] <id> DELETE /pullrequests/{id}/approve. bb pr-comment [repo] <id> <body> POST /pullrequests/{id}/comments with {"content": {"raw": "..."}}. bb branch [repo] <name> GET single branch detail; URL-encodes slashes in name (feat/widget → feat%2Fwidget). bb commits [repo] [branch] [count] GET /commits or /commits/{branch} with branch URL-encoding + count. bb whoami Resolved config + git context + /user reachability probe. NEVER echoes BB_TOKEN. PARITY BUGFIXES (a) cmd_pipeline_trigger no longer drops `variables` when `pattern` is empty. Previously the no-pattern branch built the payload WITHOUT the variables array, silently discarding any VAR=value pairs the user passed. (b) cmd_pipeline_stop captures the bb_post exit code instead of `> /dev/null`-ing it. A stop failure (already stopped, permission denied, 4xx) now surfaces with a clear error and non-zero exit instead of being masked by the unconditional success print. (c) build_number → uuid lookup bumped 50 → 100 (Bitbucket's max pagelen) in cmd_pipeline / cmd_pipeline_stop / cmd_logs. Older pipelines beyond 100 still unfindable in bash; full pagination is the Python-side improvement (2000-scan cap). (d) repo_path now validates workspace + repo at the boundary: rejects empty/whitespace, embedded `/`, `.`, and `..`. Mirrors bb_api.repo_path's Python validation so both surfaces enforce the same contract — no more silent `/repositories//foo` or `/repositories/../foo`. (e) cmd_pipeline_trigger empty-variables payload now omits the key entirely (was inconsistent: pattern-set path emitted `variables: []`, no-pattern path dropped variables). Matches Python's omit-when-empty contract. (f) cmd_pr_approve / cmd_pr_decline now check the bb_post exit code instead of discarding it — symmetric with (b). Approve / decline failures (already approved, declined PR can't be re-declined, etc.) now surface to the user. (h) cmd_pr_create omits `description` from the payload when empty/whitespace. Matches the Python omit-when-empty contract. (j) cmd_pr_diff now uses `curl -sfL` (was `-sf`) so a future redirect on this endpoint doesn't fail silently. `curl -L -u` does NOT resend credentials to a different host by default, so the cross-host credential-leak concern is already mitigated by curl's own semantics — no manual auth-stripping required here. Removed personal-data leak: help text previously named a specific personal workspace as the default. Generic example only now. DEFERRED (g) cmd_pr_create hardcodes close_source_branch=true. Both surfaces share this default; exposing as a flag would be a coordinated bash + Python change. Defer to a future PR. (i) cmd_pr_merge uses PUT; modern Bitbucket REST docs use POST. Bitbucket accepts both for this endpoint and bash's PUT is the verified-working contract. Python also uses PUT for parity. Coordinated investigation deferred. (k) cmd_pr_merge unhashable-strategy raises TypeError instead of ValueError. n/a in bash — bash doesn't enforce types this way. (l) Bash repos_list / branches_list / vars_list / downloads_list don't expose BBQL `q` query filter or `count` parameters. Agent gets richer access via the MCP surface; bash retains its simpler human-driven shape. Defer. Smoke tests run locally: - bash -n bb (clean) - bb help (all 5 new commands appear) - Python suite: 360/360 still green DISPATCHER Added case-statement entries for: pr-unapprove (alias `prua`), pr-comment, branch, commits, whoami. Help text reorganised to show the new commands grouped with their family. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.7 review fixes (security + real bugs in new code) Round 1 review surfaced 15 findings. Folding 8 — the critical-and-mine items plus a few cheap symmetries; deferring 7 pre-existing items that weren't directly touched. Security: #1 cmd_whoami strips `user:token@` from origin URL before echoing. `git remote get-url origin` returns the raw URL including any embedded credentials (a common pattern for token-based git auth). whoami output is exactly the kind of thing users paste into bug reports and screenshots — can't have secrets in it. Matches the `_redact_url` helper in mcp_server.py / git_ops.py. #6 cmd_whoami probes `/repositories/{workspace}?pagelen=1` instead of `/user`. Atlassian's workspace-scoped tokens (the modern recommended type) reject /user with 401/403 while still serving /repositories/{workspace} correctly — the old probe would have told users with valid tokens to rotate them. Real bugs in code this PR added or touched: #2 cmd_watch pagelen bumped 50 → 100. I missed this when bumping cmd_pipeline / cmd_pipeline_stop / cmd_logs in the same PR. A pipeline at positions 51-100 in the recent list would never match here and the watch loop would spin forever printing blanks. #3 cmd_pipeline_trigger variables built via jq instead of string concatenation. The previous shape produced invalid JSON when any value contained `"`, `\`, newline, or tab — the `jq -n --argjson vars` would then refuse to parse and abort after the user-visible `Triggering pipeline...` had already printed. JSON injection vector for any agent-supplied variable value. #4 cmd_pipeline_trigger `shift 3 || true` no longer masks the under-3-args case. Previously `bb trigger myrepo` (1 arg) left "myrepo" in $@ and the var-loop parsed it as a VAR=VALUE pair, sending `{"key":"myrepo","value":"myrepo"}` as a pipeline variable. Now guards on `$# -ge 3` and explicitly shifts the available count. #15 cmd_pipeline_trigger now rejects detached HEAD. `git rev-parse --abbrev-ref HEAD` exits 0 with the literal "HEAD" in detached state, so the `|| echo "main"` fallback didn't fire. Bitbucket 400s on `ref_name=HEAD`. Clean local error instead. Validation parity (symmetric with Python): #5 repo_path whitespace check now uses `[[:space:]]+` regex instead of `${var// }` (which stripped only ASCII space). Tabs / newlines / CRs now correctly rejected — true parity with Python's `.strip()`. #9 repo_path now uses `kill -TERM $$` on validation failure instead of `exit 1`. `exit 1` inside a `$(repo_path ...)` only terminates the subshell; the caller proceeded with an empty path and ate a confusing layered failure. `kill -TERM $$` terminates the parent script so validation actually halts. #7 + #14 cmd_pr_merge captures exit code AND validates strategy. Mirror of the (b)/(f) capture pattern this PR established for cmd_pr_approve / cmd_pr_unapprove / cmd_pr_decline / cmd_pipeline_stop. A merge conflict / failing-required-build / typo-strategy now surfaces with a labelled error including likely causes instead of `set -e` silently aborting. Deferred (pre-existing, not touched by this PR's scope): #8 cmd_pr_comment_add stdin fallback — nice-to-have parity with cmd_pr_create's pattern. Defer. #10 build_number numeric validation in cmd_pipeline / cmd_logs / cmd_pipeline_stop — pre-existing. Defer. #11 cmd_branch_show jq null-safety on .target.hash / .target.date. Real but narrow (annotated-tag refs). Defer. #12 cmd_pr_diff curl failures print nothing — pre-existing asymmetry. Defer. #13 cmd_commits count validation — minor. Defer. Smoke tests: bash -n bb clean, bb help renders, Python 360/360 still green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.7 round-2 review fixes Round 2 caught a regression I introduced in round 1 + several real items the first pass missed. Critical: #1 cmd_whoami origin URL fallback now actually fires when git fails. My round-1 fix piped through sed for credential redaction but the `||` attached to sed (which always exits 0 on empty input) instead of git. So `bb whoami` in a non-git directory printed `Origin:` blank instead of the promised `(no origin remote)` placeholder. Refactored to an `if remote=$(...); then ... else ... fi` block so the fallback branches on git's exit status. #2 cmd_pipeline_trigger variables parser switched to NUL delimiter (was newline). My round-1 fix replaced unsafe string-concat with `printf '%s\n' "$@" | jq -Rs 'split("\n") ...'` — but that still fragments any VAR=value where value contains a newline. `VAR=$'line1\nline2'` became `{VAR:line1}` plus a ghost `{line2:""}`. NUL is the safe delimiter; printf's `%s\0` + jq's `split(" ")` round-trip cleanly. #3 SECURITY: build_number now validated numeric in every command that splices it into a jq filter (cmd_pipeline, cmd_watch, cmd_logs, cmd_pipeline_stop). The previous raw interpolation let jq treat a non-numeric value as an undefined-function reference (typo path) OR — confirmed by the reviewer — as a jq-injection vector via `$ENV` exfil (e.g. `1) | $ENV.BB_TOKEN, .uuid` would land the token in pipeline_uuid → next URL path). New `_require_build_number` helper enforces `^[0-9]+$` at the boundary; rejected input never reaches jq. #4 cmd_pipeline_trigger `Variables: $*` echo now masks values. Previously emitted the user's raw VAR=value args to stdout — secrets in terminal scrollback, CI logs, shell-history wrappers, `bb trigger ... | tee` outputs. Now prints `KEY=***` for each variable. Contradicted the "NEVER echoes BB_TOKEN" claim in cmd_whoami. #8 repo_path whitespace check now catches MIXED whitespace, not just all-whitespace. My round-1 `^[[:space:]]+$` regex let `' acme '` through; `tr -d '[:space:]'` + equality check now rejects any value that differs from its stripped form. True parity with Python's `.strip()`. Symmetry (rc-capture pattern applied to commands the first round missed): #5 cmd_pr_create now captures the bb_post exit code. A 400 (dest branch typo, duplicate PR open, source not pushed) surfaces as a labelled error with likely causes instead of `set -e` silently aborting after the "Creating PR:" banner. #6 cmd_pipeline_trigger now captures the bb_post exit code with the same labelled-error pattern. Common 4xx causes (protected branch, custom-pipeline-not-found, invalid variable shape) spelled out. Deferred (pre-existing, narrow): #7 cmd_branch_show / cmd_commits print banner before fetch. Same pattern as #5/#6 but on read commands where the silent failure has less consequence; the curl exit code does at least surface non-zero. Defer to a follow-up if needed. #9 cmd_branch_show jq null-safety on .target.hash / .target.date. Real but narrow (annotated-tag refs). Defer. #10 cmd_pr_diff curl-silent failure. Pre-existing. Defer. #11 cmd_commits count not validated. Minor. Defer. Smoke tests: bash -n clean, bb help renders, Python 360/360 still green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Phase 4.1: bb_api.py + Python test infrastructure (MCP foundation) (#2) * Add bb_api.py + pyproject + Python test infrastructure (MCP foundation) First slice of the MCP server work. Establishes the Python module layout the MCP server will sit on top of, plus the test seam and CI extension required to develop the rest of the modules with confidence. bb_api.py — Python parallel of bb's config / auth / transport helpers. Stdlib-only (urllib.request), so the MCP server's bootstrap stays small. Public surface: - BBConfig: resolved credentials + workspace + API base URL. - load_config(): env > ~/.config/bb/config > .env > error; tolerant of bash `export KEY=value` and quoted values, matching the bash script's `source` semantics close enough for parity. - parse_remote_url() / detect_repo(): both HTTPS and SSH remote shapes; repos with dots in their slugs are preserved (bash regex truncates). - repo_path(): /repositories/{workspace}/{repo}, rejecting embedded slashes that would silently change the URL structure. - BBClient: thin transport with .get / .post / .put / .delete / .paginate. Pagination is defended against runaway cursors (max 200 pages) and stuck cursors (`next` URL unchanged between iterations), and refuses to follow `next` URLs that point at a different host than api_base. - BBApiError carries status + URL + body so callers can branch on expected non-2xx responses. bb and bb_api are PARALLEL implementations of the same Bitbucket REST contract — the MCP server does not shell out to bb. The bug-parity rule is now documented in CONTRIBUTING.md: tests verify correct API behaviour, and a defect surfaced by a Python test fixes both code paths. pyproject.toml — flat `py-modules = ["bb_api"]` layout (additional modules land in their own PRs). `mcp` is in an optional [mcp] extra so test matrices stay lean. tests/ — pytest with two responsibilities in conftest.py: set BB_MCP_SKIP_BOOTSTRAP=1 before any mcp_server import to defend against the future self-bootstrap venv pattern, and inject the repo root onto sys.path. 36 tests covering config precedence (env > file > error), remote-URL parsing (HTTPS + SSH + repos-with-dots), client transport (URL + method + auth header + body shape per call, not just response status), and pagination (walking, stuck-cursor defence, iteration cap, cross-host refusal, empty page). CI extension — Python 3.10/3.11/3.12 matrix added to the existing `syntax` job. fail-fast: false so a failure on one version doesn't mask another. Installs only pytest; the MCP runtime deps stay out of unit-test CI because the suite mocks urllib. CODEOWNERS — extended to gate review on bb_api.py, future bb_ops.py / git_ops.py / mcp_server.py, pyproject.toml, and the agents/ directory. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply review fixes for Phase 4.1 foundation Addresses 13 of the 15 findings from the Phase 4.1 code review. The strongest finding (#1) was a parity bug where a test pinned the wrong contract — exactly the anti-pattern the testing methodology warns about. Parity / correctness fixes: 1. dotenv > home-config precedence. Bash sources ~/.config/bb/config first, then `.env`, so `.env` wins. Python had the order inverted and the test pinned the inverted behaviour. Now matches bash; the test now verifies the correct contract. 2. Empty env var no longer falls through to file. Switched from `env.get(key) or file_config.get(key)` to a membership test so `BB_TOKEN=""` is treated as an explicit (failing) value rather than absence — matches bash's `[[ -z ]]` check behaviour. 3. URLError (DNS / TLS / connection / timeout) now wraps as BBApiError(status=0) instead of escaping the documented contract. 4. Pagination cross-host refusal hardened against the prefix trick. `startswith(api_base)` is replaced with a separator-aware check so `https://api.bitbucket.org/2.0evil.example.com/...` no longer slips past the host-mismatch guard. 5. BB_MCP_SKIP_BOOTSTRAP is now set unconditionally in conftest; `setdefault` was a no-op if a developer had `BB_MCP_SKIP_BOOTSTRAP=0` in their shell, defeating the guard silently. 6. Paginated response missing the `values` key now raises rather than being silently treated as an empty page (which could leave a hole in the result set if `next` was still set). 7. Non-string `next` raises BBApiError with a clear message rather than crashing on AttributeError. 8. repo_path() rejects empty / whitespace-only / `.` / `..` values in addition to embedded `/`. Defends against path-traversal where `/repositories/../widget` normalises to the wrong workspace. 10. Query values validated as scalar or list-of-scalar; nested dicts and arbitrary objects raise TypeError instead of being silently stringified by urlencode(doseq=True) into a misleading URL. 11. BB_API_BASE trailing slash normalised on load. Prevents `api_base + "/path"` becoming `//path`. 13. Replaced `assert ... is not None` with an explicit raise so `python -O` deployments retain the safety net if a future edit narrows the missing-keys check. 14. detect_repo test now asserts the exact subprocess call shape (command, capture_output, text, cwd, check). Previously the test was canned-output-only and a refactor to a different git command would have passed silently. 15. BBClient methods accept a per-call `timeout=` override for endpoints that legitimately take longer (log streaming, large diffs). Default still comes from the client. Tightened docstrings: - BBApiError now documents `status=0` as the sentinel for non-HTTP transport / malformed-response errors so callers know to gate HTTP-status dispatch on `status > 0`. Deferred with rationale: #9 (remote URL parser accepts non-Bitbucket hosts): intentional for enterprise / self-hosted Bitbucket support. The bash side has the same loose parsing. No restriction added. #12 (BBApiError used as transport-error wrapper): the dedicated exception class would expand the surface; the docstring update above is sufficient until a real consumer needs to discriminate. Test count: 36 -> 55. All passing locally on Python 3.11. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply follow-up review fixes (security: no-redirect default, test rigor) Follow-up review on commit e2567e8 verified all 13 prior fixes landed correctly and surfaced 6 new findings, 3 of them worth taking before more code lands on top of this foundation. Security: new #1. Default BBClient opener no longer follows 3xx redirects. urllib's default HTTPRedirectHandler resubmits the original request against the Location URL — INCLUDING the Authorization header. urllib does not strip auth on cross-origin redirects, so a misconfigured proxy or DNS hijack pointing api.bitbucket.org at another host would have leaked the Basic auth credential to that host on the first 3xx. `bb` doesn't have this exposure: `curl -sf` doesn't follow redirects without -L, and the one command that needs to (`bb logs`) opts in explicitly. Mirror that posture here. Implementation: _NoRedirectHandler subclasses HTTPRedirectHandler and returns None from redirect_request, which causes urllib to surface the 3xx as an HTTPError that our existing handler wraps into BBApiError. Two tests cover it — one introspects the default opener's handler chain, the other unit-tests the handler's redirect_request contract directly. When bb_ops adds pipeline_logs (which legitimately needs to follow a 307 to S3), that work will require a separate code path that strips Authorization on cross-host hops rather than weakening this default. Test rigor: new #2. test_paginate_walks_pages now asserts the URL for each of the three calls (base, base?page=2, base?page=3). Without this, a regression that refetched the original URL three times would still consume the three queued responses and pass — exactly the mock-returns-success-regardless-of-request anti-pattern. Cleanup: new #4. Removed the dead `page_query = None` assignment in paginate(). The variable was only read on the first iteration (when url is None), and the reset ran after `url` had been assigned, so it was never observed. Removed and the closure variable `query` is used directly. Documentation: new #5. parse_remote_url() now has an IMPORTANT caller-contract note: the function does not anchor to a Bitbucket host (intentional for enterprise / self-hosted Bitbucket support). Callers feeding it untrusted external URLs (webhook payloads, etc.) must verify the host upstream before calling. If a future consumer needs that guarantee, the right shape is a `strict=True` mode here, not duplicated checks at every call site. Deferred: new #3 (BB_API_BASE shape validation) — low-impact today; will revisit if a self-hosted Bitbucket Server use case lands a typo bug. new #6 (older _fake_runner tests still allow kwargs drift) — minor; the new recording-style test_detect_repo_invokes_git_remote_get_url is the load-bearer, the looser fixtures around it are acceptable. Test count: 55 -> 57. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Phase 4.2: bb_ops pipeline operations + cross-host-safe log fetch (#3) * Add bb_ops pipeline operations + cross-host-safe log fetch Phase 4.2 of the MCP server work. Adds the pipeline subset of bb_ops (list, show, steps, trigger, stop, logs), plus a new bb_api transport method for following redirects safely. 33 new tests, 90 total. bb_ops.py — public surface: pipelines_list(client, ws, repo, *, count=10, sort, branch=None) Paginates pipelines/?sort=-created_on. Honours `count` exactly by walking pages as needed; clamps per-page to Bitbucket's 100 cap. Optional branch filter via target.ref_name. pipeline_show(client, ws, repo, build_number) Resolves build_number -> uuid then GETs /pipelines/{uuid}. pipeline_steps(client, ws, repo, build_number) Same uuid lookup, then GETs /pipelines/{uuid}/steps/. pipeline_trigger(client, ws, repo, *, branch, pattern=None, variables=None) Builds the POST payload (target + optional custom-pattern selector + optional variables-as-list-of-{key,value}). Variables accept dict or iterable-of-pairs at the boundary. Rejects non-string values (Bitbucket's contract). pipeline_stop(client, ws, repo, build_number) POSTs to /pipelines/{uuid}/stopPipeline. Returns the API response (typically None on 204). The bash equivalent discards the response with `> /dev/null`, masking failures — flagged as a 4.7 parity fix. pipeline_logs(client, ws, repo, build_number, step_index, *, timeout=120) Uses the new BBClient.fetch_redirected_text() path to follow Bitbucket's 307-to-S3 redirect on the log endpoint with Authorization stripped on cross-host hops. bb_api.py — BBClient.fetch_redirected_text(): Necessary because the log endpoint sometimes returns 200 with an inline log body and sometimes returns 307 to a signed S3 URL. The default opener refuses 3xx (credential-safety), so this method opens with the default opener, catches HTTPError 3xx, extracts Location, and follows up to max_redirects hops. CRITICAL: when the next hop's host differs from api.bitbucket.org, Authorization is removed from the request headers — Basic credentials never reach S3 or any other redirect target. Once stripped, the credential stays stripped for the rest of the chain. Bash's `bb logs` uses `curl -sfL` which (depending on curl version) may send Authorization on cross-host redirects. Whether that's presently broken in bash is a parity question for 4.7; Python is correct by construction. Test coverage (33 new): _wrap_uuid: brace normalisation, URL encoding, whitespace tolerance. _strip_uuid_braces: bare/braced/empty inputs. pipelines_list: single page, count > 100 (pagination), mid-page cutoff, branch filter URL shape, non-positive-count rejection without emitting a request. _resolve_pipeline_uuid: first-page hit, multi-page walk, not-found raises BBOpNotFound (distinct from BBApiError), invalid build_number rejection, scan_limit cap respected. pipeline_show: uuid lookup THEN show by uuid; URL uses %7B...%7D bracket encoding (bash contract). pipeline_steps: uuid lookup THEN list steps. pipeline_trigger: default-pipeline payload shape, custom-pipeline payload shape (selector embedded), variables-as-dict shape, variables-as-iterable-of-pairs shape, empty-variables omits the key (not an empty list), non-string variable value rejected, empty pattern rejected, empty branch rejected. pipeline_stop: POST to /stopPipeline with no body; 204 returns None. pipeline_logs: inline body (no redirect), 307 follow with Authorization stripped on cross-host (the security-critical test), too-many-redirects raises, missing Location raises. _resolve_step_uuid: valid index, out-of-range raises BBOpNotFound, negative index rejected. Every HTTP-touching test asserts request URL, method, and body shape - not just the response value. Guards against the mock-returns-success-regardless-of-body anti-pattern. CI / packaging: pyproject.toml py-modules now includes bb_ops. ci.yml syntax-checks bb_ops.py alongside bb_api.py. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.2 review fixes (robustness + cleanup) Addresses 6 of 8 follow-up suggestions from the Phase 4.2 code review. Approve-with-suggestions; nothing blocking, all minor robustness or clean-up items. #1 (doc). Soften the cross-host-auth narrative in fetch_redirected_text. `curl -L -u` also strips credentials on cross-host hops; only `-H "Authorization: ..."` would leak. Bash is safe today via `-u`. Python's explicit strip is the symmetric future-proof version (in case the MCP server ever needs header-based auth). #4 (robustness). Case-insensitive cross-host comparison in fetch_redirected_text. RFC 3986 §3.2.2 makes hostnames case-insensitive; a redirect to API.bitbucket.org would have triggered a needless auth-strip on a capitalisation-only diff. Now lowercased both at construction and on each new host. #5 (resource hygiene). Explicit `e.close()` on the redirect-path HTTPError so the underlying socket is released immediately rather than waiting on GC under load. Wrapped in try/finally so a raise from the redirect-extract logic still closes the response. #6 (dead return). _resolve_step_uuid now returns just the uuid rather than a (uuid, name) tuple. The name was never consumed by pipeline_logs. Callers that want the name should fetch pipeline_steps() themselves. Simpler API, no behavioural change. #7 (nit). Dropped redundant `safe='-'` from _wrap_uuid's quote() call. Dash is in urllib's always-safe unreserved set. #8 (naming). Renamed pipeline_steps_raw -> _pipeline_steps_by_uuid (leading underscore to match the rest of the file's internal-helper convention; the function was only called module-internally). Bash bug newly surfaced by the review, queued for 4.7: cmd_pipeline_trigger drops `variables` entirely when `pattern` is empty (bb:382-391: the no-pattern branch builds a payload without the variables array, even when the caller passed VAR=value pairs). Python pipeline_trigger handles this consistently; tests pin it. Deferred: #2: bash bug above — fixed in 4.7 not here (Python-only PR). #3: empty-variables payload shape (omit vs []). Python omits; bash emits []. Both are valid against Bitbucket. Picking one and aligning is a 4.7 follow-up. Test count unchanged (90); test_resolve_step_uuid updated to assert the new single-uuid return. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Tighten finally-close comment per re-review nit The finally runs on every HTTPError path (both 3xx and non-3xx), not just redirects. Reword the comment so a future reader doesn't assume the close is redirect-only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Phase 4.3: bb_ops pull-request operations + tests (#4) * Add bb_ops pull-request operations + tests Phase 4.3 of the MCP server work. Adds the PR subset of bb_ops with 33 new tests, 123 total. Public surface: prs_list(client, ws, repo, *, state="OPEN", count=25) State filter + paginated count, matching bash defaults. Walks pages when count > 100 (Bitbucket's cap). pr_show(client, ws, repo, pr_id) GET /pullrequests/{id} with positive-int validation on pr_id. pr_activity(client, ws, repo, pr_id, *, count=50) Walks the activity stream — used by bash to surface approver names, surfaced as its own op so the MCP agent can render its own view. pr_create(client, ws, repo, *, title, source_branch, destination_branch="main", description="", close_source_branch=True, reviewers=None) Builds the POST payload matching Bitbucket's contract. Optional reviewers as a list of UUID strings (Bitbucket expects `[{"uuid": "..."}, ...]`). Empty description / empty reviewers omitted from the payload (Python convention; bash's "always include empty description" is a 4.7 alignment item). pr_approve(client, ws, repo, pr_id) POST /approve. Returns the approval record — bash discards it with `> /dev/null`. pr_unapprove(client, ws, repo, pr_id) DELETE /approve. Not exposed by bash today (4.7 parity gap). pr_merge(client, ws, repo, pr_id, *, strategy="merge_commit", close_source_branch=True, message=None) PUT /merge with strategy + close_source_branch + optional message. Strategy validated at boundary: merge_commit / squash / fast_forward. Mirrors bash's PUT verb (Bitbucket has historically accepted both PUT and POST; flagged for 4.7 investigation against current REST docs). pr_decline(client, ws, repo, pr_id) POST /decline. pr_diff(client, ws, repo, pr_id, *, timeout=120) Routes through fetch_redirected_text so the cross-host-auth-strip protection applies if Bitbucket ever redirects this endpoint. Bash uses `curl -sf` without `-L`, so it would fail visibly in that case — Python continues to work safely (parity divergence noted for 4.7). pr_comments_list(client, ws, repo, pr_id, *, count=100) Paginated comments listing matching bash's default pagelen. pr_comment_add(client, ws, repo, pr_id, body) POST /comments with payload {"content": {"raw": "<text>"}}. Not exposed by bash today (4.7 parity gap). Validation: every function rejects invalid pr_id (must be positive int) and non-string body / message values at the Python boundary. Tests assert `opener.calls == []` after a rejection to prove no request was emitted on bad input. Bash parity items newly surfaced (queued for 4.7): - cmd_pr_approve / cmd_pr_decline discard API responses with > /dev/null. - cmd_pr_create hardcodes close_source_branch=true; expose as a flag. - cmd_pr_create always includes an empty description; align with Python's omit-when-empty. - cmd_pr_merge uses PUT; verify against current docs and align both sides on one verb. - cmd_pr_diff uses curl -sf without -L; Python's fetch handles redirects safely. Test discipline: every HTTP-touching test asserts URL, method, AND body shape, not just response value. Rejection tests assert opener.calls == [] to prove no network IO on bad input. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.3 review fixes (bool-is-int, type checks, test coverage) Addresses all 12 findings from the code review. The strongest finding was a Python footgun: bool is a subclass of int, so a bare isinstance(x, int) check accepts True/False and stringifies them in URLs as 'True'/'False' rather than '1'/'0'. The cascading fix closes six findings at once. High-severity: #1 _validate_pr_id accepted bool. Now uses a shared _is_positive_int helper that explicitly excludes bool. Same fix applied to every count validator (prs_list, pr_activity, pr_comments_list, pipelines_list, _resolve_pipeline_uuid) and the step_index check. test_rejects_invalid_pr_id extended with True/False; the new test_every_pr_op_rejects_bad_pr_id matrix runs every PR-id-taking op (8 functions × 7 bad values) so a future refactor that drops _validate_pr_id from any one op gets caught. #2 pr_create(reviewers="alice-uuid") would have iterated the string char-by-char and built [{"uuid":"a"}, {"uuid":"l"}, ...]. Added an explicit isinstance(reviewers, str) early-reject with a hint ("did you mean [...]?"). Test added. #3 Same bool-is-int slip on every count parameter (fixed by the shared _is_positive_int helper above). Medium: #4 pr_create now type-checks `description`; a dict or other non-string raises rather than slipping into the JSON payload. Test added. #5 _KNOWN_PR_STATES docstring claim that Bitbucket accepts comma- separated `?state=OPEN,MERGED` was wrong; that shape returns 400 or empty. Docstring corrected to point at the BBQL `q` parameter for compound filtering. #6 close_source_branch now type-checked in both pr_create and pr_merge; a string like "yes" no longer slips into the JSON payload. Tests added. Low: #7 pr_merge(message="") was accepted; now rejected for consistency with pr_comment_add(body=""). Empty merge-message would leave the merge commit's subject line blank. #8 pr_diff docstring softened — the "Python continues to work safely" framing didn't acknowledge that the returned body would be the redirect target's content if a redirect ever materialised. Now spells out that today the diff endpoint doesn't redirect and flags the behavioural divergence from bash if it ever does. #9 Whitespace-only description (`description=" \n "`) now omitted from the payload via description.strip() check. Test added. Test gaps closed: #10 True/False added to test_rejects_invalid_pr_id. #11 New test_every_pr_op_rejects_bad_pr_id matrix covers every PR-id-taking op (9 functions × 7 bad values = 63 cases). #12 TestPrActivity now has count-walks-pages, invalid-count, and invalid-pr_id tests, symmetric with TestPrsList. Test count: 123 -> 141. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.3 round-2 review fixes Addresses 6 of 7 findings from the round-2 review. The strongest finding was a real boundary-validation gap: _KNOWN_PR_STATES was added in the previous fix round but never actually referenced — defined-and-documented as the prs_list state guard, yet the function only type-checked. Closes the gap. High: #1 prs_list now validates state against _KNOWN_PR_STATES. Typos ("OPENED"), case bugs ("open"), and unsupported compound forms ("OPEN,MERGED" — which Bitbucket does NOT accept on the simple ?state= filter; that needs ?q=) all fail at the boundary instead of burning an API call for a 400 / empty result. Symmetric with pr_merge's _VALID_MERGE_STRATEGIES check. #2 Whitespace-only string payloads no longer slip through. Replaced `not value` with `not value.strip()` in five places: - pr_create.title / source_branch / destination_branch - pr_merge.message - pr_comment_add.body Without this, a title like " " or a comment body like "\n\t" would have shipped to Bitbucket as accepted-but-visually-blank fields. Error messages now say "non-empty, non-whitespace". Test coverage: #3 TestPrsList.test_rejects_non_positive_count now includes True / False, matching TestPrActivity. Without symmetry, a refactor that reverted prs_list's _is_positive_int(count) check to a naive isinstance(count, int) would have regressed silently. #4 TestPrCommentsList now has test_rejects_non_positive_count with the same coverage shape (was happy-path only). #5 TestPrDiff.test_follows_cross_host_redirect_and_strips_auth pins the redirect-with-auth-strip behaviour at the pr_diff layer. Mirrors test_pipeline_logs.test_follows_s3_redirect_and_strips_auth — without it, a regression that wired pr_diff to plain client.get (which would 5xx on any future redirect because the default opener refuses 3xx) would not break any test in this file. #7 TestPrDiff.test_returns_raw_diff_text now also asserts the first hop carries Authorization. A regression that dropped the Basic header from the diff-fetch path would otherwise pass silently. Parametrized whitespace cases added across pr_create / pr_merge / pr_comment_add tests so each field's reject-on-whitespace is verified independently. Deferred (filed under 4.7): #6 pr_merge(strategy=[1,2]) raises TypeError (unhashable) rather than ValueError. Defensive-coding nit; either error surface stops the bad call. Defer to the parity sweep. Test count: 141 -> 155. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.3 round-3 review fixes Three small test/consistency fixes from the round-3 review. The PR is in good shape — reviewer explicitly called these out as non-blockers, but they're cheap and close the consistency loop. - pr_merge strategy validator now gates on isinstance(str) before the frozenset `in` check. Unhashable strategy values (list, set, dict) previously raised TypeError instead of the documented ValueError, breaking the "every boundary failure is ValueError" convention. Test extended to cover list/set/None/int alongside the original unknown-string case. - test_rejects_invalid_reviewer (formerly _empty_reviewer_uuid) now parameterizes over empty-string, None, int, and the most plausible caller mistake — pre-shaping as Bitbucket's payload contract `{"uuid": "..."}`. The validator already rejects all four; this pins coverage on the isinstance branch. - test_follows_cross_host_redirect_and_strips_auth now also asserts the first hop carries Authorization (a regression that dropped auth on the initial Bitbucket request would otherwise pass) and that the Location was actually followed (not refetched original). Symmetric with round-2 finding #7 on the no-redirect path. Test count: 155 -> 163. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Phase 4.4: bb_ops repos / branches / vars / downloads / commits + tests (#5) * Add bb_ops repos / branches / vars / downloads / commits + tests Phase 4.4 of the MCP server work. Closes out the read-oriented surface on bb_ops (Phase 4.5 will add git context tools, then 4.6 wires everything into the MCP server). 48 new tests, 211 total. Public surface: repos_list(client, workspace=None, *, count=100, sort, query=None) Defaults workspace to client.config.workspace; pass explicit to query a different one (bash only ever uses BB_WORKSPACE — parity gap). Optional BBQL `q` filter for name-substring etc. (also bash gap). repo_show(client, ws, repo) GET /repositories/{ws}/{repo} — full metadata. branches_list(client, ws, repo, *, count=50, sort, query=None) Most-recently-updated first, matches bash default. Optional `q` filter. branch_show(client, ws, repo, name) Fetch a single branch by name. URL-encodes the name so `feat/widget` isn't interpreted as a sub-resource path. Not in bash today — useful for the agent's "does this branch exist?" pre-flight check before creating a PR. vars_list(client, ws, repo, *, count=100) Pipeline configuration variables. Bash masks secured values to `********` in its display layer; Python returns the raw dicts (Bitbucket sends `"value": null` for secured entries — MCP tools surface the `secured` flag so callers don't misread null as unset). downloads_list(client, ws, repo, *, count=25) Repository download artifacts. commits_list(client, ws, repo, *, branch=None, count=10) With branch=None, lists across all branches via /commits. With branch=NAME, lists commits reachable from that branch via /commits/{name}. Branch names URL-encoded. Not in bash today. Bash parity items newly surfaced (queued for 4.7): - bash repos_list / branches_list / vars_list / downloads_list don't expose BBQL `q` query filter or `count` parameters — agent has richer access by default. - bash has no branch_show, commits_list — add as new subcommands alongside the previously-queued pr_unapprove / pr_comment_add / whoami. Test discipline (same as prior bb_ops PRs): - Every HTTP test asserts URL, method, AND body shape per call. - Rejection tests assert opener.calls == [] (no network IO on bad input). - Bool-is-int (True/False) rejection covered on every count parameter. - Whitespace-only rejection on every required-string parameter. - Branch-name slash URL-encoding verified specifically in branch_show and commits_list. Test count: 163 -> 211. All passing locally on Python 3.11. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.4 review fixes Folds in all 5 findings from the code review (none blocking; reviewer explicitly called the PR "clean, well-disciplined"): 1. repos_list now applies the same workspace validation as bb_api.repo_path — rejects embedded '/', '.', '..' in addition to empty/whitespace. Without this, `workspace="acme/widget"` would silently build a single-repo endpoint URL and paginate against a response that lacks `values` — confusing failure mode the boundary validator exists to prevent everywhere else. repos_list is the only op that doesn't route through repo_path, so the check is duplicated here. 2. Removed redundant local `from urllib.parse import quote as _quote` in branch_show and commits_list; both now use the module-level `quote` import that's been there since 4.2. 3. Updated stale module docstring — was still saying "This PR (Phase 4.2) adds pipeline operations." Now reflects current scope (pipelines + PRs + repos/branches/vars/downloads/commits) and references the companion git_ops module that 4.5 will add. 4. TestBranchesList.test_query_filter now asserts the exact URL-encoded form of the BBQL query (matching the style of TestReposList.test_query_filter). The previous `"q=" in url` would have passed even if the query string were silently mangled. 5. Three new tests close coverage gaps: - TestBranchesList.test_rejects_empty_query — pin the empty/whitespace rejection that was implemented but untested. - TestCommitsList.test_count_walks_pages — pin commits_list's multi-page pagination (it has the most complex path shape of the bunch with the branch-vs-no-branch split). - TestReposList.test_rejects_workspace_with_slash + test_rejects_workspace_dot_segments — pin the new workspace validation from finding #1. Skipped optional suggestions: - vars_list `query=` filter (no concrete agent use case yet). - _collect_paginated() helper (5x repetition is still inline-readable; refactor when a 6th list op materialises). Test count: 211 -> 218. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Phase 4.5: git_ops — git-context wrappers for the MCP agent (#6) * Add git_ops — git-context wrappers for the MCP agent Phase 4.5 of the MCP server work. Five subprocess-mocked wrappers the MCP agent uses to resolve "current branch", "is the tree clean?", "workspace/repo from origin", "recent commits", and "what's uncommitted?" — the context tools that have to run before any bb_ops call. 38 new tests, 256 total. git_ops.py public surface (stdlib-only, subprocess + bb_api.parse_remote_url): git_current_branch(path?) -> str `git rev-parse --abbrev-ref HEAD`. Detached HEAD returns the literal string "HEAD" (matches git's own output; callers detect detached state explicitly). git_remote_repo(path?) -> (workspace, repo_slug) Routes through bb_api.parse_remote_url for parity with bb_api.detect_repo's loose-host parsing (intentional — enterprise Bitbucket Server uses non-bitbucket.org URLs). Returns BOTH the workspace AND repo, unlike bb_api.detect_repo which returns only the slug. git_status(path?) -> dict Parses `git status --porcelain=v2 --branch --untracked-files=normal`. Returns a structured snapshot: branch, upstream, ahead, behind, clean, staged, modified, untracked, unmerged. Handles ordinary (type "1"), rename/copy (type "2", with new-path\told-path tab-separator), and unmerged (type "u") lines. Header lines parsed for branch.head / branch.upstream / branch.ab. git_recent_commits(path?, *, count=10, ref="HEAD") -> list[dict] `git log --pretty=format:%H<US>%h<US>%s<US>%an<US>%aI`. Uses U+001F (Unit Separator) as the field delimiter — a control character that cannot appear in commit subjects, so we never have to escape or parse-around content. Each entry: {sha, short, subject, author, date}. git_uncommitted_changes(path?) -> dict Three subprocess calls: git diff --cached -> staged_diff git diff -> working_diff git ls-files --others -> untracked_files --exclude-standard Diffs returned as raw unified-diff text; agent can show verbatim or parse further. Error handling: GitOpError carries the failing command + returncode + stderr (truncated in the message). Distinct from BBApiError so the MCP layer can render "git failure" vs "Bitbucket failure" differently. FileNotFoundError on `git` binary missing -> GitOpError(127, ..., "git executable not found on PATH"). Matches bb_api.detect_repo's behaviour. Test discipline (same as bb_api / bb_ops): - Every test asserts the EXACT subprocess shape (command + args + cwd + capture_output / text / check). A refactor swapping `git status --porcelain=v2` for v1 would change every parse result but the canned-output tests would still pass without this. - Parametrised tests on remote-URL shapes (https, ssh, embedded user-info, self-hosted Bitbucket Server) — verify the loose-host parser handles every shape the bash side accepts. - git_status parser tested against five realistic v2 captures: clean, ahead/behind, no-upstream, mixed (staged + modified + untracked), renamed, unmerged. - git_recent_commits format string pinned: %H, %h, %s, %an, %aI all present + U+001F delimiter. Malformed log lines skipped defensively rather than crashing. - Invalid-count / empty-ref rejection on git_recent_commits with True/False included (bool-as-int trap). - git_uncommitted_changes verifies the EXACT three-call sequence with explicit `cwd=` propagation on all three. CI / packaging: pyproject.toml py-modules += "git_ops". ci.yml syntax-check += "git_ops.py". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.5 review fixes Addresses 5 of 6 findings from the round-1 review (the 6th — str vs os.fspath — is pre-existing in bb_api and would be unrelated churn in this PR). Security: #1 git_recent_commits(ref=...) now rejects refs starting with '-'. Without this, an agent-supplied ref like `ref='--all'` would be parsed by git as an option flag, not a ref — silently replacing the call with a glob over all branches. `ref='-h'` would print help and return exit 0 with help text in stdout, which the parser would skip. The MCP agent is the consumer, and agent inputs are user-influenced, so option-injection at this boundary is a real concern. Belt + suspenders: also added `--` terminator to the `git log` invocation so anything past it is parsed as a positional ref regardless. Correctness: #3 GitOpError(returncode=0) for parse failures was misleading. A real git exit is always >= 0, so callers branching on `err.returncode == 0 -> success` would have been confused. Now parse-failure paths use the GIT_PARSE_ERROR_RETURNCODE sentinel (-1) and the class docstring documents the convention. Both parse-error tests now assert on the sentinel. Defensive: #4 git_recent_commits added an upper bound of _MAX_LOG_COUNT (1000) on `count` so an agent that confabulates `count=1_000_000` doesn't silently pull a million records back across the MCP boundary. Documentation: #2 git_status docstring now documents the C-quoted-pathname limitation as a known caveat (porcelain v2 line-oriented output quotes pathnames with newlines/tabs/quotes/control-chars). Fixing properly requires switching to `-z` + NUL-split parser — deferred until a real bug report shows up. #6 _LOG_FIELD_SEP comment now says "almost never" rather than "cannot" — git stores arbitrary bytes so U+001F technically could appear in a commit subject; the parser already handles it defensively. New tests: - 4 parametrized cases for ref-starts-with-dash rejection. - count-above-cap rejection + count-at-cap accepted (boundary verification). - Spaces-in-filename preserved (porcelain v2 type-1 line uses split(" ", 8) — the suggested regression guard). - Existing parse-failure tests now assert returncode is the sentinel. Skipped: #5 (str vs os.fspath nit). Pre-existing in bb_api.detect_repo, would be unrelated churn here; defer to a separate consistency PR. Test count: 256 -> 263. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.5 round-2 review fixes Substantive round-2 review surfaced 10 findings (medium to very-low), several with concrete failure modes the MCP server would hit in production. All 10 addressed plus the 3 test-coverage gaps. Medium-severity correctness: #1 git diff no longer leaks ANSI color codes. _run_git now prepends `git -c color.ui=never` to every invocation, overriding any user's `color.ui = always` gitconfig that would otherwise emit `\x1b[31m...` escapes into pipe-captured output. Diffs and log output are now safe to ship to the MCP agent verbatim. #2 Detached-HEAD reporting normalised. `git_current_branch` already returned the literal "HEAD" for detached state; the porcelain v2 parser returned "(detached)" for the same state. Now both return "HEAD" so cross-checks between the two functions agree on the underlying state. Test added. #3 subprocess.run now has timeout=30s with TimeoutExpired wrapped as GitOpError(returncode=GIT_PARSE_ERROR_RETURNCODE). A wedged git (held index.lock, credential-helper prompting on stdin, NFS server gone) can no longer hang the MCP server thread indefinitely. Test added. #4 subprocess.run now passes encoding="utf-8", errors="replace" explicitly. text=True alone defers to locale.getpreferredencoding(), which on minimal Docker / cron / systemd contexts is ASCII or C — a UTF-8 filename or author name would crash inside subprocess.run before _run_git could wrap it. Explicit UTF-8 + replace-on-error means we never lose to a locale-restricted host. Low-severity correctness / data loss: #6 splitlines() replaced with split("\n") in three parsers. The broader splitlines() also breaks on \r / \v / \f / U+0085 / U+2028 / U+2029, any of which in a commit subject (or path) would fragment one logical record into multiple "lines" that each fail the per-line guards and silently vanish. Test added covering carriage-return-in-subject preservation. #7 git_uncommitted_changes diffs now capped at _MAX_DIFF_BYTES (1 MiB) per diff, with an explicit truncation marker so the caller (the MCP agent) knows what happened and can fall back to `git diff --stat`. Without this cap, a stray multi-GB staged blob would OOM-kill the MCP server. Test added. #8 git_recent_commits parser now skips records with empty SHA. A line of pure separators ("\x1f\x1f\x1f\x1f") passes the parts-count guard and would otherwise append a degenerate {"sha":"", "short":"", ...} entry. Test added. Very-low defensive: #9 branch.ab parser now validates sign prefixes ("+N -M") before stripping. A malformed "-3 +1" line would otherwise propagate bogus values through as negative ahead / positive behind. Test added. #10 Type-1/2 XY field bounds-checked. A single-char XY from a corrupted git output stream no longer raises IndexError outside GitOpError; the malformed line is skipped defensively. Test added. Docstring: #5 _LOG_FIELD_SEP comment further softened — explicitly notes the trade-off ("rare data loss on pathological commit" vs "robust separator that won't collide with common subject content") and points to `git log -z` as the genuinely-safe variant if this ever bites a real user. Test-coverage gaps closed: - test_unborn_branch_raises_giterror — pin real-world behaviour where `git log` on a freshly-init'd repo exits 128 (the previous test_empty_output_returns_empty_list only exercised the mocked exit-0-empty-stdout path). - test_passes_cwd_when_path_given added to git_remote_repo, git_status, and git_recent_commits — was previously only on git_current_branch and git_uncommitted_changes. - test_detached_head_normalised_to_HEAD pins the new cross-function sentinel agreement. All subprocess-shape assertions updated for the new `git -c color.ui=never` prefix (argv indices shifted by 2). Skipped: - The U+001F dropping case (#5 partial) — already returns [] for a pathological subject; the docstring softening is the right response without a real-user bug report. Test count: 263 -> 274. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.5 round-3 review fixes (hardening) Reviewer flagged the PR as "in good shape to merge" but surfaced 5 hardening items, two of which they explicitly called out as worth-doing-before-MCP-wires-this-up. All 5 addressed plus the corresponding test coverage. Security: #2 Origin URL credential redaction. CI pipelines and credential- helper-less dev setups commonly embed auth in remote URLs (`https://x-token-auth:abcd@bitbucket.org/...`). On the rare unparseable URL, the secret would have landed in the GitOpError message, flowed up through MCP into agent context, and likely ended up in downstream logs. New `_redact_url_creds()` strips `://user:token@` -> `://[redacted]@` before any URL is echoed in an error string. Test added that asserts the secret token is absent from the raised error and that "[redacted]" appears. Reliability: #3 stdin / env hardening on every git call. `_run_git` now passes `stdin=subprocess.DEVNULL` so a credential prompt fails immediately with EOF rather than blocking on the MCP server's inherited stdin. Environment also adds GIT_TERMINAL_PROMPT=0 (git refuses to prompt at all) and GIT_ASKPASS="" (no GUI askpass dialog). Belt + suspenders alongside the 30s timeout — what was previously a 30-second wedge into "timed out" is now an instant clear error. Subprocess-shape assertions updated. UX: #1 FileNotFoundError disambiguation. subprocess.run raises FileNotFoundError for TWO distinct cases — git binary missing OR cwd directory missing — which the previous catch conflated into "git executable not found on PATH". Once the MCP agent is passing paths (e.g. a worktree that's since been removed), the user would chase their PATH config when the real cause is the bad directory. New code inspects `e.filename`: if it matches the cwd, raises with "path does not exist: <cwd>" instead. Test added covering the cwd-missing path. #4 Unborn-branch sentinel normalisation. Round 2 normalised `# branch.head (detached)` -> "HEAD" so git_status and git_current_branch agree on detached state. The same shape of asymmetry existed for unborn branches: porcelain v2 reports `# branch.oid (initial)` + `# branch.head main` (the would-be branch) while git_current_branch raises GitOpError on the same repo. Parser now tracks an `is_unborn` flag (line order independent — branch.oid / branch.head ordering in v2 output isn't documented as stable) and overrides branch to "HEAD" at end-of-loop so the two functions agree on "weird state, not a regular branch." Test added. Defensive: #5 `_cap_diff` fast-path on `len(text)`. UTF-8 byte count is always >= char count, so if char count fits the cap, the encoded form definitely does. Avoids materialising a transient bytes copy of a multi-GB string just to check its size — the cap exists specifically to defend against OOM on huge diffs, and the check itself shouldn't be the trigger. Deferred: None. All 5 findings addressed in this commit. Test count: 274 -> 277. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.5 round-4 review fixes Round 4 found a real bug I introduced in round 3 plus two cheap hardening items. All three addressed. Real bug fixed: #1 _cap_diff fast path was inverted. The round-3 fast path used `if len(text) <= cap: return text` with the reasoning "UTF-8 byte count is always >= char count, so chars <= cap implies bytes <= cap." That's backwards — bytes >= chars means `chars <= cap` does NOT imply `bytes <= cap`. A 600K-emoji string was 600K chars (under the 1 MiB char cap) but encoded to 2.4 MiB, defeating the OOM defense for non-ASCII content. The existing oversize-diff test used ASCII `"x"` chars (where chars == bytes), so the divergence was never exercised — the bug rode through round-3 review undetected. Fixed by gating the fast path on `text.isascii()`: ASCII-only strings genuinely have bytes == chars, so the len check is sound. Non-ASCII falls through to the encode-and-measure path. Regression test added with `"\U0001F600" * 600_000` (4-byte emoji) — asserts the returned diff is byte-capped, not char-capped. Hardening: #2 GitOpError.command in parse-failure paths now includes the `-c color.ui=never` prefix that _run_git actually passes to subprocess. Previously err.command in git_current_branch's empty-output path and git_remote_repo's unparseable-URL path hard-coded the unprefixed form, so a caller introspecting err.command (or reproducing the failure from the message) would see a different invocation than what was actually run. Tests strengthened to assert the prefix is present. #3 File-list cap added: _MAX_PATH_LIST = 10_000. Mirrors the _MAX_DIFF_BYTES and _MAX_LOG_COUNT bounds that already protect the other two agent-facing surfaces. Without this, a repo that forgot to gitignore node_modules / .venv / target/ would return hundreds of thousands of untracked paths through MCP — plausible OOM trigger. Applied to: - git_uncommitted_changes' untracked_files - _parse_status_porcelain_v2's staged / modified / untracked / unmerged lists New `_truncated_path_list` helper appends a sentinel marker (`<... N more paths omitted (cap N exceeded)>`) so the agent sees the truncation and can fall back to a narrower query. New TestPathListCap class pins the helper directly: under-cap / at-cap / one-over / many-over / git_status integration. Test count: 277 -> 284. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.5 round-5 review fixes Round 5 (extra-high recall mode) surfaced 12 findings. Reviewer's explicit verdict: "Findings 1, 2, and 4 are concrete defects worth fixing before Phase 4.6 wires this up to an agent." Addressed those three plus six more cheap real items; skipped three cosmetic / very- narrow ones. Correctness: #1 branch.ab strict validation. The startswith-only check accepted smuggled double-sign strings like "+-3 -1" where `int(parts[0][1:])` parsed as -3, contradicting the parser's own promise to reject negative values. Now requires `parts[N][1:].isdigit()` after the sign so "+5", "-3" parse but "+-3", "++5", "+junk" don't. Parametrized regression test added. #2 branch.ab atomic update. Previous try/except wrapped both int() calls together — if the first succeeded and the second raised, out["ahead"] was mutated to the new value while out["behind"] kept the default 0, producing an internally-inconsistent dict on a half-malformed line ("+5 -junk"). Now parses both into locals first, commits to out[] only on full success. #4 NotADirectoryError + PermissionError wrapped as GitOpError. Previously only FileNotFoundError + TimeoutExpired were caught, so passing path=/etc/passwd (regular file) or path=/restricted (unreadable directory) leaked raw OSErrors out of _run_git, breaking the module docstring's "Errors raise GitOpError" promise. Tests added with dedicated runner stubs for each error class. Security: #3 URL credential redaction handles `@` in passwords. The old `[^/@]+@` regex stopped at the first `@`, leaving the tail of a password containing a literal `@` exposed (e.g. `https://user:p@ss@host/...` → `https://[redacted]@ss@host/...`). Switched to `[^/]+@` which is greedy up to the LAST `@` before the path. Regression test added. Architecture: #11 _truncated_path_list returns (list, omitted_count) tuple instead of mixing a sentinel string into the path list. The in-list sentinel broke the "list of valid paths" contract — an agent iterating with `os.stat(p)` / `Path(p).exists()` / `os.path.join(root, p)` would hit the non-path "<...>" entry and crash or silently mis-handle it. Now the cap surfaces via sibling `<key>_omitted` fields (always present, 0 when not truncated) so the agent can detect truncation explicitly without iterating over non-path data. Applied to: - _parse_status_porcelain_v2: adds staged_omitted / modified_omitted / untracked_omitted / unmerged_omitted - git_uncommitted_changes: adds untracked_files_omitted #12 GIT_PARSE_ERROR_RETURNCODE moved from -1 to -1000. Python's subprocess.run uses negative returncodes for signal-killed processes (-1 = SIGHUP, -9 = SIGKILL, -15 = SIGTERM, etc.). A SIGHUP-killed git would otherwise land at returncode = -1, colliding with the parse-error sentinel and being misclassified. -1000 is safely outside the signal range. Test added asserting the sentinel is < -100. Documentation / parity: #5 git_status docstring updated to reflect the actual return shape. Previously said `"branch": ... | "(detached)"` but the code returns `"HEAD"` for both detached and unborn (per round 3's normalization). A caller reading the docstring and writing `if branch == "(detached)": handle_detached()` would never enter that branch. #6 Type-? (untracked) parser skips empty-path lines. Previously `"? "` (prefix only, no path body) would append "" to untracked and flip clean=False with a phantom entry. Asymmetric with type-1 / type-2 which have length guards. Test added. #8 Type-u (unmerged) parser checks XY field width. Type-1 and type-2 paths added this in round 3; the type-u path missed. Trivial parity fix. Test added. Skipped: #7 (cosmetic — diff cap is _MAX_DIFF_BYTES + ~140 marker bytes, not a strict cap; mattering only to a downstream consumer that enforces an exact ceiling, which no current consumer does) #9 (doc asymmetry — `git_uncommitted_changes` docstring doesn't mirror `git_status`'s C-quoting caveat; added to git_status' expanded docstring covering both surfaces) #10 (SCP-style URL credentials — narrow trigger, parse_remote_url accepts most SCP shapes; revisit if a real leak surfaces) Test count: 284 -> 296. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Phase 4.6: mcp_server.py — FastMCP wrapper around bb_ops + git_ops (#7) * Add mcp_server.py — FastMCP wrapper around bb_ops + git_ops Phase 4.6 of the MCP server work. Wires every bb_ops and git_ops function into a FastMCP tool registry, with a self-bootstrap venv mirroring zenhub-cli's mcp_server.py pattern. 39 new tests, 335 total. mcp_server.py — 30 tools: PIPELINES (6): pipelines_list, pipeline_show, pipeline_steps, pipeline_trigger, pipeline_stop, pipeline_logs PULL REQUESTS (11): prs_list, pr_show, pr_activity, pr_create, pr_approve, pr_unapprove, pr_merge, pr_decline, pr_diff, pr_comments_list, pr_comment_add REPOS / METADATA (7): repos_list, repo_show, branches_list, branch_show, vars_list, downloads_list, commits_list GIT CONTEXT (5): git_current_branch, git_status, git_remote_repo, git_recent_commits, git_uncommitted_changes META (1): whoami Self-bootstrap pattern (mirrors zenhub-cli): - On launch, checks /tmp/bbenv. If missing, finds a python3 >= 3.10 (preferring the launching interpreter, falling back to pyenv / Homebrew / system locations), runs `python -m venv`, installs the `mcp` package, then `os.execv`s into the venv. - Bootstrap is stdlib-only (uses subprocess + sys, no third-party imports until after the venv is active). Means any python3 on PATH can launch the server. - `mcp` is the only venv dep — no heavy ML libs (no torch / sentence-transformers / numpy / huggingface_hub). bb doesn't have semantic-similarity workflows like zenhub-cli does. - BB_MCP_SKIP_BOOTSTRAP=1 escape hatch (set in conftest.py) lets pytest exercise tool wiring and result-dict shapes without pulling in mcp. Production NEVER sets this — without real FastMCP, .run() raises. Repo resolution — every Bitbucket tool takes an optional `repo`: - "" → auto-detect via `git remote get-url origin` from BB_DEFAULT_REPO_PATH (or cwd) - "myrepo" → use BB_WORKSPACE from config + "myrepo" - "acme/myrepo" → use "acme" workspace + "myrepo" slug (overrides BB_WORKSPACE for this call) Validated in `_resolve_repo` with ValueError on malformed input ("a/b/c", "/repo", "ws/", "/", "//" all rejected before any network call burns API budget). Error envelope — every tool returns either: {"ok": True, "workspace": ..., "repo": ..., <result fields>} {"ok": False, "kind": "<ExceptionClassName>", "message": ..., <extras>} The error dict carries `status`/`url`/`body` for BBApiError, `returncode`/`stderr` for GitOpError, so the agent can branch on `kind == "BBApiError" and status == 404` without parsing the message. pr_create auto-detects source_branch via git_current_branch when empty (matches bash `bb pr-create`). whoami reports user / workspace / api_base + best-effort git context (branch, workspace, repo from remote). Token is NEVER echoed; the test explicitly asserts the token string is absent from the response. Test discipline (same as bb_api / bb_ops / git_ops): - Tool count + names pinned exactly (test_all_expected_tools_registered) — a future rename or accidental drop fails the suite. - Per-tool wiring: patch the bb_ops / git_ops function at the module level, call the tool, assert (a) the underlying function received the right args, (b) the response dict has the expected shape. Doesn't re-test bb_ops logic (already covered comprehensively). - Error envelope tested for each exception class (BBApiError, BBOpNotFound, GitOpError, ValueError). - whoami's no-token guarantee asserted as a security property. - FastMCP stub's .run() raises a clear error so a test that accidentally calls it fails loud instead of hanging on stdio. CI / packaging: pyproject.toml py-modules += "mcp_server". ci.yml syntax-check += "mcp_server.py". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.6 review fixes Round 1 review on PR #7 flagged 15 findings, with two explicitly called out as production hazards ("will bite real users on first deploy on common Linux topologies"). Addressed 11 of 15; deferring 4 with rationale. Production hazards (must-fix per reviewer): #1 Bootstrap venv detection now uses `sys.prefix == str(_VENV_DIR)` instead of `realpath(sys.executable) == realpath(_VENV_PY)`. `python -m venv` on Linux/macOS defaults to --symlinks, so realpath(/tmp/bbenv/bin/python3) resolves to the SAME canonical path as the builder interpreter (e.g. /usr/bin/python3.12) — the realpath comparison was claiming "already under venv" while we were actually still under the system interpreter, skipping the execv and dying on `from mcp.server.fastmcp import FastMCP`. sys.prefix is set per-interpreter from the venv layout and is the authoritative signal. #2 Bootstrap idempotency. Pip install can be interrupted (Ctrl-C, OOM-kill, disk full) leaving _VENV_PY in place but `mcp` missing. The old guard skipped reinstall on every subsequent launch and died on import with no self-repair path. Now writes a `.bbenv-ready` sentinel ONLY after the full bootstrap succeeds; the venv create step is also idempotent (only runs if _VENV_PY is absent) so retries pick up where they left off. #3 Pinned mcp dependency to `mcp>=1.0,<2`. The previous unpinned "mcp" would silently install a future breaking mcp 2.x on the next /tmp wipe. Matches the pyproject [mcp] extra. Correctness: #4 pr_create now rejects detached-HEAD auto-detect. git_current_branch returns the literal "HEAD" for detached/unborn state; bb_ops only checks non-empty/non-whitespace, so the call would have reached Bitbucket with source.branch.name="HEAD" and surfaced as an opaque API error. Now raises ValueError locally with a clear "pass source_branch=" hint. #5 _resolve_repo strips whitespace before parsing. A sloppy paste like " acme/widget " no longer slips through as workspace=" acme" and fails deep in the API layer. #11 pr_create source_branch auto-detect now triggers on whitespace (not just empty string), symmetric with #5. #13 OSError added to _TOOL_EXPECTED_EXCEPTIONS. Covers paths git_ops._run_git doesn't wrap explicitly (IsADirectoryError, BlockingIOError, ConnectionResetError), and catches os.getcwd() on a deleted cwd inside _default_repo_path() which fires BEFORE any wrapped git call. Surface improvements: #6 pipelines_list now exposes `sort=` kwarg. Default "-created_on" (newest first); agent can pass "created_on" for oldest-first or other sort keys. #7 pipeline_logs + pr_diff now expose `timeout=` kwarg. Default 120s matches bb_ops; agent can bump for pipelines with very large log payloads or oversized PR diffs that would otherwise cap out. #8 git_status payload renamed from `status` to `working_tree`. The `status` key collided with the HTTP-status field _error_dict uses for BBApiError. Today the collision can't fire (git_status doesn't raise BBApiError), but the rename pre-empts a future broadening hazard. #10 branch_show echoes the stripped name in the response. bb_ops strips before encoding the URL, so the response now matches what Bitbucket actually resolved rather than the unstripped input. Deferred (with rationale): #9 Response-key naming consistency (approval/result/pr). Reviewer suggested standardizing all mutation tools to "result", but the descriptive keys carry useful semantic info — pr_approve returns an approval record, pr_merge returns the merged PR, etc. The docstrings explain the shape. Skipping. #12 _client_cache concurrency + invalidation. Real concern for long-lived servers with rotated tokens, but stdio MCP is single-threaded and token rotation is rare; defer to Phase 4.6+ if a real bug surfaces. #14 CI smoke step that actually imports the real mcp package. Larger CI workflow change; defer. #15 README MCP documentation. Explicitly Phase 4.8 scope. Test count: 335 -> 341. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.6 round-2 review fixes Round 2 surfaced 15 new findings including a real security bug. Fixed all 15. Security: #1 BBApiError.url now redacted in _error_dict. pipeline_logs and pr_diff follow Bitbucket's 307 to a presigned S3 URL via fetch_redirected_text; on a downstream non-3xx (S3 clock skew, expired signature, redirect cap), BBApiError(url=<signed S3 URL>) carried AWS-credential-bearing query parameters (X-Amz-Signature, X-Amz-Credential) verbatim into the agent error dict and downstream logs. New _redact_url helper strips `user:token@` AND replaces signed-URL query strings with `[redacted-signed-url-params]` while preserving the path so the agent knows what host was called. Correctness: #2 AttributeError added to _TOOL_EXPECTED_EXCEPTIONS. JSON null from an MCP client deserialises to Python None; `.strip()` on None crashes uncaught. Also pre-normalised every stringy arg path through `(value or "").strip()` so None and "" funnel to the same auto-detect path uniformly. #3 _default_repo_path no longer evaluates os.getcwd() eagerly. `os.environ.get("KEY", os.getcwd())` runs the default even when the env var is set — the override never protected against a deleted cwd. Now uses `... or os.getcwd()` so the env var is the lazy short-circuit. whoami also wraps cwd resolution in a try block so a deleted cwd surfaces as a structured cwd_error field instead of escaping the function. #4 mcp import wrapped with helpful recovery message. If `mcp/` gets deleted post-bootstrap (manual `pip uninstall`, partial /tmp cleanup that wiped the package dir but spared the sentinel), the ImportError now tells the user to `rm .bbenv-ready` or `rm -rf /tmp/bbenv`. #5 _resolve_repo now strips inner slug parts. "acme/ widget" no longer slips through as workspace="acme", slug=" widget" and 404s on `/repositories/acme/%20widget`. #6 repos_list workspace stripped. ` acme` / `acme ` no longer slip through. #7 TypeError removed from _TOOL_EXPECTED_EXCEPTIONS (was contradicting the file's stated philosophy). A refactor that renames a bb_ops kwarg now crashes visibly during dev/test instead of being silently reported back to the agent as a fake Bitbucket failure. #8 whoami git-context handlers now use _TOOL_EXPECTED_EXCEPTIONS (not narrow GitOpError) AND store the full structured _error_dict (not str(e)). An agent branching on err.returncode for git-127 vs not-a-git-repo-128 vs timeout-(-1000) has the same shape available as every other tool. #10 _opt_str() helper introduced. (branch or None) became `_opt_str(branch)` everywhere — "" / " " / None all funnel to None at the bb_ops boundary. Without this, " " on a branch filter posted `?target.ref_name=%20%20`, on a pattern raised a confusing two-layers-down ValueError, on commits raised a different "non-empty, non-whitespace" error. Inconsistent surfaces for the same agent typo. #12 sys.prefix venv check now resolves both sides through Path.resolve() so macOS's `/tmp` -> `/private/tmp` symlink doesn't trigger an infinite execv loop. #13 Every tool that takes an identifier (pr_id / number / step_index) now threads it into the error dict via _error_dict_with. An agent fanning out parallel pipeline_logs(number=42, step_index=5) and pipeline_logs(number=42, step_index=6) can correlate failures to originating requests without substring-matching the message. #14 _resolve_repo validates the repo arg shape BEFORE calling _get_client(). A fresh-machine user without config + a malformed slug now sees the actual ValueError, not the BBConfigError that previously masked the real cause. #15 git_recent_commits echoes the stripped ref in the response (matches what `git log` actually ran). Consistent with branch_show. UX: #11 pip ins…

daniel-pittman and others added 19 commits May 26, 2026 14:40

Bump VERSION to 1.1.0 for release

a642baa

daniel-pittman merged commit 74f5194 into main May 27, 2026
9 checks passed

daniel-pittman mentioned this pull request May 27, 2026

Back-merge v1.1.0 release into develop #15

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release v1.1.0: Python MCP server + bash CLI parity + agent definition#14

Release v1.1.0: Python MCP server + bash CLI parity + agent definition#14
daniel-pittman merged 19 commits into
mainfrom
develop

daniel-pittman commented May 27, 2026

Uh oh!

Uh oh!

claude Bot commented May 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

daniel-pittman commented May 27, 2026

v1.1.0 — Python MCP server + bash CLI parity + bundled agent

Headline additions

Security posture

Notable behind-the-scenes fixes (discovered in PR review rounds)

PRs included

Test plan

Deferred to v1.2.0 (rolled into one epic — no separate v1.1.1 planned)

Uh oh!

Uh oh!

claude Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review — PR #14 (v1.1.0)

1. mcp_server.py:801-813 — pr_create auto-detects branch from cwd, ignoring the explicit repo arg

2. bb:362, 400, 408 — cmd_logs interpolates user-supplied step_index into a jq filter with no _require_* validator

3. bb:1211 — cmd_whoami origin-URL redaction regex stops at first @

4. bb:162 — ${1!r:-empty} is Python f-string syntax, not bash

5. bb:438-456 — cmd_pipeline_trigger claims NUL-delimited args but jq splits on space

6. bb_ops.py:333 and bb_ops.py:631 — pipeline_logs / pr_diff return unbounded body strings

7. mcp_server.py:166-192 — _bootstrap_venv has no concurrency guard

8. bb:289, 384 — bash uses pagelen=50 on /steps/ while Python paginates fully

9. bb:545-554 — cmd_pipeline_stop scans only 100 recent pipelines vs Python's 2000

10. bb:213, 605, 916, 981, 1041, 1126, 1163 — count interpolated into pagelen URL without shape validation

11. bb:605 — cmd_pr_list doesn't validate state against the known set

12. bb:459-467 and bb_ops.py:261-262 — pipeline_trigger HEAD-rejection is asymmetric and incomplete

13. bb_ops.py:473-490 — pr_create validates not value.strip() but uses the raw, untrimmed value in the payload

14. mcp_server.py:666 — pipeline_trigger exception handler uses _error_dict(e) not _error_dict_with(e, branch=...)

15. bb_api.py:223-231 — detect_repo shells out to git without subprocess hardening

Sweep notes (gaps not in the top 15)

What I didn't post inline

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude Bot commented May 27, 2026 •

edited

Loading

1. `mcp_server.py:801-813` — `pr_create` auto-detects branch from cwd, ignoring the explicit `repo` arg

2. `bb:362, 400, 408` — `cmd_logs` interpolates user-supplied `step_index` into a jq filter with no `_require_*` validator

3. `bb:1211` — `cmd_whoami` origin-URL redaction regex stops at first `@`

4. `bb:162` — `${1!r:-empty}` is Python f-string syntax, not bash

5. `bb:438-456` — `cmd_pipeline_trigger` claims NUL-delimited args but `jq` splits on space

6. `bb_ops.py:333` and `bb_ops.py:631` — `pipeline_logs` / `pr_diff` return unbounded body strings

7. `mcp_server.py:166-192` — `_bootstrap_venv` has no concurrency guard

8. `bb:289, 384` — bash uses `pagelen=50` on `/steps/` while Python paginates fully

9. `bb:545-554` — `cmd_pipeline_stop` scans only 100 recent pipelines vs Python's 2000

10. `bb:213, 605, 916, 981, 1041, 1126, 1163` — `count` interpolated into `pagelen` URL without shape validation

11. `bb:605` — `cmd_pr_list` doesn't validate `state` against the known set

12. `bb:459-467` and `bb_ops.py:261-262` — `pipeline_trigger` HEAD-rejection is asymmetric and incomplete

13. `bb_ops.py:473-490` — `pr_create` validates `not value.strip()` but uses the raw, untrimmed value in the payload

14. `mcp_server.py:666` — `pipeline_trigger` exception handler uses `_error_dict(e)` not `_error_dict_with(e, branch=...)`

15. `bb_api.py:223-231` — `detect_repo` shells out to git without subprocess hardening