Bump VERSION to 1.1.0 for release by daniel-pittman · Pull Request #13 · daniel-pittman/bitbucket-cli

daniel-pittman · 2026-05-27T23:41:15Z

Summary

VERSION bump 1.0.1 → 1.1.0 in preparation for the v1.1.0 release.

Scope of v1.1.0

(Captured in the release notes; PRs #2–#12 merged since v1.0.1.)

claude · 2026-05-27T23:41:31Z

Claude finished @daniel-pittman's task in 25s —— View job

Code Review — PR #13

Read PR context
Inspect VERSION change
Verify scope matches title/body
Post review

Summary

This PR is a single-line version bump in VERSION: 1.0.1 → 1.1.0, in preparation for the v1.1.0 release. The scope matches the title and body exactly.

- 1.0.1
+ 1.1.0

Assessment

Correctness — The new value follows SemVer and is a valid MINOR bump from 1.0.1, appropriate for an accumulation of feature PRs (Phase 4.1: bb_api.py + Python test infrastructure (MCP foundation) #2–Fix bb prs bash-3.2 incompat + macOS install docs #12) without breaking changes. The file remains a single line with a trailing newline (consistent with prior convention).
Scope — Strictly the version bump. No collateral edits, no stray formatting changes. Good hygiene for a release-prep PR.
Parity rule — Per CLAUDE.md, bash and Python sides must stay in lockstep. VERSION is a single source of truth read by both, so no parallel update is required here. Verified bb and mcp_server.py do not embed a hard-coded version string that would need to be bumped alongside this.
Tagging — As a follow-up after merge, ensure a matching v1.1.0 git tag and GitHub release are created so consumers pinning by tag stay aligned with VERSION. (Out of scope for this PR.)
Changelog — The PR body defers release notes to the GitHub release. That's fine, but if a CHANGELOG.md exists or is added later, this would be the natural place to enumerate Phase 4.1: bb_api.py + Python test infrastructure (MCP foundation) #2–Fix bb prs bash-3.2 incompat + macOS install docs #12.

Findings

No blocking issues. No security, correctness, or style concerns in the diff.

Verdict

LGTM — ship it. Status notes this is already MERGED, so this review is informational. Confirm the v1.1.0 tag/release is pushed to finalize the release.

• Branch: claude/pr-13-20260527-2341

* Add mcp_server.py — FastMCP wrapper around bb_ops + git_ops Phase 4.6 of the MCP server work. Wires every bb_ops and git_ops function into a FastMCP tool registry, with a self-bootstrap venv mirroring zenhub-cli's mcp_server.py pattern. 39 new tests, 335 total. mcp_server.py — 30 tools: PIPELINES (6): pipelines_list, pipeline_show, pipeline_steps, pipeline_trigger, pipeline_stop, pipeline_logs PULL REQUESTS (11): prs_list, pr_show, pr_activity, pr_create, pr_approve, pr_unapprove, pr_merge, pr_decline, pr_diff, pr_comments_list, pr_comment_add REPOS / METADATA (7): repos_list, repo_show, branches_list, branch_show, vars_list, downloads_list, commits_list GIT CONTEXT (5): git_current_branch, git_status, git_remote_repo, git_recent_commits, git_uncommitted_changes META (1): whoami Self-bootstrap pattern (mirrors zenhub-cli): - On launch, checks /tmp/bbenv. If missing, finds a python3 >= 3.10 (preferring the launching interpreter, falling back to pyenv / Homebrew / system locations), runs `python -m venv`, installs the `mcp` package, then `os.execv`s into the venv. - Bootstrap is stdlib-only (uses subprocess + sys, no third-party imports until after the venv is active). Means any python3 on PATH can launch the server. - `mcp` is the only venv dep — no heavy ML libs (no torch / sentence-transformers / numpy / huggingface_hub). bb doesn't have semantic-similarity workflows like zenhub-cli does. - BB_MCP_SKIP_BOOTSTRAP=1 escape hatch (set in conftest.py) lets pytest exercise tool wiring and result-dict shapes without pulling in mcp. Production NEVER sets this — without real FastMCP, .run() raises. Repo resolution — every Bitbucket tool takes an optional `repo`: - "" → auto-detect via `git remote get-url origin` from BB_DEFAULT_REPO_PATH (or cwd) - "myrepo" → use BB_WORKSPACE from config + "myrepo" - "acme/myrepo" → use "acme" workspace + "myrepo" slug (overrides BB_WORKSPACE for this call) Validated in `_resolve_repo` with ValueError on malformed input ("a/b/c", "/repo", "ws/", "/", "//" all rejected before any network call burns API budget). Error envelope — every tool returns either: {"ok": True, "workspace": ..., "repo": ..., <result fields>} {"ok": False, "kind": "<ExceptionClassName>", "message": ..., <extras>} The error dict carries `status`/`url`/`body` for BBApiError, `returncode`/`stderr` for GitOpError, so the agent can branch on `kind == "BBApiError" and status == 404` without parsing the message. pr_create auto-detects source_branch via git_current_branch when empty (matches bash `bb pr-create`). whoami reports user / workspace / api_base + best-effort git context (branch, workspace, repo from remote). Token is NEVER echoed; the test explicitly asserts the token string is absent from the response. Test discipline (same as bb_api / bb_ops / git_ops): - Tool count + names pinned exactly (test_all_expected_tools_registered) — a future rename or accidental drop fails the suite. - Per-tool wiring: patch the bb_ops / git_ops function at the module level, call the tool, assert (a) the underlying function received the right args, (b) the response dict has the expected shape. Doesn't re-test bb_ops logic (already covered comprehensively). - Error envelope tested for each exception class (BBApiError, BBOpNotFound, GitOpError, ValueError). - whoami's no-token guarantee asserted as a security property. - FastMCP stub's .run() raises a clear error so a test that accidentally calls it fails loud instead of hanging on stdio. CI / packaging: pyproject.toml py-modules += "mcp_server". ci.yml syntax-check += "mcp_server.py". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.6 review fixes Round 1 review on PR #7 flagged 15 findings, with two explicitly called out as production hazards ("will bite real users on first deploy on common Linux topologies"). Addressed 11 of 15; deferring 4 with rationale. Production hazards (must-fix per reviewer): #1 Bootstrap venv detection now uses `sys.prefix == str(_VENV_DIR)` instead of `realpath(sys.executable) == realpath(_VENV_PY)`. `python -m venv` on Linux/macOS defaults to --symlinks, so realpath(/tmp/bbenv/bin/python3) resolves to the SAME canonical path as the builder interpreter (e.g. /usr/bin/python3.12) — the realpath comparison was claiming "already under venv" while we were actually still under the system interpreter, skipping the execv and dying on `from mcp.server.fastmcp import FastMCP`. sys.prefix is set per-interpreter from the venv layout and is the authoritative signal. #2 Bootstrap idempotency. Pip install can be interrupted (Ctrl-C, OOM-kill, disk full) leaving _VENV_PY in place but `mcp` missing. The old guard skipped reinstall on every subsequent launch and died on import with no self-repair path. Now writes a `.bbenv-ready` sentinel ONLY after the full bootstrap succeeds; the venv create step is also idempotent (only runs if _VENV_PY is absent) so retries pick up where they left off. #3 Pinned mcp dependency to `mcp>=1.0,<2`. The previous unpinned "mcp" would silently install a future breaking mcp 2.x on the next /tmp wipe. Matches the pyproject [mcp] extra. Correctness: #4 pr_create now rejects detached-HEAD auto-detect. git_current_branch returns the literal "HEAD" for detached/unborn state; bb_ops only checks non-empty/non-whitespace, so the call would have reached Bitbucket with source.branch.name="HEAD" and surfaced as an opaque API error. Now raises ValueError locally with a clear "pass source_branch=" hint. #5 _resolve_repo strips whitespace before parsing. A sloppy paste like " acme/widget " no longer slips through as workspace=" acme" and fails deep in the API layer. #11 pr_create source_branch auto-detect now triggers on whitespace (not just empty string), symmetric with #5. #13 OSError added to _TOOL_EXPECTED_EXCEPTIONS. Covers paths git_ops._run_git doesn't wrap explicitly (IsADirectoryError, BlockingIOError, ConnectionResetError), and catches os.getcwd() on a deleted cwd inside _default_repo_path() which fires BEFORE any wrapped git call. Surface improvements: #6 pipelines_list now exposes `sort=` kwarg. Default "-created_on" (newest first); agent can pass "created_on" for oldest-first or other sort keys. #7 pipeline_logs + pr_diff now expose `timeout=` kwarg. Default 120s matches bb_ops; agent can bump for pipelines with very large log payloads or oversized PR diffs that would otherwise cap out. #8 git_status payload renamed from `status` to `working_tree`. The `status` key collided with the HTTP-status field _error_dict uses for BBApiError. Today the collision can't fire (git_status doesn't raise BBApiError), but the rename pre-empts a future broadening hazard. #10 branch_show echoes the stripped name in the response. bb_ops strips before encoding the URL, so the response now matches what Bitbucket actually resolved rather than the unstripped input. Deferred (with rationale): #9 Response-key naming consistency (approval/result/pr). Reviewer suggested standardizing all mutation tools to "result", but the descriptive keys carry useful semantic info — pr_approve returns an approval record, pr_merge returns the merged PR, etc. The docstrings explain the shape. Skipping. #12 _client_cache concurrency + invalidation. Real concern for long-lived servers with rotated tokens, but stdio MCP is single-threaded and token rotation is rare; defer to Phase 4.6+ if a real bug surfaces. #14 CI smoke step that actually imports the real mcp package. Larger CI workflow change; defer. #15 README MCP documentation. Explicitly Phase 4.8 scope. Test count: 335 -> 341. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.6 round-2 review fixes Round 2 surfaced 15 new findings including a real security bug. Fixed all 15. Security: #1 BBApiError.url now redacted in _error_dict. pipeline_logs and pr_diff follow Bitbucket's 307 to a presigned S3 URL via fetch_redirected_text; on a downstream non-3xx (S3 clock skew, expired signature, redirect cap), BBApiError(url=<signed S3 URL>) carried AWS-credential-bearing query parameters (X-Amz-Signature, X-Amz-Credential) verbatim into the agent error dict and downstream logs. New _redact_url helper strips `user:token@` AND replaces signed-URL query strings with `[redacted-signed-url-params]` while preserving the path so the agent knows what host was called. Correctness: #2 AttributeError added to _TOOL_EXPECTED_EXCEPTIONS. JSON null from an MCP client deserialises to Python None; `.strip()` on None crashes uncaught. Also pre-normalised every stringy arg path through `(value or "").strip()` so None and "" funnel to the same auto-detect path uniformly. #3 _default_repo_path no longer evaluates os.getcwd() eagerly. `os.environ.get("KEY", os.getcwd())` runs the default even when the env var is set — the override never protected against a deleted cwd. Now uses `... or os.getcwd()` so the env var is the lazy short-circuit. whoami also wraps cwd resolution in a try block so a deleted cwd surfaces as a structured cwd_error field instead of escaping the function. #4 mcp import wrapped with helpful recovery message. If `mcp/` gets deleted post-bootstrap (manual `pip uninstall`, partial /tmp cleanup that wiped the package dir but spared the sentinel), the ImportError now tells the user to `rm .bbenv-ready` or `rm -rf /tmp/bbenv`. #5 _resolve_repo now strips inner slug parts. "acme/ widget" no longer slips through as workspace="acme", slug=" widget" and 404s on `/repositories/acme/%20widget`. #6 repos_list workspace stripped. ` acme` / `acme ` no longer slip through. #7 TypeError removed from _TOOL_EXPECTED_EXCEPTIONS (was contradicting the file's stated philosophy). A refactor that renames a bb_ops kwarg now crashes visibly during dev/test instead of being silently reported back to the agent as a fake Bitbucket failure. #8 whoami git-context handlers now use _TOOL_EXPECTED_EXCEPTIONS (not narrow GitOpError) AND store the full structured _error_dict (not str(e)). An agent branching on err.returncode for git-127 vs not-a-git-repo-128 vs timeout-(-1000) has the same shape available as every other tool. #10 _opt_str() helper introduced. (branch or None) became `_opt_str(branch)` everywhere — "" / " " / None all funnel to None at the bb_ops boundary. Without this, " " on a branch filter posted `?target.ref_name=%20%20`, on a pattern raised a confusing two-layers-down ValueError, on commits raised a different "non-empty, non-whitespace" error. Inconsistent surfaces for the same agent typo. #12 sys.prefix venv check now resolves both sides through Path.resolve() so macOS's `/tmp` -> `/private/tmp` symlink doesn't trigger an infinite execv loop. #13 Every tool that takes an identifier (pr_id / number / step_index) now threads it into the error dict via _error_dict_with. An agent fanning out parallel pipeline_logs(number=42, step_index=5) and pipeline_logs(number=42, step_index=6) can correlate failures to originating requests without substring-matching the message. #14 _resolve_repo validates the repo arg shape BEFORE calling _get_client(). A fresh-machine user without config + a malformed slug now sees the actual ValueError, not the BBConfigError that previously masked the real cause. #15 git_recent_commits echoes the stripped ref in the response (matches what `git log` actually ran). Consistent with branch_show. UX: #11 pip install no longer --quiet. Now uses subprocess.run capture_output=True and re-raises with the captured stderr inlined so a network blip / mirror 503 / version yank / SSL cert error / proxy block surfaces with pip's actual diagnostic instead of bare `CalledProcessError: returned non-zero exit status 1`. Deferred: #9 Test recorder accepts any kwarg name (false sense of wiring pinned). Real concern but the fix (functools.wraps or create_autospec) is invasive across the test suite; defer to a focused refactor. Test count: 341 -> 348. All passing locally on Python 3.11. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.6 round-3 review fixes (security regression + edges) Round 3 caught a real security regression of round 2's fix, plus a batch of consistency / hardening items. Security regression fixed: #1 _error_dict's `message` field now also routed through URL redaction. Round 2 redacted `out["url"]` but left `out["message"]` carrying the raw URL — BBApiError.__str__ constructs the message via `f"HTTP {status} from {url}: {body[:500]}"`, so str(e) ALWAYS contains the original URL. For the S3 presigned case (pipeline_logs / pr_diff → 307 → S3 non-3xx), the error dict was shipping: {"url": "...?[redacted-signed-url-params]", # fine "message": "HTTP 403 from https://.../?X-Amz-Signature=secret..."} ^^^^^^ secret leaked The existing test_bbapierror_redacts_signed_s3_url only asserted on d["url"], so the regression was invisible to CI. The fix adds a `_redact_message` helper that scans for `http(s)://...` substrings and runs each through `_redact_url`. Tests now assert redaction on BOTH `url` AND `message` for every cred-bearing fixture. #9 Case-insensitive signed-URL indicator matching. MinIO / R2 / Backblaze / mixed-case AWS variants would have slipped past the case-sensitive substring check. Now lowercases the query part before membership. Tests added for lowercase variant + plain `Signature=` form (used by some non-AWS S3-compatible services). Correctness: #2 pipeline_trigger now validates + strips `branch`. Was the only branch-accepting tool that forwarded the arg verbatim — every other tool funnels through _opt_str, but pipeline_trigger couldn't (branch is required, not optional). Now explicitly validates non-empty/non-whitespace and strips, so `" main "` becomes `"main"` and `" "` raises a clean local ValueError instead of POSTing whitespace and getting an opaque 4xx. #3 _resolve_repo bare-slug path now validates `repo not in (".", "..")` before _get_client(). Round 2 fixed the slash-containing branch but the bare-slug fallback still hit _get_client() first, so a fresh-machine user with `repo='.'` saw BBConfigError masking the real ValueError. Parametrized regression test added. #5 pr_create now rejects explicit `source_branch="HEAD"` (not just the auto-detect path). Same severity class as round-1 #4 — just on the other entry point. Bitbucket would have silently created a degenerate PR named after the literal HEAD ref. #6 pr_create error path now uses _error_dict_with(e, title=...). Threads the title into the error dict so parallel pr_create fan-outs (e.g. agent creating one PR per stacked branch in a train) can correlate failures with originating requests. Defensive: #10 os.execv argv now uses Path(__file__).resolve(). Today's documented launch pattern uses absolute paths so the issue can't fire; this hardens against a future bootstrap enhancement or sitecustomize.py that introduces a chdir between launch and execv. Deferred (carried forward with rationale): #4 _recorder accepts any kwarg name (false sense of wiring pinned). Fixing requires either create_autospec or inspect.signature.bind across the test suite — bigger refactor than this commit. Notable enough to track separately. #7 _client_cache invalidation. Long-lived MCP holds the original BBClient for its lifetime; rotating BB_TOKEN requires restart. Architectural, defer. #8 git_recent_commits silently defaults None ref to "HEAD". Cosmetic UX, defer. Test count: 348 -> 356. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.6 round-4 review fixes (structural redaction) Round 4 caught a third iteration of the same credential-leak pattern. This commit closes it structurally so the class can't regress. Critical security fix (structural): #1 _error_dict now routes EVERY string field through _safe_text: message, body (BBApiError), stderr (GitOpError). The previous rounds whack-a-mole'd one field at a time: - Round 2: redacted `url` → message still leaked - Round 3: redacted `message` → body still leaked - Round 4: redacted `body` → stderr would leak the moment Phase 4.7 wraps `git fetch` / `git push` (whose stderr commonly contains `fatal: unable to access 'https://x-token- auth:TOKEN@bb.org/...'`). Unified `_safe_text()` helper applies URL + SCP-cred redaction to free-form text. Tests now assert redaction across ("url", "message", "body") for the API case AND ("message", "stderr") for the git case, plus a new test_giterror_stderr_redacted that covers the Phase 4.7 hazard pre-emptively. #2 SECURITY: signed-URL indicator list broadened beyond AWS. Added: Azure SAS (`sig=`, lowercase only — the trailing `=` blocks `signature=` substring matching), GCP X-Goog-Signature / -Credential, bearer-token-in-URL forms (`access_token=`, `api_key=`). Plus the existing AWS / generic `Signature=`. Tests added for Azure SAS and bearer-token shapes. #8 SECURITY: free-form-text URL redaction broadened to any scheme. The previous `https?://` regex missed `ssh://`, `git+ssh://`, etc. The new `_ANY_URL_PATTERN` matches any URL scheme; combined with the existing SCP-style pattern, free-form git stderr / log output gets redacted regardless of remote URL shape. Test added: ssh:// URL with embedded passphrase redacted by _safe_text. Correctness: #3 pr_create normalises every string arg at the wrapper boundary (source_branch, destination_branch, title, description). The wrapper previously only used `.strip()` for the empty-check guard, then forwarded the unstripped value to bb_ops which wrote it verbatim into the JSON payload. `"feat/widget "` became a 404, `"Hi "` became a PR title with trailing whitespace. #4 pr_create handles JSON null for source_branch. Previously `None.strip()` crashed uncaught with AttributeError (round 2 fixed this class for _resolve_repo but missed pr_create). Title required validation also added — empty/whitespace title raises ValueError with the rejected value echoed. #6 _resolve_repo slash-branch now validates `.` / `..` in either workspace OR repo segment BEFORE _get_client(). Round 3 only landed the check on the bare-slug fallback; `repo="./foo"` still surfaced as BBConfigError on a config-less machine. Parametrized regression test added. #9 git_recent_commits ref normalisation handles None and whitespace-only uniformly. `(ref or "").strip() or "HEAD"` — None and " " both fall back to HEAD, consistent with the empty-string success path. Test hermeticity: #12 _reset_state fixture now scrubs BB_USER / BB_TOKEN / BB_WORKSPACE / BB_API_BASE before each test. A developer running pytest locally with these exported in their shell no longer leaks ambient config into the tests — the suite is hermetic against dev env. Deferred (carried forward with rationale): #5 Identifier-threading inconsistency on remaining tools (pipeline_trigger / branches_list / branch_show / commits_list / repos_list / repo_show / vars_list / downloads_list). Real but stylistic; pick this up in a focused symmetry pass. #7 Stderr redaction proactive defense for future git wrappers — addressed structurally by routing stderr through _safe_text in #1 above (so technically not deferred). #10 _client_cache invalidation. Architectural; documented in earlier commit messages, defer. #11 pr_approve/pr_unapprove key naming inconsistency. Pure naming; intentional descriptive names per round-1 #9 deferral. #13 _resolve_repo error path lacks workspace/repo/path. Style; bundle with #5 in a symmetry pass. Test count: 356 -> 360. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Phase 4.7: bash parity — 5 new commands + 8 parity bugfixes Extends bb with the commands the Python ops exposed but bash didn't, and fixes the parity bugs Python tests surfaced. Also removes a personal-data leak in the help text. NEW COMMANDS bb pr-unapprove [repo] <id> DELETE /pullrequests/{id}/approve. bb pr-comment [repo] <id> <body> POST /pullrequests/{id}/comments with {"content": {"raw": "..."}}. bb branch [repo] <name> GET single branch detail; URL-encodes slashes in name (feat/widget → feat%2Fwidget). bb commits [repo] [branch] [count] GET /commits or /commits/{branch} with branch URL-encoding + count. bb whoami Resolved config + git context + /user reachability probe. NEVER echoes BB_TOKEN. PARITY BUGFIXES (a) cmd_pipeline_trigger no longer drops `variables` when `pattern` is empty. Previously the no-pattern branch built the payload WITHOUT the variables array, silently discarding any VAR=value pairs the user passed. (b) cmd_pipeline_stop captures the bb_post exit code instead of `> /dev/null`-ing it. A stop failure (already stopped, permission denied, 4xx) now surfaces with a clear error and non-zero exit instead of being masked by the unconditional success print. (c) build_number → uuid lookup bumped 50 → 100 (Bitbucket's max pagelen) in cmd_pipeline / cmd_pipeline_stop / cmd_logs. Older pipelines beyond 100 still unfindable in bash; full pagination is the Python-side improvement (2000-scan cap). (d) repo_path now validates workspace + repo at the boundary: rejects empty/whitespace, embedded `/`, `.`, and `..`. Mirrors bb_api.repo_path's Python validation so both surfaces enforce the same contract — no more silent `/repositories//foo` or `/repositories/../foo`. (e) cmd_pipeline_trigger empty-variables payload now omits the key entirely (was inconsistent: pattern-set path emitted `variables: []`, no-pattern path dropped variables). Matches Python's omit-when-empty contract. (f) cmd_pr_approve / cmd_pr_decline now check the bb_post exit code instead of discarding it — symmetric with (b). Approve / decline failures (already approved, declined PR can't be re-declined, etc.) now surface to the user. (h) cmd_pr_create omits `description` from the payload when empty/whitespace. Matches the Python omit-when-empty contract. (j) cmd_pr_diff now uses `curl -sfL` (was `-sf`) so a future redirect on this endpoint doesn't fail silently. `curl -L -u` does NOT resend credentials to a different host by default, so the cross-host credential-leak concern is already mitigated by curl's own semantics — no manual auth-stripping required here. Removed personal-data leak: help text previously named a specific personal workspace as the default. Generic example only now. DEFERRED (g) cmd_pr_create hardcodes close_source_branch=true. Both surfaces share this default; exposing as a flag would be a coordinated bash + Python change. Defer to a future PR. (i) cmd_pr_merge uses PUT; modern Bitbucket REST docs use POST. Bitbucket accepts both for this endpoint and bash's PUT is the verified-working contract. Python also uses PUT for parity. Coordinated investigation deferred. (k) cmd_pr_merge unhashable-strategy raises TypeError instead of ValueError. n/a in bash — bash doesn't enforce types this way. (l) Bash repos_list / branches_list / vars_list / downloads_list don't expose BBQL `q` query filter or `count` parameters. Agent gets richer access via the MCP surface; bash retains its simpler human-driven shape. Defer. Smoke tests run locally: - bash -n bb (clean) - bb help (all 5 new commands appear) - Python suite: 360/360 still green DISPATCHER Added case-statement entries for: pr-unapprove (alias `prua`), pr-comment, branch, commits, whoami. Help text reorganised to show the new commands grouped with their family. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.7 review fixes (security + real bugs in new code) Round 1 review surfaced 15 findings. Folding 8 — the critical-and-mine items plus a few cheap symmetries; deferring 7 pre-existing items that weren't directly touched. Security: #1 cmd_whoami strips `user:token@` from origin URL before echoing. `git remote get-url origin` returns the raw URL including any embedded credentials (a common pattern for token-based git auth). whoami output is exactly the kind of thing users paste into bug reports and screenshots — can't have secrets in it. Matches the `_redact_url` helper in mcp_server.py / git_ops.py. #6 cmd_whoami probes `/repositories/{workspace}?pagelen=1` instead of `/user`. Atlassian's workspace-scoped tokens (the modern recommended type) reject /user with 401/403 while still serving /repositories/{workspace} correctly — the old probe would have told users with valid tokens to rotate them. Real bugs in code this PR added or touched: #2 cmd_watch pagelen bumped 50 → 100. I missed this when bumping cmd_pipeline / cmd_pipeline_stop / cmd_logs in the same PR. A pipeline at positions 51-100 in the recent list would never match here and the watch loop would spin forever printing blanks. #3 cmd_pipeline_trigger variables built via jq instead of string concatenation. The previous shape produced invalid JSON when any value contained `"`, `\`, newline, or tab — the `jq -n --argjson vars` would then refuse to parse and abort after the user-visible `Triggering pipeline...` had already printed. JSON injection vector for any agent-supplied variable value. #4 cmd_pipeline_trigger `shift 3 || true` no longer masks the under-3-args case. Previously `bb trigger myrepo` (1 arg) left "myrepo" in $@ and the var-loop parsed it as a VAR=VALUE pair, sending `{"key":"myrepo","value":"myrepo"}` as a pipeline variable. Now guards on `$# -ge 3` and explicitly shifts the available count. #15 cmd_pipeline_trigger now rejects detached HEAD. `git rev-parse --abbrev-ref HEAD` exits 0 with the literal "HEAD" in detached state, so the `|| echo "main"` fallback didn't fire. Bitbucket 400s on `ref_name=HEAD`. Clean local error instead. Validation parity (symmetric with Python): #5 repo_path whitespace check now uses `[[:space:]]+` regex instead of `${var// }` (which stripped only ASCII space). Tabs / newlines / CRs now correctly rejected — true parity with Python's `.strip()`. #9 repo_path now uses `kill -TERM $$` on validation failure instead of `exit 1`. `exit 1` inside a `$(repo_path ...)` only terminates the subshell; the caller proceeded with an empty path and ate a confusing layered failure. `kill -TERM $$` terminates the parent script so validation actually halts. #7 + #14 cmd_pr_merge captures exit code AND validates strategy. Mirror of the (b)/(f) capture pattern this PR established for cmd_pr_approve / cmd_pr_unapprove / cmd_pr_decline / cmd_pipeline_stop. A merge conflict / failing-required-build / typo-strategy now surfaces with a labelled error including likely causes instead of `set -e` silently aborting. Deferred (pre-existing, not touched by this PR's scope): #8 cmd_pr_comment_add stdin fallback — nice-to-have parity with cmd_pr_create's pattern. Defer. #10 build_number numeric validation in cmd_pipeline / cmd_logs / cmd_pipeline_stop — pre-existing. Defer. #11 cmd_branch_show jq null-safety on .target.hash / .target.date. Real but narrow (annotated-tag refs). Defer. #12 cmd_pr_diff curl failures print nothing — pre-existing asymmetry. Defer. #13 cmd_commits count validation — minor. Defer. Smoke tests: bash -n bb clean, bb help renders, Python 360/360 still green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.7 round-2 review fixes Round 2 caught a regression I introduced in round 1 + several real items the first pass missed. Critical: #1 cmd_whoami origin URL fallback now actually fires when git fails. My round-1 fix piped through sed for credential redaction but the `||` attached to sed (which always exits 0 on empty input) instead of git. So `bb whoami` in a non-git directory printed `Origin:` blank instead of the promised `(no origin remote)` placeholder. Refactored to an `if remote=$(...); then ... else ... fi` block so the fallback branches on git's exit status. #2 cmd_pipeline_trigger variables parser switched to NUL delimiter (was newline). My round-1 fix replaced unsafe string-concat with `printf '%s\n' "$@" | jq -Rs 'split("\n") ...'` — but that still fragments any VAR=value where value contains a newline. `VAR=$'line1\nline2'` became `{VAR:line1}` plus a ghost `{line2:""}`. NUL is the safe delimiter; printf's `%s\0` + jq's `split(" ")` round-trip cleanly. #3 SECURITY: build_number now validated numeric in every command that splices it into a jq filter (cmd_pipeline, cmd_watch, cmd_logs, cmd_pipeline_stop). The previous raw interpolation let jq treat a non-numeric value as an undefined-function reference (typo path) OR — confirmed by the reviewer — as a jq-injection vector via `$ENV` exfil (e.g. `1) | $ENV.BB_TOKEN, .uuid` would land the token in pipeline_uuid → next URL path). New `_require_build_number` helper enforces `^[0-9]+$` at the boundary; rejected input never reaches jq. #4 cmd_pipeline_trigger `Variables: $*` echo now masks values. Previously emitted the user's raw VAR=value args to stdout — secrets in terminal scrollback, CI logs, shell-history wrappers, `bb trigger ... | tee` outputs. Now prints `KEY=***` for each variable. Contradicted the "NEVER echoes BB_TOKEN" claim in cmd_whoami. #8 repo_path whitespace check now catches MIXED whitespace, not just all-whitespace. My round-1 `^[[:space:]]+$` regex let `' acme '` through; `tr -d '[:space:]'` + equality check now rejects any value that differs from its stripped form. True parity with Python's `.strip()`. Symmetry (rc-capture pattern applied to commands the first round missed): #5 cmd_pr_create now captures the bb_post exit code. A 400 (dest branch typo, duplicate PR open, source not pushed) surfaces as a labelled error with likely causes instead of `set -e` silently aborting after the "Creating PR:" banner. #6 cmd_pipeline_trigger now captures the bb_post exit code with the same labelled-error pattern. Common 4xx causes (protected branch, custom-pipeline-not-found, invalid variable shape) spelled out. Deferred (pre-existing, narrow): #7 cmd_branch_show / cmd_commits print banner before fetch. Same pattern as #5/#6 but on read commands where the silent failure has less consequence; the curl exit code does at least surface non-zero. Defer to a follow-up if needed. #9 cmd_branch_show jq null-safety on .target.hash / .target.date. Real but narrow (annotated-tag refs). Defer. #10 cmd_pr_diff curl-silent failure. Pre-existing. Defer. #11 cmd_commits count not validated. Minor. Defer. Smoke tests: bash -n clean, bb help renders, Python 360/360 still green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Phase 4.1: bb_api.py + Python test infrastructure (MCP foundation) (#2) * Add bb_api.py + pyproject + Python test infrastructure (MCP foundation) First slice of the MCP server work. Establishes the Python module layout the MCP server will sit on top of, plus the test seam and CI extension required to develop the rest of the modules with confidence. bb_api.py — Python parallel of bb's config / auth / transport helpers. Stdlib-only (urllib.request), so the MCP server's bootstrap stays small. Public surface: - BBConfig: resolved credentials + workspace + API base URL. - load_config(): env > ~/.config/bb/config > .env > error; tolerant of bash `export KEY=value` and quoted values, matching the bash script's `source` semantics close enough for parity. - parse_remote_url() / detect_repo(): both HTTPS and SSH remote shapes; repos with dots in their slugs are preserved (bash regex truncates). - repo_path(): /repositories/{workspace}/{repo}, rejecting embedded slashes that would silently change the URL structure. - BBClient: thin transport with .get / .post / .put / .delete / .paginate. Pagination is defended against runaway cursors (max 200 pages) and stuck cursors (`next` URL unchanged between iterations), and refuses to follow `next` URLs that point at a different host than api_base. - BBApiError carries status + URL + body so callers can branch on expected non-2xx responses. bb and bb_api are PARALLEL implementations of the same Bitbucket REST contract — the MCP server does not shell out to bb. The bug-parity rule is now documented in CONTRIBUTING.md: tests verify correct API behaviour, and a defect surfaced by a Python test fixes both code paths. pyproject.toml — flat `py-modules = ["bb_api"]` layout (additional modules land in their own PRs). `mcp` is in an optional [mcp] extra so test matrices stay lean. tests/ — pytest with two responsibilities in conftest.py: set BB_MCP_SKIP_BOOTSTRAP=1 before any mcp_server import to defend against the future self-bootstrap venv pattern, and inject the repo root onto sys.path. 36 tests covering config precedence (env > file > error), remote-URL parsing (HTTPS + SSH + repos-with-dots), client transport (URL + method + auth header + body shape per call, not just response status), and pagination (walking, stuck-cursor defence, iteration cap, cross-host refusal, empty page). CI extension — Python 3.10/3.11/3.12 matrix added to the existing `syntax` job. fail-fast: false so a failure on one version doesn't mask another. Installs only pytest; the MCP runtime deps stay out of unit-test CI because the suite mocks urllib. CODEOWNERS — extended to gate review on bb_api.py, future bb_ops.py / git_ops.py / mcp_server.py, pyproject.toml, and the agents/ directory. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply review fixes for Phase 4.1 foundation Addresses 13 of the 15 findings from the Phase 4.1 code review. The strongest finding (#1) was a parity bug where a test pinned the wrong contract — exactly the anti-pattern the testing methodology warns about. Parity / correctness fixes: 1. dotenv > home-config precedence. Bash sources ~/.config/bb/config first, then `.env`, so `.env` wins. Python had the order inverted and the test pinned the inverted behaviour. Now matches bash; the test now verifies the correct contract. 2. Empty env var no longer falls through to file. Switched from `env.get(key) or file_config.get(key)` to a membership test so `BB_TOKEN=""` is treated as an explicit (failing) value rather than absence — matches bash's `[[ -z ]]` check behaviour. 3. URLError (DNS / TLS / connection / timeout) now wraps as BBApiError(status=0) instead of escaping the documented contract. 4. Pagination cross-host refusal hardened against the prefix trick. `startswith(api_base)` is replaced with a separator-aware check so `https://api.bitbucket.org/2.0evil.example.com/...` no longer slips past the host-mismatch guard. 5. BB_MCP_SKIP_BOOTSTRAP is now set unconditionally in conftest; `setdefault` was a no-op if a developer had `BB_MCP_SKIP_BOOTSTRAP=0` in their shell, defeating the guard silently. 6. Paginated response missing the `values` key now raises rather than being silently treated as an empty page (which could leave a hole in the result set if `next` was still set). 7. Non-string `next` raises BBApiError with a clear message rather than crashing on AttributeError. 8. repo_path() rejects empty / whitespace-only / `.` / `..` values in addition to embedded `/`. Defends against path-traversal where `/repositories/../widget` normalises to the wrong workspace. 10. Query values validated as scalar or list-of-scalar; nested dicts and arbitrary objects raise TypeError instead of being silently stringified by urlencode(doseq=True) into a misleading URL. 11. BB_API_BASE trailing slash normalised on load. Prevents `api_base + "/path"` becoming `//path`. 13. Replaced `assert ... is not None` with an explicit raise so `python -O` deployments retain the safety net if a future edit narrows the missing-keys check. 14. detect_repo test now asserts the exact subprocess call shape (command, capture_output, text, cwd, check). Previously the test was canned-output-only and a refactor to a different git command would have passed silently. 15. BBClient methods accept a per-call `timeout=` override for endpoints that legitimately take longer (log streaming, large diffs). Default still comes from the client. Tightened docstrings: - BBApiError now documents `status=0` as the sentinel for non-HTTP transport / malformed-response errors so callers know to gate HTTP-status dispatch on `status > 0`. Deferred with rationale: #9 (remote URL parser accepts non-Bitbucket hosts): intentional for enterprise / self-hosted Bitbucket support. The bash side has the same loose parsing. No restriction added. #12 (BBApiError used as transport-error wrapper): the dedicated exception class would expand the surface; the docstring update above is sufficient until a real consumer needs to discriminate. Test count: 36 -> 55. All passing locally on Python 3.11. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply follow-up review fixes (security: no-redirect default, test rigor) Follow-up review on commit e2567e8 verified all 13 prior fixes landed correctly and surfaced 6 new findings, 3 of them worth taking before more code lands on top of this foundation. Security: new #1. Default BBClient opener no longer follows 3xx redirects. urllib's default HTTPRedirectHandler resubmits the original request against the Location URL — INCLUDING the Authorization header. urllib does not strip auth on cross-origin redirects, so a misconfigured proxy or DNS hijack pointing api.bitbucket.org at another host would have leaked the Basic auth credential to that host on the first 3xx. `bb` doesn't have this exposure: `curl -sf` doesn't follow redirects without -L, and the one command that needs to (`bb logs`) opts in explicitly. Mirror that posture here. Implementation: _NoRedirectHandler subclasses HTTPRedirectHandler and returns None from redirect_request, which causes urllib to surface the 3xx as an HTTPError that our existing handler wraps into BBApiError. Two tests cover it — one introspects the default opener's handler chain, the other unit-tests the handler's redirect_request contract directly. When bb_ops adds pipeline_logs (which legitimately needs to follow a 307 to S3), that work will require a separate code path that strips Authorization on cross-host hops rather than weakening this default. Test rigor: new #2. test_paginate_walks_pages now asserts the URL for each of the three calls (base, base?page=2, base?page=3). Without this, a regression that refetched the original URL three times would still consume the three queued responses and pass — exactly the mock-returns-success-regardless-of-request anti-pattern. Cleanup: new #4. Removed the dead `page_query = None` assignment in paginate(). The variable was only read on the first iteration (when url is None), and the reset ran after `url` had been assigned, so it was never observed. Removed and the closure variable `query` is used directly. Documentation: new #5. parse_remote_url() now has an IMPORTANT caller-contract note: the function does not anchor to a Bitbucket host (intentional for enterprise / self-hosted Bitbucket support). Callers feeding it untrusted external URLs (webhook payloads, etc.) must verify the host upstream before calling. If a future consumer needs that guarantee, the right shape is a `strict=True` mode here, not duplicated checks at every call site. Deferred: new #3 (BB_API_BASE shape validation) — low-impact today; will revisit if a self-hosted Bitbucket Server use case lands a typo bug. new #6 (older _fake_runner tests still allow kwargs drift) — minor; the new recording-style test_detect_repo_invokes_git_remote_get_url is the load-bearer, the looser fixtures around it are acceptable. Test count: 55 -> 57. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Phase 4.2: bb_ops pipeline operations + cross-host-safe log fetch (#3) * Add bb_ops pipeline operations + cross-host-safe log fetch Phase 4.2 of the MCP server work. Adds the pipeline subset of bb_ops (list, show, steps, trigger, stop, logs), plus a new bb_api transport method for following redirects safely. 33 new tests, 90 total. bb_ops.py — public surface: pipelines_list(client, ws, repo, *, count=10, sort, branch=None) Paginates pipelines/?sort=-created_on. Honours `count` exactly by walking pages as needed; clamps per-page to Bitbucket's 100 cap. Optional branch filter via target.ref_name. pipeline_show(client, ws, repo, build_number) Resolves build_number -> uuid then GETs /pipelines/{uuid}. pipeline_steps(client, ws, repo, build_number) Same uuid lookup, then GETs /pipelines/{uuid}/steps/. pipeline_trigger(client, ws, repo, *, branch, pattern=None, variables=None) Builds the POST payload (target + optional custom-pattern selector + optional variables-as-list-of-{key,value}). Variables accept dict or iterable-of-pairs at the boundary. Rejects non-string values (Bitbucket's contract). pipeline_stop(client, ws, repo, build_number) POSTs to /pipelines/{uuid}/stopPipeline. Returns the API response (typically None on 204). The bash equivalent discards the response with `> /dev/null`, masking failures — flagged as a 4.7 parity fix. pipeline_logs(client, ws, repo, build_number, step_index, *, timeout=120) Uses the new BBClient.fetch_redirected_text() path to follow Bitbucket's 307-to-S3 redirect on the log endpoint with Authorization stripped on cross-host hops. bb_api.py — BBClient.fetch_redirected_text(): Necessary because the log endpoint sometimes returns 200 with an inline log body and sometimes returns 307 to a signed S3 URL. The default opener refuses 3xx (credential-safety), so this method opens with the default opener, catches HTTPError 3xx, extracts Location, and follows up to max_redirects hops. CRITICAL: when the next hop's host differs from api.bitbucket.org, Authorization is removed from the request headers — Basic credentials never reach S3 or any other redirect target. Once stripped, the credential stays stripped for the rest of the chain. Bash's `bb logs` uses `curl -sfL` which (depending on curl version) may send Authorization on cross-host redirects. Whether that's presently broken in bash is a parity question for 4.7; Python is correct by construction. Test coverage (33 new): _wrap_uuid: brace normalisation, URL encoding, whitespace tolerance. _strip_uuid_braces: bare/braced/empty inputs. pipelines_list: single page, count > 100 (pagination), mid-page cutoff, branch filter URL shape, non-positive-count rejection without emitting a request. _resolve_pipeline_uuid: first-page hit, multi-page walk, not-found raises BBOpNotFound (distinct from BBApiError), invalid build_number rejection, scan_limit cap respected. pipeline_show: uuid lookup THEN show by uuid; URL uses %7B...%7D bracket encoding (bash contract). pipeline_steps: uuid lookup THEN list steps. pipeline_trigger: default-pipeline payload shape, custom-pipeline payload shape (selector embedded), variables-as-dict shape, variables-as-iterable-of-pairs shape, empty-variables omits the key (not an empty list), non-string variable value rejected, empty pattern rejected, empty branch rejected. pipeline_stop: POST to /stopPipeline with no body; 204 returns None. pipeline_logs: inline body (no redirect), 307 follow with Authorization stripped on cross-host (the security-critical test), too-many-redirects raises, missing Location raises. _resolve_step_uuid: valid index, out-of-range raises BBOpNotFound, negative index rejected. Every HTTP-touching test asserts request URL, method, and body shape - not just the response value. Guards against the mock-returns-success-regardless-of-body anti-pattern. CI / packaging: pyproject.toml py-modules now includes bb_ops. ci.yml syntax-checks bb_ops.py alongside bb_api.py. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.2 review fixes (robustness + cleanup) Addresses 6 of 8 follow-up suggestions from the Phase 4.2 code review. Approve-with-suggestions; nothing blocking, all minor robustness or clean-up items. #1 (doc). Soften the cross-host-auth narrative in fetch_redirected_text. `curl -L -u` also strips credentials on cross-host hops; only `-H "Authorization: ..."` would leak. Bash is safe today via `-u`. Python's explicit strip is the symmetric future-proof version (in case the MCP server ever needs header-based auth). #4 (robustness). Case-insensitive cross-host comparison in fetch_redirected_text. RFC 3986 §3.2.2 makes hostnames case-insensitive; a redirect to API.bitbucket.org would have triggered a needless auth-strip on a capitalisation-only diff. Now lowercased both at construction and on each new host. #5 (resource hygiene). Explicit `e.close()` on the redirect-path HTTPError so the underlying socket is released immediately rather than waiting on GC under load. Wrapped in try/finally so a raise from the redirect-extract logic still closes the response. #6 (dead return). _resolve_step_uuid now returns just the uuid rather than a (uuid, name) tuple. The name was never consumed by pipeline_logs. Callers that want the name should fetch pipeline_steps() themselves. Simpler API, no behavioural change. #7 (nit). Dropped redundant `safe='-'` from _wrap_uuid's quote() call. Dash is in urllib's always-safe unreserved set. #8 (naming). Renamed pipeline_steps_raw -> _pipeline_steps_by_uuid (leading underscore to match the rest of the file's internal-helper convention; the function was only called module-internally). Bash bug newly surfaced by the review, queued for 4.7: cmd_pipeline_trigger drops `variables` entirely when `pattern` is empty (bb:382-391: the no-pattern branch builds a payload without the variables array, even when the caller passed VAR=value pairs). Python pipeline_trigger handles this consistently; tests pin it. Deferred: #2: bash bug above — fixed in 4.7 not here (Python-only PR). #3: empty-variables payload shape (omit vs []). Python omits; bash emits []. Both are valid against Bitbucket. Picking one and aligning is a 4.7 follow-up. Test count unchanged (90); test_resolve_step_uuid updated to assert the new single-uuid return. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Tighten finally-close comment per re-review nit The finally runs on every HTTPError path (both 3xx and non-3xx), not just redirects. Reword the comment so a future reader doesn't assume the close is redirect-only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Phase 4.3: bb_ops pull-request operations + tests (#4) * Add bb_ops pull-request operations + tests Phase 4.3 of the MCP server work. Adds the PR subset of bb_ops with 33 new tests, 123 total. Public surface: prs_list(client, ws, repo, *, state="OPEN", count=25) State filter + paginated count, matching bash defaults. Walks pages when count > 100 (Bitbucket's cap). pr_show(client, ws, repo, pr_id) GET /pullrequests/{id} with positive-int validation on pr_id. pr_activity(client, ws, repo, pr_id, *, count=50) Walks the activity stream — used by bash to surface approver names, surfaced as its own op so the MCP agent can render its own view. pr_create(client, ws, repo, *, title, source_branch, destination_branch="main", description="", close_source_branch=True, reviewers=None) Builds the POST payload matching Bitbucket's contract. Optional reviewers as a list of UUID strings (Bitbucket expects `[{"uuid": "..."}, ...]`). Empty description / empty reviewers omitted from the payload (Python convention; bash's "always include empty description" is a 4.7 alignment item). pr_approve(client, ws, repo, pr_id) POST /approve. Returns the approval record — bash discards it with `> /dev/null`. pr_unapprove(client, ws, repo, pr_id) DELETE /approve. Not exposed by bash today (4.7 parity gap). pr_merge(client, ws, repo, pr_id, *, strategy="merge_commit", close_source_branch=True, message=None) PUT /merge with strategy + close_source_branch + optional message. Strategy validated at boundary: merge_commit / squash / fast_forward. Mirrors bash's PUT verb (Bitbucket has historically accepted both PUT and POST; flagged for 4.7 investigation against current REST docs). pr_decline(client, ws, repo, pr_id) POST /decline. pr_diff(client, ws, repo, pr_id, *, timeout=120) Routes through fetch_redirected_text so the cross-host-auth-strip protection applies if Bitbucket ever redirects this endpoint. Bash uses `curl -sf` without `-L`, so it would fail visibly in that case — Python continues to work safely (parity divergence noted for 4.7). pr_comments_list(client, ws, repo, pr_id, *, count=100) Paginated comments listing matching bash's default pagelen. pr_comment_add(client, ws, repo, pr_id, body) POST /comments with payload {"content": {"raw": "<text>"}}. Not exposed by bash today (4.7 parity gap). Validation: every function rejects invalid pr_id (must be positive int) and non-string body / message values at the Python boundary. Tests assert `opener.calls == []` after a rejection to prove no request was emitted on bad input. Bash parity items newly surfaced (queued for 4.7): - cmd_pr_approve / cmd_pr_decline discard API responses with > /dev/null. - cmd_pr_create hardcodes close_source_branch=true; expose as a flag. - cmd_pr_create always includes an empty description; align with Python's omit-when-empty. - cmd_pr_merge uses PUT; verify against current docs and align both sides on one verb. - cmd_pr_diff uses curl -sf without -L; Python's fetch handles redirects safely. Test discipline: every HTTP-touching test asserts URL, method, AND body shape, not just response value. Rejection tests assert opener.calls == [] to prove no network IO on bad input. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.3 review fixes (bool-is-int, type checks, test coverage) Addresses all 12 findings from the code review. The strongest finding was a Python footgun: bool is a subclass of int, so a bare isinstance(x, int) check accepts True/False and stringifies them in URLs as 'True'/'False' rather than '1'/'0'. The cascading fix closes six findings at once. High-severity: #1 _validate_pr_id accepted bool. Now uses a shared _is_positive_int helper that explicitly excludes bool. Same fix applied to every count validator (prs_list, pr_activity, pr_comments_list, pipelines_list, _resolve_pipeline_uuid) and the step_index check. test_rejects_invalid_pr_id extended with True/False; the new test_every_pr_op_rejects_bad_pr_id matrix runs every PR-id-taking op (8 functions × 7 bad values) so a future refactor that drops _validate_pr_id from any one op gets caught. #2 pr_create(reviewers="alice-uuid") would have iterated the string char-by-char and built [{"uuid":"a"}, {"uuid":"l"}, ...]. Added an explicit isinstance(reviewers, str) early-reject with a hint ("did you mean [...]?"). Test added. #3 Same bool-is-int slip on every count parameter (fixed by the shared _is_positive_int helper above). Medium: #4 pr_create now type-checks `description`; a dict or other non-string raises rather than slipping into the JSON payload. Test added. #5 _KNOWN_PR_STATES docstring claim that Bitbucket accepts comma- separated `?state=OPEN,MERGED` was wrong; that shape returns 400 or empty. Docstring corrected to point at the BBQL `q` parameter for compound filtering. #6 close_source_branch now type-checked in both pr_create and pr_merge; a string like "yes" no longer slips into the JSON payload. Tests added. Low: #7 pr_merge(message="") was accepted; now rejected for consistency with pr_comment_add(body=""). Empty merge-message would leave the merge commit's subject line blank. #8 pr_diff docstring softened — the "Python continues to work safely" framing didn't acknowledge that the returned body would be the redirect target's content if a redirect ever materialised. Now spells out that today the diff endpoint doesn't redirect and flags the behavioural divergence from bash if it ever does. #9 Whitespace-only description (`description=" \n "`) now omitted from the payload via description.strip() check. Test added. Test gaps closed: #10 True/False added to test_rejects_invalid_pr_id. #11 New test_every_pr_op_rejects_bad_pr_id matrix covers every PR-id-taking op (9 functions × 7 bad values = 63 cases). #12 TestPrActivity now has count-walks-pages, invalid-count, and invalid-pr_id tests, symmetric with TestPrsList. Test count: 123 -> 141. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.3 round-2 review fixes Addresses 6 of 7 findings from the round-2 review. The strongest finding was a real boundary-validation gap: _KNOWN_PR_STATES was added in the previous fix round but never actually referenced — defined-and-documented as the prs_list state guard, yet the function only type-checked. Closes the gap. High: #1 prs_list now validates state against _KNOWN_PR_STATES. Typos ("OPENED"), case bugs ("open"), and unsupported compound forms ("OPEN,MERGED" — which Bitbucket does NOT accept on the simple ?state= filter; that needs ?q=) all fail at the boundary instead of burning an API call for a 400 / empty result. Symmetric with pr_merge's _VALID_MERGE_STRATEGIES check. #2 Whitespace-only string payloads no longer slip through. Replaced `not value` with `not value.strip()` in five places: - pr_create.title / source_branch / destination_branch - pr_merge.message - pr_comment_add.body Without this, a title like " " or a comment body like "\n\t" would have shipped to Bitbucket as accepted-but-visually-blank fields. Error messages now say "non-empty, non-whitespace". Test coverage: #3 TestPrsList.test_rejects_non_positive_count now includes True / False, matching TestPrActivity. Without symmetry, a refactor that reverted prs_list's _is_positive_int(count) check to a naive isinstance(count, int) would have regressed silently. #4 TestPrCommentsList now has test_rejects_non_positive_count with the same coverage shape (was happy-path only). #5 TestPrDiff.test_follows_cross_host_redirect_and_strips_auth pins the redirect-with-auth-strip behaviour at the pr_diff layer. Mirrors test_pipeline_logs.test_follows_s3_redirect_and_strips_auth — without it, a regression that wired pr_diff to plain client.get (which would 5xx on any future redirect because the default opener refuses 3xx) would not break any test in this file. #7 TestPrDiff.test_returns_raw_diff_text now also asserts the first hop carries Authorization. A regression that dropped the Basic header from the diff-fetch path would otherwise pass silently. Parametrized whitespace cases added across pr_create / pr_merge / pr_comment_add tests so each field's reject-on-whitespace is verified independently. Deferred (filed under 4.7): #6 pr_merge(strategy=[1,2]) raises TypeError (unhashable) rather than ValueError. Defensive-coding nit; either error surface stops the bad call. Defer to the parity sweep. Test count: 141 -> 155. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.3 round-3 review fixes Three small test/consistency fixes from the round-3 review. The PR is in good shape — reviewer explicitly called these out as non-blockers, but they're cheap and close the consistency loop. - pr_merge strategy validator now gates on isinstance(str) before the frozenset `in` check. Unhashable strategy values (list, set, dict) previously raised TypeError instead of the documented ValueError, breaking the "every boundary failure is ValueError" convention. Test extended to cover list/set/None/int alongside the original unknown-string case. - test_rejects_invalid_reviewer (formerly _empty_reviewer_uuid) now parameterizes over empty-string, None, int, and the most plausible caller mistake — pre-shaping as Bitbucket's payload contract `{"uuid": "..."}`. The validator already rejects all four; this pins coverage on the isinstance branch. - test_follows_cross_host_redirect_and_strips_auth now also asserts the first hop carries Authorization (a regression that dropped auth on the initial Bitbucket request would otherwise pass) and that the Location was actually followed (not refetched original). Symmetric with round-2 finding #7 on the no-redirect path. Test count: 155 -> 163. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Phase 4.4: bb_ops repos / branches / vars / downloads / commits + tests (#5) * Add bb_ops repos / branches / vars / downloads / commits + tests Phase 4.4 of the MCP server work. Closes out the read-oriented surface on bb_ops (Phase 4.5 will add git context tools, then 4.6 wires everything into the MCP server). 48 new tests, 211 total. Public surface: repos_list(client, workspace=None, *, count=100, sort, query=None) Defaults workspace to client.config.workspace; pass explicit to query a different one (bash only ever uses BB_WORKSPACE — parity gap). Optional BBQL `q` filter for name-substring etc. (also bash gap). repo_show(client, ws, repo) GET /repositories/{ws}/{repo} — full metadata. branches_list(client, ws, repo, *, count=50, sort, query=None) Most-recently-updated first, matches bash default. Optional `q` filter. branch_show(client, ws, repo, name) Fetch a single branch by name. URL-encodes the name so `feat/widget` isn't interpreted as a sub-resource path. Not in bash today — useful for the agent's "does this branch exist?" pre-flight check before creating a PR. vars_list(client, ws, repo, *, count=100) Pipeline configuration variables. Bash masks secured values to `********` in its display layer; Python returns the raw dicts (Bitbucket sends `"value": null` for secured entries — MCP tools surface the `secured` flag so callers don't misread null as unset). downloads_list(client, ws, repo, *, count=25) Repository download artifacts. commits_list(client, ws, repo, *, branch=None, count=10) With branch=None, lists across all branches via /commits. With branch=NAME, lists commits reachable from that branch via /commits/{name}. Branch names URL-encoded. Not in bash today. Bash parity items newly surfaced (queued for 4.7): - bash repos_list / branches_list / vars_list / downloads_list don't expose BBQL `q` query filter or `count` parameters — agent has richer access by default. - bash has no branch_show, commits_list — add as new subcommands alongside the previously-queued pr_unapprove / pr_comment_add / whoami. Test discipline (same as prior bb_ops PRs): - Every HTTP test asserts URL, method, AND body shape per call. - Rejection tests assert opener.calls == [] (no network IO on bad input). - Bool-is-int (True/False) rejection covered on every count parameter. - Whitespace-only rejection on every required-string parameter. - Branch-name slash URL-encoding verified specifically in branch_show and commits_list. Test count: 163 -> 211. All passing locally on Python 3.11. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.4 review fixes Folds in all 5 findings from the code review (none blocking; reviewer explicitly called the PR "clean, well-disciplined"): 1. repos_list now applies the same workspace validation as bb_api.repo_path — rejects embedded '/', '.', '..' in addition to empty/whitespace. Without this, `workspace="acme/widget"` would silently build a single-repo endpoint URL and paginate against a response that lacks `values` — confusing failure mode the boundary validator exists to prevent everywhere else. repos_list is the only op that doesn't route through repo_path, so the check is duplicated here. 2. Removed redundant local `from urllib.parse import quote as _quote` in branch_show and commits_list; both now use the module-level `quote` import that's been there since 4.2. 3. Updated stale module docstring — was still saying "This PR (Phase 4.2) adds pipeline operations." Now reflects current scope (pipelines + PRs + repos/branches/vars/downloads/commits) and references the companion git_ops module that 4.5 will add. 4. TestBranchesList.test_query_filter now asserts the exact URL-encoded form of the BBQL query (matching the style of TestReposList.test_query_filter). The previous `"q=" in url` would have passed even if the query string were silently mangled. 5. Three new tests close coverage gaps: - TestBranchesList.test_rejects_empty_query — pin the empty/whitespace rejection that was implemented but untested. - TestCommitsList.test_count_walks_pages — pin commits_list's multi-page pagination (it has the most complex path shape of the bunch with the branch-vs-no-branch split). - TestReposList.test_rejects_workspace_with_slash + test_rejects_workspace_dot_segments — pin the new workspace validation from finding #1. Skipped optional suggestions: - vars_list `query=` filter (no concrete agent use case yet). - _collect_paginated() helper (5x repetition is still inline-readable; refactor when a 6th list op materialises). Test count: 211 -> 218. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Phase 4.5: git_ops — git-context wrappers for the MCP agent (#6) * Add git_ops — git-context wrappers for the MCP agent Phase 4.5 of the MCP server work. Five subprocess-mocked wrappers the MCP agent uses to resolve "current branch", "is the tree clean?", "workspace/repo from origin", "recent commits", and "what's uncommitted?" — the context tools that have to run before any bb_ops call. 38 new tests, 256 total. git_ops.py public surface (stdlib-only, subprocess + bb_api.parse_remote_url): git_current_branch(path?) -> str `git rev-parse --abbrev-ref HEAD`. Detached HEAD returns the literal string "HEAD" (matches git's own output; callers detect detached state explicitly). git_remote_repo(path?) -> (workspace, repo_slug) Routes through bb_api.parse_remote_url for parity with bb_api.detect_repo's loose-host parsing (intentional — enterprise Bitbucket Server uses non-bitbucket.org URLs). Returns BOTH the workspace AND repo, unlike bb_api.detect_repo which returns only the slug. git_status(path?) -> dict Parses `git status --porcelain=v2 --branch --untracked-files=normal`. Returns a structured snapshot: branch, upstream, ahead, behind, clean, staged, modified, untracked, unmerged. Handles ordinary (type "1"), rename/copy (type "2", with new-path\told-path tab-separator), and unmerged (type "u") lines. Header lines parsed for branch.head / branch.upstream / branch.ab. git_recent_commits(path?, *, count=10, ref="HEAD") -> list[dict] `git log --pretty=format:%H<US>%h<US>%s<US>%an<US>%aI`. Uses U+001F (Unit Separator) as the field delimiter — a control character that cannot appear in commit subjects, so we never have to escape or parse-around content. Each entry: {sha, short, subject, author, date}. git_uncommitted_changes(path?) -> dict Three subprocess calls: git diff --cached -> staged_diff git diff -> working_diff git ls-files --others -> untracked_files --exclude-standard Diffs returned as raw unified-diff text; agent can show verbatim or parse further. Error handling: GitOpError carries the failing command + returncode + stderr (truncated in the message). Distinct from BBApiError so the MCP layer can render "git failure" vs "Bitbucket failure" differently. FileNotFoundError on `git` binary missing -> GitOpError(127, ..., "git executable not found on PATH"). Matches bb_api.detect_repo's behaviour. Test discipline (same as bb_api / bb_ops): - Every test asserts the EXACT subprocess shape (command + args + cwd + capture_output / text / check). A refactor swapping `git status --porcelain=v2` for v1 would change every parse result but the canned-output tests would still pass without this. - Parametrised tests on remote-URL shapes (https, ssh, embedded user-info, self-hosted Bitbucket Server) — verify the loose-host parser handles every shape the bash side accepts. - git_status parser tested against five realistic v2 captures: clean, ahead/behind, no-upstream, mixed (staged + modified + untracked), renamed, unmerged. - git_recent_commits format string pinned: %H, %h, %s, %an, %aI all present + U+001F delimiter. Malformed log lines skipped defensively rather than crashing. - Invalid-count / empty-ref rejection on git_recent_commits with True/False included (bool-as-int trap). - git_uncommitted_changes verifies the EXACT three-call sequence with explicit `cwd=` propagation on all three. CI / packaging: pyproject.toml py-modules += "git_ops". ci.yml syntax-check += "git_ops.py". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.5 review fixes Addresses 5 of 6 findings from the round-1 review (the 6th — str vs os.fspath — is pre-existing in bb_api and would be unrelated churn in this PR). Security: #1 git_recent_commits(ref=...) now rejects refs starting with '-'. Without this, an agent-supplied ref like `ref='--all'` would be parsed by git as an option flag, not a ref — silently replacing the call with a glob over all branches. `ref='-h'` would print help and return exit 0 with help text in stdout, which the parser would skip. The MCP agent is the consumer, and agent inputs are user-influenced, so option-injection at this boundary is a real concern. Belt + suspenders: also added `--` terminator to the `git log` invocation so anything past it is parsed as a positional ref regardless. Correctness: #3 GitOpError(returncode=0) for parse failures was misleading. A real git exit is always >= 0, so callers branching on `err.returncode == 0 -> success` would have been confused. Now parse-failure paths use the GIT_PARSE_ERROR_RETURNCODE sentinel (-1) and the class docstring documents the convention. Both parse-error tests now assert on the sentinel. Defensive: #4 git_recent_commits added an upper bound of _MAX_LOG_COUNT (1000) on `count` so an agent that confabulates `count=1_000_000` doesn't silently pull a million records back across the MCP boundary. Documentation: #2 git_status docstring now documents the C-quoted-pathname limitation as a known caveat (porcelain v2 line-oriented output quotes pathnames with newlines/tabs/quotes/control-chars). Fixing properly requires switching to `-z` + NUL-split parser — deferred until a real bug report shows up. #6 _LOG_FIELD_SEP comment now says "almost never" rather than "cannot" — git stores arbitrary bytes so U+001F technically could appear in a commit subject; the parser already handles it defensively. New tests: - 4 parametrized cases for ref-starts-with-dash rejection. - count-above-cap rejection + count-at-cap accepted (boundary verification). - Spaces-in-filename preserved (porcelain v2 type-1 line uses split(" ", 8) — the suggested regression guard). - Existing parse-failure tests now assert returncode is the sentinel. Skipped: #5 (str vs os.fspath nit). Pre-existing in bb_api.detect_repo, would be unrelated churn here; defer to a separate consistency PR. Test count: 256 -> 263. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.5 round-2 review fixes Substantive round-2 review surfaced 10 findings (medium to very-low), several with concrete failure modes the MCP server would hit in production. All 10 addressed plus the 3 test-coverage gaps. Medium-severity correctness: #1 git diff no longer leaks ANSI color codes. _run_git now prepends `git -c color.ui=never` to every invocation, overriding any user's `color.ui = always` gitconfig that would otherwise emit `\x1b[31m...` escapes into pipe-captured output. Diffs and log output are now safe to ship to the MCP agent verbatim. #2 Detached-HEAD reporting normalised. `git_current_branch` already returned the literal "HEAD" for detached state; the porcelain v2 parser returned "(detached)" for the same state. Now both return "HEAD" so cross-checks between the two functions agree on the underlying state. Test added. #3 subprocess.run now has timeout=30s with TimeoutExpired wrapped as GitOpError(returncode=GIT_PARSE_ERROR_RETURNCODE). A wedged git (held index.lock, credential-helper prompting on stdin, NFS server gone) can no longer hang the MCP server thread indefinitely. Test added. #4 subprocess.run now passes encoding="utf-8", errors="replace" explicitly. text=True alone defers to locale.getpreferredencoding(), which on minimal Docker / cron / systemd contexts is ASCII or C — a UTF-8 filename or author name would crash inside subprocess.run before _run_git could wrap it. Explicit UTF-8 + replace-on-error means we never lose to a locale-restricted host. Low-severity correctness / data loss: #6 splitlines() replaced with split("\n") in three parsers. The broader splitlines() also breaks on \r / \v / \f / U+0085 / U+2028 / U+2029, any of which in a commit subject (or path) would fragment one logical record into multiple "lines" that each fail the per-line guards and silently vanish. Test added covering carriage-return-in-subject preservation. #7 git_uncommitted_changes diffs now capped at _MAX_DIFF_BYTES (1 MiB) per diff, with an explicit truncation marker so the caller (the MCP agent) knows what happened and can fall back to `git diff --stat`. Without this cap, a stray multi-GB staged blob would OOM-kill the MCP server. Test added. #8 git_recent_commits parser now skips records with empty SHA. A line of pure separators ("\x1f\x1f\x1f\x1f") passes the parts-count guard and would otherwise append a degenerate {"sha":"", "short":"", ...} entry. Test added. Very-low defensive: #9 branch.ab parser now validates sign prefixes ("+N -M") before stripping. A malformed "-3 +1" line would otherwise propagate bogus values through as negative ahead / positive behind. Test added. #10 Type-1/2 XY field bounds-checked. A single-char XY from a corrupted git output stream no longer raises IndexError outside GitOpError; the malformed line is skipped defensively. Test added. Docstring: #5 _LOG_FIELD_SEP comment further softened — explicitly notes the trade-off ("rare data loss on pathological commit" vs "robust separator that won't collide with common subject content") and points to `git log -z` as the genuinely-safe variant if this ever bites a real user. Test-coverage gaps closed: - test_unborn_branch_raises_giterror — pin real-world behaviour where `git log` on a freshly-init'd repo exits 128 (the previous test_empty_output_returns_empty_list only exercised the mocked exit-0-empty-stdout path). - test_passes_cwd_when_path_given added to git_remote_repo, git_status, and git_recent_commits — was previously only on git_current_branch and git_uncommitted_changes. - test_detached_head_normalised_to_HEAD pins the new cross-function sentinel agreement. All subprocess-shape assertions updated for the new `git -c color.ui=never` prefix (argv indices shifted by 2). Skipped: - The U+001F dropping case (#5 partial) — already returns [] for a pathological subject; the docstring softening is the right response without a real-user bug report. Test count: 263 -> 274. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.5 round-3 review fixes (hardening) Reviewer flagged the PR as "in good shape to merge" but surfaced 5 hardening items, two of which they explicitly called out as worth-doing-before-MCP-wires-this-up. All 5 addressed plus the corresponding test coverage. Security: #2 Origin URL credential redaction. CI pipelines and credential- helper-less dev setups commonly embed auth in remote URLs (`https://x-token-auth:abcd@bitbucket.org/...`). On the rare unparseable URL, the secret would have landed in the GitOpError message, flowed up through MCP into agent context, and likely ended up in downstream logs. New `_redact_url_creds()` strips `://user:token@` -> `://[redacted]@` before any URL is echoed in an error string. Test added that asserts the secret token is absent from the raised error and that "[redacted]" appears. Reliability: #3 stdin / env hardening on every git call. `_run_git` now passes `stdin=subprocess.DEVNULL` so a credential prompt fails immediately with EOF rather than blocking on the MCP server's inherited stdin. Environment also adds GIT_TERMINAL_PROMPT=0 (git refuses to prompt at all) and GIT_ASKPASS="" (no GUI askpass dialog). Belt + suspenders alongside the 30s timeout — what was previously a 30-second wedge into "timed out" is now an instant clear error. Subprocess-shape assertions updated. UX: #1 FileNotFoundError disambiguation. subprocess.run raises FileNotFoundError for TWO distinct cases — git binary missing OR cwd directory missing — which the previous catch conflated into "git executable not found on PATH". Once the MCP agent is passing paths (e.g. a worktree that's since been removed), the user would chase their PATH config when the real cause is the bad directory. New code inspects `e.filename`: if it matches the cwd, raises with "path does not exist: <cwd>" instead. Test added covering the cwd-missing path. #4 Unborn-branch sentinel normalisation. Round 2 normalised `# branch.head (detached)` -> "HEAD" so git_status and git_current_branch agree on detached state. The same shape of asymmetry existed for unborn branches: porcelain v2 reports `# branch.oid (initial)` + `# branch.head main` (the would-be branch) while git_current_branch raises GitOpError on the same repo. Parser now tracks an `is_unborn` flag (line order independent — branch.oid / branch.head ordering in v2 output isn't documented as stable) and overrides branch to "HEAD" at end-of-loop so the two functions agree on "weird state, not a regular branch." Test added. Defensive: #5 `_cap_diff` fast-path on `len(text)`. UTF-8 byte count is always >= char count, so if char count fits the cap, the encoded form definitely does. Avoids materialising a transient bytes copy of a multi-GB string just to check its size — the cap exists specifically to defend against OOM on huge diffs, and the check itself shouldn't be the trigger. Deferred: None. All 5 findings addressed in this commit. Test count: 274 -> 277. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.5 round-4 review fixes Round 4 found a real bug I introduced in round 3 plus two cheap hardening items. All three addressed. Real bug fixed: #1 _cap_diff fast path was inverted. The round-3 fast path used `if len(text) <= cap: return text` with the reasoning "UTF-8 byte count is always >= char count, so chars <= cap implies bytes <= cap." That's backwards — bytes >= chars means `chars <= cap` does NOT imply `bytes <= cap`. A 600K-emoji string was 600K chars (under the 1 MiB char cap) but encoded to 2.4 MiB, defeating the OOM defense for non-ASCII content. The existing oversize-diff test used ASCII `"x"` chars (where chars == bytes), so the divergence was never exercised — the bug rode through round-3 review undetected. Fixed by gating the fast path on `text.isascii()`: ASCII-only strings genuinely have bytes == chars, so the len check is sound. Non-ASCII falls through to the encode-and-measure path. Regression test added with `"\U0001F600" * 600_000` (4-byte emoji) — asserts the returned diff is byte-capped, not char-capped. Hardening: #2 GitOpError.command in parse-failure paths now includes the `-c color.ui=never` prefix that _run_git actually passes to subprocess. Previously err.command in git_current_branch's empty-output path and git_remote_repo's unparseable-URL path hard-coded the unprefixed form, so a caller introspecting err.command (or reproducing the failure from the message) would see a different invocation than what was actually run. Tests strengthened to assert the prefix is present. #3 File-list cap added: _MAX_PATH_LIST = 10_000. Mirrors the _MAX_DIFF_BYTES and _MAX_LOG_COUNT bounds that already protect the other two agent-facing surfaces. Without this, a repo that forgot to gitignore node_modules / .venv / target/ would return hundreds of thousands of untracked paths through MCP — plausible OOM trigger. Applied to: - git_uncommitted_changes' untracked_files - _parse_status_porcelain_v2's staged / modified / untracked / unmerged lists New `_truncated_path_list` helper appends a sentinel marker (`<... N more paths omitted (cap N exceeded)>`) so the agent sees the truncation and can fall back to a narrower query. New TestPathListCap class pins the helper directly: under-cap / at-cap / one-over / many-over / git_status integration. Test count: 277 -> 284. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.5 round-5 review fixes Round 5 (extra-high recall mode) surfaced 12 findings. Reviewer's explicit verdict: "Findings 1, 2, and 4 are concrete defects worth fixing before Phase 4.6 wires this up to an agent." Addressed those three plus six more cheap real items; skipped three cosmetic / very- narrow ones. Correctness: #1 branch.ab strict validation. The startswith-only check accepted smuggled double-sign strings like "+-3 -1" where `int(parts[0][1:])` parsed as -3, contradicting the parser's own promise to reject negative values. Now requires `parts[N][1:].isdigit()` after the sign so "+5", "-3" parse but "+-3", "++5", "+junk" don't. Parametrized regression test added. #2 branch.ab atomic update. Previous try/except wrapped both int() calls together — if the first succeeded and the second raised, out["ahead"] was mutated to the new value while out["behind"] kept the default 0, producing an internally-inconsistent dict on a half-malformed line ("+5 -junk"). Now parses both into locals first, commits to out[] only on full success. #4 NotADirectoryError + PermissionError wrapped as GitOpError. Previously only FileNotFoundError + TimeoutExpired were caught, so passing path=/etc/passwd (regular file) or path=/restricted (unreadable directory) leaked raw OSErrors out of _run_git, breaking the module docstring's "Errors raise GitOpError" promise. Tests added with dedicated runner stubs for each error class. Security: #3 URL credential redaction handles `@` in passwords. The old `[^/@]+@` regex stopped at the first `@`, leaving the tail of a password containing a literal `@` exposed (e.g. `https://user:p@ss@host/...` → `https://[redacted]@ss@host/...`). Switched to `[^/]+@` which is greedy up to the LAST `@` before the path. Regression test added. Architecture: #11 _truncated_path_list returns (list, omitted_count) tuple instead of mixing a sentinel string into the path list. The in-list sentinel broke the "list of valid paths" contract — an agent iterating with `os.stat(p)` / `Path(p).exists()` / `os.path.join(root, p)` would hit the non-path "<...>" entry and crash or silently mis-handle it. Now the cap surfaces via sibling `<key>_omitted` fields (always present, 0 when not truncated) so the agent can detect truncation explicitly without iterating over non-path data. Applied to: - _parse_status_porcelain_v2: adds staged_omitted / modified_omitted / untracked_omitted / unmerged_omitted - git_uncommitted_changes: adds untracked_files_omitted #12 GIT_PARSE_ERROR_RETURNCODE moved from -1 to -1000. Python's subprocess.run uses negative returncodes for signal-killed processes (-1 = SIGHUP, -9 = SIGKILL, -15 = SIGTERM, etc.). A SIGHUP-killed git would otherwise land at returncode = -1, colliding with the parse-error sentinel and being misclassified. -1000 is safely outside the signal range. Test added asserting the sentinel is < -100. Documentation / parity: #5 git_status docstring updated to reflect the actual return shape. Previously said `"branch": ... | "(detached)"` but the code returns `"HEAD"` for both detached and unborn (per round 3's normalization). A caller reading the docstring and writing `if branch == "(detached)": handle_detached()` would never enter that branch. #6 Type-? (untracked) parser skips empty-path lines. Previously `"? "` (prefix only, no path body) would append "" to untracked and flip clean=False with a phantom entry. Asymmetric with type-1 / type-2 which have length guards. Test added. #8 Type-u (unmerged) parser checks XY field width. Type-1 and type-2 paths added this in round 3; the type-u path missed. Trivial parity fix. Test added. Skipped: #7 (cosmetic — diff cap is _MAX_DIFF_BYTES + ~140 marker bytes, not a strict cap; mattering only to a downstream consumer that enforces an exact ceiling, which no current consumer does) #9 (doc asymmetry — `git_uncommitted_changes` docstring doesn't mirror `git_status`'s C-quoting caveat; added to git_status' expanded docstring covering both surfaces) #10 (SCP-style URL credentials — narrow trigger, parse_remote_url accepts most SCP shapes; revisit if a real leak surfaces) Test count: 284 -> 296. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Phase 4.6: mcp_server.py — FastMCP wrapper around bb_ops + git_ops (#7) * Add mcp_server.py — FastMCP wrapper around bb_ops + git_ops Phase 4.6 of the MCP server work. Wires every bb_ops and git_ops function into a FastMCP tool registry, with a self-bootstrap venv mirroring zenhub-cli's mcp_server.py pattern. 39 new tests, 335 total. mcp_server.py — 30 tools: PIPELINES (6): pipelines_list, pipeline_show, pipeline_steps, pipeline_trigger, pipeline_stop, pipeline_logs PULL REQUESTS (11): prs_list, pr_show, pr_activity, pr_create, pr_approve, pr_unapprove, pr_merge, pr_decline, pr_diff, pr_comments_list, pr_comment_add REPOS / METADATA (7): repos_list, repo_show, branches_list, branch_show, vars_list, downloads_list, commits_list GIT CONTEXT (5): git_current_branch, git_status, git_remote_repo, git_recent_commits, git_uncommitted_changes META (1): whoami Self-bootstrap pattern (mirrors zenhub-cli): - On launch, checks /tmp/bbenv. If missing, finds a python3 >= 3.10 (preferring the launching interpreter, falling back to pyenv / Homebrew / system locations), runs `python -m venv`, installs the `mcp` package, then `os.execv`s into the venv. - Bootstrap is stdlib-only (uses subprocess + sys, no third-party imports until after the venv is active). Means any python3 on PATH can launch the server. - `mcp` is the only venv dep — no heavy ML libs (no torch / sentence-transformers / numpy / huggingface_hub). bb doesn't have semantic-similarity workflows like zenhub-cli does. - BB_MCP_SKIP_BOOTSTRAP=1 escape hatch (set in conftest.py) lets pytest exercise tool wiring and result-dict shapes without pulling in mcp. Production NEVER sets this — without real FastMCP, .run() raises. Repo resolution — every Bitbucket tool takes an optional `repo`: - "" → auto-detect via `git remote get-url origin` from BB_DEFAULT_REPO_PATH (or cwd) - "myrepo" → use BB_WORKSPACE from config + "myrepo" - "acme/myrepo" → use "acme" workspace + "myrepo" slug (overrides BB_WORKSPACE for this call) Validated in `_resolve_repo` with ValueError on malformed input ("a/b/c", "/repo", "ws/", "/", "//" all rejected before any network call burns API budget). Error envelope — every tool returns either: {"ok": True, "workspace": ..., "repo": ..., <result fields>} {"ok": False, "kind": "<ExceptionClassName>", "message": ..., <extras>} The error dict carries `status`/`url`/`body` for BBApiError, `returncode`/`stderr` for GitOpError, so the agent can branch on `kind == "BBApiError" and status == 404` without parsing the message. pr_create auto-detects source_branch via git_current_branch when empty (matches bash `bb pr-create`). whoami reports user / workspace / api_base + best-effort git context (branch, workspace, repo from remote). Token is NEVER echoed; the test explicitly asserts the token string is absent from the response. Test discipline (same as bb_api / bb_ops / git_ops): - Tool count + names pinned exactly (test_all_expected_tools_registered) — a future rename or accidental drop fails the suite. - Per-tool wiring: patch the bb_ops / git_ops function at the module level, call the tool, assert (a) the underlying function received the right args, (b) the response dict has the expected shape. Doesn't re-test bb_ops logic (already covered comprehensively). - Error envelope tested for each exception class (BBApiError, BBOpNotFound, GitOpError, ValueError). - whoami's no-token guarantee asserted as a security property. - FastMCP stub's .run() raises a clear error so a test that accidentally calls it fails loud instead of hanging on stdio. CI / packaging: pyproject.toml py-modules += "mcp_server". ci.yml syntax-check += "mcp_server.py". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.6 review fixes Round 1 review on PR #7 flagged 15 findings, with two explicitly called out as production hazards ("will bite real users on first deploy on common Linux topologies"). Addressed 11 of 15; deferring 4 with rationale. Production hazards (must-fix per reviewer): #1 Bootstrap venv detection now uses `sys.prefix == str(_VENV_DIR)` instead of `realpath(sys.executable) == realpath(_VENV_PY)`. `python -m venv` on Linux/macOS defaults to --symlinks, so realpath(/tmp/bbenv/bin/python3) resolves to the SAME canonical path as the builder interpreter (e.g. /usr/bin/python3.12) — the realpath comparison was claiming "already under venv" while we were actually still under the system interpreter, skipping the execv and dying on `from mcp.server.fastmcp import FastMCP`. sys.prefix is set per-interpreter from the venv layout and is the authoritative signal. #2 Bootstrap idempotency. Pip install can be interrupted (Ctrl-C, OOM-kill, disk full) leaving _VENV_PY in place but `mcp` missing. The old guard skipped reinstall on every subsequent launch and died on import with no self-repair path. Now writes a `.bbenv-ready` sentinel ONLY after the full bootstrap succeeds; the venv create step is also idempotent (only runs if _VENV_PY is absent) so retries pick up where they left off. #3 Pinned mcp dependency to `mcp>=1.0,<2`. The previous unpinned "mcp" would silently install a future breaking mcp 2.x on the next /tmp wipe. Matches the pyproject [mcp] extra. Correctness: #4 pr_create now rejects detached-HEAD auto-detect. git_current_branch returns the literal "HEAD" for detached/unborn state; bb_ops only checks non-empty/non-whitespace, so the call would have reached Bitbucket with source.branch.name="HEAD" and surfaced as an opaque API error. Now raises ValueError locally with a clear "pass source_branch=" hint. #5 _resolve_repo strips whitespace before parsing. A sloppy paste like " acme/widget " no longer slips through as workspace=" acme" and fails deep in the API layer. #11 pr_create source_branch auto-detect now triggers on whitespace (not just empty string), symmetric with #5. #13 OSError added to _TOOL_EXPECTED_EXCEPTIONS. Covers paths git_ops._run_git doesn't wrap explicitly (IsADirectoryError, BlockingIOError, ConnectionResetError), and catches os.getcwd() on a deleted cwd inside _default_repo_path() which fires BEFORE any wrapped git call. Surface improvements: #6 pipelines_list now exposes `sort=` kwarg. Default "-created_on" (newest first); agent can pass "created_on" for oldest-first or other sort keys. #7 pipeline_logs + pr_diff now expose `timeout=` kwarg. Default 120s matches bb_ops; agent can bump for pipelines with very large log payloads or oversized PR diffs that would otherwise cap out. #8 git_status payload renamed from `status` to `working_tree`. The `status` key collided with the HTTP-status field _error_dict uses for BBApiError. Today the collision can't fire (git_status doesn't raise BBApiError), but the rename pre-empts a future broadening hazard. #10 branch_show echoes the stripped name in the response. bb_ops strips before encoding the URL, so the response now matches what Bitbucket actually resolved rather than the unstripped input. Deferred (with rationale): #9 Response-key naming consistency (approval/result/pr). Reviewer suggested standardizing all mutation tools to "result", but the descriptive keys carry useful semantic info — pr_approve returns an approval record, pr_merge returns the merged PR, etc. The docstrings explain the shape. Skipping. #12 _client_cache concurrency + invalidation. Real concern for long-lived servers with rotated tokens, but stdio MCP is single-threaded and token rotation is rare; defer to Phase 4.6+ if a real bug surfaces. #14 CI smoke step that actually imports the real mcp package. Larger CI workflow change; defer. #15 README MCP documentation. Explicitly Phase 4.8 scope. Test count: 335 -> 341. All passing locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Apply Phase 4.6 round-2 review fixes Round 2 surfaced 15 new findings including a real security bug. Fixed all 15. Security: #1 BBApiError.url now redacted in _error_dict. pipeline_logs and pr_diff follow Bitbucket's 307 to a presigned S3 URL via fetch_redirected_text; on a downstream non-3xx (S3 clock skew, expired signature, redirect cap), BBApiError(url=<signed S3 URL>) carried AWS-credential-bearing query parameters (X-Amz-Signature, X-Amz-Credential) verbatim into the agent error dict and downstream logs. New _redact_url helper strips `user:token@` AND replaces signed-URL query strings with `[redacted-signed-url-params]` while preserving the path so the agent knows what host was called. Correctness: #2 AttributeError added to _TOOL_EXPECTED_EXCEPTIONS. JSON null from an MCP client deserialises to Python None; `.strip()` on None crashes uncaught. Also pre-normalised every stringy arg path through `(value or "").strip()` so None and "" funnel to the same auto-detect path uniformly. #3 _default_repo_path no longer evaluates os.getcwd() eagerly. `os.environ.get("KEY", os.getcwd())` runs the default even when the env var is set — the override never protected against a deleted cwd. Now uses `... or os.getcwd()` so the env var is the lazy short-circuit. whoami also wraps cwd resolution in a try block so a deleted cwd surfaces as a structured cwd_error field instead of escaping the function. #4 mcp import wrapped with helpful recovery message. If `mcp/` gets deleted post-bootstrap (manual `pip uninstall`, partial /tmp cleanup that wiped the package dir but spared the sentinel), the ImportError now tells the user to `rm .bbenv-ready` or `rm -rf /tmp/bbenv`. #5 _resolve_repo now strips inner slug parts. "acme/ widget" no longer slips through as workspace="acme", slug=" widget" and 404s on `/repositories/acme/%20widget`. #6 repos_list workspace stripped. ` acme` / `acme ` no longer slip through. #7 TypeError removed from _TOOL_EXPECTED_EXCEPTIONS (was contradicting the file's stated philosophy). A refactor that renames a bb_ops kwarg now crashes visibly during dev/test instead of being silently reported back to the agent as a fake Bitbucket failure. #8 whoami git-context handlers now use _TOOL_EXPECTED_EXCEPTIONS (not narrow GitOpError) AND store the full structured _error_dict (not str(e)). An agent branching on err.returncode for git-127 vs not-a-git-repo-128 vs timeout-(-1000) has the same shape available as every other tool. #10 _opt_str() helper introduced. (branch or None) became `_opt_str(branch)` everywhere — "" / " " / None all funnel to None at the bb_ops boundary. Without this, " " on a branch filter posted `?target.ref_name=%20%20`, on a pattern raised a confusing two-layers-down ValueError, on commits raised a different "non-empty, non-whitespace" error. Inconsistent surfaces for the same agent typo. #12 sys.prefix venv check now resolves both sides through Path.resolve() so macOS's `/tmp` -> `/private/tmp` symlink doesn't trigger an infinite execv loop. #13 Every tool that takes an identifier (pr_id / number / step_index) now threads it into the error dict via _error_dict_with. An agent fanning out parallel pipeline_logs(number=42, step_index=5) and pipeline_logs(number=42, step_index=6) can correlate failures to originating requests without substring-matching the message. #14 _resolve_repo validates the repo arg shape BEFORE calling _get_client(). A fresh-machine user without config + a malformed slug now sees the actual ValueError, not the BBConfigError that previously masked the real cause. #15 git_recent_commits echoes the stripped ref in the response (matches what `git log` actually ran). Consistent with branch_show. UX: #11 pip ins…

Bump VERSION to 1.1.0 for release

40aa940

daniel-pittman merged commit a642baa into develop May 27, 2026
3 checks passed

daniel-pittman deleted the chore/bump-1.1.0 branch May 27, 2026 23:41

daniel-pittman mentioned this pull request May 27, 2026

Release v1.1.0: Python MCP server + bash CLI parity + agent definition #14

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bump VERSION to 1.1.0 for release#13

Bump VERSION to 1.1.0 for release#13
daniel-pittman merged 1 commit into
developfrom
chore/bump-1.1.0

daniel-pittman commented May 27, 2026

Uh oh!

Uh oh!

claude Bot commented May 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

daniel-pittman commented May 27, 2026

Summary

Scope of v1.1.0

Uh oh!

Uh oh!

claude Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review — PR #13

Summary

Assessment

Findings

Verdict

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude Bot commented May 27, 2026 •

edited

Loading