Bump VERSION to 1.1.0 for release#13
Merged
Merged
Conversation
|
Claude finished @daniel-pittman's task in 25s —— View job Code Review — PR #13
SummaryThis PR is a single-line version bump in - 1.0.1
+ 1.1.0Assessment
FindingsNo blocking issues. No security, correctness, or style concerns in the diff. VerdictLGTM — ship it. Status notes this is already MERGED, so this review is informational. Confirm the • Branch: |
4 tasks
daniel-pittman
added a commit
that referenced
this pull request
May 27, 2026
* Add mcp_server.py — FastMCP wrapper around bb_ops + git_ops
Phase 4.6 of the MCP server work. Wires every bb_ops and git_ops
function into a FastMCP tool registry, with a self-bootstrap venv
mirroring zenhub-cli's mcp_server.py pattern. 39 new tests, 335 total.
mcp_server.py — 30 tools:
PIPELINES (6): pipelines_list, pipeline_show, pipeline_steps,
pipeline_trigger, pipeline_stop, pipeline_logs
PULL REQUESTS (11): prs_list, pr_show, pr_activity, pr_create,
pr_approve, pr_unapprove, pr_merge, pr_decline, pr_diff,
pr_comments_list, pr_comment_add
REPOS / METADATA (7): repos_list, repo_show, branches_list,
branch_show, vars_list, downloads_list, commits_list
GIT CONTEXT (5): git_current_branch, git_status, git_remote_repo,
git_recent_commits, git_uncommitted_changes
META (1): whoami
Self-bootstrap pattern (mirrors zenhub-cli):
- On launch, checks /tmp/bbenv. If missing, finds a python3 >= 3.10
(preferring the launching interpreter, falling back to pyenv /
Homebrew / system locations), runs `python -m venv`, installs the
`mcp` package, then `os.execv`s into the venv.
- Bootstrap is stdlib-only (uses subprocess + sys, no third-party
imports until after the venv is active). Means any python3 on PATH
can launch the server.
- `mcp` is the only venv dep — no heavy ML libs (no torch /
sentence-transformers / numpy / huggingface_hub). bb doesn't have
semantic-similarity workflows like zenhub-cli does.
- BB_MCP_SKIP_BOOTSTRAP=1 escape hatch (set in conftest.py) lets
pytest exercise tool wiring and result-dict shapes without
pulling in mcp. Production NEVER sets this — without real
FastMCP, .run() raises.
Repo resolution — every Bitbucket tool takes an optional `repo`:
- "" → auto-detect via `git remote get-url origin`
from BB_DEFAULT_REPO_PATH (or cwd)
- "myrepo" → use BB_WORKSPACE from config + "myrepo"
- "acme/myrepo" → use "acme" workspace + "myrepo" slug (overrides
BB_WORKSPACE for this call)
Validated in `_resolve_repo` with ValueError on malformed input
("a/b/c", "/repo", "ws/", "/", "//" all rejected before any network
call burns API budget).
Error envelope — every tool returns either:
{"ok": True, "workspace": ..., "repo": ..., <result fields>}
{"ok": False, "kind": "<ExceptionClassName>", "message": ..., <extras>}
The error dict carries `status`/`url`/`body` for BBApiError,
`returncode`/`stderr` for GitOpError, so the agent can branch on
`kind == "BBApiError" and status == 404` without parsing the message.
pr_create auto-detects source_branch via git_current_branch when
empty (matches bash `bb pr-create`).
whoami reports user / workspace / api_base + best-effort git context
(branch, workspace, repo from remote). Token is NEVER echoed; the
test explicitly asserts the token string is absent from the response.
Test discipline (same as bb_api / bb_ops / git_ops):
- Tool count + names pinned exactly (test_all_expected_tools_registered)
— a future rename or accidental drop fails the suite.
- Per-tool wiring: patch the bb_ops / git_ops function at the module
level, call the tool, assert (a) the underlying function received
the right args, (b) the response dict has the expected shape.
Doesn't re-test bb_ops logic (already covered comprehensively).
- Error envelope tested for each exception class
(BBApiError, BBOpNotFound, GitOpError, ValueError).
- whoami's no-token guarantee asserted as a security property.
- FastMCP stub's .run() raises a clear error so a test that
accidentally calls it fails loud instead of hanging on stdio.
CI / packaging:
pyproject.toml py-modules += "mcp_server".
ci.yml syntax-check += "mcp_server.py".
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Apply Phase 4.6 review fixes
Round 1 review on PR #7 flagged 15 findings, with two explicitly called
out as production hazards ("will bite real users on first deploy on
common Linux topologies"). Addressed 11 of 15; deferring 4 with
rationale.
Production hazards (must-fix per reviewer):
#1 Bootstrap venv detection now uses `sys.prefix == str(_VENV_DIR)`
instead of `realpath(sys.executable) == realpath(_VENV_PY)`.
`python -m venv` on Linux/macOS defaults to --symlinks, so
realpath(/tmp/bbenv/bin/python3) resolves to the SAME canonical
path as the builder interpreter (e.g. /usr/bin/python3.12) — the
realpath comparison was claiming "already under venv" while we
were actually still under the system interpreter, skipping the
execv and dying on `from mcp.server.fastmcp import FastMCP`.
sys.prefix is set per-interpreter from the venv layout and is the
authoritative signal.
#2 Bootstrap idempotency. Pip install can be interrupted (Ctrl-C,
OOM-kill, disk full) leaving _VENV_PY in place but `mcp` missing.
The old guard skipped reinstall on every subsequent launch and
died on import with no self-repair path. Now writes a
`.bbenv-ready` sentinel ONLY after the full bootstrap succeeds;
the venv create step is also idempotent (only runs if _VENV_PY
is absent) so retries pick up where they left off.
#3 Pinned mcp dependency to `mcp>=1.0,<2`. The previous unpinned
"mcp" would silently install a future breaking mcp 2.x on the
next /tmp wipe. Matches the pyproject [mcp] extra.
Correctness:
#4 pr_create now rejects detached-HEAD auto-detect. git_current_branch
returns the literal "HEAD" for detached/unborn state; bb_ops only
checks non-empty/non-whitespace, so the call would have reached
Bitbucket with source.branch.name="HEAD" and surfaced as an
opaque API error. Now raises ValueError locally with a clear
"pass source_branch=" hint.
#5 _resolve_repo strips whitespace before parsing. A sloppy paste
like " acme/widget " no longer slips through as
workspace=" acme" and fails deep in the API layer.
#11 pr_create source_branch auto-detect now triggers on whitespace
(not just empty string), symmetric with #5.
#13 OSError added to _TOOL_EXPECTED_EXCEPTIONS. Covers paths
git_ops._run_git doesn't wrap explicitly (IsADirectoryError,
BlockingIOError, ConnectionResetError), and catches
os.getcwd() on a deleted cwd inside _default_repo_path() which
fires BEFORE any wrapped git call.
Surface improvements:
#6 pipelines_list now exposes `sort=` kwarg. Default "-created_on"
(newest first); agent can pass "created_on" for oldest-first or
other sort keys.
#7 pipeline_logs + pr_diff now expose `timeout=` kwarg. Default
120s matches bb_ops; agent can bump for pipelines with very
large log payloads or oversized PR diffs that would otherwise
cap out.
#8 git_status payload renamed from `status` to `working_tree`. The
`status` key collided with the HTTP-status field _error_dict
uses for BBApiError. Today the collision can't fire (git_status
doesn't raise BBApiError), but the rename pre-empts a future
broadening hazard.
#10 branch_show echoes the stripped name in the response. bb_ops
strips before encoding the URL, so the response now matches what
Bitbucket actually resolved rather than the unstripped input.
Deferred (with rationale):
#9 Response-key naming consistency (approval/result/pr). Reviewer
suggested standardizing all mutation tools to "result", but the
descriptive keys carry useful semantic info — pr_approve returns
an approval record, pr_merge returns the merged PR, etc. The
docstrings explain the shape. Skipping.
#12 _client_cache concurrency + invalidation. Real concern for
long-lived servers with rotated tokens, but stdio MCP is
single-threaded and token rotation is rare; defer to Phase 4.6+
if a real bug surfaces.
#14 CI smoke step that actually imports the real mcp package.
Larger CI workflow change; defer.
#15 README MCP documentation. Explicitly Phase 4.8 scope.
Test count: 335 -> 341. All passing locally.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Apply Phase 4.6 round-2 review fixes
Round 2 surfaced 15 new findings including a real security bug. Fixed
all 15.
Security:
#1 BBApiError.url now redacted in _error_dict. pipeline_logs and
pr_diff follow Bitbucket's 307 to a presigned S3 URL via
fetch_redirected_text; on a downstream non-3xx (S3 clock skew,
expired signature, redirect cap), BBApiError(url=<signed S3 URL>)
carried AWS-credential-bearing query parameters
(X-Amz-Signature, X-Amz-Credential) verbatim into the agent
error dict and downstream logs. New _redact_url helper strips
`user:token@` AND replaces signed-URL query strings with
`[redacted-signed-url-params]` while preserving the path so
the agent knows what host was called.
Correctness:
#2 AttributeError added to _TOOL_EXPECTED_EXCEPTIONS. JSON null
from an MCP client deserialises to Python None; `.strip()` on
None crashes uncaught. Also pre-normalised every stringy arg
path through `(value or "").strip()` so None and "" funnel to
the same auto-detect path uniformly.
#3 _default_repo_path no longer evaluates os.getcwd() eagerly.
`os.environ.get("KEY", os.getcwd())` runs the default even when
the env var is set — the override never protected against a
deleted cwd. Now uses `... or os.getcwd()` so the env var is
the lazy short-circuit. whoami also wraps cwd resolution in a
try block so a deleted cwd surfaces as a structured cwd_error
field instead of escaping the function.
#4 mcp import wrapped with helpful recovery message. If `mcp/`
gets deleted post-bootstrap (manual `pip uninstall`, partial
/tmp cleanup that wiped the package dir but spared the
sentinel), the ImportError now tells the user to
`rm .bbenv-ready` or `rm -rf /tmp/bbenv`.
#5 _resolve_repo now strips inner slug parts. "acme/ widget"
no longer slips through as workspace="acme", slug=" widget"
and 404s on `/repositories/acme/%20widget`.
#6 repos_list workspace stripped. ` acme` / `acme ` no longer
slip through.
#7 TypeError removed from _TOOL_EXPECTED_EXCEPTIONS (was
contradicting the file's stated philosophy). A refactor that
renames a bb_ops kwarg now crashes visibly during dev/test
instead of being silently reported back to the agent as a
fake Bitbucket failure.
#8 whoami git-context handlers now use _TOOL_EXPECTED_EXCEPTIONS
(not narrow GitOpError) AND store the full structured
_error_dict (not str(e)). An agent branching on
err.returncode for git-127 vs not-a-git-repo-128 vs
timeout-(-1000) has the same shape available as every other
tool.
#10 _opt_str() helper introduced. (branch or None) became
`_opt_str(branch)` everywhere — "" / " " / None all funnel
to None at the bb_ops boundary. Without this, " " on a
branch filter posted `?target.ref_name=%20%20`, on a pattern
raised a confusing two-layers-down ValueError, on commits
raised a different "non-empty, non-whitespace" error.
Inconsistent surfaces for the same agent typo.
#12 sys.prefix venv check now resolves both sides through
Path.resolve() so macOS's `/tmp` -> `/private/tmp` symlink
doesn't trigger an infinite execv loop.
#13 Every tool that takes an identifier (pr_id / number /
step_index) now threads it into the error dict via
_error_dict_with. An agent fanning out parallel
pipeline_logs(number=42, step_index=5) and
pipeline_logs(number=42, step_index=6) can correlate
failures to originating requests without substring-matching
the message.
#14 _resolve_repo validates the repo arg shape BEFORE calling
_get_client(). A fresh-machine user without config + a
malformed slug now sees the actual ValueError, not the
BBConfigError that previously masked the real cause.
#15 git_recent_commits echoes the stripped ref in the response
(matches what `git log` actually ran). Consistent with
branch_show.
UX:
#11 pip install no longer --quiet. Now uses subprocess.run
capture_output=True and re-raises with the captured stderr
inlined so a network blip / mirror 503 / version yank /
SSL cert error / proxy block surfaces with pip's actual
diagnostic instead of bare `CalledProcessError: returned
non-zero exit status 1`.
Deferred:
#9 Test recorder accepts any kwarg name (false sense of
wiring pinned). Real concern but the fix (functools.wraps
or create_autospec) is invasive across the test suite;
defer to a focused refactor.
Test count: 341 -> 348. All passing locally on Python 3.11.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Apply Phase 4.6 round-3 review fixes (security regression + edges)
Round 3 caught a real security regression of round 2's fix, plus a
batch of consistency / hardening items.
Security regression fixed:
#1 _error_dict's `message` field now also routed through URL
redaction. Round 2 redacted `out["url"]` but left
`out["message"]` carrying the raw URL — BBApiError.__str__
constructs the message via
`f"HTTP {status} from {url}: {body[:500]}"`, so str(e) ALWAYS
contains the original URL.
For the S3 presigned case (pipeline_logs / pr_diff → 307 → S3
non-3xx), the error dict was shipping:
{"url": "...?[redacted-signed-url-params]", # fine
"message": "HTTP 403 from https://.../?X-Amz-Signature=secret..."}
^^^^^^ secret leaked
The existing test_bbapierror_redacts_signed_s3_url only asserted
on d["url"], so the regression was invisible to CI. The fix adds
a `_redact_message` helper that scans for `http(s)://...`
substrings and runs each through `_redact_url`. Tests now assert
redaction on BOTH `url` AND `message` for every cred-bearing
fixture.
#9 Case-insensitive signed-URL indicator matching. MinIO / R2 /
Backblaze / mixed-case AWS variants would have slipped past the
case-sensitive substring check. Now lowercases the query part
before membership. Tests added for lowercase variant + plain
`Signature=` form (used by some non-AWS S3-compatible services).
Correctness:
#2 pipeline_trigger now validates + strips `branch`. Was the only
branch-accepting tool that forwarded the arg verbatim — every
other tool funnels through _opt_str, but pipeline_trigger
couldn't (branch is required, not optional). Now explicitly
validates non-empty/non-whitespace and strips, so `" main "`
becomes `"main"` and `" "` raises a clean local ValueError
instead of POSTing whitespace and getting an opaque 4xx.
#3 _resolve_repo bare-slug path now validates `repo not in (".", "..")`
before _get_client(). Round 2 fixed the slash-containing branch
but the bare-slug fallback still hit _get_client() first, so a
fresh-machine user with `repo='.'` saw BBConfigError masking
the real ValueError. Parametrized regression test added.
#5 pr_create now rejects explicit `source_branch="HEAD"` (not just
the auto-detect path). Same severity class as round-1 #4 — just
on the other entry point. Bitbucket would have silently created
a degenerate PR named after the literal HEAD ref.
#6 pr_create error path now uses _error_dict_with(e, title=...).
Threads the title into the error dict so parallel pr_create
fan-outs (e.g. agent creating one PR per stacked branch in a
train) can correlate failures with originating requests.
Defensive:
#10 os.execv argv now uses Path(__file__).resolve(). Today's
documented launch pattern uses absolute paths so the issue
can't fire; this hardens against a future bootstrap enhancement
or sitecustomize.py that introduces a chdir between launch and
execv.
Deferred (carried forward with rationale):
#4 _recorder accepts any kwarg name (false sense of wiring
pinned). Fixing requires either create_autospec or
inspect.signature.bind across the test suite — bigger refactor
than this commit. Notable enough to track separately.
#7 _client_cache invalidation. Long-lived MCP holds the original
BBClient for its lifetime; rotating BB_TOKEN requires restart.
Architectural, defer.
#8 git_recent_commits silently defaults None ref to "HEAD".
Cosmetic UX, defer.
Test count: 348 -> 356. All passing locally.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Apply Phase 4.6 round-4 review fixes (structural redaction)
Round 4 caught a third iteration of the same credential-leak pattern.
This commit closes it structurally so the class can't regress.
Critical security fix (structural):
#1 _error_dict now routes EVERY string field through _safe_text:
message, body (BBApiError), stderr (GitOpError). The previous
rounds whack-a-mole'd one field at a time:
- Round 2: redacted `url` → message still leaked
- Round 3: redacted `message` → body still leaked
- Round 4: redacted `body` → stderr would leak the moment
Phase 4.7 wraps `git fetch` / `git push` (whose stderr
commonly contains `fatal: unable to access 'https://x-token-
auth:TOKEN@bb.org/...'`).
Unified `_safe_text()` helper applies URL + SCP-cred redaction
to free-form text. Tests now assert redaction across ("url",
"message", "body") for the API case AND ("message", "stderr")
for the git case, plus a new test_giterror_stderr_redacted that
covers the Phase 4.7 hazard pre-emptively.
#2 SECURITY: signed-URL indicator list broadened beyond AWS.
Added: Azure SAS (`sig=`, lowercase only — the trailing `=`
blocks `signature=` substring matching), GCP X-Goog-Signature /
-Credential, bearer-token-in-URL forms (`access_token=`,
`api_key=`). Plus the existing AWS / generic `Signature=`.
Tests added for Azure SAS and bearer-token shapes.
#8 SECURITY: free-form-text URL redaction broadened to any
scheme. The previous `https?://` regex missed `ssh://`,
`git+ssh://`, etc. The new `_ANY_URL_PATTERN` matches any URL
scheme; combined with the existing SCP-style pattern, free-form
git stderr / log output gets redacted regardless of remote URL
shape. Test added: ssh:// URL with embedded passphrase redacted
by _safe_text.
Correctness:
#3 pr_create normalises every string arg at the wrapper boundary
(source_branch, destination_branch, title, description). The
wrapper previously only used `.strip()` for the empty-check
guard, then forwarded the unstripped value to bb_ops which
wrote it verbatim into the JSON payload. `"feat/widget "`
became a 404, `"Hi "` became a PR title with trailing
whitespace.
#4 pr_create handles JSON null for source_branch. Previously
`None.strip()` crashed uncaught with AttributeError (round 2
fixed this class for _resolve_repo but missed pr_create).
Title required validation also added — empty/whitespace title
raises ValueError with the rejected value echoed.
#6 _resolve_repo slash-branch now validates `.` / `..` in either
workspace OR repo segment BEFORE _get_client(). Round 3 only
landed the check on the bare-slug fallback; `repo="./foo"`
still surfaced as BBConfigError on a config-less machine.
Parametrized regression test added.
#9 git_recent_commits ref normalisation handles None and
whitespace-only uniformly. `(ref or "").strip() or "HEAD"` —
None and " " both fall back to HEAD, consistent with the
empty-string success path.
Test hermeticity:
#12 _reset_state fixture now scrubs BB_USER / BB_TOKEN /
BB_WORKSPACE / BB_API_BASE before each test. A developer
running pytest locally with these exported in their shell
no longer leaks ambient config into the tests — the suite
is hermetic against dev env.
Deferred (carried forward with rationale):
#5 Identifier-threading inconsistency on remaining tools
(pipeline_trigger / branches_list / branch_show / commits_list
/ repos_list / repo_show / vars_list / downloads_list). Real
but stylistic; pick this up in a focused symmetry pass.
#7 Stderr redaction proactive defense for future git wrappers —
addressed structurally by routing stderr through _safe_text in
#1 above (so technically not deferred).
#10 _client_cache invalidation. Architectural; documented in
earlier commit messages, defer.
#11 pr_approve/pr_unapprove key naming inconsistency. Pure
naming; intentional descriptive names per round-1 #9 deferral.
#13 _resolve_repo error path lacks workspace/repo/path. Style;
bundle with #5 in a symmetry pass.
Test count: 356 -> 360. All passing locally.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
daniel-pittman
added a commit
that referenced
this pull request
May 27, 2026
* Phase 4.7: bash parity — 5 new commands + 8 parity bugfixes
Extends bb with the commands the Python ops exposed but bash didn't,
and fixes the parity bugs Python tests surfaced. Also removes a
personal-data leak in the help text.
NEW COMMANDS
bb pr-unapprove [repo] <id> DELETE /pullrequests/{id}/approve.
bb pr-comment [repo] <id> <body> POST /pullrequests/{id}/comments
with {"content": {"raw": "..."}}.
bb branch [repo] <name> GET single branch detail; URL-encodes
slashes in name (feat/widget →
feat%2Fwidget).
bb commits [repo] [branch] [count] GET /commits or /commits/{branch}
with branch URL-encoding + count.
bb whoami Resolved config + git context +
/user reachability probe. NEVER
echoes BB_TOKEN.
PARITY BUGFIXES
(a) cmd_pipeline_trigger no longer drops `variables` when `pattern`
is empty. Previously the no-pattern branch built the payload
WITHOUT the variables array, silently discarding any VAR=value
pairs the user passed.
(b) cmd_pipeline_stop captures the bb_post exit code instead of
`> /dev/null`-ing it. A stop failure (already stopped,
permission denied, 4xx) now surfaces with a clear error and
non-zero exit instead of being masked by the unconditional
success print.
(c) build_number → uuid lookup bumped 50 → 100 (Bitbucket's max
pagelen) in cmd_pipeline / cmd_pipeline_stop / cmd_logs.
Older pipelines beyond 100 still unfindable in bash; full
pagination is the Python-side improvement (2000-scan cap).
(d) repo_path now validates workspace + repo at the boundary:
rejects empty/whitespace, embedded `/`, `.`, and `..`.
Mirrors bb_api.repo_path's Python validation so both
surfaces enforce the same contract — no more silent
`/repositories//foo` or `/repositories/../foo`.
(e) cmd_pipeline_trigger empty-variables payload now omits the
key entirely (was inconsistent: pattern-set path emitted
`variables: []`, no-pattern path dropped variables). Matches
Python's omit-when-empty contract.
(f) cmd_pr_approve / cmd_pr_decline now check the bb_post exit
code instead of discarding it — symmetric with (b). Approve /
decline failures (already approved, declined PR can't be
re-declined, etc.) now surface to the user.
(h) cmd_pr_create omits `description` from the payload when
empty/whitespace. Matches the Python omit-when-empty
contract.
(j) cmd_pr_diff now uses `curl -sfL` (was `-sf`) so a future
redirect on this endpoint doesn't fail silently. `curl -L -u`
does NOT resend credentials to a different host by default,
so the cross-host credential-leak concern is already
mitigated by curl's own semantics — no manual auth-stripping
required here.
Removed personal-data leak: help text previously named a specific
personal workspace as the default. Generic example only now.
DEFERRED
(g) cmd_pr_create hardcodes close_source_branch=true. Both
surfaces share this default; exposing as a flag would be a
coordinated bash + Python change. Defer to a future PR.
(i) cmd_pr_merge uses PUT; modern Bitbucket REST docs use POST.
Bitbucket accepts both for this endpoint and bash's PUT is
the verified-working contract. Python also uses PUT for
parity. Coordinated investigation deferred.
(k) cmd_pr_merge unhashable-strategy raises TypeError instead
of ValueError. n/a in bash — bash doesn't enforce types this
way.
(l) Bash repos_list / branches_list / vars_list / downloads_list
don't expose BBQL `q` query filter or `count` parameters.
Agent gets richer access via the MCP surface; bash retains
its simpler human-driven shape. Defer.
Smoke tests run locally:
- bash -n bb (clean)
- bb help (all 5 new commands appear)
- Python suite: 360/360 still green
DISPATCHER
Added case-statement entries for: pr-unapprove (alias `prua`),
pr-comment, branch, commits, whoami. Help text reorganised to
show the new commands grouped with their family.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Apply Phase 4.7 review fixes (security + real bugs in new code)
Round 1 review surfaced 15 findings. Folding 8 — the critical-and-mine
items plus a few cheap symmetries; deferring 7 pre-existing items that
weren't directly touched.
Security:
#1 cmd_whoami strips `user:token@` from origin URL before echoing.
`git remote get-url origin` returns the raw URL including any
embedded credentials (a common pattern for token-based git auth).
whoami output is exactly the kind of thing users paste into bug
reports and screenshots — can't have secrets in it. Matches the
`_redact_url` helper in mcp_server.py / git_ops.py.
#6 cmd_whoami probes `/repositories/{workspace}?pagelen=1` instead
of `/user`. Atlassian's workspace-scoped tokens (the modern
recommended type) reject /user with 401/403 while still serving
/repositories/{workspace} correctly — the old probe would have
told users with valid tokens to rotate them.
Real bugs in code this PR added or touched:
#2 cmd_watch pagelen bumped 50 → 100. I missed this when bumping
cmd_pipeline / cmd_pipeline_stop / cmd_logs in the same PR. A
pipeline at positions 51-100 in the recent list would never
match here and the watch loop would spin forever printing blanks.
#3 cmd_pipeline_trigger variables built via jq instead of string
concatenation. The previous shape produced invalid JSON when
any value contained `"`, `\`, newline, or tab — the `jq -n
--argjson vars` would then refuse to parse and abort after the
user-visible `Triggering pipeline...` had already printed. JSON
injection vector for any agent-supplied variable value.
#4 cmd_pipeline_trigger `shift 3 || true` no longer masks the
under-3-args case. Previously `bb trigger myrepo` (1 arg) left
"myrepo" in $@ and the var-loop parsed it as a VAR=VALUE pair,
sending `{"key":"myrepo","value":"myrepo"}` as a pipeline
variable. Now guards on `$# -ge 3` and explicitly shifts the
available count.
#15 cmd_pipeline_trigger now rejects detached HEAD. `git rev-parse
--abbrev-ref HEAD` exits 0 with the literal "HEAD" in detached
state, so the `|| echo "main"` fallback didn't fire. Bitbucket
400s on `ref_name=HEAD`. Clean local error instead.
Validation parity (symmetric with Python):
#5 repo_path whitespace check now uses `[[:space:]]+` regex
instead of `${var// }` (which stripped only ASCII space).
Tabs / newlines / CRs now correctly rejected — true parity
with Python's `.strip()`.
#9 repo_path now uses `kill -TERM $$` on validation failure
instead of `exit 1`. `exit 1` inside a `$(repo_path ...)` only
terminates the subshell; the caller proceeded with an empty
path and ate a confusing layered failure. `kill -TERM $$`
terminates the parent script so validation actually halts.
#7 + #14 cmd_pr_merge captures exit code AND validates strategy.
Mirror of the (b)/(f) capture pattern this PR established for
cmd_pr_approve / cmd_pr_unapprove / cmd_pr_decline /
cmd_pipeline_stop. A merge conflict / failing-required-build /
typo-strategy now surfaces with a labelled error including
likely causes instead of `set -e` silently aborting.
Deferred (pre-existing, not touched by this PR's scope):
#8 cmd_pr_comment_add stdin fallback — nice-to-have parity with
cmd_pr_create's pattern. Defer.
#10 build_number numeric validation in cmd_pipeline / cmd_logs /
cmd_pipeline_stop — pre-existing. Defer.
#11 cmd_branch_show jq null-safety on .target.hash / .target.date.
Real but narrow (annotated-tag refs). Defer.
#12 cmd_pr_diff curl failures print nothing — pre-existing
asymmetry. Defer.
#13 cmd_commits count validation — minor. Defer.
Smoke tests: bash -n bb clean, bb help renders, Python 360/360 still green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Apply Phase 4.7 round-2 review fixes
Round 2 caught a regression I introduced in round 1 + several real
items the first pass missed.
Critical:
#1 cmd_whoami origin URL fallback now actually fires when git
fails. My round-1 fix piped through sed for credential redaction
but the `||` attached to sed (which always exits 0 on empty
input) instead of git. So `bb whoami` in a non-git directory
printed `Origin:` blank instead of the promised `(no origin
remote)` placeholder. Refactored to an `if remote=$(...); then
... else ... fi` block so the fallback branches on git's
exit status.
#2 cmd_pipeline_trigger variables parser switched to NUL delimiter
(was newline). My round-1 fix replaced unsafe string-concat
with `printf '%s\n' "$@" | jq -Rs 'split("\n") ...'` — but
that still fragments any VAR=value where value contains a
newline. `VAR=$'line1\nline2'` became `{VAR:line1}` plus a
ghost `{line2:""}`. NUL is the safe delimiter; printf's
`%s\0` + jq's `split(" ")` round-trip cleanly.
#3 SECURITY: build_number now validated numeric in every command
that splices it into a jq filter (cmd_pipeline, cmd_watch,
cmd_logs, cmd_pipeline_stop). The previous raw interpolation
let jq treat a non-numeric value as an undefined-function
reference (typo path) OR — confirmed by the reviewer — as a
jq-injection vector via `$ENV` exfil (e.g. `1) | $ENV.BB_TOKEN,
.uuid` would land the token in pipeline_uuid → next URL path).
New `_require_build_number` helper enforces `^[0-9]+$` at the
boundary; rejected input never reaches jq.
#4 cmd_pipeline_trigger `Variables: $*` echo now masks values.
Previously emitted the user's raw VAR=value args to stdout —
secrets in terminal scrollback, CI logs, shell-history wrappers,
`bb trigger ... | tee` outputs. Now prints `KEY=***` for each
variable. Contradicted the "NEVER echoes BB_TOKEN" claim in
cmd_whoami.
#8 repo_path whitespace check now catches MIXED whitespace, not
just all-whitespace. My round-1 `^[[:space:]]+$` regex let
`' acme '` through; `tr -d '[:space:]'` + equality check now
rejects any value that differs from its stripped form. True
parity with Python's `.strip()`.
Symmetry (rc-capture pattern applied to commands the first round
missed):
#5 cmd_pr_create now captures the bb_post exit code. A 400
(dest branch typo, duplicate PR open, source not pushed)
surfaces as a labelled error with likely causes instead of
`set -e` silently aborting after the "Creating PR:" banner.
#6 cmd_pipeline_trigger now captures the bb_post exit code with
the same labelled-error pattern. Common 4xx causes (protected
branch, custom-pipeline-not-found, invalid variable shape)
spelled out.
Deferred (pre-existing, narrow):
#7 cmd_branch_show / cmd_commits print banner before fetch.
Same pattern as #5/#6 but on read commands where the silent
failure has less consequence; the curl exit code does at least
surface non-zero. Defer to a follow-up if needed.
#9 cmd_branch_show jq null-safety on .target.hash / .target.date.
Real but narrow (annotated-tag refs). Defer.
#10 cmd_pr_diff curl-silent failure. Pre-existing. Defer.
#11 cmd_commits count not validated. Minor. Defer.
Smoke tests: bash -n clean, bb help renders, Python 360/360 still green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
daniel-pittman
added a commit
that referenced
this pull request
May 27, 2026
* Phase 4.1: bb_api.py + Python test infrastructure (MCP foundation) (#2)
* Add bb_api.py + pyproject + Python test infrastructure (MCP foundation)
First slice of the MCP server work. Establishes the Python module layout
the MCP server will sit on top of, plus the test seam and CI extension
required to develop the rest of the modules with confidence.
bb_api.py — Python parallel of bb's config / auth / transport helpers.
Stdlib-only (urllib.request), so the MCP server's bootstrap stays small.
Public surface:
- BBConfig: resolved credentials + workspace + API base URL.
- load_config(): env > ~/.config/bb/config > .env > error; tolerant of
bash `export KEY=value` and quoted values, matching the bash script's
`source` semantics close enough for parity.
- parse_remote_url() / detect_repo(): both HTTPS and SSH remote shapes;
repos with dots in their slugs are preserved (bash regex truncates).
- repo_path(): /repositories/{workspace}/{repo}, rejecting embedded
slashes that would silently change the URL structure.
- BBClient: thin transport with .get / .post / .put / .delete / .paginate.
Pagination is defended against runaway cursors (max 200 pages) and
stuck cursors (`next` URL unchanged between iterations), and refuses
to follow `next` URLs that point at a different host than api_base.
- BBApiError carries status + URL + body so callers can branch on
expected non-2xx responses.
bb and bb_api are PARALLEL implementations of the same Bitbucket REST
contract — the MCP server does not shell out to bb. The bug-parity rule
is now documented in CONTRIBUTING.md: tests verify correct API behaviour,
and a defect surfaced by a Python test fixes both code paths.
pyproject.toml — flat `py-modules = ["bb_api"]` layout (additional modules
land in their own PRs). `mcp` is in an optional [mcp] extra so test
matrices stay lean.
tests/ — pytest with two responsibilities in conftest.py: set
BB_MCP_SKIP_BOOTSTRAP=1 before any mcp_server import to defend against
the future self-bootstrap venv pattern, and inject the repo root onto
sys.path. 36 tests covering config precedence (env > file > error),
remote-URL parsing (HTTPS + SSH + repos-with-dots), client transport
(URL + method + auth header + body shape per call, not just response
status), and pagination (walking, stuck-cursor defence, iteration cap,
cross-host refusal, empty page).
CI extension — Python 3.10/3.11/3.12 matrix added to the existing
`syntax` job. fail-fast: false so a failure on one version doesn't
mask another. Installs only pytest; the MCP runtime deps stay out of
unit-test CI because the suite mocks urllib.
CODEOWNERS — extended to gate review on bb_api.py, future bb_ops.py /
git_ops.py / mcp_server.py, pyproject.toml, and the agents/ directory.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Apply review fixes for Phase 4.1 foundation
Addresses 13 of the 15 findings from the Phase 4.1 code review. The
strongest finding (#1) was a parity bug where a test pinned the wrong
contract — exactly the anti-pattern the testing methodology warns about.
Parity / correctness fixes:
1. dotenv > home-config precedence. Bash sources ~/.config/bb/config
first, then `.env`, so `.env` wins. Python had the order inverted
and the test pinned the inverted behaviour. Now matches bash; the
test now verifies the correct contract.
2. Empty env var no longer falls through to file. Switched from
`env.get(key) or file_config.get(key)` to a membership test so
`BB_TOKEN=""` is treated as an explicit (failing) value rather
than absence — matches bash's `[[ -z ]]` check behaviour.
3. URLError (DNS / TLS / connection / timeout) now wraps as
BBApiError(status=0) instead of escaping the documented contract.
4. Pagination cross-host refusal hardened against the prefix trick.
`startswith(api_base)` is replaced with a separator-aware check
so `https://api.bitbucket.org/2.0evil.example.com/...` no longer
slips past the host-mismatch guard.
5. BB_MCP_SKIP_BOOTSTRAP is now set unconditionally in conftest;
`setdefault` was a no-op if a developer had `BB_MCP_SKIP_BOOTSTRAP=0`
in their shell, defeating the guard silently.
6. Paginated response missing the `values` key now raises rather
than being silently treated as an empty page (which could leave a
hole in the result set if `next` was still set).
7. Non-string `next` raises BBApiError with a clear message rather
than crashing on AttributeError.
8. repo_path() rejects empty / whitespace-only / `.` / `..` values
in addition to embedded `/`. Defends against path-traversal where
`/repositories/../widget` normalises to the wrong workspace.
10. Query values validated as scalar or list-of-scalar; nested dicts
and arbitrary objects raise TypeError instead of being silently
stringified by urlencode(doseq=True) into a misleading URL.
11. BB_API_BASE trailing slash normalised on load. Prevents
`api_base + "/path"` becoming `//path`.
13. Replaced `assert ... is not None` with an explicit raise so
`python -O` deployments retain the safety net if a future edit
narrows the missing-keys check.
14. detect_repo test now asserts the exact subprocess call shape
(command, capture_output, text, cwd, check). Previously the test
was canned-output-only and a refactor to a different git command
would have passed silently.
15. BBClient methods accept a per-call `timeout=` override for
endpoints that legitimately take longer (log streaming, large
diffs). Default still comes from the client.
Tightened docstrings:
- BBApiError now documents `status=0` as the sentinel for
non-HTTP transport / malformed-response errors so callers know
to gate HTTP-status dispatch on `status > 0`.
Deferred with rationale:
#9 (remote URL parser accepts non-Bitbucket hosts): intentional
for enterprise / self-hosted Bitbucket support. The bash side has
the same loose parsing. No restriction added.
#12 (BBApiError used as transport-error wrapper): the dedicated
exception class would expand the surface; the docstring update
above is sufficient until a real consumer needs to discriminate.
Test count: 36 -> 55. All passing locally on Python 3.11.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Apply follow-up review fixes (security: no-redirect default, test rigor)
Follow-up review on commit e2567e8 verified all 13 prior fixes landed
correctly and surfaced 6 new findings, 3 of them worth taking before
more code lands on top of this foundation.
Security:
new #1. Default BBClient opener no longer follows 3xx redirects.
urllib's default HTTPRedirectHandler resubmits the original request
against the Location URL — INCLUDING the Authorization header. urllib
does not strip auth on cross-origin redirects, so a misconfigured
proxy or DNS hijack pointing api.bitbucket.org at another host would
have leaked the Basic auth credential to that host on the first 3xx.
`bb` doesn't have this exposure: `curl -sf` doesn't follow redirects
without -L, and the one command that needs to (`bb logs`) opts in
explicitly. Mirror that posture here.
Implementation: _NoRedirectHandler subclasses HTTPRedirectHandler and
returns None from redirect_request, which causes urllib to surface
the 3xx as an HTTPError that our existing handler wraps into
BBApiError. Two tests cover it — one introspects the default opener's
handler chain, the other unit-tests the handler's redirect_request
contract directly.
When bb_ops adds pipeline_logs (which legitimately needs to follow a
307 to S3), that work will require a separate code path that strips
Authorization on cross-host hops rather than weakening this default.
Test rigor:
new #2. test_paginate_walks_pages now asserts the URL for each of the
three calls (base, base?page=2, base?page=3). Without this, a
regression that refetched the original URL three times would still
consume the three queued responses and pass — exactly the
mock-returns-success-regardless-of-request anti-pattern.
Cleanup:
new #4. Removed the dead `page_query = None` assignment in paginate().
The variable was only read on the first iteration (when url is None),
and the reset ran after `url` had been assigned, so it was never
observed. Removed and the closure variable `query` is used directly.
Documentation:
new #5. parse_remote_url() now has an IMPORTANT caller-contract note:
the function does not anchor to a Bitbucket host (intentional for
enterprise / self-hosted Bitbucket support). Callers feeding it
untrusted external URLs (webhook payloads, etc.) must verify the
host upstream before calling. If a future consumer needs that
guarantee, the right shape is a `strict=True` mode here, not
duplicated checks at every call site.
Deferred:
new #3 (BB_API_BASE shape validation) — low-impact today; will revisit
if a self-hosted Bitbucket Server use case lands a typo bug.
new #6 (older _fake_runner tests still allow kwargs drift) — minor;
the new recording-style test_detect_repo_invokes_git_remote_get_url
is the load-bearer, the looser fixtures around it are acceptable.
Test count: 55 -> 57. All passing locally.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Phase 4.2: bb_ops pipeline operations + cross-host-safe log fetch (#3)
* Add bb_ops pipeline operations + cross-host-safe log fetch
Phase 4.2 of the MCP server work. Adds the pipeline subset of bb_ops
(list, show, steps, trigger, stop, logs), plus a new bb_api transport
method for following redirects safely. 33 new tests, 90 total.
bb_ops.py — public surface:
pipelines_list(client, ws, repo, *, count=10, sort, branch=None)
Paginates pipelines/?sort=-created_on. Honours `count` exactly by
walking pages as needed; clamps per-page to Bitbucket's 100 cap.
Optional branch filter via target.ref_name.
pipeline_show(client, ws, repo, build_number)
Resolves build_number -> uuid then GETs /pipelines/{uuid}.
pipeline_steps(client, ws, repo, build_number)
Same uuid lookup, then GETs /pipelines/{uuid}/steps/.
pipeline_trigger(client, ws, repo, *, branch, pattern=None, variables=None)
Builds the POST payload (target + optional custom-pattern selector
+ optional variables-as-list-of-{key,value}). Variables accept dict
or iterable-of-pairs at the boundary. Rejects non-string values
(Bitbucket's contract).
pipeline_stop(client, ws, repo, build_number)
POSTs to /pipelines/{uuid}/stopPipeline. Returns the API response
(typically None on 204). The bash equivalent discards the response
with `> /dev/null`, masking failures — flagged as a 4.7 parity fix.
pipeline_logs(client, ws, repo, build_number, step_index, *, timeout=120)
Uses the new BBClient.fetch_redirected_text() path to follow
Bitbucket's 307-to-S3 redirect on the log endpoint with
Authorization stripped on cross-host hops.
bb_api.py — BBClient.fetch_redirected_text():
Necessary because the log endpoint sometimes returns 200 with an
inline log body and sometimes returns 307 to a signed S3 URL. The
default opener refuses 3xx (credential-safety), so this method opens
with the default opener, catches HTTPError 3xx, extracts Location,
and follows up to max_redirects hops. CRITICAL: when the next hop's
host differs from api.bitbucket.org, Authorization is removed from
the request headers — Basic credentials never reach S3 or any other
redirect target. Once stripped, the credential stays stripped for
the rest of the chain.
Bash's `bb logs` uses `curl -sfL` which (depending on curl version)
may send Authorization on cross-host redirects. Whether that's
presently broken in bash is a parity question for 4.7; Python is
correct by construction.
Test coverage (33 new):
_wrap_uuid: brace normalisation, URL encoding, whitespace tolerance.
_strip_uuid_braces: bare/braced/empty inputs.
pipelines_list: single page, count > 100 (pagination), mid-page
cutoff, branch filter URL shape, non-positive-count rejection
without emitting a request.
_resolve_pipeline_uuid: first-page hit, multi-page walk, not-found
raises BBOpNotFound (distinct from BBApiError), invalid build_number
rejection, scan_limit cap respected.
pipeline_show: uuid lookup THEN show by uuid; URL uses %7B...%7D
bracket encoding (bash contract).
pipeline_steps: uuid lookup THEN list steps.
pipeline_trigger: default-pipeline payload shape, custom-pipeline
payload shape (selector embedded), variables-as-dict shape,
variables-as-iterable-of-pairs shape, empty-variables omits the
key (not an empty list), non-string variable value rejected,
empty pattern rejected, empty branch rejected.
pipeline_stop: POST to /stopPipeline with no body; 204 returns None.
pipeline_logs: inline body (no redirect), 307 follow with
Authorization stripped on cross-host (the security-critical test),
too-many-redirects raises, missing Location raises.
_resolve_step_uuid: valid index, out-of-range raises BBOpNotFound,
negative index rejected.
Every HTTP-touching test asserts request URL, method, and body shape
- not just the response value. Guards against the
mock-returns-success-regardless-of-body anti-pattern.
CI / packaging:
pyproject.toml py-modules now includes bb_ops.
ci.yml syntax-checks bb_ops.py alongside bb_api.py.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Apply Phase 4.2 review fixes (robustness + cleanup)
Addresses 6 of 8 follow-up suggestions from the Phase 4.2 code review.
Approve-with-suggestions; nothing blocking, all minor robustness or
clean-up items.
#1 (doc). Soften the cross-host-auth narrative in
fetch_redirected_text. `curl -L -u` also strips credentials on
cross-host hops; only `-H "Authorization: ..."` would leak. Bash
is safe today via `-u`. Python's explicit strip is the symmetric
future-proof version (in case the MCP server ever needs
header-based auth).
#4 (robustness). Case-insensitive cross-host comparison in
fetch_redirected_text. RFC 3986 §3.2.2 makes hostnames
case-insensitive; a redirect to API.bitbucket.org would have
triggered a needless auth-strip on a capitalisation-only diff.
Now lowercased both at construction and on each new host.
#5 (resource hygiene). Explicit `e.close()` on the redirect-path
HTTPError so the underlying socket is released immediately
rather than waiting on GC under load. Wrapped in try/finally so
a raise from the redirect-extract logic still closes the
response.
#6 (dead return). _resolve_step_uuid now returns just the uuid
rather than a (uuid, name) tuple. The name was never consumed
by pipeline_logs. Callers that want the name should fetch
pipeline_steps() themselves. Simpler API, no behavioural change.
#7 (nit). Dropped redundant `safe='-'` from _wrap_uuid's quote()
call. Dash is in urllib's always-safe unreserved set.
#8 (naming). Renamed pipeline_steps_raw -> _pipeline_steps_by_uuid
(leading underscore to match the rest of the file's
internal-helper convention; the function was only called
module-internally).
Bash bug newly surfaced by the review, queued for 4.7:
cmd_pipeline_trigger drops `variables` entirely when `pattern` is
empty (bb:382-391: the no-pattern branch builds a payload without
the variables array, even when the caller passed VAR=value pairs).
Python pipeline_trigger handles this consistently; tests pin it.
Deferred:
#2: bash bug above — fixed in 4.7 not here (Python-only PR).
#3: empty-variables payload shape (omit vs []). Python omits;
bash emits []. Both are valid against Bitbucket. Picking
one and aligning is a 4.7 follow-up.
Test count unchanged (90); test_resolve_step_uuid updated to assert
the new single-uuid return.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Tighten finally-close comment per re-review nit
The finally runs on every HTTPError path (both 3xx and non-3xx),
not just redirects. Reword the comment so a future reader doesn't
assume the close is redirect-only.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Phase 4.3: bb_ops pull-request operations + tests (#4)
* Add bb_ops pull-request operations + tests
Phase 4.3 of the MCP server work. Adds the PR subset of bb_ops with 33
new tests, 123 total.
Public surface:
prs_list(client, ws, repo, *, state="OPEN", count=25)
State filter + paginated count, matching bash defaults. Walks pages
when count > 100 (Bitbucket's cap).
pr_show(client, ws, repo, pr_id)
GET /pullrequests/{id} with positive-int validation on pr_id.
pr_activity(client, ws, repo, pr_id, *, count=50)
Walks the activity stream — used by bash to surface approver names,
surfaced as its own op so the MCP agent can render its own view.
pr_create(client, ws, repo, *, title, source_branch,
destination_branch="main", description="",
close_source_branch=True, reviewers=None)
Builds the POST payload matching Bitbucket's contract. Optional
reviewers as a list of UUID strings (Bitbucket expects
`[{"uuid": "..."}, ...]`). Empty description / empty reviewers
omitted from the payload (Python convention; bash's "always
include empty description" is a 4.7 alignment item).
pr_approve(client, ws, repo, pr_id)
POST /approve. Returns the approval record — bash discards it with
`> /dev/null`.
pr_unapprove(client, ws, repo, pr_id)
DELETE /approve. Not exposed by bash today (4.7 parity gap).
pr_merge(client, ws, repo, pr_id, *, strategy="merge_commit",
close_source_branch=True, message=None)
PUT /merge with strategy + close_source_branch + optional message.
Strategy validated at boundary: merge_commit / squash / fast_forward.
Mirrors bash's PUT verb (Bitbucket has historically accepted both
PUT and POST; flagged for 4.7 investigation against current REST
docs).
pr_decline(client, ws, repo, pr_id)
POST /decline.
pr_diff(client, ws, repo, pr_id, *, timeout=120)
Routes through fetch_redirected_text so the cross-host-auth-strip
protection applies if Bitbucket ever redirects this endpoint. Bash
uses `curl -sf` without `-L`, so it would fail visibly in that
case — Python continues to work safely (parity divergence noted
for 4.7).
pr_comments_list(client, ws, repo, pr_id, *, count=100)
Paginated comments listing matching bash's default pagelen.
pr_comment_add(client, ws, repo, pr_id, body)
POST /comments with payload {"content": {"raw": "<text>"}}. Not
exposed by bash today (4.7 parity gap).
Validation: every function rejects invalid pr_id (must be positive int)
and non-string body / message values at the Python boundary. Tests
assert `opener.calls == []` after a rejection to prove no request was
emitted on bad input.
Bash parity items newly surfaced (queued for 4.7):
- cmd_pr_approve / cmd_pr_decline discard API responses with > /dev/null.
- cmd_pr_create hardcodes close_source_branch=true; expose as a flag.
- cmd_pr_create always includes an empty description; align with
Python's omit-when-empty.
- cmd_pr_merge uses PUT; verify against current docs and align both
sides on one verb.
- cmd_pr_diff uses curl -sf without -L; Python's fetch handles
redirects safely.
Test discipline: every HTTP-touching test asserts URL, method, AND body
shape, not just response value. Rejection tests assert opener.calls == []
to prove no network IO on bad input.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Apply Phase 4.3 review fixes (bool-is-int, type checks, test coverage)
Addresses all 12 findings from the code review. The strongest finding
was a Python footgun: bool is a subclass of int, so a bare
isinstance(x, int) check accepts True/False and stringifies them in
URLs as 'True'/'False' rather than '1'/'0'. The cascading fix closes
six findings at once.
High-severity:
#1 _validate_pr_id accepted bool. Now uses a shared _is_positive_int
helper that explicitly excludes bool. Same fix applied to every
count validator (prs_list, pr_activity, pr_comments_list,
pipelines_list, _resolve_pipeline_uuid) and the step_index check.
test_rejects_invalid_pr_id extended with True/False; the new
test_every_pr_op_rejects_bad_pr_id matrix runs every PR-id-taking
op (8 functions × 7 bad values) so a future refactor that drops
_validate_pr_id from any one op gets caught.
#2 pr_create(reviewers="alice-uuid") would have iterated the string
char-by-char and built [{"uuid":"a"}, {"uuid":"l"}, ...]. Added an
explicit isinstance(reviewers, str) early-reject with a hint
("did you mean [...]?"). Test added.
#3 Same bool-is-int slip on every count parameter (fixed by the
shared _is_positive_int helper above).
Medium:
#4 pr_create now type-checks `description`; a dict or other
non-string raises rather than slipping into the JSON payload.
Test added.
#5 _KNOWN_PR_STATES docstring claim that Bitbucket accepts comma-
separated `?state=OPEN,MERGED` was wrong; that shape returns 400
or empty. Docstring corrected to point at the BBQL `q` parameter
for compound filtering.
#6 close_source_branch now type-checked in both pr_create and
pr_merge; a string like "yes" no longer slips into the JSON
payload. Tests added.
Low:
#7 pr_merge(message="") was accepted; now rejected for consistency
with pr_comment_add(body=""). Empty merge-message would leave the
merge commit's subject line blank.
#8 pr_diff docstring softened — the "Python continues to work
safely" framing didn't acknowledge that the returned body would
be the redirect target's content if a redirect ever materialised.
Now spells out that today the diff endpoint doesn't redirect and
flags the behavioural divergence from bash if it ever does.
#9 Whitespace-only description (`description=" \n "`) now omitted
from the payload via description.strip() check. Test added.
Test gaps closed:
#10 True/False added to test_rejects_invalid_pr_id.
#11 New test_every_pr_op_rejects_bad_pr_id matrix covers every
PR-id-taking op (9 functions × 7 bad values = 63 cases).
#12 TestPrActivity now has count-walks-pages, invalid-count, and
invalid-pr_id tests, symmetric with TestPrsList.
Test count: 123 -> 141. All passing locally.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Apply Phase 4.3 round-2 review fixes
Addresses 6 of 7 findings from the round-2 review. The strongest finding
was a real boundary-validation gap: _KNOWN_PR_STATES was added in the
previous fix round but never actually referenced — defined-and-documented
as the prs_list state guard, yet the function only type-checked. Closes
the gap.
High:
#1 prs_list now validates state against _KNOWN_PR_STATES. Typos
("OPENED"), case bugs ("open"), and unsupported compound forms
("OPEN,MERGED" — which Bitbucket does NOT accept on the simple
?state= filter; that needs ?q=) all fail at the boundary instead
of burning an API call for a 400 / empty result. Symmetric with
pr_merge's _VALID_MERGE_STRATEGIES check.
#2 Whitespace-only string payloads no longer slip through. Replaced
`not value` with `not value.strip()` in five places:
- pr_create.title / source_branch / destination_branch
- pr_merge.message
- pr_comment_add.body
Without this, a title like " " or a comment body like "\n\t"
would have shipped to Bitbucket as accepted-but-visually-blank
fields. Error messages now say "non-empty, non-whitespace".
Test coverage:
#3 TestPrsList.test_rejects_non_positive_count now includes True /
False, matching TestPrActivity. Without symmetry, a refactor that
reverted prs_list's _is_positive_int(count) check to a naive
isinstance(count, int) would have regressed silently.
#4 TestPrCommentsList now has test_rejects_non_positive_count with
the same coverage shape (was happy-path only).
#5 TestPrDiff.test_follows_cross_host_redirect_and_strips_auth pins
the redirect-with-auth-strip behaviour at the pr_diff layer. Mirrors
test_pipeline_logs.test_follows_s3_redirect_and_strips_auth — without
it, a regression that wired pr_diff to plain client.get (which
would 5xx on any future redirect because the default opener
refuses 3xx) would not break any test in this file.
#7 TestPrDiff.test_returns_raw_diff_text now also asserts the first
hop carries Authorization. A regression that dropped the Basic
header from the diff-fetch path would otherwise pass silently.
Parametrized whitespace cases added across pr_create / pr_merge /
pr_comment_add tests so each field's reject-on-whitespace is verified
independently.
Deferred (filed under 4.7):
#6 pr_merge(strategy=[1,2]) raises TypeError (unhashable) rather
than ValueError. Defensive-coding nit; either error surface stops
the bad call. Defer to the parity sweep.
Test count: 141 -> 155. All passing locally.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Apply Phase 4.3 round-3 review fixes
Three small test/consistency fixes from the round-3 review. The PR is
in good shape — reviewer explicitly called these out as non-blockers,
but they're cheap and close the consistency loop.
- pr_merge strategy validator now gates on isinstance(str) before
the frozenset `in` check. Unhashable strategy values (list, set,
dict) previously raised TypeError instead of the documented
ValueError, breaking the "every boundary failure is ValueError"
convention. Test extended to cover list/set/None/int alongside
the original unknown-string case.
- test_rejects_invalid_reviewer (formerly _empty_reviewer_uuid) now
parameterizes over empty-string, None, int, and the most plausible
caller mistake — pre-shaping as Bitbucket's payload contract
`{"uuid": "..."}`. The validator already rejects all four; this
pins coverage on the isinstance branch.
- test_follows_cross_host_redirect_and_strips_auth now also asserts
the first hop carries Authorization (a regression that dropped
auth on the initial Bitbucket request would otherwise pass) and
that the Location was actually followed (not refetched original).
Symmetric with round-2 finding #7 on the no-redirect path.
Test count: 155 -> 163. All passing locally.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Phase 4.4: bb_ops repos / branches / vars / downloads / commits + tests (#5)
* Add bb_ops repos / branches / vars / downloads / commits + tests
Phase 4.4 of the MCP server work. Closes out the read-oriented surface
on bb_ops (Phase 4.5 will add git context tools, then 4.6 wires
everything into the MCP server). 48 new tests, 211 total.
Public surface:
repos_list(client, workspace=None, *, count=100, sort, query=None)
Defaults workspace to client.config.workspace; pass explicit to
query a different one (bash only ever uses BB_WORKSPACE — parity
gap). Optional BBQL `q` filter for name-substring etc. (also
bash gap).
repo_show(client, ws, repo)
GET /repositories/{ws}/{repo} — full metadata.
branches_list(client, ws, repo, *, count=50, sort, query=None)
Most-recently-updated first, matches bash default. Optional `q`
filter.
branch_show(client, ws, repo, name)
Fetch a single branch by name. URL-encodes the name so
`feat/widget` isn't interpreted as a sub-resource path. Not in
bash today — useful for the agent's "does this branch exist?"
pre-flight check before creating a PR.
vars_list(client, ws, repo, *, count=100)
Pipeline configuration variables. Bash masks secured values to
`********` in its display layer; Python returns the raw dicts
(Bitbucket sends `"value": null` for secured entries — MCP tools
surface the `secured` flag so callers don't misread null as unset).
downloads_list(client, ws, repo, *, count=25)
Repository download artifacts.
commits_list(client, ws, repo, *, branch=None, count=10)
With branch=None, lists across all branches via /commits.
With branch=NAME, lists commits reachable from that branch via
/commits/{name}. Branch names URL-encoded. Not in bash today.
Bash parity items newly surfaced (queued for 4.7):
- bash repos_list / branches_list / vars_list / downloads_list don't
expose BBQL `q` query filter or `count` parameters — agent has
richer access by default.
- bash has no branch_show, commits_list — add as new subcommands
alongside the previously-queued pr_unapprove / pr_comment_add /
whoami.
Test discipline (same as prior bb_ops PRs):
- Every HTTP test asserts URL, method, AND body shape per call.
- Rejection tests assert opener.calls == [] (no network IO on bad input).
- Bool-is-int (True/False) rejection covered on every count parameter.
- Whitespace-only rejection on every required-string parameter.
- Branch-name slash URL-encoding verified specifically in branch_show
and commits_list.
Test count: 163 -> 211. All passing locally on Python 3.11.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Apply Phase 4.4 review fixes
Folds in all 5 findings from the code review (none blocking; reviewer
explicitly called the PR "clean, well-disciplined"):
1. repos_list now applies the same workspace validation as
bb_api.repo_path — rejects embedded '/', '.', '..' in addition to
empty/whitespace. Without this, `workspace="acme/widget"` would
silently build a single-repo endpoint URL and paginate against a
response that lacks `values` — confusing failure mode the boundary
validator exists to prevent everywhere else. repos_list is the
only op that doesn't route through repo_path, so the check is
duplicated here.
2. Removed redundant local `from urllib.parse import quote as _quote`
in branch_show and commits_list; both now use the module-level
`quote` import that's been there since 4.2.
3. Updated stale module docstring — was still saying "This PR
(Phase 4.2) adds pipeline operations." Now reflects current
scope (pipelines + PRs + repos/branches/vars/downloads/commits)
and references the companion git_ops module that 4.5 will add.
4. TestBranchesList.test_query_filter now asserts the exact
URL-encoded form of the BBQL query (matching the style of
TestReposList.test_query_filter). The previous `"q=" in url`
would have passed even if the query string were silently
mangled.
5. Three new tests close coverage gaps:
- TestBranchesList.test_rejects_empty_query — pin the
empty/whitespace rejection that was implemented but untested.
- TestCommitsList.test_count_walks_pages — pin commits_list's
multi-page pagination (it has the most complex path shape
of the bunch with the branch-vs-no-branch split).
- TestReposList.test_rejects_workspace_with_slash +
test_rejects_workspace_dot_segments — pin the new workspace
validation from finding #1.
Skipped optional suggestions:
- vars_list `query=` filter (no concrete agent use case yet).
- _collect_paginated() helper (5x repetition is still inline-readable;
refactor when a 6th list op materialises).
Test count: 211 -> 218. All passing locally.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Phase 4.5: git_ops — git-context wrappers for the MCP agent (#6)
* Add git_ops — git-context wrappers for the MCP agent
Phase 4.5 of the MCP server work. Five subprocess-mocked wrappers the
MCP agent uses to resolve "current branch", "is the tree clean?",
"workspace/repo from origin", "recent commits", and "what's
uncommitted?" — the context tools that have to run before any bb_ops
call. 38 new tests, 256 total.
git_ops.py public surface (stdlib-only, subprocess + bb_api.parse_remote_url):
git_current_branch(path?) -> str
`git rev-parse --abbrev-ref HEAD`. Detached HEAD returns the literal
string "HEAD" (matches git's own output; callers detect detached
state explicitly).
git_remote_repo(path?) -> (workspace, repo_slug)
Routes through bb_api.parse_remote_url for parity with
bb_api.detect_repo's loose-host parsing (intentional — enterprise
Bitbucket Server uses non-bitbucket.org URLs). Returns BOTH the
workspace AND repo, unlike bb_api.detect_repo which returns only
the slug.
git_status(path?) -> dict
Parses `git status --porcelain=v2 --branch --untracked-files=normal`.
Returns a structured snapshot:
branch, upstream, ahead, behind, clean, staged, modified,
untracked, unmerged.
Handles ordinary (type "1"), rename/copy (type "2", with
new-path\told-path tab-separator), and unmerged (type "u") lines.
Header lines parsed for branch.head / branch.upstream / branch.ab.
git_recent_commits(path?, *, count=10, ref="HEAD") -> list[dict]
`git log --pretty=format:%H<US>%h<US>%s<US>%an<US>%aI`. Uses
U+001F (Unit Separator) as the field delimiter — a control
character that cannot appear in commit subjects, so we never have
to escape or parse-around content. Each entry:
{sha, short, subject, author, date}.
git_uncommitted_changes(path?) -> dict
Three subprocess calls:
git diff --cached -> staged_diff
git diff -> working_diff
git ls-files --others -> untracked_files
--exclude-standard
Diffs returned as raw unified-diff text; agent can show verbatim
or parse further.
Error handling:
GitOpError carries the failing command + returncode + stderr (truncated
in the message). Distinct from BBApiError so the MCP layer can render
"git failure" vs "Bitbucket failure" differently.
FileNotFoundError on `git` binary missing -> GitOpError(127, ..., "git
executable not found on PATH"). Matches bb_api.detect_repo's behaviour.
Test discipline (same as bb_api / bb_ops):
- Every test asserts the EXACT subprocess shape (command + args +
cwd + capture_output / text / check). A refactor swapping
`git status --porcelain=v2` for v1 would change every parse
result but the canned-output tests would still pass without this.
- Parametrised tests on remote-URL shapes (https, ssh, embedded
user-info, self-hosted Bitbucket Server) — verify the loose-host
parser handles every shape the bash side accepts.
- git_status parser tested against five realistic v2 captures:
clean, ahead/behind, no-upstream, mixed (staged + modified +
untracked), renamed, unmerged.
- git_recent_commits format string pinned: %H, %h, %s, %an, %aI all
present + U+001F delimiter. Malformed log lines skipped
defensively rather than crashing.
- Invalid-count / empty-ref rejection on git_recent_commits with
True/False included (bool-as-int trap).
- git_uncommitted_changes verifies the EXACT three-call sequence
with explicit `cwd=` propagation on all three.
CI / packaging:
pyproject.toml py-modules += "git_ops".
ci.yml syntax-check += "git_ops.py".
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Apply Phase 4.5 review fixes
Addresses 5 of 6 findings from the round-1 review (the 6th — str vs
os.fspath — is pre-existing in bb_api and would be unrelated churn in
this PR).
Security:
#1 git_recent_commits(ref=...) now rejects refs starting with '-'.
Without this, an agent-supplied ref like `ref='--all'` would be
parsed by git as an option flag, not a ref — silently replacing
the call with a glob over all branches. `ref='-h'` would print
help and return exit 0 with help text in stdout, which the
parser would skip. The MCP agent is the consumer, and agent
inputs are user-influenced, so option-injection at this boundary
is a real concern. Belt + suspenders: also added `--` terminator
to the `git log` invocation so anything past it is parsed as a
positional ref regardless.
Correctness:
#3 GitOpError(returncode=0) for parse failures was misleading. A
real git exit is always >= 0, so callers branching on
`err.returncode == 0 -> success` would have been confused. Now
parse-failure paths use the GIT_PARSE_ERROR_RETURNCODE sentinel
(-1) and the class docstring documents the convention. Both
parse-error tests now assert on the sentinel.
Defensive:
#4 git_recent_commits added an upper bound of _MAX_LOG_COUNT (1000)
on `count` so an agent that confabulates `count=1_000_000`
doesn't silently pull a million records back across the MCP
boundary.
Documentation:
#2 git_status docstring now documents the C-quoted-pathname
limitation as a known caveat (porcelain v2 line-oriented output
quotes pathnames with newlines/tabs/quotes/control-chars). Fixing
properly requires switching to `-z` + NUL-split parser — deferred
until a real bug report shows up.
#6 _LOG_FIELD_SEP comment now says "almost never" rather than
"cannot" — git stores arbitrary bytes so U+001F technically could
appear in a commit subject; the parser already handles it
defensively.
New tests:
- 4 parametrized cases for ref-starts-with-dash rejection.
- count-above-cap rejection + count-at-cap accepted (boundary
verification).
- Spaces-in-filename preserved (porcelain v2 type-1 line uses
split(" ", 8) — the suggested regression guard).
- Existing parse-failure tests now assert returncode is the
sentinel.
Skipped: #5 (str vs os.fspath nit). Pre-existing in bb_api.detect_repo,
would be unrelated churn here; defer to a separate consistency PR.
Test count: 256 -> 263. All passing locally.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Apply Phase 4.5 round-2 review fixes
Substantive round-2 review surfaced 10 findings (medium to very-low),
several with concrete failure modes the MCP server would hit in
production. All 10 addressed plus the 3 test-coverage gaps.
Medium-severity correctness:
#1 git diff no longer leaks ANSI color codes. _run_git now prepends
`git -c color.ui=never` to every invocation, overriding any
user's `color.ui = always` gitconfig that would otherwise emit
`\x1b[31m...` escapes into pipe-captured output. Diffs and log
output are now safe to ship to the MCP agent verbatim.
#2 Detached-HEAD reporting normalised. `git_current_branch` already
returned the literal "HEAD" for detached state; the porcelain v2
parser returned "(detached)" for the same state. Now both return
"HEAD" so cross-checks between the two functions agree on the
underlying state. Test added.
#3 subprocess.run now has timeout=30s with TimeoutExpired wrapped
as GitOpError(returncode=GIT_PARSE_ERROR_RETURNCODE). A wedged git
(held index.lock, credential-helper prompting on stdin, NFS server
gone) can no longer hang the MCP server thread indefinitely. Test
added.
#4 subprocess.run now passes encoding="utf-8", errors="replace"
explicitly. text=True alone defers to locale.getpreferredencoding(),
which on minimal Docker / cron / systemd contexts is ASCII or C —
a UTF-8 filename or author name would crash inside subprocess.run
before _run_git could wrap it. Explicit UTF-8 + replace-on-error
means we never lose to a locale-restricted host.
Low-severity correctness / data loss:
#6 splitlines() replaced with split("\n") in three parsers. The
broader splitlines() also breaks on \r / \v / \f / U+0085 /
U+2028 / U+2029, any of which in a commit subject (or path) would
fragment one logical record into multiple "lines" that each fail
the per-line guards and silently vanish. Test added covering
carriage-return-in-subject preservation.
#7 git_uncommitted_changes diffs now capped at _MAX_DIFF_BYTES (1 MiB)
per diff, with an explicit truncation marker so the caller (the
MCP agent) knows what happened and can fall back to `git diff
--stat`. Without this cap, a stray multi-GB staged blob would
OOM-kill the MCP server. Test added.
#8 git_recent_commits parser now skips records with empty SHA. A
line of pure separators ("\x1f\x1f\x1f\x1f") passes the
parts-count guard and would otherwise append a degenerate
{"sha":"", "short":"", ...} entry. Test added.
Very-low defensive:
#9 branch.ab parser now validates sign prefixes ("+N -M") before
stripping. A malformed "-3 +1" line would otherwise propagate
bogus values through as negative ahead / positive behind. Test
added.
#10 Type-1/2 XY field bounds-checked. A single-char XY from a
corrupted git output stream no longer raises IndexError outside
GitOpError; the malformed line is skipped defensively. Test
added.
Docstring:
#5 _LOG_FIELD_SEP comment further softened — explicitly notes the
trade-off ("rare data loss on pathological commit" vs "robust
separator that won't collide with common subject content") and
points to `git log -z` as the genuinely-safe variant if this
ever bites a real user.
Test-coverage gaps closed:
- test_unborn_branch_raises_giterror — pin real-world behaviour
where `git log` on a freshly-init'd repo exits 128 (the previous
test_empty_output_returns_empty_list only exercised the mocked
exit-0-empty-stdout path).
- test_passes_cwd_when_path_given added to git_remote_repo,
git_status, and git_recent_commits — was previously only on
git_current_branch and git_uncommitted_changes.
- test_detached_head_normalised_to_HEAD pins the new cross-function
sentinel agreement.
All subprocess-shape assertions updated for the new `git -c
color.ui=never` prefix (argv indices shifted by 2).
Skipped:
- The U+001F dropping case (#5 partial) — already returns [] for
a pathological subject; the docstring softening is the right
response without a real-user bug report.
Test count: 263 -> 274. All passing locally.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Apply Phase 4.5 round-3 review fixes (hardening)
Reviewer flagged the PR as "in good shape to merge" but surfaced 5
hardening items, two of which they explicitly called out as
worth-doing-before-MCP-wires-this-up. All 5 addressed plus the
corresponding test coverage.
Security:
#2 Origin URL credential redaction. CI pipelines and credential-
helper-less dev setups commonly embed auth in remote URLs
(`https://x-token-auth:abcd@bitbucket.org/...`). On the rare
unparseable URL, the secret would have landed in the GitOpError
message, flowed up through MCP into agent context, and likely
ended up in downstream logs. New `_redact_url_creds()` strips
`://user:token@` -> `://[redacted]@` before any URL is echoed
in an error string. Test added that asserts the secret token
is absent from the raised error and that "[redacted]" appears.
Reliability:
#3 stdin / env hardening on every git call. `_run_git` now passes
`stdin=subprocess.DEVNULL` so a credential prompt fails
immediately with EOF rather than blocking on the MCP server's
inherited stdin. Environment also adds GIT_TERMINAL_PROMPT=0
(git refuses to prompt at all) and GIT_ASKPASS="" (no GUI
askpass dialog). Belt + suspenders alongside the 30s timeout —
what was previously a 30-second wedge into "timed out" is now
an instant clear error. Subprocess-shape assertions updated.
UX:
#1 FileNotFoundError disambiguation. subprocess.run raises
FileNotFoundError for TWO distinct cases — git binary missing
OR cwd directory missing — which the previous catch conflated
into "git executable not found on PATH". Once the MCP agent is
passing paths (e.g. a worktree that's since been removed), the
user would chase their PATH config when the real cause is the
bad directory. New code inspects `e.filename`: if it matches
the cwd, raises with "path does not exist: <cwd>" instead.
Test added covering the cwd-missing path.
#4 Unborn-branch sentinel normalisation. Round 2 normalised
`# branch.head (detached)` -> "HEAD" so git_status and
git_current_branch agree on detached state. The same shape of
asymmetry existed for unborn branches: porcelain v2 reports
`# branch.oid (initial)` + `# branch.head main` (the would-be
branch) while git_current_branch raises GitOpError on the same
repo. Parser now tracks an `is_unborn` flag (line order
independent — branch.oid / branch.head ordering in v2 output
isn't documented as stable) and overrides branch to "HEAD" at
end-of-loop so the two functions agree on "weird state, not a
regular branch." Test added.
Defensive:
#5 `_cap_diff` fast-path on `len(text)`. UTF-8 byte count is
always >= char count, so if char count fits the cap, the
encoded form definitely does. Avoids materialising a transient
bytes copy of a multi-GB string just to check its size — the
cap exists specifically to defend against OOM on huge diffs,
and the check itself shouldn't be the trigger.
Deferred:
None. All 5 findings addressed in this commit.
Test count: 274 -> 277. All passing locally.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Apply Phase 4.5 round-4 review fixes
Round 4 found a real bug I introduced in round 3 plus two cheap
hardening items. All three addressed.
Real bug fixed:
#1 _cap_diff fast path was inverted. The round-3 fast path used
`if len(text) <= cap: return text` with the reasoning "UTF-8
byte count is always >= char count, so chars <= cap implies
bytes <= cap." That's backwards — bytes >= chars means
`chars <= cap` does NOT imply `bytes <= cap`. A 600K-emoji
string was 600K chars (under the 1 MiB char cap) but encoded
to 2.4 MiB, defeating the OOM defense for non-ASCII content.
The existing oversize-diff test used ASCII `"x"` chars (where
chars == bytes), so the divergence was never exercised — the
bug rode through round-3 review undetected.
Fixed by gating the fast path on `text.isascii()`: ASCII-only
strings genuinely have bytes == chars, so the len check is
sound. Non-ASCII falls through to the encode-and-measure path.
Regression test added with `"\U0001F600" * 600_000` (4-byte
emoji) — asserts the returned diff is byte-capped, not
char-capped.
Hardening:
#2 GitOpError.command in parse-failure paths now includes the
`-c color.ui=never` prefix that _run_git actually passes to
subprocess. Previously err.command in git_current_branch's
empty-output path and git_remote_repo's unparseable-URL path
hard-coded the unprefixed form, so a caller introspecting
err.command (or reproducing the failure from the message)
would see a different invocation than what was actually run.
Tests strengthened to assert the prefix is present.
#3 File-list cap added: _MAX_PATH_LIST = 10_000. Mirrors the
_MAX_DIFF_BYTES and _MAX_LOG_COUNT bounds that already protect
the other two agent-facing surfaces. Without this, a repo that
forgot to gitignore node_modules / .venv / target/ would
return hundreds of thousands of untracked paths through MCP —
plausible OOM trigger.
Applied to:
- git_uncommitted_changes' untracked_files
- _parse_status_porcelain_v2's staged / modified / untracked /
unmerged lists
New `_truncated_path_list` helper appends a sentinel marker
(`<... N more paths omitted (cap N exceeded)>`) so the agent
sees the truncation and can fall back to a narrower query.
New TestPathListCap class pins the helper directly:
under-cap / at-cap / one-over / many-over / git_status
integration.
Test count: 277 -> 284. All passing locally.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Apply Phase 4.5 round-5 review fixes
Round 5 (extra-high recall mode) surfaced 12 findings. Reviewer's
explicit verdict: "Findings 1, 2, and 4 are concrete defects worth
fixing before Phase 4.6 wires this up to an agent." Addressed those
three plus six more cheap real items; skipped three cosmetic / very-
narrow ones.
Correctness:
#1 branch.ab strict validation. The startswith-only check accepted
smuggled double-sign strings like "+-3 -1" where `int(parts[0][1:])`
parsed as -3, contradicting the parser's own promise to reject
negative values. Now requires `parts[N][1:].isdigit()` after the
sign so "+5", "-3" parse but "+-3", "++5", "+junk" don't.
Parametrized regression test added.
#2 branch.ab atomic update. Previous try/except wrapped both int()
calls together — if the first succeeded and the second raised,
out["ahead"] was mutated to the new value while out["behind"]
kept the default 0, producing an internally-inconsistent dict
on a half-malformed line ("+5 -junk"). Now parses both into
locals first, commits to out[] only on full success.
#4 NotADirectoryError + PermissionError wrapped as GitOpError.
Previously only FileNotFoundError + TimeoutExpired were caught,
so passing path=/etc/passwd (regular file) or path=/restricted
(unreadable directory) leaked raw OSErrors out of _run_git,
breaking the module docstring's "Errors raise GitOpError"
promise. Tests added with dedicated runner stubs for each
error class.
Security:
#3 URL credential redaction handles `@` in passwords. The old
`[^/@]+@` regex stopped at the first `@`, leaving the tail of
a password containing a literal `@` exposed (e.g.
`https://user:p@ss@host/...` → `https://[redacted]@ss@host/...`).
Switched to `[^/]+@` which is greedy up to the LAST `@` before
the path. Regression test added.
Architecture:
#11 _truncated_path_list returns (list, omitted_count) tuple
instead of mixing a sentinel string into the path list. The
in-list sentinel broke the "list of valid paths" contract — an
agent iterating with `os.stat(p)` / `Path(p).exists()` /
`os.path.join(root, p)` would hit the non-path "<...>" entry
and crash or silently mis-handle it. Now the cap surfaces via
sibling `<key>_omitted` fields (always present, 0 when not
truncated) so the agent can detect truncation explicitly
without iterating over non-path data.
Applied to:
- _parse_status_porcelain_v2: adds staged_omitted /
modified_omitted / untracked_omitted / unmerged_omitted
- git_uncommitted_changes: adds untracked_files_omitted
#12 GIT_PARSE_ERROR_RETURNCODE moved from -1 to -1000. Python's
subprocess.run uses negative returncodes for signal-killed
processes (-1 = SIGHUP, -9 = SIGKILL, -15 = SIGTERM, etc.). A
SIGHUP-killed git would otherwise land at returncode = -1,
colliding with the parse-error sentinel and being misclassified.
-1000 is safely outside the signal range. Test added asserting
the sentinel is < -100.
Documentation / parity:
#5 git_status docstring updated to reflect the actual return
shape. Previously said `"branch": ... | "(detached)"` but the
code returns `"HEAD"` for both detached and unborn (per round 3's
normalization). A caller reading the docstring and writing
`if branch == "(detached)": handle_detached()` would never
enter that branch.
#6 Type-? (untracked) parser skips empty-path lines. Previously
`"? "` (prefix only, no path body) would append "" to untracked
and flip clean=False with a phantom entry. Asymmetric with type-1
/ type-2 which have length guards. Test added.
#8 Type-u (unmerged) parser checks XY field width. Type-1 and
type-2 paths added this in round 3; the type-u path missed.
Trivial parity fix. Test added.
Skipped:
#7 (cosmetic — diff cap is _MAX_DIFF_BYTES + ~140 marker bytes,
not a strict cap; mattering only to a downstream consumer that
enforces an exact ceiling, which no current consumer does)
#9 (doc asymmetry — `git_uncommitted_changes` docstring doesn't
mirror `git_status`'s C-quoting caveat; added to git_status'
expanded docstring covering both surfaces)
#10 (SCP-style URL credentials — narrow trigger, parse_remote_url
accepts most SCP shapes; revisit if a real leak surfaces)
Test count: 284 -> 296. All passing locally.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Phase 4.6: mcp_server.py — FastMCP wrapper around bb_ops + git_ops (#7)
* Add mcp_server.py — FastMCP wrapper around bb_ops + git_ops
Phase 4.6 of the MCP server work. Wires every bb_ops and git_ops
function into a FastMCP tool registry, with a self-bootstrap venv
mirroring zenhub-cli's mcp_server.py pattern. 39 new tests, 335 total.
mcp_server.py — 30 tools:
PIPELINES (6): pipelines_list, pipeline_show, pipeline_steps,
pipeline_trigger, pipeline_stop, pipeline_logs
PULL REQUESTS (11): prs_list, pr_show, pr_activity, pr_create,
pr_approve, pr_unapprove, pr_merge, pr_decline, pr_diff,
pr_comments_list, pr_comment_add
REPOS / METADATA (7): repos_list, repo_show, branches_list,
branch_show, vars_list, downloads_list, commits_list
GIT CONTEXT (5): git_current_branch, git_status, git_remote_repo,
git_recent_commits, git_uncommitted_changes
META (1): whoami
Self-bootstrap pattern (mirrors zenhub-cli):
- On launch, checks /tmp/bbenv. If missing, finds a python3 >= 3.10
(preferring the launching interpreter, falling back to pyenv /
Homebrew / system locations), runs `python -m venv`, installs the
`mcp` package, then `os.execv`s into the venv.
- Bootstrap is stdlib-only (uses subprocess + sys, no third-party
imports until after the venv is active). Means any python3 on PATH
can launch the server.
- `mcp` is the only venv dep — no heavy ML libs (no torch /
sentence-transformers / numpy / huggingface_hub). bb doesn't have
semantic-similarity workflows like zenhub-cli does.
- BB_MCP_SKIP_BOOTSTRAP=1 escape hatch (set in conftest.py) lets
pytest exercise tool wiring and result-dict shapes without
pulling in mcp. Production NEVER sets this — without real
FastMCP, .run() raises.
Repo resolution — every Bitbucket tool takes an optional `repo`:
- "" → auto-detect via `git remote get-url origin`
from BB_DEFAULT_REPO_PATH (or cwd)
- "myrepo" → use BB_WORKSPACE from config + "myrepo"
- "acme/myrepo" → use "acme" workspace + "myrepo" slug (overrides
BB_WORKSPACE for this call)
Validated in `_resolve_repo` with ValueError on malformed input
("a/b/c", "/repo", "ws/", "/", "//" all rejected before any network
call burns API budget).
Error envelope — every tool returns either:
{"ok": True, "workspace": ..., "repo": ..., <result fields>}
{"ok": False, "kind": "<ExceptionClassName>", "message": ..., <extras>}
The error dict carries `status`/`url`/`body` for BBApiError,
`returncode`/`stderr` for GitOpError, so the agent can branch on
`kind == "BBApiError" and status == 404` without parsing the message.
pr_create auto-detects source_branch via git_current_branch when
empty (matches bash `bb pr-create`).
whoami reports user / workspace / api_base + best-effort git context
(branch, workspace, repo from remote). Token is NEVER echoed; the
test explicitly asserts the token string is absent from the response.
Test discipline (same as bb_api / bb_ops / git_ops):
- Tool count + names pinned exactly (test_all_expected_tools_registered)
— a future rename or accidental drop fails the suite.
- Per-tool wiring: patch the bb_ops / git_ops function at the module
level, call the tool, assert (a) the underlying function received
the right args, (b) the response dict has the expected shape.
Doesn't re-test bb_ops logic (already covered comprehensively).
- Error envelope tested for each exception class
(BBApiError, BBOpNotFound, GitOpError, ValueError).
- whoami's no-token guarantee asserted as a security property.
- FastMCP stub's .run() raises a clear error so a test that
accidentally calls it fails loud instead of hanging on stdio.
CI / packaging:
pyproject.toml py-modules += "mcp_server".
ci.yml syntax-check += "mcp_server.py".
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Apply Phase 4.6 review fixes
Round 1 review on PR #7 flagged 15 findings, with two explicitly called
out as production hazards ("will bite real users on first deploy on
common Linux topologies"). Addressed 11 of 15; deferring 4 with
rationale.
Production hazards (must-fix per reviewer):
#1 Bootstrap venv detection now uses `sys.prefix == str(_VENV_DIR)`
instead of `realpath(sys.executable) == realpath(_VENV_PY)`.
`python -m venv` on Linux/macOS defaults to --symlinks, so
realpath(/tmp/bbenv/bin/python3) resolves to the SAME canonical
path as the builder interpreter (e.g. /usr/bin/python3.12) — the
realpath comparison was claiming "already under venv" while we
were actually still under the system interpreter, skipping the
execv and dying on `from mcp.server.fastmcp import FastMCP`.
sys.prefix is set per-interpreter from the venv layout and is the
authoritative signal.
#2 Bootstrap idempotency. Pip install can be interrupted (Ctrl-C,
OOM-kill, disk full) leaving _VENV_PY in place but `mcp` missing.
The old guard skipped reinstall on every subsequent launch and
died on import with no self-repair path. Now writes a
`.bbenv-ready` sentinel ONLY after the full bootstrap succeeds;
the venv create step is also idempotent (only runs if _VENV_PY
is absent) so retries pick up where they left off.
#3 Pinned mcp dependency to `mcp>=1.0,<2`. The previous unpinned
"mcp" would silently install a future breaking mcp 2.x on the
next /tmp wipe. Matches the pyproject [mcp] extra.
Correctness:
#4 pr_create now rejects detached-HEAD auto-detect. git_current_branch
returns the literal "HEAD" for detached/unborn state; bb_ops only
checks non-empty/non-whitespace, so the call would have reached
Bitbucket with source.branch.name="HEAD" and surfaced as an
opaque API error. Now raises ValueError locally with a clear
"pass source_branch=" hint.
#5 _resolve_repo strips whitespace before parsing. A sloppy paste
like " acme/widget " no longer slips through as
workspace=" acme" and fails deep in the API layer.
#11 pr_create source_branch auto-detect now triggers on whitespace
(not just empty string), symmetric with #5.
#13 OSError added to _TOOL_EXPECTED_EXCEPTIONS. Covers paths
git_ops._run_git doesn't wrap explicitly (IsADirectoryError,
BlockingIOError, ConnectionResetError), and catches
os.getcwd() on a deleted cwd inside _default_repo_path() which
fires BEFORE any wrapped git call.
Surface improvements:
#6 pipelines_list now exposes `sort=` kwarg. Default "-created_on"
(newest first); agent can pass "created_on" for oldest-first or
other sort keys.
#7 pipeline_logs + pr_diff now expose `timeout=` kwarg. Default
120s matches bb_ops; agent can bump for pipelines with very
large log payloads or oversized PR diffs that would otherwise
cap out.
#8 git_status payload renamed from `status` to `working_tree`. The
`status` key collided with the HTTP-status field _error_dict
uses for BBApiError. Today the collision can't fire (git_status
doesn't raise BBApiError), but the rename pre-empts a future
broadening hazard.
#10 branch_show echoes the stripped name in the response. bb_ops
strips before encoding the URL, so the response now matches what
Bitbucket actually resolved rather than the unstripped input.
Deferred (with rationale):
#9 Response-key naming consistency (approval/result/pr). Reviewer
suggested standardizing all mutation tools to "result", but the
descriptive keys carry useful semantic info — pr_approve returns
an approval record, pr_merge returns the merged PR, etc. The
docstrings explain the shape. Skipping.
#12 _client_cache concurrency + invalidation. Real concern for
long-lived servers with rotated tokens, but stdio MCP is
single-threaded and token rotation is rare; defer to Phase 4.6+
if a real bug surfaces.
#14 CI smoke step that actually imports the real mcp package.
Larger CI workflow change; defer.
#15 README MCP documentation. Explicitly Phase 4.8 scope.
Test count: 335 -> 341. All passing locally.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Apply Phase 4.6 round-2 review fixes
Round 2 surfaced 15 new findings including a real security bug. Fixed
all 15.
Security:
#1 BBApiError.url now redacted in _error_dict. pipeline_logs and
pr_diff follow Bitbucket's 307 to a presigned S3 URL via
fetch_redirected_text; on a downstream non-3xx (S3 clock skew,
expired signature, redirect cap), BBApiError(url=<signed S3 URL>)
carried AWS-credential-bearing query parameters
(X-Amz-Signature, X-Amz-Credential) verbatim into the agent
error dict and downstream logs. New _redact_url helper strips
`user:token@` AND replaces signed-URL query strings with
`[redacted-signed-url-params]` while preserving the path so
the agent knows what host was called.
Correctness:
#2 AttributeError added to _TOOL_EXPECTED_EXCEPTIONS. JSON null
from an MCP client deserialises to Python None; `.strip()` on
None crashes uncaught. Also pre-normalised every stringy arg
path through `(value or "").strip()` so None and "" funnel to
the same auto-detect path uniformly.
#3 _default_repo_path no longer evaluates os.getcwd() eagerly.
`os.environ.get("KEY", os.getcwd())` runs the default even when
the env var is set — the override never protected against a
deleted cwd. Now uses `... or os.getcwd()` so the env var is
the lazy short-circuit. whoami also wraps cwd resolution in a
try block so a deleted cwd surfaces as a structured cwd_error
field instead of escaping the function.
#4 mcp import wrapped with helpful recovery message. If `mcp/`
gets deleted post-bootstrap (manual `pip uninstall`, partial
/tmp cleanup that wiped the package dir but spared the
sentinel), the ImportError now tells the user to
`rm .bbenv-ready` or `rm -rf /tmp/bbenv`.
#5 _resolve_repo now strips inner slug parts. "acme/ widget"
no longer slips through as workspace="acme", slug=" widget"
and 404s on `/repositories/acme/%20widget`.
#6 repos_list workspace stripped. ` acme` / `acme ` no longer
slip through.
#7 TypeError removed from _TOOL_EXPECTED_EXCEPTIONS (was
contradicting the file's stated philosophy). A refactor that
renames a bb_ops kwarg now crashes visibly during dev/test
instead of being silently reported back to the agent as a
fake Bitbucket failure.
#8 whoami git-context handlers now use _TOOL_EXPECTED_EXCEPTIONS
(not narrow GitOpError) AND store the full structured
_error_dict (not str(e)). An agent branching on
err.returncode for git-127 vs not-a-git-repo-128 vs
timeout-(-1000) has the same shape available as every other
tool.
#10 _opt_str() helper introduced. (branch or None) became
`_opt_str(branch)` everywhere — "" / " " / None all funnel
to None at the bb_ops boundary. Without this, " " on a
branch filter posted `?target.ref_name=%20%20`, on a pattern
raised a confusing two-layers-down ValueError, on commits
raised a different "non-empty, non-whitespace" error.
Inconsistent surfaces for the same agent typo.
#12 sys.prefix venv check now resolves both sides through
Path.resolve() so macOS's `/tmp` -> `/private/tmp` symlink
doesn't trigger an infinite execv loop.
#13 Every tool that takes an identifier (pr_id / number /
step_index) now threads it into the error dict via
_error_dict_with. An agent fanning out parallel
pipeline_logs(number=42, step_index=5) and
pipeline_logs(number=42, step_index=6) can correlate
failures to originating requests without substring-matching
the message.
#14 _resolve_repo validates the repo arg shape BEFORE calling
_get_client(). A fresh-machine user without config + a
malformed slug now sees the actual ValueError, not the
BBConfigError that previously masked the real cause.
#15 git_recent_commits echoes the stripped ref in the response
(matches what `git log` actually ran). Consistent with
branch_show.
UX:
#11 pip ins…
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
VERSION bump 1.0.1 → 1.1.0 in preparation for the v1.1.0 release.
Scope of v1.1.0
(Captured in the release notes; PRs #2–#12 merged since v1.0.1.)