Problem
code_review_graph/skills.py:417 uses a regex to strip trailing commas before } or ] when reading existing config files for merge:
stripped = re.sub(r',(\s*[}\]])', r'\1', stripped)
This regex does not track JSON string boundaries. If a string value contains a comma followed by } or ] on the next line, the regex will incorrectly remove the comma inside the string, producing invalid JSON that the subsequent json.loads silently misparses (or rejects outright).
Concrete breakage
A user config like:
becomes (after the regex pass):
{
"mcp": {
"my-server": {
"command": ["x"],
"description": "foo bar",
}
}
}
The comma inside "foo, bar" is deleted, silently corrupting the user's string value. Worse: a real ] or } inside a string (e.g. "see [docs]") will also be touched.
Same bug class as the _strip_jsonc_comments issue fixed in #530 (commit 47a43dc) — that one was string-aware and resolved cleanly; this one is a leftover raw regex that needs the same treatment.
Reproduction
import re
raw = '{"a": "foo, bar", "b": 1}'
print(re.sub(r',(\s*[}\]])', r'\1', raw))
# -> {"a": "foo bar", "b": 1} ← comma inside string deleted
Suggested fix
Fold the trailing-comma strip into the existing stateful _strip_jsonc_comments scanner (introduced in #530) so both passes happen in one string-aware walk. The scanner already tracks string boundaries; extending it to also drop a , when the next non-whitespace token is } or ] is a small change and removes the need for the post-pass regex entirely.
Sketch:
def _strip_jsonc_and_trailing_commas(text: str) -> str:
# same state machine as _strip_jsonc_comments, plus:
# in_state == "code" and c == "," and the next non-ws
# char is "}" or "]" -> drop the comma
Rename _strip_jsonc_comments -> _strip_jsonc_and_trailing_commas and delete the re.sub line at 417.
Out of scope for #530
Owner (@tirth8205) confirmed in #530 (comment) that this is out of scope for the spec-compliance PR and asked for a separate follow-up issue. Filing per that request.
Acceptance
Problem
code_review_graph/skills.py:417uses a regex to strip trailing commas before}or]when reading existing config files for merge:This regex does not track JSON string boundaries. If a string value contains a comma followed by
}or]on the next line, the regex will incorrectly remove the comma inside the string, producing invalid JSON that the subsequentjson.loadssilently misparses (or rejects outright).Concrete breakage
A user config like:
{ "mcp": { "my-server": { "command": ["x"], "description": "foo, bar", } } }becomes (after the regex pass):
{ "mcp": { "my-server": { "command": ["x"], "description": "foo bar", } } }The comma inside
"foo, bar"is deleted, silently corrupting the user's string value. Worse: a real]or}inside a string (e.g."see [docs]") will also be touched.Same bug class as the
_strip_jsonc_commentsissue fixed in #530 (commit47a43dc) — that one was string-aware and resolved cleanly; this one is a leftover raw regex that needs the same treatment.Reproduction
Suggested fix
Fold the trailing-comma strip into the existing stateful
_strip_jsonc_commentsscanner (introduced in #530) so both passes happen in one string-aware walk. The scanner already tracks string boundaries; extending it to also drop a,when the next non-whitespace token is}or]is a small change and removes the need for the post-pass regex entirely.Sketch:
Rename
_strip_jsonc_comments->_strip_jsonc_and_trailing_commasand delete there.subline at 417.Out of scope for #530
Owner (@tirth8205) confirmed in #530 (comment) that this is out of scope for the spec-compliance PR and asked for a separate follow-up issue. Filing per that request.
Acceptance
}/]to confirm the strip is string-awarere.subat line 417