Conversation
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
WalkthroughAdds an optional CSV name-mapping for geo_sub, exposes precomputed TARGET_GEO_IDS, target_id_to_path, and brcode_to_ptcode mappings, renames rule wildcards/outputs from Changes
Sequence Diagram(s)sequenceDiagram
participant Snakemake as Snakemake Workflow
participant CSV as name_mapping_csv (optional)
participant Parser as Mapping Parser
participant Matcher as ID Matcher
participant PathLookup as target_id_to_path
participant Rules as GeoSub Rules
Snakemake->>CSV: load optional CSV
CSV-->>Parser: provide rows
Parser->>Matcher: build brcode_to_ptcode (longest-substring)
Matcher->>PathLookup: produce target_id_to_path & TARGET_GEO_IDS
Rules->>PathLookup: request original_path for geo_sub_id
PathLookup-->>Rules: return original_path
Rules->>Rules: expand gatherFilesPerSampleForGeoSub / gatherFilesForGeoSub using TARGET_GEO_IDS
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
config/README.md (1)
229-232: Documentation could be more complete.The documentation is functional but could benefit from a few clarifications:
- Indicate whether this configuration is optional (the implementation treats it as optional).
- Clarify whether the CSV should have headers or not.
- Mention what happens if the file doesn't exist or is invalid.
- Minor grammar: "with specified str" → "with new names" or "with specified strings".
📝 Suggested documentation improvement
## name_mapping_csv -Used to replace donor and sample names with specified str (when creating geo_sub folder). -Path to csv file with first column containing donor/sample names and second column new names. +(Optional) Used to replace donor and sample names with new names when creating the geo_sub folder. +Path to a CSV file (without headers) where the first column contains original donor/sample names and the second column contains the replacement names. +If the file does not exist or cannot be loaded, the workflow proceeds without name mapping.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@config/README.md` around lines 229 - 232, Update the name_mapping_csv documentation to state that the name_mapping_csv configuration is optional, clarify that the CSV should contain two columns (original name, new name) and indicate whether headers are allowed or required (e.g., "no header expected" or "header row ignored"), replace "with specified str" with "with new names" (or "with specified strings"), and add a short sentence describing runtime behavior when the file is missing or invalid (e.g., the system skips renaming and logs a warning/error) and mention the context where it is used (creating geo_sub folder).workflow/rules/geo_sub.smk (1)
80-86: Only the lastcpcommand logs errors.The current shell command only redirects stderr from the final
cpto the log file. If an earliercpfails, the error won't be captured in the log, making debugging harder.📝 Suggested improvement to capture all errors
shell: - "cp {input.r_img} {output.r_img} && " - "cp {input.r_ts} {output.r_ts} && " - "cp {input.p_cnt} {output.p_cnt} && " - "cp {input.p_cells} {output.p_cells} && " - "cp {input.p_cell_boundaries} {output.p_cell_boundaries} && " - "cp {input.p_nuc_boundaries} {output.p_nuc_boundaries} &> {log}" + "(" + "cp {input.r_img} {output.r_img} && " + "cp {input.r_ts} {output.r_ts} && " + "cp {input.p_cnt} {output.p_cnt} && " + "cp {input.p_cells} {output.p_cells} && " + "cp {input.p_cell_boundaries} {output.p_cell_boundaries} && " + "cp {input.p_nuc_boundaries} {output.p_nuc_boundaries}" + ") &> {log}"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@workflow/rules/geo_sub.smk` around lines 80 - 86, The shell block in workflow/rules/geo_sub.smk currently only redirects stderr from the final cp, so failures in earlier cp commands won't be logged; modify the shell invocation in the rule's shell block (the sequence of cp {input.r_img} ... cp {input.p_nuc_boundaries}) to ensure stderr from all cp commands is captured — either redirect each cp's stderr to {log} or group the entire sequence (e.g., with (...) or { ...; } ) and redirect the group's stderr to {log} so every cp error is written to the log.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@workflow/rules/geo_sub.smk`:
- Around line 32-36: The loop over sorted_brcodes can cause cascading
replacements because after replacing final_id with brcode_to_ptcode[brcode] it
continues searching and may replace the new text; update the loop in the block
that iterates over sorted_brcodes to stop after the first successful replacement
by adding a break immediately after final_id = final_id.replace(brcode,
brcode_to_ptcode[brcode]) so only one replacement occurs before assigning
target_id_to_path[final_id] = rel_path; use the existing variables
sorted_brcodes, final_id, brcode_to_ptcode, target_id_to_path, and rel_path to
locate and change the code.
- Around line 7-15: The code directly indexes config['name_mapping_csv'] which
can raise KeyError; change to use mapping_file = config.get('name_mapping_csv')
(or default None/''), then guard the load with "if mapping_file and
os.path.exists(mapping_file):" before attempting to read into df_mapping and
build brcode_to_ptcode; keep error handling around pd.read_csv as-is and ensure
brcode_to_ptcode stays an empty dict when mapping_file is unset or missing.
---
Nitpick comments:
In `@config/README.md`:
- Around line 229-232: Update the name_mapping_csv documentation to state that
the name_mapping_csv configuration is optional, clarify that the CSV should
contain two columns (original name, new name) and indicate whether headers are
allowed or required (e.g., "no header expected" or "header row ignored"),
replace "with specified str" with "with new names" (or "with specified
strings"), and add a short sentence describing runtime behavior when the file is
missing or invalid (e.g., the system skips renaming and logs a warning/error)
and mention the context where it is used (creating geo_sub folder).
In `@workflow/rules/geo_sub.smk`:
- Around line 80-86: The shell block in workflow/rules/geo_sub.smk currently
only redirects stderr from the final cp, so failures in earlier cp commands
won't be logged; modify the shell invocation in the rule's shell block (the
sequence of cp {input.r_img} ... cp {input.p_nuc_boundaries}) to ensure stderr
from all cp commands is captured — either redirect each cp's stderr to {log} or
group the entire sequence (e.g., with (...) or { ...; } ) and redirect the
group's stderr to {log} so every cp error is written to the log.
🪄 Autofix (Beta)
✅ Autofix completed
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a9b3dff0-205d-4f5b-bbee-18665fe428a5
📒 Files selected for processing (2)
config/README.mdworkflow/rules/geo_sub.smk
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
workflow/rules/geo_sub.smk (1)
32-35:⚠️ Potential issue | 🟡 MinorStop after first match to prevent chained remapping.
At Line 32-Line 35, the loop keeps scanning after a replacement, so a newly substituted token can be replaced again in the same pass.
Proposed fix
for brcode in sorted_brcodes: if brcode in final_id: final_id = final_id.replace(brcode, brcode_to_ptcode[brcode]) + break🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@workflow/rules/geo_sub.smk` around lines 32 - 35, The loop over sorted_brcodes can cause chained remapping because after replacing a match in final_id you continue scanning and may replace the newly substituted token; in the block that iterates sorted_brcodes and does final_id = final_id.replace(brcode, brcode_to_ptcode[brcode]) add logic to stop scanning after the first successful replacement (e.g., break out of the loop or return immediately) so only the first matching brcode is substituted for final_id; refer to sorted_brcodes, final_id, and brcode_to_ptcode in the change.
🧹 Nitpick comments (1)
workflow/rules/geo_sub.smk (1)
39-39: Make target expansion order deterministic.At Line 39,
TARGET_GEO_IDSdepends onglob.globdiscovery order, which can vary and change_all_files.txtordering between runs.Proposed fix
-TARGET_GEO_IDS = list(target_id_to_path.keys()) +TARGET_GEO_IDS = sorted(target_id_to_path.keys())🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@workflow/rules/geo_sub.smk` at line 39, TARGET_GEO_IDS is built from target_id_to_path keys which inherit nondeterministic glob.glob discovery order; make expansion deterministic by sorting the keys before assigning TARGET_GEO_IDS (e.g., replace the current assignment with one that uses sorted(...) on target_id_to_path.keys()) so subsequent users of TARGET_GEO_IDS get a stable, reproducible order.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@workflow/rules/geo_sub.smk`:
- Around line 36-39: Before assigning into target_id_to_path, check whether
final_id already exists and handle the collision: in the block that sets
target_id_to_path[final_id] = rel_path, replace the blind assignment with a
guard that tests "if final_id in target_id_to_path" and then either raise an
error (e.g., ValueError including the existing path and the new rel_path) or
otherwise resolve the conflict (e.g., append to a list or include a
disambiguating suffix). Ensure TARGET_GEO_IDS is still derived from the keys
after the guard so collisions are not silently overwritten.
---
Duplicate comments:
In `@workflow/rules/geo_sub.smk`:
- Around line 32-35: The loop over sorted_brcodes can cause chained remapping
because after replacing a match in final_id you continue scanning and may
replace the newly substituted token; in the block that iterates sorted_brcodes
and does final_id = final_id.replace(brcode, brcode_to_ptcode[brcode]) add logic
to stop scanning after the first successful replacement (e.g., break out of the
loop or return immediately) so only the first matching brcode is substituted for
final_id; refer to sorted_brcodes, final_id, and brcode_to_ptcode in the change.
---
Nitpick comments:
In `@workflow/rules/geo_sub.smk`:
- Line 39: TARGET_GEO_IDS is built from target_id_to_path keys which inherit
nondeterministic glob.glob discovery order; make expansion deterministic by
sorting the keys before assigning TARGET_GEO_IDS (e.g., replace the current
assignment with one that uses sorted(...) on target_id_to_path.keys()) so
subsequent users of TARGET_GEO_IDS get a stable, reproducible order.
🪄 Autofix (Beta)
❌ Autofix failed (check again to retry)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 26f2e3d5-fbd8-405e-975e-b856f7284ceb
📒 Files selected for processing (1)
workflow/rules/geo_sub.smk
|
Note Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it. An unexpected error occurred while generating fixes: Update is not a fast forward - https://docs.github.com/rest/git/refs#update-a-reference |
|
Note Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it. Fixes Applied SuccessfullyFixed 2 file(s) based on 1 unresolved review comment. Files modified:
Commit: The changes have been pushed to the Time taken: |
Fixed 2 file(s) based on 1 unresolved review comment. Co-authored-by: CodeRabbit <noreply@coderabbit.ai>
|
Note Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it. Fixes Applied SuccessfullyFixed 1 file(s) based on 1 unresolved review comment. Files modified:
Commit: The changes have been pushed to the Time taken: |
|
Note Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it. An unexpected error occurred while generating fixes: Update is not a fast forward - https://docs.github.com/rest/git/refs#update-a-reference |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@run.sh`:
- Around line 58-59: The while-loop's case statement in run.sh falls through to
the *) handler when $1 is empty; add an explicit empty-argument guard at the top
of the case (e.g., a '') pattern that does a break so the loop exits cleanly
when arguments are exhausted. Locate the while ... case ... esac block (the loop
that checks $1) and insert the empty-string case before the other patterns to
prevent the "Unknown option" *) branch from running when there are no more args.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 85ebecc8-4f37-4c49-9652-a39ec52f03c7
📒 Files selected for processing (2)
run.shworkflow/rules/geo_sub.smk
Fixed 1 file(s) based on 1 unresolved review comment. Co-authored-by: CodeRabbit <noreply@coderabbit.ai>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
workflow/rules/geo_sub.smk (1)
86-91: Redirect logs for the full copy chain, not only the last command.Current
&> {log}applies only to the finalcp, so diagnostics from earlier copy steps are missed. Wrap the chain in a grouped command and redirect once.Proposed fix
- shell: - "cp {input.r_img} {output.r_img} && " - "cp {input.r_ts} {output.r_ts} && " - "cp {input.p_cnt} {output.p_cnt} && " - "cp {input.p_cells} {output.p_cells} && " - "cp {input.p_cell_boundaries} {output.p_cell_boundaries} && " - "cp {input.p_nuc_boundaries} {output.p_nuc_boundaries} &> {log}" + shell: + "( " + "cp {input.r_img} {output.r_img} && " + "cp {input.r_ts} {output.r_ts} && " + "cp {input.p_cnt} {output.p_cnt} && " + "cp {input.p_cells} {output.p_cells} && " + "cp {input.p_cell_boundaries} {output.p_cell_boundaries} && " + "cp {input.p_nuc_boundaries} {output.p_nuc_boundaries} " + ") &> {log}"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@workflow/rules/geo_sub.smk` around lines 86 - 91, The sequence of multiple "cp" commands in the rule currently only redirects output of the final cp to {log}; group the entire chain (the lines building the shell command string that include "cp {input.r_img} {output.r_img}" through "cp {input.p_nuc_boundaries} {output.p_nuc_boundaries}") into a single grouped shell command (e.g., wrap with parentheses or braces) and apply a single redirection to {log} so all commands' stdout/stderr are captured; locate the command string in geo_sub.smk and update it to perform the grouping and single &> {log} redirect.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@workflow/rules/geo_sub.smk`:
- Around line 24-45: The discovery is nondeterministic because found_dirs comes
from glob.glob and dict key order may vary; make the process deterministic by
iterating over a sorted list of found_dirs (use sorted(found_dirs)) when
building target_id_to_path and then export TARGET_GEO_IDS as a sorted list
(e.g., TARGET_GEO_IDS = sorted(target_id_to_path.keys())) so final_id generation
(final_id, target_id_to_path) and downstream outputs are stable across runs.
---
Nitpick comments:
In `@workflow/rules/geo_sub.smk`:
- Around line 86-91: The sequence of multiple "cp" commands in the rule
currently only redirects output of the final cp to {log}; group the entire chain
(the lines building the shell command string that include "cp {input.r_img}
{output.r_img}" through "cp {input.p_nuc_boundaries} {output.p_nuc_boundaries}")
into a single grouped shell command (e.g., wrap with parentheses or braces) and
apply a single redirection to {log} so all commands' stdout/stderr are captured;
locate the command string in geo_sub.smk and update it to perform the grouping
and single &> {log} redirect.
🪄 Autofix (Beta)
✅ Autofix completed
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ebb447a2-298d-47dc-8491-4f0af945e582
📒 Files selected for processing (1)
workflow/rules/geo_sub.smk
|
Note Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it. Fixes Applied SuccessfullyFixed 2 file(s) based on 1 unresolved review comment. Files modified:
Commit: The changes have been pushed to the Time taken: |
Fixed 2 file(s) based on 1 unresolved review comment. Co-authored-by: CodeRabbit <noreply@coderabbit.ai>
No description provided.