scripts/build_backgrounds_chrombpnet.py: --gpu N silently overrides outer CUDA_VISIBLE_DEVICES

## Background

\`scripts/build_backgrounds_chrombpnet.py:200\` does:

\`\`\`python
os.environ["CUDA_VISIBLE_DEVICES"] = str(args.gpu)
\`\`\`

unconditionally — clobbering any pre-set \`CUDA_VISIBLE_DEVICES\` from the calling shell. So this parallel-launch pattern (suggested in the handoff at \`audits/2026-04-29_chrombpnet_cdf_rebuild/HANDOFF.md\` for sharded multi-GPU runs):

\`\`\`bash
# Terminal 1 (intended GPU 0):
CUDA_VISIBLE_DEVICES=0 python scripts/build_backgrounds_chrombpnet.py --gpu 0 ...

# Terminal 2 (intended GPU 1):
CUDA_VISIBLE_DEVICES=1 python scripts/build_backgrounds_chrombpnet.py --gpu 0 ...
\`\`\`

…doesn't work as expected: both jobs end up on physical GPU 0, fighting for memory, because each script invocation overrides \`CUDA_VISIBLE_DEVICES=N\` from its own \`--gpu N\` arg.

## Reproduction

During [PR #70](https://github.com/pinellolab/chorus/pull/70)'s Phase 1 redo, two parallel invocations:
- \`CUDA_VISIBLE_DEVICES=0 ... --part variants --assay ATAC_DNASE --gpu 0\`
- \`CUDA_VISIBLE_DEVICES=1 ... --part baselines --assay ATAC_DNASE --gpu 0\`

…both ended up on physical GPU 0. The second job OOM'd at \`MaxAllocSize: 327706624\` (~312 MB free between the first job's allocation and the third user's job on the same GPU).

Workaround for the rebuild was to pass \`--gpu 1\` explicitly to the second invocation and skip the outer \`CUDA_VISIBLE_DEVICES\` setting — but this is the opposite of what an experienced cluster user would expect.

## Suggested fix

\`\`\`python
# Honour pre-set CUDA_VISIBLE_DEVICES if present; only set from --gpu when nothing
# was passed in via env.
if "CUDA_VISIBLE_DEVICES" not in os.environ:
    os.environ["CUDA_VISIBLE_DEVICES"] = str(args.gpu)
\`\`\`

Or warn loudly when the script is overriding an existing env var:

\`\`\`python
if "CUDA_VISIBLE_DEVICES" in os.environ and os.environ["CUDA_VISIBLE_DEVICES"] != str(args.gpu):
    logger.warning(
        "Overriding caller CUDA_VISIBLE_DEVICES=%s with --gpu=%s",
        os.environ["CUDA_VISIBLE_DEVICES"], args.gpu,
    )
os.environ["CUDA_VISIBLE_DEVICES"] = str(args.gpu)
\`\`\`

The first option is cleaner and matches Unix env-var conventions (env var > CLI arg unless arg explicitly overrides).

## Why it matters

The handoff's sharded-multi-GPU invocation pattern requires a workaround that's non-obvious. Anyone following the handoff with \`CUDA_VISIBLE_DEVICES=N\` will silently land on the wrong GPU.

## Related

- chorus PR #70 (CDF rebuild) — caught and worked around the bug
- Sister issue (variants/baselines interim overwrite) — both surfaced during the same rebuild
- \`audits/2026-04-29_chrombpnet_cdf_rebuild/report.md\` "GPU pinning bug (informational, F2)" section

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scripts/build_backgrounds_chrombpnet.py: --gpu N silently overrides outer CUDA_VISIBLE_DEVICES #72

Background

Terminal 1 (intended GPU 0):

Terminal 2 (intended GPU 1):

Reproduction

Suggested fix

Honour pre-set CUDA_VISIBLE_DEVICES if present; only set from --gpu when nothing

was passed in via env.

Why it matters

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

scripts/build_backgrounds_chrombpnet.py: --gpu N silently overrides outer CUDA_VISIBLE_DEVICES #72

Description

Background

Terminal 1 (intended GPU 0):

Terminal 2 (intended GPU 1):

Reproduction

Suggested fix

Honour pre-set CUDA_VISIBLE_DEVICES if present; only set from --gpu when nothing

was passed in via env.

Why it matters

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions