Html rep reduce by katosh · Pull Request #6 · settylab/anndata

katosh · 2026-03-23T20:12:50Z

Summary

This branch merges ig/fold_can_write (scverse#2372) into html_rep and rewrites the HTML repr's section traversal to use the new AnnData.reduce() API. The goal is to evaluate concretely how well reduce serves as a rendering backbone, and to surface where it helps vs. where it introduces friction.

This is an experiment branch, not intended for merge into main. It exists to inform the API discussion on scverse#2372.

What changed

Shared IO utilities (_io/specs/registry.py): Extracted IORegistry.get_writeable_types() and IORegistry.has_spec() from the logic inside can_write. Both can_write and the repr's serializability checks now use these instead of duplicating registry lookups.
Simplified serializability checks (_repr/utils.py, _repr/formatters.py): _check_serializable_single and _check_array_has_writer now delegate to has_spec() instead of catching exceptions from get_spec().
_render_all_sections rewritten with reduce (_repr/html.py): The core experiment. The SECTION_ORDER loop is replaced by a single adata.reduce() call in DFS-pre order.

What to look for in the code

All pain points are marked with PAIN POINT comments in html.py. Key locations:

The accumulator (`_render_via_reduce`)

Without enter/exit events, the callback must thread a 4-tuple through every call:

(finished_sections, current_rows, current_section, current_n_items)

current_rows and current_section exist only to track which section is "open" — state that reduce already knows internally but doesn't expose.

Duplicated finalization

Search for _finalize_section. It appears in two places:

Inside the callback, triggered when the next section starts (implicitly closing the previous one)
After reduce returns, for the last section that never got a successor

Both calls are identical. An on_exit hook would eliminate this duplication entirely.

Section vs. leaf detection

Search for _is_section_visit. The callback must isinstance-check the accessor hierarchy to distinguish parent visits from leaf visits. This is fragile: LayerAcc is used for both X (k=None) and individual layer entries (k='counts'), so a naive isinstance(ref_acc, LayerAcc) treats each layer as a new section instead of an entry within the layers section. The initial implementation had exactly this bug. The fix required adding a special case to exclude LayerAcc entirely from section detection and relying on _is_x to handle the X case separately. reduce knows this distinction internally but doesn't communicate it.

Section metadata (`len` mismatch)

Search for isinstance(ref_acc, MetaAcc). For obs/var, len(elem) returns the number of rows, not columns. The callback needs a per-type branch to get the correct entry count. The original _render_dataframe_section called len(df.columns) directly.

Opaque uns/raw

Search for isinstance(elem, dict) and isinstance(elem, Raw). reduce passes these as opaque blobs with ref_acc=None and doesn't descend into them. The callback falls back to the dedicated _render_uns_section and _render_raw_section renderers — the same functions the old code used.

Custom and unknown sections

Search for custom_sections_after and _detect_unknown_sections. These are handled entirely outside reduce because reduce only visits the hardcoded attribute list. Extension packages (TreeData adding .obst/.vart) are invisible to it.

Robustness regression

The current per-section rendering is deliberately defensive: each section renders independently, and if one fails, the rest still appear. With reduce, a single problematic element crashes the entire traversal. For example, a non-string DataFrame column name (tuple key in obs) causes the accessor system to throw, which aborts reduce entirely. Instead of showing all other sections with one error indicator, the repr shows nothing but "reduce failed." The try/except around adata.reduce() catches this, but the damage is total rather than localized. See test 23 ("Serialization warnings") in the visual test for a live example.

What reduce simplified

The shared has_spec() / get_writeable_types() utilities are a genuine improvement: both can_write and the repr now use the same registry check instead of each rolling their own. This is independent of whether reduce is used for rendering.

What reduce complicated

The rendering itself became harder to follow. The original code had a clean per-section structure: open section, iterate entries, close section. With reduce, this three-phase logic is flattened into a single callback that must reconstruct section boundaries from accessor types, carry open-section state through the accumulator, and handle the last-section edge case separately.

Section ordering

reduce visits sections in a different order than the repr's SECTION_ORDER:

reduce: X, obs, var, obsm, varm, obsp, varp, layers, uns, raw
SECTION_ORDER: X, obs, var, uns, obsm, varm, layers, obsp, varp, raw

This branch uses reduce's order. If the original order matters, the output would need post-hoc reordering.

scverse#2367)

Co-authored-by: Philipp A. <flying-sheep@web.de>

Replace _render_all_sections with a reduce-based traversal to evaluate how well the new reduce API serves as a rendering backbone. Changes: - Extract IORegistry.get_writeable_types() and has_spec() utilities - Simplify can_write, _check_serializable_single, _check_array_has_writer to use the shared utilities - Rewrite _render_all_sections using adata.reduce(DFS-pre) Pain points documented inline: - 4-tuple accumulator to track open section state (no enter/exit events) - Duplicated _finalize_section calls (callback + post-reduce) - isinstance checks to distinguish section vs leaf visits - len(DataFrame) returns rows not columns, needs per-type handling - reduce crashes on non-string column names (tuple keys in obs/var) - uns/raw passed as opaque blobs, fall back to dedicated renderers - Custom/unknown sections entirely outside reduce's scope

coderabbitai · 2026-03-23T20:12:57Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 02cd3745-cd55-450c-ac6b-fa97e3dfd75b

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch html_rep_reduce

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

LayerAcc is used for both X (k=None) and individual layer entries (k='counts'). The _is_section_visit check matched both, causing each layer to render as its own section instead of as an entry within the layers section. This is another instance of the isinstance fragility documented in the reduce experiment: the accessor type hierarchy doesn't cleanly separate parent visits from leaf visits.

get_section_tooltip() already returns the correct tooltips for obs/var. The hardcoded override was an artifact of porting from _render_dataframe_section which didn't use get_section_tooltip.

ilan-gold and others added 22 commits March 19, 2026 14:16

feat: AnnData.can_write based on AnnData.fold

b907554

chore: docs

19daed5

refactor: use accessors

4125375

fix: DFS order + fixes

8be5ba2

chore: add test for uns

0f4d1b0

fix!: remove __delitem__ and __setitem__ from the AnnData object (

be98d32

scverse#2367)

Merge branch 'main' into ig/fold_can_write

9338baa

chore!: remove AnnData.concatenate (scverse#2370)

fbc696f

Co-authored-by: Philipp A. <flying-sheep@web.de>

feat: raw + uns traversal

69daf90

fix: fold -> reduce

932d766

chore: docs

e0f3ee2

Merge branch 'main' into ig/fold_can_write

436fc68

fix: meth not func

ee04741

fix: fold not reduce in relnote

6d6f454

fix: nested

1f77a4c

chore: more func clarification

91adffe

fix: link

928b72a

fix: link

19a915d

refactor: simpler

c0886fe

fix: relnote number

6cffc05

Merge remote-tracking branch 'origin/ig/fold_can_write' into html_rep

4de5eac

katosh mentioned this pull request Mar 23, 2026

feat: AnnData.unwriteable based on AnnData._reduce + iter_outer + refactorings of other relevant functions scverse/anndata#2372

Merged

5 tasks

katosh added 3 commits March 23, 2026 15:28

fix: remove redundant tooltip override in _finalize_section

eb84064

get_section_tooltip() already returns the correct tooltips for obs/var. The hardcoded override was an artifact of porting from _render_dataframe_section which didn't use get_section_tooltip.

Merge branch 'html_rep' into html_rep_reduce

aa67cf2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Html rep reduce#6

Html rep reduce#6
katosh wants to merge 25 commits intohtml_repfrom
html_rep_reduce

katosh commented Mar 23, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Mar 23, 2026 •

edited

Loading

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

katosh commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

What to look for in the code

The accumulator (_render_via_reduce)

Duplicated finalization

Section vs. leaf detection

Section metadata (len mismatch)

Opaque uns/raw

Custom and unknown sections

Robustness regression

What reduce simplified

What reduce complicated

Section ordering

Uh oh!

coderabbitai Bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

katosh commented Mar 23, 2026 •

edited

Loading

The accumulator (`_render_via_reduce`)

Section metadata (`len` mismatch)

coderabbitai Bot commented Mar 23, 2026 •

edited

Loading