Conversation
Co-authored-by: Philipp A. <flying-sheep@web.de>
Replace _render_all_sections with a reduce-based traversal to evaluate how well the new reduce API serves as a rendering backbone. Changes: - Extract IORegistry.get_writeable_types() and has_spec() utilities - Simplify can_write, _check_serializable_single, _check_array_has_writer to use the shared utilities - Rewrite _render_all_sections using adata.reduce(DFS-pre) Pain points documented inline: - 4-tuple accumulator to track open section state (no enter/exit events) - Duplicated _finalize_section calls (callback + post-reduce) - isinstance checks to distinguish section vs leaf visits - len(DataFrame) returns rows not columns, needs per-type handling - reduce crashes on non-string column names (tuple keys in obs/var) - uns/raw passed as opaque blobs, fall back to dedicated renderers - Custom/unknown sections entirely outside reduce's scope
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
5 tasks
LayerAcc is used for both X (k=None) and individual layer entries (k='counts'). The _is_section_visit check matched both, causing each layer to render as its own section instead of as an entry within the layers section. This is another instance of the isinstance fragility documented in the reduce experiment: the accessor type hierarchy doesn't cleanly separate parent visits from leaf visits.
get_section_tooltip() already returns the correct tooltips for obs/var. The hardcoded override was an artifact of porting from _render_dataframe_section which didn't use get_section_tooltip.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This branch merges
ig/fold_can_write(scverse#2372) intohtml_repand rewrites the HTML repr's section traversal to use the newAnnData.reduce()API. The goal is to evaluate concretely how wellreduceserves as a rendering backbone, and to surface where it helps vs. where it introduces friction.This is an experiment branch, not intended for merge into
main. It exists to inform the API discussion on scverse#2372.What changed
Shared IO utilities (
_io/specs/registry.py): ExtractedIORegistry.get_writeable_types()andIORegistry.has_spec()from the logic insidecan_write. Bothcan_writeand the repr's serializability checks now use these instead of duplicating registry lookups.Simplified serializability checks (
_repr/utils.py,_repr/formatters.py):_check_serializable_singleand_check_array_has_writernow delegate tohas_spec()instead of catching exceptions fromget_spec()._render_all_sectionsrewritten withreduce(_repr/html.py): The core experiment. TheSECTION_ORDERloop is replaced by a singleadata.reduce()call in DFS-pre order.What to look for in the code
All pain points are marked with
PAIN POINTcomments inhtml.py. Key locations:The accumulator (
_render_via_reduce)Without enter/exit events, the callback must thread a 4-tuple through every call:
current_rowsandcurrent_sectionexist only to track which section is "open" — state thatreducealready knows internally but doesn't expose.Duplicated finalization
Search for
_finalize_section. It appears in two places:reducereturns, for the last section that never got a successorBoth calls are identical. An
on_exithook would eliminate this duplication entirely.Section vs. leaf detection
Search for
_is_section_visit. The callback must isinstance-check the accessor hierarchy to distinguish parent visits from leaf visits. This is fragile:LayerAccis used for both X (k=None) and individual layer entries (k='counts'), so a naiveisinstance(ref_acc, LayerAcc)treats each layer as a new section instead of an entry within the layers section. The initial implementation had exactly this bug. The fix required adding a special case to excludeLayerAccentirely from section detection and relying on_is_xto handle the X case separately.reduceknows this distinction internally but doesn't communicate it.Section metadata (
lenmismatch)Search for
isinstance(ref_acc, MetaAcc). For obs/var,len(elem)returns the number of rows, not columns. The callback needs a per-type branch to get the correct entry count. The original_render_dataframe_sectioncalledlen(df.columns)directly.Opaque uns/raw
Search for
isinstance(elem, dict)andisinstance(elem, Raw).reducepasses these as opaque blobs withref_acc=Noneand doesn't descend into them. The callback falls back to the dedicated_render_uns_sectionand_render_raw_sectionrenderers — the same functions the old code used.Custom and unknown sections
Search for
custom_sections_afterand_detect_unknown_sections. These are handled entirely outsidereducebecausereduceonly visits the hardcoded attribute list. Extension packages (TreeData adding.obst/.vart) are invisible to it.Robustness regression
The current per-section rendering is deliberately defensive: each section renders independently, and if one fails, the rest still appear. With
reduce, a single problematic element crashes the entire traversal. For example, a non-string DataFrame column name (tuple key in obs) causes the accessor system to throw, which abortsreduceentirely. Instead of showing all other sections with one error indicator, the repr shows nothing but "reduce failed." Thetry/exceptaroundadata.reduce()catches this, but the damage is total rather than localized. See test 23 ("Serialization warnings") in the visual test for a live example.What reduce simplified
The shared
has_spec()/get_writeable_types()utilities are a genuine improvement: bothcan_writeand the repr now use the same registry check instead of each rolling their own. This is independent of whetherreduceis used for rendering.What reduce complicated
The rendering itself became harder to follow. The original code had a clean per-section structure: open section, iterate entries, close section. With
reduce, this three-phase logic is flattened into a single callback that must reconstruct section boundaries from accessor types, carry open-section state through the accumulator, and handle the last-section edge case separately.Section ordering
reducevisits sections in a different order than the repr'sSECTION_ORDER:This branch uses
reduce's order. If the original order matters, the output would need post-hoc reordering.