Skip to content

NetCDF scale/offset and dimension name display #1918

@cesclc

Description

@cesclc

Is the feature request related to a problem?

Often, NetCDF/CF datasets store calibrated values using scale_factor and add_offset (or similar metadata), while the raw array contains unscaled integers. In the VS Code H5Web extension, the viewer currently displays the raw values, which makes quick inspection misleading and requires external tooling to interpret correctly. Additionally, when browsing complex datasets, it’s hard to confirm the logical shape and ordering because the names of dimensions are not visible alongside the data.

Requested solution or feature
1. Apply scale/offset on the fly (with a toggle):
• Detect standard CF attributes (scale_factor, add_offset, _FillValue, missing_value, valid_range) and apply them to the displayed data.
• Provide a per-variable toggle:
• “Show raw data”
• “Show scaled data (CF)”
• If units are present, display updated units when scaled; otherwise keep raw.
2. Respect fill/missing values and valid ranges:
• Treat _FillValue/missing_value as NaN (or visually dim) in the rendered table/plot.
• Optionally allow a “clip to valid_range” view.
3. Show dimension names next to shapes:
• In the variable sidebar and the data header, display the array shape annotated with dimension names, e.g. reflectance: float32 [time=12, latitude=2048, longitude=3712].
• Add a small “Dimensions & coords” panel listing:
• Dimension names and sizes
• Linked coordinate variables if present.
4. Lightweight UI details:
• A small badge or pill next to the variable name indicating Scaled vs Raw.
• Tooltip explaining which attributes were applied (e.g., valuescale_factor + add_offset).
• If multiple conventions exist (e.g., non-CF custom scale/offset), allow manual entry of a linear transform (a
x + b) with reset to auto/CF.
5. Performance considerations:
• Apply the transform lazily in the viewer (chunked/viewport-based) so large arrays don’t stall the UI.
• Don’t rewrite the file; this is display-only.

Alternatives considered
• External preprocessing: Open the file in Python/xarray and compute data = data * scale_factor + add_offset, then save a new file or preview – this breaks the quick-look workflow and is heavy for large files.
• Relying on other viewers: Tools like Panoply or custom Jupyter notebooks can show scaled data, but switching tools defeats the purpose of an integrated VS Code preview.

Additional context
• Target datasets: NetCDF4/CF-compliant files (but feature should be robust to HDF5 groups with CF-like attributes).
• Typical use case: Earth Observation L1/L2 products where raw integer arrays are scaled to physical units (radiance, reflectance, brightness temperature).
• Nice-to-have: expose a small “Data summary” line showing min/max/mean after scaling (computed on the viewed slice) and indicate unit.
• Edge cases:
• Variables with scale_factor but no add_offset (assume 0).
• Presence of both _FillValue and missing_value (treat both as missing).
• Nonlinear encodings (rare) should not be auto-applied; keep to linear CF only.
• Accessibility: clearly differentiate missing values in tables/heatmaps (e.g., hatch/transparent).

Metadata

Metadata

Assignees

No one assigned

    Labels

    epicIssue that will need to be split up later on

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions