-
Notifications
You must be signed in to change notification settings - Fork 26
Description
Is the feature request related to a problem?
Often, NetCDF/CF datasets store calibrated values using scale_factor and add_offset (or similar metadata), while the raw array contains unscaled integers. In the VS Code H5Web extension, the viewer currently displays the raw values, which makes quick inspection misleading and requires external tooling to interpret correctly. Additionally, when browsing complex datasets, it’s hard to confirm the logical shape and ordering because the names of dimensions are not visible alongside the data.
Requested solution or feature
1. Apply scale/offset on the fly (with a toggle):
• Detect standard CF attributes (scale_factor, add_offset, _FillValue, missing_value, valid_range) and apply them to the displayed data.
• Provide a per-variable toggle:
• “Show raw data”
• “Show scaled data (CF)”
• If units are present, display updated units when scaled; otherwise keep raw.
2. Respect fill/missing values and valid ranges:
• Treat _FillValue/missing_value as NaN (or visually dim) in the rendered table/plot.
• Optionally allow a “clip to valid_range” view.
3. Show dimension names next to shapes:
• In the variable sidebar and the data header, display the array shape annotated with dimension names, e.g. reflectance: float32 [time=12, latitude=2048, longitude=3712].
• Add a small “Dimensions & coords” panel listing:
• Dimension names and sizes
• Linked coordinate variables if present.
4. Lightweight UI details:
• A small badge or pill next to the variable name indicating Scaled vs Raw.
• Tooltip explaining which attributes were applied (e.g., valuescale_factor + add_offset).
• If multiple conventions exist (e.g., non-CF custom scale/offset), allow manual entry of a linear transform (ax + b) with reset to auto/CF.
5. Performance considerations:
• Apply the transform lazily in the viewer (chunked/viewport-based) so large arrays don’t stall the UI.
• Don’t rewrite the file; this is display-only.
Alternatives considered
• External preprocessing: Open the file in Python/xarray and compute data = data * scale_factor + add_offset, then save a new file or preview – this breaks the quick-look workflow and is heavy for large files.
• Relying on other viewers: Tools like Panoply or custom Jupyter notebooks can show scaled data, but switching tools defeats the purpose of an integrated VS Code preview.
Additional context
• Target datasets: NetCDF4/CF-compliant files (but feature should be robust to HDF5 groups with CF-like attributes).
• Typical use case: Earth Observation L1/L2 products where raw integer arrays are scaled to physical units (radiance, reflectance, brightness temperature).
• Nice-to-have: expose a small “Data summary” line showing min/max/mean after scaling (computed on the viewed slice) and indicate unit.
• Edge cases:
• Variables with scale_factor but no add_offset (assume 0).
• Presence of both _FillValue and missing_value (treat both as missing).
• Nonlinear encodings (rare) should not be auto-applied; keep to linear CF only.
• Accessibility: clearly differentiate missing values in tables/heatmaps (e.g., hatch/transparent).