Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
53 commits
Select commit Hold shift + click to select a range
2fc0e1c
perf: optimize DataDict copy, validate, and pipeline data flow
Apr 15, 2026
e81f752
docs: add further optimization opportunities to performance plan
Apr 16, 2026
3f0d4ef
perf: round 2 optimizations — node overhead, numtype, invalid entries
Apr 16, 2026
ab485ee
fix: resolve mypy type errors in validate() monotonicity check
Apr 16, 2026
dbff1c0
docs: add real dataset benchmark results (23 QCodes datasets)
Apr 16, 2026
1edb53b
docs: add large dataset benchmark (array paramtype, 15-61 MB each)
Apr 16, 2026
b503f09
docs: replace benchmarks with improved methodology (v2)
Apr 17, 2026
c4e181b
docs: add interactive action benchmark with per-node profiling
Apr 17, 2026
8ad85a2
perf: optimize _find_switches() — 2.5x faster gridding
Apr 17, 2026
ac573f5
docs: add real experimental data benchmark
Apr 17, 2026
d95db0e
perf: optimize inspectr — lazy snapshot, incremental refresh
Apr 20, 2026
b67ffaa
perf: fast DB loading via load_by_id, bypass experiments enumeration
Apr 20, 2026
a3dbc54
feat: add loading progress and status text to inspectr
Apr 20, 2026
2d5c271
fix: overlay text stuck after reloading same DB file
Apr 20, 2026
78a642b
fix: collapse info pane by default, enable smooth scrolling
Apr 20, 2026
307b7a4
feat: add plot backend selector to inspectr toolbar
Apr 20, 2026
ee4421c
perf: add fast SQL-based DB overview, wider default window
Apr 20, 2026
2de2ef0
feat: grid layout for pyqtgraph subplots (matching matplotlib)
Apr 20, 2026
7c2dd4a
feat: scrollable plot area toggle for both backends
Apr 20, 2026
c663ec7
fix: reduce matplotlib scrollable min height to 100px per row
Apr 20, 2026
497f707
feat: scrollable off by default, add min height spinbox
Apr 20, 2026
f948fb6
docs: reorganize PERFORMANCE_PLAN.md into implemented vs future
Apr 20, 2026
913c672
fix: add plotWidgetClass parameter to autoplotQcodesDataset
Apr 20, 2026
d98a72a
refactor: extract hint constant, clean up backend selector mapping
Apr 20, 2026
4428cf1
fix: address code review — timestamps, connections, dead code
Apr 20, 2026
0ffe850
feat: LaTeX-to-HTML conversion for pyqtgraph plot labels
Apr 20, 2026
7cb6a20
fix: use HTML sub/sup instead of Unicode subscript/superscript
Apr 20, 2026
2c92484
fix: only apply LaTeX conversion when actual LaTeX syntax is present
Apr 20, 2026
9e63038
fix: resolve all mypy errors and add hypothesis to test deps
Apr 21, 2026
6d141bf
fix: mypy cross-stubs compat (PyQt5-stubs in CI vs PyQt6 locally)
Apr 21, 2026
6446f92
fix: update qcodes import to current public API, remove stale ignore
Apr 21, 2026
2513c11
fix: address review — rettype assert, scaleunits import, mypy config
Apr 21, 2026
f7ea409
fix: address PR review — 6 Copilot review comments
Apr 21, 2026
9e99ef0
docs: add real-data profiling results (large complex 2D dataset)
Apr 30, 2026
0ddb6e7
perf: fix is_invalid() for numeric dtypes, default to metadataShape
Apr 30, 2026
758b05d
perf: is_invalid 44x faster, fix mpl double-replot, label() skip vali…
Apr 30, 2026
c59fa61
docs: update plan with implemented fixes and backend comparison
Apr 30, 2026
d19ebf0
fix: records counter, is_invalid 44x, mpl double-replot, label skip v…
May 1, 2026
9c65fe1
fix: reset imagData flag, equal grid stretch for aspect ratio
May 1, 2026
4997b5d
fix: regressions + feat: selection buttons
May 1, 2026
97a4051
fix: recursion in setSelectedData, records from shapes, button layout
May 1, 2026
d382ec1
chore: remove unused imports from test_regressions
May 1, 2026
d6fb45d
chore: remove data file references, use tmp_path in all tests
May 1, 2026
6e3ca69
fix: mpl blank plot, pyqtgraph grid resize; reorganize tests
May 1, 2026
0b9a388
fix: deselect-all clears plots, pyqtgraph min size for font warning
May 1, 2026
5969ec6
feat: mpl colormap selector, pyqtgraph complex mode tests
May 1, 2026
d68b296
fix: pyqtgraph image axis orientation, deselect-all UX
May 1, 2026
68681cd
fix: rename deselect-all to 'Select first only' button
May 1, 2026
2efa69e
ci: trigger CI after history rewrite
May 1, 2026
e6a31e1
Merge remote-tracking branch 'origin/master' into perf/datadict-copy-…
May 1, 2026
8ec0349
chore: remove inspectr from mypy warn_unused_ignores override
May 1, 2026
15f7b3a
fix: use plottr Qt imports instead of PyQt6 in tests
May 1, 2026
99122f9
fix: ParamSpecBase import compat for older qcodes in CI
May 1, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
295 changes: 295 additions & 0 deletions PERFORMANCE_PLAN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,295 @@
# Plottr Performance & UX Improvements

This document summarizes the changes in this PR, the profiling that motivated them,
and suggestions for future work.

---

## Part 1: Implemented — Pipeline Performance (datadict, nodes, gridding)

### Problem

Plottr's data pipeline copied data excessively as it flowed through nodes. Each node
defensively deep-copied all data, and internal methods (`structure()`, `validate()`,
`copy()`) added further redundant copies. For a 100x100x100 MeshgridDataDict (~38 MB),
a single `copy()` took 92 ms and `validate()` took 43 ms.

### What Changed

**`plottr/data/datadict.py`** (core data container):
- New `_copy_field()` helper with per-key copy semantics: numpy `.copy()` for arrays,
`list()` for axes, `deepcopy` only for mutable metadata
- Rewrote `copy(deep=True/False)` — no longer chains through `structure()` → `validate()`
→ `deepcopy`. New `deep=False` shares arrays (xarray-style API, backward compatible)
- `_build_structure()` private helper that skips redundant validation
- `MeshgridDataDict.validate()` monotonicity check: replaced `np.unique(np.sign(np.diff(...)))`
with direct min/max checks — same coverage, no sort/allocate
- `mask_invalid()` fast-path: skips masking entirely when data has no invalid entries
- `shapes()` uses `np.shape()` instead of `np.array(...).shape`
- `datasets_are_equal()` shape short-circuit + set-based comparison
- `remove_invalid_entries()` fixed O(n²) `np.append` pattern + fixed crash on inhomogeneous arrays
- `meshgrid_to_datadict()` / `datadict_to_dataframe()`: `ravel()` instead of `flatten()`

**`plottr/utils/num.py`** (numerical utilities):
- `largest_numtype()`: dtype check instead of iterating every element as Python object (~15,000× faster)
- `is_invalid()`: skip zero-array allocation for non-float types
- `guess_grid_from_sweep_direction()`: convert with `np.asarray()` once instead of 4×
- `_find_switches()`: compute `is_invalid()` once (was 3×), single `np.percentile([lo,hi])` call
(was 2 separate sorts), vectorized boolean filter, `np.nanmean` for NaN-safe sweep direction

**`plottr/node/node.py`**: Defer `structure()` call to only when structure actually changes (50× faster steady-state)

**`plottr/node/dim_reducer.py`**: Removed redundant `copy()` in `XYSelector.process()`

**`plottr/node/grid.py`**: Pass `copy=False` to `datadict_to_meshgrid()` since gridder already copies input

**`plottr/plot/base.py`**: `dataclasses.replace` instead of `deepcopy` for complex plot splitting

### Bugs Fixed
- `copy()` now properly deep-copies global mutable metadata (was sharing references)
- `remove_invalid_entries()` no longer crashes when dependents have different numbers of invalid entries

### Benchmark Results

**Micro-benchmarks (key functions):**

| Function | Before | After | Speedup |
|---|---|---|---|
| `largest_numtype` (500K float) | 29.8 ms | 0.002 ms | ~15,000× |
| `mesh_500k_copy()` | 42.2 ms | 2.9 ms | 14.8× |
| `node_process` (500K mesh, steady state) | 7.4 ms | 0.15 ms | 50× |
| `_find_switches` (640K pts) | 80 ms | 31 ms | 2.6× |
| `datadict_to_meshgrid` (640K pts) | 175 ms | 71 ms | 2.5× |
| `mesh_500k_validate()` | 20.5 ms | 14.1 ms | 1.5× |

**Real experimental data (large qcodes database, steady-state refresh):**

| Dataset | Data Size | Before | After | Speedup |
|---|---|---|---|---|
| QDstability (14400×251, 16 deps) | 223 MB | 555 ms | 189 ms | 2.93× |
| TopogapStage2 (41×33×5×81, 21 deps) | 152 MB | 439 ms | 161 ms | 2.73× |
| QDtuning (7440×121, 16 deps) | 14 MB | 31 ms | 11 ms | 2.73× |

**Interactive actions (simulated user operations on large datasets):**

| Action | Before | After | Speedup |
|---|---|---|---|
| Toggle subtract average (15 MB 2D) | 293 ms | 29 ms | 10.2× |
| Swap XY axes (18 MB 2D) | 790 ms | 241 ms | 3.3× |
| Switch dependent (61 MB 1D) | 2,287 ms | 977 ms | 2.3× |
| Data refresh (15 MB 2D) | 697 ms | 199 ms | 3.5× |

### Tests Added

221 new tests across 4 test files:
- `test_datadict_copy_semantics.py` — copy isolation, edge cases, pipeline integrity
- `test_pipeline_coverage.py` — per-node tests, hypothesis property-based, various dtypes
- `test_round2_optimizations.py` — is_invalid, largest_numtype, remove_invalid_entries
- `test_gridder_comprehensive.py` — all GridOption paths, shapes, edge cases

---

## Part 2: Implemented — Inspectr Loading & UX

### Problem

Opening a large QCoDeS database (1496 runs) in inspectr took 15+ minutes because the
`experiments()` + `data_sets()` enumeration in QCoDeS is O(N²). Clicking any dataset
froze the UI for ~1 second while the snapshot (up to 6 MB of JSON) was parsed into
thousands of tree widget items.

### What Changed

**Fast database overview** (`plottr/data/qcodes_db_overview.py`, new module):
- Single SQL JOIN query fetching run metadata directly from runs + experiments tables
- Skips snapshot and run_description blobs entirely
- Reads `inspectr_tag` directly as a column from the runs table
- Intended for eventual contribution to QCoDeS

**Lazy snapshot loading** (`plottr/apps/inspectr.py`):
- Snapshot tree built only when user expands the "QCoDeS Snapshot" section
- Info pane sections collapsed by default
- Smooth pixel-based scrolling for tall rows (e.g., exception tracebacks)

**Incremental refresh**:
- `refreshDB()` only loads runs newer than the last known run_id
- Merges incremental results into existing dataframe

**Loading UX**:
- Live progress indicator: "Loading database... (142/1496 datasets)"
- Contextual messages: "Select a date...", "No datasets found...", "No datasets match filter..."
- Wider default window (960×640)

**Fallback chain**: SQL direct → `load_by_id` loop → original `experiments()` API

### Benchmark

| Approach | 23 runs | 1496 runs (projected) |
|---|---|---|
| Old (experiments + data_sets) | 103 ms | 15+ minutes |
| load_by_id loop | 90 ms | ~5 seconds |
| **SQL direct** (new) | **14 ms** | **~10 ms** |
| Incremental (3 new runs) | - | **~4 ms** |

Snapshot click: 951 ms → 0.3 ms (3,554× faster)

---

## Part 3: Implemented — Plot UI Improvements

### What Changed

**Grid layout for pyqtgraph subplots** (`plottr/plot/pyqtgraph/autoplot.py`):
- Replaced single-column `QSplitter` with `QGridLayout` using near-square grid
(same formula as matplotlib: `nrows = int(n^0.5 + 0.5)`)
- Many subplots now arrange as 2×2, 2×3, 4×4 etc. instead of stacking vertically

**Scrollable plot area** (both backends):
- "Scrollable" checkbox + min-height spinbox in the plot toolbar
- Off by default; when enabled, plot area expands and becomes scrollable
- Min height per row configurable (40–2000 px, default 75 px pyqtgraph / 100 px mpl)

**Plot backend selector** (`plottr/apps/inspectr.py`):
- Combo box in inspectr toolbar to switch between matplotlib and pyqtgraph
- Default: matplotlib. Applies to newly opened plot windows.

---

## Part 4: Not Implemented — Future Suggestions

These were identified during analysis but not implemented in this PR.

### HDF5 Data Loading (datadict_storage.py)
- Lines 274 and 305 read the **entire HDF5 dataset into memory** just to get its shape
- Fix: `ds.shape` instead of `ds[:].shape` — would reduce load time by 50–80%

### Signal Emission Overhead (node.py)
- Up to 7 Qt signals emitted per node per data update
- `dataFieldsChanged` is redundant (axes + deps)
- Could consolidate to 1–2 batched signals

### Fitter / Histogrammer / ScaleUnits Memoization
- These nodes recompute results on every update even when inputs haven't changed
- Could cache results keyed on data hash + parameters

### Pipeline Change Detection
- No concept of "what changed" — every update re-processes all data through all nodes
- For append-only monitoring, nodes could process only new data

### QCoDeS API Suggestion
The ideal API for inspectr would be a single function returning lightweight run metadata
for all or a range of runs without creating full DataSet objects:
```python
get_run_overview(conn, start_id=None, end_id=None)
# Returns: [{run_id, exp_name, sample_name, name, timestamps, guid, result_counter, metadata_keys}]
```
This would be a single SQL query completing in <1 ms for any database size.

---

## Part 5: Profiling with Real Data (963×1001 complex RF measurement)

Profiled using a real 963×1001 complex128 2D gate-gate sweep measurement
(~12.5 MB on disk, ~15 MB in memory as complex128).

### Timing Summary

| Operation | Time (ms) | Notes |
|---|---|---|
| `ds_to_datadict` (first call) | 2,588 | 1,500 ms is xarray/cf_xarray import (one-time) |
| `ds_to_datadict` (steady state) | 999 | qcodes SQLite → numpy deserialization |
| `datadict_to_meshgrid` | 122 | `guess_grid_from_sweep_direction` dominates |
| Pipeline steady state (sel+grid) | 51 | Per re-trigger with same data |
| Switch dependent variable | 172 | selector + gridding + pyqtgraph `eq()` |
| Complex: real only | 8.5 | `copy()` + `.real.copy()` |
| Complex: real+imag | 11.6 | `copy()` + `.real` + `.imag` |
| Complex: mag+phase | 30.8 | `copy()` + `np.abs()` + `np.angle()` |
| `copy()` deep | 5.1 | Already fast after our optimization |
| `copy()` shallow | 0.1 | Zero-copy array sharing |
| `validate()` | 0.2 | Already fast |
| `structure()` | 0.4 | Already fast |
| `is_invalid()` on 963k complex | 44.6 | **`a == None` comparison is 44× slower than `np.isnan`** |
| `np.isnan()` on 963k complex | 1.0 | What `is_invalid` should use for numeric dtypes |

### Bottleneck Analysis

#### 1. `is_invalid()` — 44× slower than needed (LOW-HANGING FRUIT)

The current implementation does `a == None` for all arrays, which triggers Python object
comparison on every element. For numeric arrays (float/complex), this is always `False`
and is pure waste. Replacing with `np.isnan()` directly for numeric dtypes would cut
`is_invalid` from 44.6 ms → ~1 ms.

This cascades through `_find_switches()` (which calls `is_invalid` on each 963k-element
axis), making `datadict_to_meshgrid` ~90 ms faster.

**Fix**: In `is_invalid()`, check dtype first — if it's a numeric type, skip the `== None`
check entirely and return just `np.isnan(a)`.

#### 2. `ds_to_datadict()` — 999 ms steady state (MEDIUM EFFORT)

The qcodes `DataSetCacheDeferred` loads data via xarray round-trip. The actual SQLite
read + numpy deserialization (`_convert_array` → `numpy.read_array` → `ast.literal_eval`
for headers) takes ~1 second for 963k × 3 parameters.

This is largely inside qcodes, so fixes would be upstream. However, plottr could:
- Cache the loaded DataDict and skip reload when the dataset hasn't changed
- Use `load_by_id(...).cache.data()` directly instead of going through `ds_to_datadict`
which re-wraps the data
- For completed datasets (known from metadata), cache the DataDict permanently

#### 3. `datadict_to_meshgrid` with `guessShape` — 122 ms (AVOIDABLE)

When shape metadata exists in the QCodes `RunDescriber` (this dataset has
`shapes={'rf_wrapper_ch6_Vrf_6': (1001, 1001)}`), the gridder should use
`GridOption.metadataShape` and skip the expensive `guess_grid_from_sweep_direction`.

The autoplot code already does this (`autoplot.py:298`), but the grid widget default
is `noGrid`, so if the user starts from the widget rather than autoplot, they get
`guessShape` which runs the full sweep-direction analysis on every re-trigger.

**Fix**: Default the grid widget to `metadataShape` when shape metadata is available.

#### 4. `np.abs()` + `np.angle()` for complex mag+phase — 30.8 ms (INHERENT)

This is inherent computational cost for computing magnitude and phase of 963k complex128
values. Not much to optimize here, but could be deferred (only compute when the plot
backend actually needs to render).

#### 5. pyqtgraph `Terminal.setValue` → `eq()` — 12 ms per node (MEDIUM)

pyqtgraph's flowchart compares old and new terminal values using a recursive `eq()`
function. For large DataDicts this recurses into all arrays and does element-wise
comparison. This adds ~24 ms per pipeline trigger (12 ms per node, 2 nodes).

**Fix**: Override `eq()` on DataDictBase to do a cheap identity or shape check
instead of element-wise comparison, or set terminal values without comparison.

### Suggested Priority (remaining)

Items 1, 2, and 6 have been implemented. Remaining potential improvements:

1. ~~**Fix `is_invalid()`**~~ ✅ Done — 44x faster (44.6ms → 1.0ms)
2. ~~**Default to `metadataShape`**~~ ✅ Done — avoids 122ms gridding when shape metadata exists
3. **Cache loaded DataDict** for completed datasets — avoids 999 ms reload on each refresh
4. **Override pyqtgraph `eq()`** for DataDictBase — saves ~24 ms per pipeline trigger
5. **Lazy complex splitting** — compute mag/phase only when needed by the plot backend
6. ~~**Fix mpl double-replot**~~ ✅ Done — ~20% faster mpl steady-state (919ms → 754ms)
7. **Matplotlib artist-level updates** — Instead of `fig.clear()` + full recreation on every
`setData()`, reuse existing Line2D/QuadMesh/colorbar artists and update their data.
The pyqtgraph backend already does this via `clearWidget=False`; bringing the same
pattern to mpl could reduce steady-state replot from ~750ms to ~200ms.

### Backend Comparison After Optimizations (963×1001 complex128)

| Operation | matplotlib | pyqtgraph |
|---|---|---|
| First plot | 1,428 ms | 175 ms |
| Steady replot | 754 ms | 80 ms |
| Complex real | 394 ms | 118 ms |
| Complex realAndImag | 687 ms | 114 ms |
| Complex magAndPhase | 730 ms | 108 ms |

The pyqtgraph backend is ~10x faster for steady-state replots because it reuses
plot widget objects when only data changes. The matplotlib backend's remaining
cost is dominated by `fig.clear()` + subplot/artist recreation + agg rendering.
20 changes: 9 additions & 11 deletions plottr/apps/autoplot.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@
import time
import argparse
from typing import Union, Tuple, Optional, Type, List, Any, Type
from packaging import version

from .. import QtCore, Flowchart, Signal, Slot, QtWidgets, QtGui
from .. import log as plottrlog
Expand Down Expand Up @@ -249,7 +248,10 @@ def setDefaults(self, data: DataDictBase) -> None:

try:
self.fc.nodes()['Data selection'].selectedData = selected
self.fc.nodes()['Grid'].grid = GridOption.guessShape, {}
if data.meta_val('qcodes_shape') is not None:
self.fc.nodes()['Grid'].grid = GridOption.metadataShape, {}
else:
self.fc.nodes()['Grid'].grid = GridOption.guessShape, {}
self.fc.nodes()['Dimension assignment'].dimensionRoles = drs
# FIXME: this is maybe a bit excessive, but trying to set all the defaults
# like this can result in many types of errors.
Expand Down Expand Up @@ -291,17 +293,12 @@ def __init__(self, fc: Flowchart,

def setDefaults(self, data: DataDictBase) -> None:
super().setDefaults(data)
import qcodes as qc
qcodes_support = (version.parse(qc.__version__) >=
version.parse("0.20.0"))
if data.meta_val('qcodes_shape') is not None and qcodes_support:
self.fc.nodes()['Grid'].grid = GridOption.metadataShape, {}
else:
self.fc.nodes()['Grid'].grid = GridOption.guessShape, {}



def autoplotQcodesDataset(log: bool = False,
pathAndId: Union[Tuple[str, int], None] = None) \
pathAndId: Union[Tuple[str, int], None] = None,
plotWidgetClass: Optional[Type[PlotWidget]] = None) \
-> Tuple[Flowchart, QCAutoPlotMainWindow]:
"""
Sets up a simple flowchart consisting of a data selector,
Expand Down Expand Up @@ -331,7 +328,8 @@ def autoplotQcodesDataset(log: bool = False,
win = QCAutoPlotMainWindow(fc, pathAndId=pathAndId,
widgetOptions=widgetOptions,
monitor=True,
loaderName='Data loader')
loaderName='Data loader',
plotWidgetClass=plotWidgetClass)
win.show()

return fc, win
Expand Down
Loading