feat: cluster sequences into groups + propagation on validation#132
Merged
Conversation
Recurring real-world entity at one camera angle (a persistent fire, a recurring antenna FP, ...). A group carries at most one label (smoke OR false positive, never both, enforced by a CHECK constraint), and the representative_bbox lives on the group itself so the group remains self-defining if all its members are eventually pruned. Sequences gain a nullable sequence_group_id FK with ON DELETE SET NULL. Membership is set by the assign-groups job, not on import.
GET returns the group + its members (lightweight projection with has_annotation flag for the UI's "already-annotated" hint). POST /sequence_groups/assign is the single-threaded, idempotent batch that turns unassigned sequences into group memberships: - compute representative_bbox from the sequence's first 10 detections (median of algo_predictions, ignoring others_bboxes — matches the current auto-annotation flow) - best-IoU match against existing groups for the same (camera_id, azimuth), threshold > 0.5 (stricter than within-sequence clustering since a wrong match auto-applies inherited labels) - if no match: create a new group; otherwise join the existing one and, if the group already has a label, create a SequenceAnnotation with inherited labels in stage SEQ_ANNOTATION_DONE
Apply one label (smoke OR false positive, never both) to many sequences in a single request. Skips sequences whose annotation is past SEQ_ANNOTATION_DONE so reviewed work isn't clobbered, and rejects with 409 when the target group already carries a different label unless the caller passes force=true. When group_id is provided, the endpoint also writes the label onto the group itself so future joiners inherit it via assign-groups. Returns per-sequence applied/skipped status with reasons.
…ences Thin wrapper that auths against the local annotation API and calls POST /sequence_groups/assign once. Single-threaded by contract — the endpoint is the only writer of sequence_groups, so we never have two ingesters racing to create the same group. make pull-sequences now runs assign-groups automatically after each import so newly-pulled sequences are immediately clustered (and inherit labels from existing groups when applicable).
New route /sequence-groups/:id/annotate. Thumbnail grid + per-member checkbox + label form. Defaults to selecting every member that doesn't already have an annotation, so the annotator just unchecks outliers and applies. The label form is a radio (smoke vs false positive) plus a single dropdown to pick the enum value, mirroring the API constraint that exactly one of the two is set. Submit posts to /annotations/sequences/bulk with the explicit sequence_ids list and group_id, then surfaces the per-sequence applied/skipped status returned by the backend. Conflict overwrite is gated on a "Overwrite group's existing label" checkbox surfaced only when the group is already labeled.
- assign-groups creates a new group from an unmatched sequence - bulk-annotate writes the label both onto the sequences and onto the group (so future joiners inherit it) - bulk-annotate rejects a conflicting label on an already-labeled group with 409, accepts it with force=true - request validation rejects payloads with neither or both labels
- migration header: align Revises comment with the actual down_revision
(a1b2c3d4e5f6, the others_bboxes migration)
- shorten the XOR check constraint to its equivalent positive form
(smoke_type IS NULL OR false_positive_type IS NULL)
- delete the unused SequenceGroupLabelUpdate schema and the unused
module-level logger
- merge the two duplicated label-rewriting helpers into
apply_label_to_sequences_bbox in services/annotation_generation.py
- add UNDER_ANNOTATION to the bulk-annotate locked stages so active
human work isn't clobbered
- only write the label onto the group when at least one sequence was
actually applied; otherwise the group would carry a label that
never reached any current member
- frontend bulk-error display reads the API's detail field with the
Error.message fallback (axios rejects with { detail })
- test docstring: remove the unsupported claim about cross-sequence
inheritance coverage (tested end-to-end via the make pipeline)
- new GET /sequence_groups/ paginated endpoint with member_count and
?labeled=true|false filter, ordered by created_at desc
- GET /sequence_groups/{id} now returns each member's first detection id
and its algo_predictions, so the UI can render a thumbnail with bbox
overlays without an extra round-trip per member
- new "Sequence groups" entry in the left sidebar pointing at
/sequence-groups (the index page) so groups are discoverable from the
navigation
- new SequenceGroupsListPage: paginated table of all groups with
members count, label state, and a filter (all / labeled / unlabeled)
- SequenceGroupAnnotatePage now overlays two bboxes on each thumbnail:
- the sequence's own tracked predictions (red, solid)
- the group's reference region (yellow, dashed) — same on every
thumbnail so the annotator can eyeball whether each member really
overlaps the group
- thumbnails consume the new first_detection_id + algo_predictions
inlined in the group response, removing the previous N+1 query
The list query returns rows containing the JSONB representative_bbox (a dict), which is not hashable. fastapi-pagination then refuses to deduplicate the rows and raises NonHashableRowsException — surfaced to the UI as 'Failed to load groups'. Disabling row deduplication is safe here since we already group by SequenceGroup.id.
Pivot the group review UX: the per-group page no longer applies labels
itself. The annotator clicks through to the regular per-sequence
annotation page; if the group has been marked "validated", the labels
they save there fan out to every other unannotated member of the group.
- new sequence_groups.is_validated column (separate migration)
- new PATCH /sequence_groups/{id} to flip is_validated
- new DELETE /sequence_groups/{group_id}/members/{seq_id} to remove an
outlier from a group without deleting the sequence
- propagation hook in POST/PATCH /annotations/sequences: when an
annotation reaches SEQ_ANNOTATION_DONE and the seq belongs to a
validated group, derive a single label from the annotation
(most-common smoke type or fp type), copy it onto the group, and
re-generate matching annotations for the other members
- skip locked stages (UNDER_ANNOTATION, SEQ_ANNOTATION_DONE,
IN_REVIEW, NEEDS_MANUAL, ANNOTATED) so manual work isn't clobbered
- frontend: drop the label form on /sequence-groups/:id/annotate;
thumbnails are now clickable links to /sequences/:id/annotate, each
has an X to remove that member, and the header gets a
Validate / Unvalidate toggle. List page surfaces is_validated.
…placeholder import.py's Step 3 creates an empty SequenceAnnotation in stage READY_TO_ANNOTATE for every imported sequence. assign-groups previously treated 'annotation already exists' as 'skip', so a fresh sequence that joined a labeled group never picked up the label automatically. Now the inheritance path checks the existing annotation's stage: - READY_TO_ANNOTATE (the placeholder) -> update it in place with the inherited labels and bump to SEQ_ANNOTATION_DONE - anything past that (UNDER_ANNOTATION, SEQ_ANNOTATION_DONE+) -> skip; the human / review pipeline owns it
R&D predicted 0.5 was too strict and would miss most natural smoke drift. Live data confirms it: at 0.5 only 14% of new sequences joined an existing group; at 0.3 the rate is 30% (matching the R&D estimate of ~46% click savings).
The /sequence_groups/ list page is for finding groups worth bulk- annotating; size-1 groups can't benefit from validation + propagation. Filter them out at the SQL level (HAVING count >= 2 + inner-join on the member-count subquery) so they don't pollute the page. The single-group GET endpoint and the assign-groups job are unaffected; singleton groups still exist in the DB and a future joiner can promote them to a multi-member group.
- assign-groups docstring: clarify that every unassigned sequence is assigned to a group (not just unannotated ones); only the inheritance step is gated on annotation stage - frontend thumbnail: switch from object-cover to object-contain so the bbox overlays don't get pushed off the visible image when an image's aspect ratio differs from 16:9 - drop the unused bulkAnnotateSequences client method, the Bulk* TS types, and the SEQUENCE_ANNOTATIONS_BULK constant — the per-sequence propagation path replaced this surface for the UI; the backend endpoint stays as a programmatic primitive - list page caption: replace the misleading 'bulk-annotate' wording with the actual flow (annotate one member → propagation if validated) - update model.py docstring to reference the current IoU > 0.3 threshold instead of the previous 0.5
…bnails - Move 'Sequence groups' to the top of the left sidebar (above Sequences/Detections) since it's now the primary workflow entry - Default the groups list filter to 'unlabeled' — labeled groups are done work, the point of opening this page is to find the next thing to validate + annotate - Increase thumbnail size: 1/2/3 columns instead of 2/3/5; thumbnails now occupy enough area to actually see what's in the image and judge whether the bbox overlay matches
- bulk-annotate: rewrite the 409 conflict message to spell out that
force=true only overwrites the group's label and re-propagation to
unlocked members, not annotations past SEQ_ANNOTATION_DONE
- group propagation: refuse to silently flip an existing group label
when a member's annotation implies a different one; log a warning and
leave the group alone instead (the per-seq annotation still saves)
- GET /sequence_groups/{id}: replace the has_annotation boolean by the
raw annotation_processing_stage so the UI can distinguish import.py's
READY_TO_ANNOTATE placeholder from real human work
- first-detection subquery: switch to row_number() with a deterministic
tie-breaker so members can't duplicate on equal recorded_at
- dedupe _bbox_iou by reusing services.annotation_generation.box_iou
- move the previously inline SequenceAnnotationUpdate import to the top
of sequence_groups.py
- frontend: revert thumbnails to object-cover (pyro images are 16:9,
cover matches container exactly) and route the annotated indicator
through the new processing-stage signal so READY_TO_ANNOTATE no
longer shows as "annotated"
- 'remove from group' now sticks: new Sequence.is_group_excluded
boolean (migration d4e5f6a7b8c9). DELETE /sequence_groups/{id}/members
sets it; assign_groups filters it out so the next import doesn't
silently re-attach a sequence the annotator pruned.
- Group-propagation conflicts surface to the caller: when fan-out is
skipped because the group already carries a different label, the
per-sequence annotation response now includes group_propagation_warning
with a human-readable reason. UI can show a toast; the annotation
itself still saves.
- Drop the misleading 'in a single transaction' phrasing on the bulk
endpoint summary. Wording is now 'per-sequence commits' so callers
know retries are idempotent but the loop is not atomic.
Backend already returned the warning on /annotations/sequences POST + PATCH since the previous fix; the React save handler was ignoring the mutation result and always showed 'Annotation saved successfully'. Annotators in a validated group with a conflicting label would never see that propagation was skipped. - Add group_propagation_warning to the SequenceAnnotation TypeScript type - In AnnotationInterface's update mutation onSuccess, when the saved annotation carries a non-null group_propagation_warning, show an info toast with the backend message in addition to the success toast
The previous attempt called showToastNotification twice and let the
1-second auto-advance run anyway, so:
- useToastNotifications holds only one message — the second call
overwrote the first.
- The setTimeout either replaced the warning with a 'Moving to next'
info toast or navigated away from the page entirely.
Now: when the backend returns group_propagation_warning we render a
sticky amber banner at the top of the page and skip the auto-advance.
The banner carries the backend message + a 'Open group' link
(/sequence-groups/<id>/annotate) when the sequence belongs to a group,
plus a Dismiss button. The annotator can reconcile the conflict before
moving on.
- Reset the sticky banner whenever sequenceId changes so it doesn't bleed from one sequence onto the next, and clear it on a subsequent non-warning save so a resolved conflict disappears immediately. - Move the banner from a fixed top-0 z-50 element (which overlaid the fixed AnnotationHeader) to a sticky top-20 z-30 element inside the body content. It now sits just below the header and follows the user as they scroll, without obscuring header controls. - Add Sequence.sequence_group_id to the frontend TypeScript type so the 'Open group' link no longer needs an unsafe in-keyword cast. - Swap the raw <a href> for react-router-dom Link so navigating away from a conflict doesn't trigger a full app reload.
Higher priority:
- Test the propagation hook explicitly. Four new cases cover:
unvalidated group (no-op), validated + no conflict (group label set
and fan-out happens), validated + conflict (warning returned, group
untouched, no fan-out), and validated + locked member (reviewed
member is left alone).
- Add a recovery path for manual exclusion: new endpoint
POST /sequence_groups/members/{sequence_id}/re-include clears
is_group_excluded so an accidentally-removed sequence can be put
back into the pool. The remove confirm dialog now also warns that
the exclusion is sticky.
Smaller:
- Make the inconsistent bare return in _propagate_to_group_if_validated
an explicit return None, matching the function's annotated return
type.
- Clamp confidence in the computed representative_bbox to [0, 1] so a
malformed detection can't make a group fail RepresentativeBbox
validation on the next read.
- Cross-reference the locked-stage set between
ANNOTATED_STAGES (frontend) and _BULK_LOCKED_STAGES (backend) so
future changes touch both lists together.
- Make the empty-state copy on the groups list page actually useful
for the default 'unlabeled' filter — mentions that singletons are
hidden by design and points at make assign-groups.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Most platform sequences are recurring views of the same camera/azimuth/region. R&D on 857 sequences shows ~46% of annotation clicks can be avoided by clustering and reusing one annotator's labels across the group.
What
sequence_groupstable keyed on(camera_id, azimuth)with a frozenrepresentative_bbox. At most one label per group (smoke OR false positive, enforced by CHECK). Groups carry anis_validatedflag.POST /sequence_groups/assign+make assign-groups: best-IoU match (>0.3) on the camera/azimuth bucket. New sequences joining an already-labeled group inherit its label automatically (upgrade the import.pyREADY_TO_ANNOTATEplaceholder toSEQ_ANNOTATION_DONE). Chained intomake pull-sequences.SEQ_ANNOTATION_DONEand the seq is in a validated group, the labels fan out to other unlocked members. Refuses to silently flip a conflicting group label.Out of scope / known follow-ups