Skip to content

Prevent broken transactions when autonumbering#7671

Merged
melton-jason merged 27 commits intov7.11.4-prereleasefrom
issue-6490-patch
Feb 4, 2026
Merged

Prevent broken transactions when autonumbering#7671
melton-jason merged 27 commits intov7.11.4-prereleasefrom
issue-6490-patch

Conversation

@melton-jason
Copy link
Contributor

@melton-jason melton-jason commented Jan 26, 2026

Replaces #5404
Fixes #4148, #7560, #4894
Addresses part of #5337

Checklist

  • Self-review the PR after opening it to make sure the changes look good and
    self-explanatory (or properly documented)
  • Add relevant issue to release milestone
  • Add pr to documentation list
  • Add automated tests

This branch should functionally be the equivalent of #7455, based on v7.11.3 instead of main.

The Pull Request seeks to address an issue related to the prior autonumbering code, where database transactions could become broken due to a LOCK TABLES statement within transactions:

Because LOCK TABLES implicitly commits any current transactions (see aforementioned MariaDB docs), wouldn't autonumbering implicitly commit the Django transaction?
If this is true, then the transaction state that Django works with in outer transaction.atomic() blocks would be broken/inconsistent: the transaction would already be committed as soon as the tables are locked for autonumbering.

See #6490 (comment)

Currently in main, this autonumbering behavior with the WorkBench can result in a functionally complete lockout of the database, preventing other connections from reading tables like Discipline and Collection until the WorkBench operation finishes.

Below is a video of the issue taken in v7.11.3 :

v7_11_3_wb_issue.mov

Below is with the changes in this branch:

v7__11_3_wb_fix.mov

(Developer-Focused) New Internal Autonumbering Design

Specify now uses transaction-independent User Advisory Locks to determine which session is currently using autonumbering for a particular table. If no other sessions currently have the autonumbering lock for a particular table, a connection will acquire the lock.
If a record attempts to autonumber a field that is being autonumbered by another session (e.g. another session holds the autonumbering lock), Specify will wait up to 10 seconds for the prior connection to release its lock. If the lock is not released by that time, then Specify will error out of the operation.

To maintain autonumbering behavior for concurrent actions where there is a long-running transaction also doing autonumbering (such as when a long-running WorkBench operation is ongoing), Specify uses Redis (for IPC) to store the currently highest autonumbering behavior for a particular formatter.
When a sessions holds an AutoNumbering lock for a particular table, it checks this store for the highest automumber and compares it to the highest value the session can see in the database: using the higher of the two to determine the new highest value.
Thus, even with transactions that have a high Isolation Level, Specify is still internally committing the AutoNumbering values to the store that can be accessed by other sessions.

Roughly, this means that AutoNumbering acts at a similar isolation level to READ UNCOMMITTED within the application.

The core part of the implementation of this design is through the new LockDispatcher class:

class LockDispatcher:
def __init__(self, lock_prefix: str | None = None, case_sensitive_names=False):
db_name = getattr(settings, "DATABASE_NAME")
self.lock_prefix_parts: list[str] = [db_name]
if lock_prefix is not None:
self.lock_prefix_parts.append(lock_prefix)
self.case_sensitive_names = case_sensitive_names
self.locks: dict[str, Lock] = dict()
self.in_context = False
def close(self):
self.release_all()
def __enter__(self):
self.in_context = True
return self
def __exit__(self, exc_type, exc_val, exc_tb):
self.close()
self.in_context = False
def lock_name(self, *name_parts: str):
final_name = LOCK_NAME_SEPARATOR.join(
(*self.lock_prefix_parts, *name_parts))
return final_name.lower() if self.case_sensitive_names else final_name
@contextmanager
def lock_and_release(self, name: str, timeout: int = 5):
try:
yield self.acquire(name, timeout)
finally:
self.release(name)
def create_lock(self, name: str, timeout: int = 5):
lock_name = self.lock_name(name)
return Lock(lock_name, timeout)
def acquire(self, name: str, timeout: int = 5):
if self.locks.get(name) is not None:
return
lock = self.create_lock(name, timeout)
self.locks[name] = lock
return lock.acquire()
def release_all(self):
for lock_name in list(self.locks.keys()):
self.release(lock_name)
self.locks = dict()
def release(self, name: str):
lock = self.locks.pop(name, None)
if lock is None:
return
lock.release()

This class can be used as a context manager and handles acquiring and releasing User Advisory Locks:

with LockDispatcher() as lock_dispatcher:
    # acquire a new lock
    lock_dispatcher.acquire("some_lock_name", timeout=10)
    # if another connection already holds a particular lock, Specify will wait timeout
    # seconds for the lock to be released before raising a TimeoutError
    lock_dispatcher.acquire("other_lock", timeout=20)
    # locks can be released explicitly...
    lock_dispatcher.release("some_lock_name")
    # or are automatically released when the context manager code block is exited
    do_more_while_locked()
# all acquired locks are released at this point, regardless of whether an error occurred within 
# the context block
do_something()

The AutonumberingLockDispatcher builds the IPC (Redis) integration required for AutoNumbering on top of the base LockDispatcher class, which handles the database locks.

class AutonumberingLockDispatcher(LockDispatcher):
def __init__(self):
lock_prefix = "autonumbering"
super().__init__(lock_prefix=lock_prefix, case_sensitive_names=False)
# We use Redis for IPC, to maintain the current "highest" autonumbering
# value for each table + field
self.redis = RedisConnection(decode_responses=True)
# Before the records are created within a transaction, they're stored
# locally within this dictonary
# The whole dictonary can be committed to Redis via commit_highest
# The key hierarchy is generally:
# table -> field -> collection = "highest value"
self.highest_in_flight: MutableMapping[str, MutableMapping[str, MutableMapping[int, str]]] = defaultdict(
lambda: defaultdict(lambda: defaultdict(str)))
def __exit__(self, exc_type, exc_val, exc_tb):
super().__exit__(exc_type, exc_val, exc_tb)
def highest_stored_value(self, table_name: str, field_name: str, collection_id: int) -> str | None:
key_name = self.lock_name(
table_name, field_name, "highest", str(collection_id))
highest = RedisString(self.redis).get(key_name)
if isinstance(highest, bytes):
return highest.decode()
elif highest is None:
return None
return str(highest)
def cache_highest(self, table_name: str, field_name: str, collection_id: int, value: str):
self.highest_in_flight[table_name.lower(
)][field_name.lower()][collection_id] = value
def commit_highest(self):
for table_name, tables in self.highest_in_flight.items():
for field_name, fields in tables.items():
for collection, value in fields.items():
self.set_highest_value(
table_name, field_name, collection, value)
self.highest_in_flight.clear()
def set_highest_value(self, table_name: str, field_name: str, collection_id: int, value: str, time_to_live: int = 10):
key_name = self.lock_name(
table_name, field_name, "highest", str(collection_id))
RedisString(self.redis).set(key_name, value,
time_to_live, override_existing=True)

(User-Focused) New Autonumbering Behavior

Below is graph which simply describes the new behavior

flowchart TD
    A[WorkBench 001] --> B[WorkBench 002]
    B l1@-- waiting for DataEntry--> C(.)
    B -- concurrent record saved --> D(DataEntry 003)
    C l2@-- DataEntry finished--> E(WorkBench 004)
    D --> E

    classDef wb stroke:#f00
    classDef de stroke:#00f
    classDef dashed stroke-dasharray: 5 5
    class A,B,C,E wb;
    class D de;
    class l1,l2 dashed
Loading

All AutoNumbering operations are processed serially. This means if there's an ongoing Workbench operation that creates a record that is numbered 01, the next autonumbering value for any other session will be 02.
If the Workbench were to create another record, it would be numbered 03.

Testing instructions

Sample Database

To speed up setup for testing this Pull Request, i've set up a testing database that contains all resources required for testing:
Feel free to use the database!

User - spadmin
Password - testuser

Collections:

  • Fish
    • In the SCC Division
    • In the Ichthyology Discipline
    • Catalog Number of the format FSH-#########
    • One CollectionObject WorkBench Data Set
    • One Accession WorkBench Data Set
  • Plants
    • In the SCC Division
    • In the Botany Discipline
    • Catalog Number of the format #########
    • CO -> text2 of the format AAA-NNN-#########
      • Where A can be any letter and N can be any number
    • Locality -> text1 of the format #########
    • Two Collection Object WorkBench Data Sets
    • One Locality Data Set
  • Vascular Plants
  • In the SCC Division
  • In the Botany Discipline
  • CO -> text2 of the format AAA-NNN-#########
    • Where A can be any letter and N can be any number
  • Locality -> text1 of the format #########
  • One Collection Object WorkBench Data Set
  • Mammals
    • In the Test Division
    • In the Mammology Discipline
  • One Accession WorkBench Data Set

Where Accessions are scoped to Division:

issue-6490_scoped_accession_testing.sql.zip

Where Accessions are globally scoped:

issue-6490_global_accession_testing.sql.zip


General Concurrency with WorkBench

  • Start a WorkBench Upload or Validation operation on a sufficiently large Data Set (the operation needs to be in process while the below steps of General Concurrency are completed)
  • Open Specify in a new tab, window, or browser
  • Open a DataEntry form of the same table as the base table as the WorkBench Data Set, where one or fields have an autonumbering field format
  • Save the Data Entry record and ensure the record saves successfully
  • Reload the page and ensure Specify loads and still remains accessible

Concurrent Autonumbering with the WorkBench

  • Start a WorkBench Upload operation on a sufficiently large Data Set (the operation needs to be in process while some of the below steps of Concurrent Autonumbering with the WorkBench are completed) that contains two or more fields that will be autonumbered
  • Open Specify in a new tab, window, or browser
  • Open a DataEntry form of the same table as the base table as the WorkBench Data Set, where one or fields have an autonumbering field format
  • Save the Data Entry record and ensure the record saves successfully
  • Wait for the WorkBench Upload operation to complete
  • Ensure that the value for the autonumbered field from Data Entry is skipped in the WorkBench upload results.

Concurrent Autonumbering with Non Collection Scoping
Review Table Scoping Hiearchy for an overview on table scopes

With the Sample Database, this would be using the Locality Data Set in the Plants Collection with creating a new Locality in the Vascular Plants Collection

  • Start a WorkBench Upload operation on a sufficiently large Data Set (the operation needs to be in process while some of the below steps of Concurrent Autonumbering with Non Collection Scoping are completed) where the base table is not scoped to Collection that contains one or more fields that will be autonumbered
  • Open Specify in a new tab, window, or browser
  • Switch to a different Collection which is in the same scope as the records being uploaded
  • Open a DataEntry form of the same table as the base table as the WorkBench Data Set where one or more fields have an autonumbering field format
  • Save the Data Entry record and ensure the record saves successfully
  • Wait for the WorkBench Upload operation to complete
  • Ensure that the value for the autonumbered field from Data Entry is skipped in the WorkBench upload results.

Autonumbering with Variable Table Scoping

In a database with more than one Division where Accessions are scoped to Division

  • Start a WorkBench Upload operation on a sufficiently large Accession Data Set (the operation needs to be in process while some of the below steps of Concurrent Autonumbering with Non Collection Scoping are completed) where a field on Accession is being autonumbered
  • Open Specify in a new tab, window, or browser
  • Switch to a different Collection which is in a different Division as the records being uploaded
  • Open an Accession DataEntry form
  • Save the Accession and ensure the record saves successfully
  • Wait for the WorkBench Upload operation to complete
  • Ensure that the Accessions in different Divisions are independently auto-numbered (e.g., Accessions in different Divisions should not share the same autonumbering)

In a database with more than one Division where Accessions are globally scoped

  • Start a WorkBench Upload operation on a sufficiently large Accession Data Set (the operation needs to be in process while some of the below steps of Concurrent Autonumbering with Non Collection Scoping are completed) where a field on Accession is being autonumbered

  • Open Specify in a new tab, window, or browser

  • Switch to a different Collection which is in a different Division as the records being uploaded

  • Open an Accession DataEntry form

  • Save the Accession and ensure the record saves successfully

  • Wait for the WorkBench Upload operation to complete

  • Ensure that the value for the autonumbered field from Data Entry is skipped in the WorkBench upload results in the other Division

  • Generally ensure that autonumbering respects the scoping of Accessions (e.g, using DataEntry, create Accessions in different scopes and ensure they're autonumbered as expected)

General

  • Please also do general testing focused around autonumbering, especially with concurrent WorkBench uploads!

See #6490, #5337
Replaces #5404
Fixes #4148, #7560

This branch is largely the application of #7455 on `v7.11.3`
@grantfitzsimmons grantfitzsimmons added this to the 7.11.4 milestone Jan 28, 2026
@melton-jason melton-jason marked this pull request as ready for review January 29, 2026 18:20
@melton-jason melton-jason requested review from a team January 29, 2026 18:20
@melton-jason
Copy link
Contributor Author

There previously was an issue with the Concurrent Autonumbering with Non Collection Scoping testing step where records scoped higher than Collection would not "share" their expected autonumbering scheme
e.g., when using the Sample Database, autonumbering a Locality -> text1 while there was a concurrent WorkBench Upload also autonumbering Locality -> text1 records would insert a duplicate.

This issue has been fixed!
The concurrent Autonumbering should now always respect the table's scope

Copy link
Contributor

@kwhuber kwhuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General Concurrency with WorkBench

  • Save the Data Entry record and ensure the record saves successfully
  • Reload the page and ensure Specify loads and still remains accessible

Concurrent Autonumbering with the WorkBench

  • Save the Data Entry record and ensure the record saves successfully
  • Ensure that the value for the autonumbered field from Data Entry is skipped in the WorkBench upload results.

Concurrent Autonumbering with Non Collection Scoping

  • Save the Data Entry record and ensure the record saves successfully
  • Ensure that the value for the autonumbered field from Data Entry is skipped in the WorkBench upload results.

Copy link
Collaborator

@emenslin emenslin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Save the Data Entry record and ensure the record saves successfully
  • Reload the page and ensure Specify loads and still remains accessible
  • Save the Data Entry record and ensure the record saves successfully
  • Ensure that the value for the autonumbered field from Data Entry is skipped in the WorkBench upload results.
  • Save the Data Entry record and ensure the record saves successfully
  • Ensure that the value for the autonumbered field from Data Entry is skipped in the WorkBench upload results.

Everything looks good except I'm running into some issues with accession autonumbering and I'm not sure if the scoping is working properly. I'm not entirely sure if I am testing it right though so if there could be some clarification on the expected behavior and some specific testing instructions I would appreciate it.

@github-project-automation github-project-automation bot moved this from 📋Back Log to Dev Attention Needed in General Tester Board Feb 2, 2026
@bhumikaguptaa bhumikaguptaa requested a review from a team February 2, 2026 22:15
Copy link
Collaborator

@bhumikaguptaa bhumikaguptaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Save the Data Entry record and ensure the record saves successfully
  • Reload the page and ensure Specify loads and still remains accessible
  • Save the Data Entry record and ensure the record saves successfully
  • Ensure that the value for the autonumbered field from Data Entry is skipped in the WorkBench upload results.
  • Save the Data Entry record and ensure the record saves successfully
  • Ensure that the value for the autonumbered field from Data Entry is skipped in the WorkBench upload results.

Everything works as expected.

@acwhite211 acwhite211 self-requested a review February 2, 2026 22:38
@melton-jason
Copy link
Contributor Author

melton-jason commented Feb 3, 2026

Hi everyone!
I fixed some edge cases that did not have the correct expected behavior (unit tests came in clutch 🥳)1.
Additionally, there were some issues relating to autonumbering Accessions, especially when the Accession was scoped to Division.

I've updated the sample database to include more than one Division (the new Mammals collection is in a different Division then the other Collections) and some Accession Data Sets; specifically, I've provided two databases: one with Accessions scoped to Division and one with Accessions that are globally scoped.

I've also updated the testing instructions to include testing for Accession autonumbering.

During my own testing, I came across two Issues that are also present in main:

  • The scoping of AccessionNumber Auto Numbering is not currently configurable in Specify 7: when the Institution -> IsAccessionGlobal checkbox is changed, it does not update the underlying autonumbering tables
  • AutoIncrementing does not currently work when there is more than one record through a to-many relationship in the WorkBench. For example, uploading more than one CollectionObjects with the ######### catalog number on the same Accession when using Accession as a base table is currently broken.

These will likely not be addressed in this PR: I'll write up an Issue for each of these

Footnotes

  1. a lot of these edge cases were centered around autoincrementing Field Formats that accepted letters preceding the autoincrementing portion. For example if I were to save a record with AB-002 and save a record within 5 seconds as AA-###, it would previously be saved as AB-003 instead of the expected AA-001

@melton-jason melton-jason requested review from a team and emenslin February 3, 2026 16:37
Copy link
Collaborator

@emenslin emenslin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Save the Data Entry record and ensure the record saves successfully
  • Reload the page and ensure Specify loads and still remains accessible
  • Save the Data Entry record and ensure the record saves successfully
  • Ensure that the value for the autonumbered field from Data Entry is skipped in the WorkBench upload results.
  • Save the Data Entry record and ensure the record saves successfully
  • Ensure that the value for the autonumbered field from Data Entry is skipped in the WorkBench upload results.
  • Save the Accession and ensure the record saves successfully
  • Ensure that the Accessions in different Divisions are independently auto-numbered (e.g., Accessions in different Divisions should not share the same autonumbering)
  • Save the Accession and ensure the record saves successfully
  • Ensure that the value for the autonumbered field from Data Entry is skipped in the WorkBench upload results in the other Division
  • Generally ensure that autonumbering respects the scoping of Accessions (e.g, using DataEntry, create Accessions in different scopes and ensure they're autonumbered as expected)

Autonumbering behavior looks good! The only issue I ran into was when editing the preparation count of an existing CO record and pressing save it gets stuck loading. I checked this on multiple different CO and it seems to be happening consistently in the mammals collection but not in other collections. I checked in 7.11.3 and I can recreate it so it is probably not related to this PR but I wanted to mention it since I discovered it when testing.

@melton-jason
Copy link
Contributor Author

melton-jason commented Feb 4, 2026

The only issue I ran into was when editing the preparation count of an existing CO record and pressing save it gets stuck loading

This Issue is actually what happens when the CollectionObject doesn't have a CollectionObjectType (COT). The business rules (provided below) expect the COT to always be present, and don't handle cases when there is not a COT.

const coType = await resource.rgetPromise('collectionObjectType');
const coTypeTreeDef = coType.get('taxonTreeDef');

This results in errors behind-the-scenes, errors that are reported in the console in a production environment. Regardless, these errors make is so that the request to Create or Update the resource are never actually sent to the backend when the resource is saved, meaning the frontend will be waiting indefinitely for a response to a request that was never sent!

Screen.Recording.2026-02-03.at.1.04.28.PM.mov

This functionality was not modified in any way in this PR.
Ultimately, this was caused because I created a new Division, Discipline, and Collection in Specify 6. Specify 7 expects its migrations to be run on the collections, and creating resources in Specify 6 does not add the resources that Specify 7 expects.

I've updated the sample databases in the testing instructions to add the Specify 7 resources to the Division, Discipline, and Collection for Mammology, so you shouldn't encounter the same issue when using the sample databases!

@melton-jason melton-jason merged commit f36908d into v7.11.4-prerelease Feb 4, 2026
14 checks passed
@melton-jason melton-jason deleted the issue-6490-patch branch February 4, 2026 17:40
@github-project-automation github-project-automation bot moved this from Dev Attention Needed to ✅Done in General Tester Board Feb 4, 2026
@specifysoftware
Copy link

This pull request has been mentioned on Specify Community Forum. There might be relevant details there:

https://discourse.specifysoftware.org/t/specify-7-11-4-release-announcement/3340/1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

8 participants