Skip to content

Conversation

@acwhite211
Copy link
Member

@acwhite211 acwhite211 commented Jan 7, 2026

Fixes #7617
Fixes #7626

Create Django models and a migration file for the autonumbering tables present in the Specify database schema from Specify 6, but are currently missing from the Specify 7 Django model schema. Add the tables, fields, constraints, and indexes associated with the autonumsch_coll, autonumsch_div, and autonumsch_dsp tables.

Also, added in the specifyuser_spprincipal table to fix the issue of it missing for the setup tool.

Checklist

  • Self-review the PR after opening it to make sure the changes look good and
    self-explanatory (or properly documented)
  • Add relevant issue to release milestone
  • Add pr to documentation list

Testing instructions

  • Create a new database by setting the database name in the env file to a non-existign database. See that the migrations run without error.
  • See that autonumbering is working correctly.
  • Test out creating a new collection object with an autonumbered ID.
  • When testing on a new created database with the setup tool, make sure to create a new database from scratch so that the autonum tables get created, otherwise the autonumbering error will persist.

@acwhite211 acwhite211 added the 2 - Database/Schema Issues that are related to the underlying database and schema label Jan 9, 2026
@acwhite211 acwhite211 marked this pull request as ready for review January 9, 2026 17:15
@acwhite211 acwhite211 requested review from a team January 9, 2026 17:16
@acwhite211 acwhite211 changed the title Add missing autonum tables in django Add missing autonum and spprincipal tables in django Jan 9, 2026
Copy link
Contributor

@alesan99 alesan99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Create a new database by setting the database name in the env file to a non-existign database. See that the migrations run without error.
  • See that autonumbering is working correctly.
  • Test out creating a new collection object with an autonumbered ID.

Autonum tables are being created but it looks like specifyuser_spprincipal is still missing.

dbeaver_14LGh9ih5q dbeaver_hyxE3aRihJ Code_pvUBvCMOoP

Uncommenting the specifyuser_spprincipal queries in the setup PR results in a missing table error:

Details
worker         | Traceback (most recent call last):
worker         |   File "/opt/specify7/specifyweb/backend/setup_tool/api.py", line 327, in create_specifyuser
worker         |     new_user = Specifyuser.objects.create(**data)
worker         |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
worker         |   File "/opt/specify7/ve/lib/python3.12/site-packages/django/db/models/manager.py", line 87, in manager_method
worker         |     return getattr(self.get_queryset(), name)(*args, **kwargs)
worker         |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
worker         |   File "/opt/specify7/ve/lib/python3.12/site-packages/django/db/models/query.py", line 660, in create
worker         |     obj.save(force_insert=True, using=self.db)
worker         |   File "/opt/specify7/specifyweb/specify/models_utils/model_extras.py", line 123, in save
worker         |     self.clear_admin()
worker         |   File "/opt/specify7/specifyweb/specify/models_utils/model_extras.py", line 101, in clear_admin
worker         |     cursor.execute("""
worker         |   File "/opt/specify7/ve/lib/python3.12/site-packages/django/db/backends/utils.py", line 102, in execute
worker         |     return super().execute(sql, params)
worker         |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
worker         |   File "/opt/specify7/ve/lib/python3.12/site-packages/django/db/backends/utils.py", line 67, in execute
worker         |     return self._execute_with_wrappers(
worker         |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
worker         |   File "/opt/specify7/ve/lib/python3.12/site-packages/django/db/backends/utils.py", line 80, in _execute_with_wrappers
worker         |     return executor(sql, params, many, context)
worker         |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
worker         |   File "/opt/specify7/ve/lib/python3.12/site-packages/django/db/backends/utils.py", line 84, in _execute
worker         |     with self.db.wrap_database_errors:
worker         |   File "/opt/specify7/ve/lib/python3.12/site-packages/django/db/utils.py", line 91, in __exit__
worker         |     raise dj_exc_value.with_traceback(traceback) from exc_value
worker         |   File "/opt/specify7/ve/lib/python3.12/site-packages/django/db/backends/utils.py", line 89, in _execute
worker         |     return self.cursor.execute(sql, params)
worker         |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
worker         |   File "/opt/specify7/ve/lib/python3.12/site-packages/django/db/backends/mysql/base.py", line 75, in execute
worker         |     return self.cursor.execute(query, args)
worker         |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
worker         |   File "/opt/specify7/ve/lib/python3.12/site-packages/MySQLdb/cursors.py", line 206, in execute
worker         |     res = self._query(query)
worker         |           ^^^^^^^^^^^^^^^^^^
worker         |   File "/opt/specify7/ve/lib/python3.12/site-packages/MySQLdb/cursors.py", line 319, in _query
worker         |     db.query(q)
worker         |   File "/opt/specify7/ve/lib/python3.12/site-packages/MySQLdb/connections.py", line 254, in query
worker         |     _mysql.connection.query(self, query)
worker         | django.db.utils.ProgrammingError: (1146, "Table 'newdb15.specifyuser_spprincipal' doesn't exist")

This is on a brand new (previously non-existent) database.
https://drive.google.com/file/d/1BwhA3QhpYsrV4tWa_yGkv1cjWNxwxeE2/view?usp=drive_link

@github-project-automation github-project-automation bot moved this from 📋Back Log to Dev Attention Needed in General Tester Board Jan 13, 2026
Copy link
Contributor

@melton-jason melton-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad we're adding these missing tables! 🥳

While I was reviewing this PR, I went through some other ManyToMany tables that we might be missing at the moment, and I also found:

  • spprincipal_sppermission
  • project_colobj
  • sp_schema_mapping

I pushed a branch that adds these missing tables and addresses some of the issues I mentioned in this PR: issue-7617-1

Feel free to look at those changes and incorporate any here, or let me know I can open a PR!

We would still need to discuss the proper on_delete behavior for the relationships, but everything else for model creation should be handled (we might still need to handle adding the relationships to the datamodel, but I think that could technically be addressed later down the line).

save = partialmethod(custom_save)

class AutonumSchColl(models.Model):
collection = models.ForeignKey("Collection", db_column="CollectionID", on_delete=models.DO_NOTHING)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless we have custom delete behavior for these relationships, we shouldn't (and can't) use the DO_NOTHING behavior here.

DO_NOTHING means that Django will attempt to leave the database structure as-is when the referencing record is deleted.
Our databases enforces referential integrity, so an error will be thrown if we try to delete a Collection with referencing an AutonumSchColl: https://docs.djangoproject.com/en/6.0/ref/models/fields/#django.db.models.DO_NOTHING

(The same applies to all other ForeignKeys with the DO_NOTHING delete behavior)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is primarily a discussion on whether we want traditional delete blocker behavior (e.g., you can't delete a Collection without first deleting the referenced AutonumSchColl records) or have cascading behavior (e.g., all referenced AutonumSchColl records will be deleted when the Collection is deleted)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm leaning towards using cascade, but can't say for sure what user's would want.

@acwhite211
Copy link
Member Author

acwhite211 commented Jan 14, 2026

Fixed the on delete behavior and verified the new databases runs without error:

specify7       | MariaDB is up and running.
specify7       | Client host as seen by MariaDB: '172.18.0.10'
specify7       | Creating database 'blankdb_setup5'...
specify7       | Executing: mariadb -h "mariadb" -P "3306" -u "root" --password="<hidden>" -e "CREATE DATABASE `blankdb_setup5`;"
specify7       | Migrator user 'root' uses the same credentials as master.
specify7       | Skipping creation/grant steps for a separate migrator account and using master connection for migrations.
specify7       | App user 'root' uses the same credentials as master.
specify7       | Skipping creation/grant steps for a separate app account.
specify7       | Relying on master privileges for runtime connections.
specify7       | --------------------------------------------------
specify7       | Database and user setup complete.
specify7       | New database created: True
specify7       | New migrator user created: False
specify7       | New app user created: False
specify7       | --------------------------------------------------
...
specify7       |   Applying specify.0001_initial... OK
specify7       |   Applying accounts.0001_initial... OK
specify7       |   Applying accounts.0002_auto_20211223_1206... OK
specify7       |   Applying accounts.0003_auto_20220621_1541... OK
specify7       |   Applying attachment_gw.0001_initial... OK
specify7       |   Applying contenttypes.0001_initial... OK
specify7       |   Applying contenttypes.0002_remove_content_type_name... OK
specify7       |   Applying auth.0001_initial... OK
specify7       |   Applying auth.0002_alter_permission_name_max_length... OK
specify7       |   Applying auth.0003_alter_user_email_max_length... OK
specify7       |   Applying auth.0004_alter_user_username_opts... OK
specify7       |   Applying auth.0005_alter_user_last_login_null... OK
specify7       |   Applying auth.0006_require_contenttypes_0002... OK
specify7       |   Applying auth.0007_alter_validators_add_error_messages... OK
specify7       |   Applying auth.0008_alter_user_username_max_length... OK
specify7       |   Applying auth.0009_alter_user_last_name_max_length... OK
specify7       |   Applying auth.0010_alter_group_name_max_length... OK
specify7       |   Applying auth.0011_update_proxy_permissions... OK
specify7       |   Applying auth.0012_alter_user_first_name_max_length... OK
... 
specify7       |   Applying specify.0040_components... OK
specify7       |   Applying specify.0041_add_missing_schema_after_reorganization... OK
specify7       |   Applying workbench.0001_initial... OK
specify7       |   Applying workbench.0002_spdataset_visualorder... OK
specify7       |   Applying workbench.0003_auto_20210218_1256... OK
specify7       |   Applying workbench.0004_auto_20210219_1131... OK
specify7       |   Applying workbench.0005_auto_20210428_1634... OK
specify7       |   Applying workbench.0006_batch_edit... OK
specify7       |   Applying workbench.0007_spdatasetattachment... OK
specify7       |   Applying workbench.0008_spdataset_rolledback... OK
specify7       | /opt/specify7/ve/lib/python3.12/site-packages/debugpy/_vendored/pydevd/pydevd_plugins/__init__.py:5: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
specify7       |   __import__('pkg_resources').declare_namespace(__name__)
specify7       | 0.00s - Debugger warning: It seems that frozen modules are being used, which may
specify7       | 0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
specify7       | 0.00s - to python to disable frozen modules.
specify7       | 0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
specify7       | Watching for file changes with StatReloader
image image

@CarolineDenis
Copy link
Contributor

CarolineDenis commented Jan 15, 2026

@acwhite211 Can we add:

spprincipal_sppermission
project_colobj
sp_schema_mapping

?

Copy link
Contributor

@melton-jason melton-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these changes are related to Support many-to-many relationships #2928, I've thought a little about how these changes impact implementing #2928 in the future, and how easily we'll be able to optimize and use these models in the future.

Let me know and I can absolutely take over handling Many-to-Many tables! For the setup tool at least, we just need to worry about the Django models and migration (I did the work I'd love to see - minor modifications of this branch to support Django ManyToMany fields- in issue-7617-1), but there are some implications with how we handle tackling #2928.


While not strictly required, for general usability and potential optimizations of these tables on the backend with Django, I would love to see us using the Django's built-in many-to-many relationship! The changes needed to support this are very minimal.

Below are the examples of adding AutonumSchColl on issue-7617-1:

class Autonumschcoll(models.Model):
# specify_model = datamodel.get_table_strict("Autonumschcoll")
id = models.AutoField(primary_key=True, db_column='AutonumSchCollID')
autonumberingscheme = models.ForeignKey('Autonumberingscheme', db_column='AutoNumberingSchemeID', related_name="+", on_delete=protect_with_blockers)
collection = models.ForeignKey('Collection', db_column='CollectionID', related_name="+", on_delete=protect_with_blockers)
class Meta:
db_table='autonumsch_coll'
constraints = [
models.UniqueConstraint(fields=["autonumberingscheme", "collection"], name="autonumberingscheme_collection")
]

on the Autonumberingscheme model:

# Relationships: Many-to-Many
collections = models.ManyToManyField(
"Collection",
through="Autonumschcoll",
through_fields=("autonumberingscheme", "collection"),
related_name="autonumberingschemes"
)

This would make it so that we can stuff like the following:

my_collection = models.Collection.objects.get(id=some_id)
# Fetch all autonumbering schemes
autonumbering_schemes = my_collection.autonumberingschemes.all()
# Add a new autonumbering scheme
new_scheme = Autonumberingscheme(...)
my_collection.autonumberingschemes.add(new_scheme)
# Remove an existing autonumbering scheme 
my_collection.autonumberingschemes.remove(some_scheme)

# or we can do any of the above also on the autonumbering side

# this would also enable us to use optimizations like prefetch_related
# https://docs.djangoproject.com/en/6.0/ref/models/querysets/#prefetch-related

my_collection = models.Collection.objects.prefetch_related("autonumberingschemes").get(id_some_id)

# This will not result in another database query as it was fetched when
# evaluating the above QuerySet
autonumbering_schemes = my_collection.autonumberingschemes.all()

Currently, we would have to directly use the join table to handle relationships:

my_collection = models.Collection.objects.get(id=some_id)
autonumberingschemes = models.AutonumSchColl.filter(collection=my_collection)

When it comes to adding support for many-to-many relationships in other ares of the application (QueryBuilder, Forms, WorkBench/BatchEdit, Use with the API, Record/Field formatters, Record Merging, etc.), these hinge on us integrating many-to-many relationships with our internal datamodel.

Due to complications of setting the idField1, I think we can get away with not defining these tables as separate tables and instead define everything within the respective fields.

A user should not have to go through an intermediate join table to access or mody information in many-to-many relationships (e.g., CollectionObject -> projects and it's reverse should be maintained as direct relationships).

I have something like the following in mind for defining many-to-many relationships:

# CollectionObject -> projects
Relationship(
  name='projects',
  type='many-to-many',
  required=False,
  relatedModelName='Project',
  otherSideName='collectionObjects'
  joinTable="proj_colobj",
  joinDatabaseColumn="CollectionObjectID"
)
# we can define the join table explicitly if that's needed. 
# Either in the datamodel or through some external structure 
# that the datamodel references (this would hopefully satisfy 
# DRY principles and be less prone to mistakes when defining 
# relationships)
# Project -> collectionObjects
Relationship(
  name='collectionObjects',
  type='many-to-many',
  required=False,
  relatedModelName='CollectionObject',
  otherSideName='projects'
  joinTable="proj_colobj",
  joinDatabaseColumn="ProjectID"
)

This would generally follow our current "to-many" scheme and allow people to directly interact with the "otherside" of the many-to-many relationship.

The first challenge in mind comes with ensuring SQLAlchemy supports many-to-many relationships so support can be added for QueryComboBoxes and the QueryBuilder.

SQLAlchemy states that we can use the secondary keyword to specify a join table through a relationship:

association_table = Table(
    "association_table",
    Base.metadata,
    Column("left_id", ForeignKey("left_table.id"), primary_key=True),
    Column("right_id", ForeignKey("right_table.id"), primary_key=True),
)


class Parent(Base):
    __tablename__ = "left_table"
    id = Column(Integer, primary_key=True)
    children = relationship("Child", secondary=association_table, backref="parents")


class Child(Base):
    __tablename__ = "right_table"
    id = Column(Integer, primary_key=True)

https://docs.sqlalchemy.org/en/14/orm/basic_relationships.html#many-to-many

This would be the start of enabling people to build queries through many-to-many relationships like CollectionObject -> projects -> (whatever) and Project -> collectionObjects -> (whatever).

Footnotes

  1. See https://github.com/specify/specify7/pull/7455#discussion_r2543051371 and https://github.com/specify/specify7/pull/7455#discussion_r2543046732

Comment on lines +8030 to +8031
specifyuser = models.ForeignKey('SpecifyUser', db_column='SpecifyUserID', on_delete=models.deletion.DO_NOTHING)
spprincipal = models.ForeignKey('SpPrincipal', db_column='SpPrincipalID', on_delete=models.deletion.DO_NOTHING)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is still the DO_NOTHING on_delete behavior here 👀

Comment on lines +8001 to +8002
models.Index(fields=["autonumberingscheme"], name="FK46F04F2AFE55DD76"),
models.Index(fields=["collection"], name="FK46F04F2A8C2288BA"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Django should actually index ForeignKeys it creates by default, so these explicit indexes shouldn't be needed:

A database index is automatically created on the ForeignKey. You can disable this by setting db_index to False. You may want to avoid the overhead of an index if you are creating a foreign key for consistency rather than joins, or if you will be creating an alternative index like a partial or multiple column index.

https://docs.djangoproject.com/en/4.2/ref/models/fields/#foreignkey

For an example, here's the indexes from the sprole table:

MariaDB [specify]> show indexes from sprole;
+--------+------------+--------------------------------------------------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
| Table  | Non_unique | Key_name                                                     | Seq_in_index | Column_name   | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Ignored |
+--------+------------+--------------------------------------------------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
| sprole |          0 | PRIMARY                                                      |            1 | id            | A         |          10 |     NULL | NULL   |      | BTREE      |         |               | NO      |
| sprole |          1 | sprole_collection_id_4dccb6f9_fk_collection_usergroupscopeid |            1 | collection_id | A         |           5 |     NULL | NULL   |      | BTREE      |         |               | NO      |
+--------+------------+--------------------------------------------------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+
2 rows in set (0.055 sec)

collection_id is already indexed, without specifying anything on the model or migration:

collection = models.ForeignKey(Collection, on_delete=models.CASCADE)

('collection', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='specify.Collection')),


Could we remove the indexes declared explicitly on Foreign Keys in this PR?

save = partialmethod(custom_save) No newline at end of file
save = partialmethod(custom_save)

class AutonumSchColl(models.Model):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We did have a convention in the past where we expected to have the Django model class names be in a certain case: where the first letter in the class name is capitalized and all other letters are lowercase.

Here are some related commits I could find:

This specific convention was needed when we were dynamically fetching the model from the models class.
Did we do some work to alleviate the need of the convention?

I did a regular expression search of getattr\((.*models) within our Python files and it looks like (for non-tree classes) having them named this way might at least break WorkBench/BatchEdit DataSets the tables are in, as well as queries.
There may be other places which would break when interacting with these classes.

@acwhite211
Copy link
Member Author

acwhite211 commented Jan 16, 2026

Think it might be best to move this issue into the #7643 PR. Everything the is missing from Sp6 is being created in Sp7 there. Including spprincipal_sppermission, project_colobj, and sp_schema_mapping. There's already a lot of overlap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2 - Database/Schema Issues that are related to the underlying database and schema

Projects

Status: Dev Attention Needed

Development

Successfully merging this pull request may close these issues.

[Guided Setup]: Add missing specifyuser_spprincipal and spprincipal tables [Guided Setup] : Add missing autonumbering columns

5 participants