docs: OEP-68 Learning Content Identifiers by kdmccormick · Pull Request #773 · openedx/openedx-proposals

kdmccormick · 2026-03-18T19:47:46Z

Abstract

Open edX code has four kinds of identifiers for learning content: Integer Primary Keys,
Codes, OpaqueKeys, and UUIDs. Choosing the wrong kind—or naming it ambiguously—can
cause subtle bugs, broken exports, and painful code reviews. This OEP explains what each kind
is, when to reach for it, and how to name it consistently.

The primary audience is developers working on openedx-core, openedx-platform, and other
repositories that manage learning content. Plugin developers, site administrators, and
nontechnical Open edX Core Contributors who work with learning content will also find these
concepts useful.

🌐 This is part of a broader "Open edX Core v1.0" effort.

Review 👀

Open: Monday, March 23rd, 2026
Closed: ~~Monday, April 13th, 2026~~ Wednesday, April 15th, 2026 (beginning of day)

You can preview the rendered OEP here: https://open-edx-proposals--773.org.readthedocs.build/en/773/best-practices/oep-0068-bp-content-identifiers.html

AI Note 🤖

I taught Claude about the concepts in this OEP and then had it write a first draft. I then rewrote most of the content myself and corrected Claude's misunderstandings. I then told Claude "make it easier to read" and then reviewed and tweaked that.

kdmccormick · 2026-03-20T20:45:10Z

@ormsbee , could I ask you to give a very high-level review of what this OEP's trying to achieve? Don't worry about the exact prose and examples, I'd like to work on those a bit to make them more reader-friendly. Just want to make sure the scope and intent feels appropriate before I dig in to making a final draft.

ormsbee · 2026-03-20T23:00:53Z

@kdmccormick: I think the high level scope and framing are good. I was a little surprised to see UUIDs in there, but it makes sense. As part of this PR, you might want to amend OEP-38 so that people see a mention of this new OEP if they're looking at Data Modeling more generally.

bmtcril

Looks good, just one comment

bmtcril · 2026-03-23T19:11:49Z

+Summary
+=======
+
+.. list-table::


It would be nice if this table had guidance on database / model types for each identifier type as well. Some places where this comes up:

OpequeKeyField vs CharField

UUIDField vs CharField vs BinaryField

How long to make the CharField for a code? Not that there's a single answer, but guidance might be useful

Great idea, I will do that. My proposal is:

OpaqueKeyField for keys, always.

UUIDField for UUIDs, always

For codes, default to 255 chars,[A-Za-z0-9_\-\.], case sensitive. We could define this as a utility class somewhere.

Let me know if you disagree, or if you know of exceptions to (1) and (2). cc @ormsbee

That proposal looks good to me. I have a suspicion that 1 & 2 are somewhat inconsistent currently, but nothing to back that up aside from a memory of being irritated by it at some point. It could have been in an archived IDA or something.

bradenmacdonald · 2026-03-24T01:21:37Z

Looks great! Thanks so much for taking this on.

Do we have a clear example of something that "needs" to be a UUID, to illustrate it?
I think it's worth mentioning that UUIDs are much less common[ly needed] that the other identifier types.

kdmccormick · 2026-03-24T19:17:17Z

I think it's worth mentioning that UUIDs are much less common[ly needed] that the other identifier types.

agreed

Do we have a clear example of something that "needs" to be a UUID, to illustrate it?

That's a great question. The immediate example that comes to mind is an event, but that's not really a piece of learning content. Do we have any examples which are closer to "content"? Maybe a certificate UUID?

Agrendalath · 2026-03-25T14:24:10Z

@kdmccormick

That's a great question. The immediate example that comes to mind is an event, but that's not really a piece of learning content. Do we have any examples which are closer to "content"? Maybe a certificate UUID?

A certificate is a good example. Another could be the LTI configuration, where we use UUID to avoid leaking the primary ID.

ormsbee

This is great work—thank you so much for doing it. Minor questions and suggestions, but nothing that I would consider blocking.

ormsbee · 2026-03-28T18:58:27Z

That's a great question. The immediate example that comes to mind is an event, but that's not really a piece of learning content. Do we have any examples which are closer to "content"? Maybe a certificate UUID?

The draft and publish logs are another example.

kdmccormick · 2026-04-06T19:17:18Z

@bmtcril @ormsbee @bradenmacdonald @Agrendalath Thanks for your reviews!

I've incorporated all feedback. Here's the diff w.r.t. the initial OEP: 31aaed6...110dc06 . I wouldn't mind another set of eyes on the "OpaueKeys" section which I've amended to explain that OpaqueKeys can sometimes have meaning outside of the instance.

rodmgwgu

Thanks for this proposal, this is great, I've been confused in the past with course_id's that are actually course keys, so this is really needed.

kdmccormick · 2026-04-08T15:24:48Z

+**How to name:** On Django models, the primary key is ``id`` and foreign keys automatically
+get the ``_id`` suffix (e.g. ``collection_id``). Outside of model definitions—in variable
+names, REST APIs, event schemas, and so on—use the suffix ``_pk`` instead (e.g.
+``collection_pk``). When accessing the primary key on a django model, prefer ``.pk``, e.g.
+``collection.pk``.  "id" is an overloaded term that means many things; ``_pk`` leaves no
+doubt that the value is a Django model integer primary key.


Unfortunately Braden's recent research shows that the name pk is in conflict with our desire to use typed model keys: openedx/openedx-core#534 (comment) . He suggested changing this guidance to recommend .id.

I'm happy switching back to .id for accessing the PK on the model object and for queries. Anyway, that's what 95% of our code does, so we'd have way less code to update.

What I'll miss about _pk is that it disambiguates non-model fields very well:

some_data = { "course_id": ..., # ambiguous :( "course_key": ..., # clearly an opaque key "course_pk": ..., # clearly a PK }

Maybe we could offer a new convention for ambiguous situations, while still respecting the _id suffix? Like, _db_id:

some_data = { "course_id": ..., # ambiguous :( "course_key": ..., # clearly an opaque key "course_db_id": ..., # clearly a PK }

Lmk your thoughts @ormsbee , @bradenmacdonald , et al

I'm okay with having vars named course_id an dealing with a bit of ambiguity, but I do strongly wish that we could keep model_instance.pk working somehow. It's a standard Django practice, and it's useful for 1:1 models like Component. I suppose we could modify PublishableEntityMixin so that the OneToOneField is named id, which would probably help. Right now, my_component.id would just error because the actual pk is at my_component.publishable_entity_id.

In my PR, I added a typed accessor so that you can use .id and it returns self.publishable_entity_id with the correctly-typed PK type.

I do strongly wish that we could keep model_instance.pk working somehow

Yeah. It works, but it has Any type, so it kind of bypasses all the advantages of typed primary keys if you use it.

If we want the best of both worlds (fully typed + using pk terminology), I think the only feasible option (working around django-stubs limitations) is to use modelname_pk on each model, e.g. container.container_pk, entity.entity_pk, other_entity.entity_pk. This lets us be fairly consistent across the board (except if we need to access the ID field in django ORM queries/joins, e.g. obj__entity_id derived from obj.entity.id). But it's a lot more verbose.

Or we go with id which is simple, concise, can be made fully-typed, but is ambiguous in old code.

Or something novel like db_id, but I'd rather avoid that.

I think I'm personally leaning toward just id/..._id everywhere as it's by far the most consistent in new code, and we can try to work on bringing the old code in line over time.

I think I'm personally leaning toward just id/..._id everywhere as it's by far the most consistent in new code, and we can try to work on bringing the old code in line over time.

I'm supportive of this as well. At the very least, we'll be able to make openedx-core fully consistent, and most platform code will naturally be consistent. course_id (a CourseKey) will be the major offender, but we can work that out over time.

@ormsbee @bradenmacdonald , I can make this OEP update, any objections?

Btw @bradenmacdonald , I'd love if we could add guidance to the OEP about typed primary keys, but I'm fine merging this OEP first and then updating it after your openedx-core PR merges.

Sounds good to me.

Btw @bradenmacdonald , I'd love if we could add guidance to the OEP about typed primary keys, but I'm fine merging this OEP first and then updating it after your openedx-core PR merges.

I'd like to see if we can find any ways to improve this, as adding two id_fields.py[i] files to each app is slightly more overhead than I'd like. But if we only do it for "major" models, I think it's still a good approach.

Btw @bradenmacdonald , I'd love if we could add guidance to the OEP about typed primary keys, but I'm fine merging this OEP first and then updating it after your openedx-core PR merges.

I do have a recommendation on this now. Let me know what you think: openedx/openedx-core#534 (comment)

Co-authored-by: Claude <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

kdmccormick · 2026-04-13T14:59:50Z

@bmtcril @Agrendalath @rodmgwgu @ormsbee @bradenmacdonald , based on the latest discussion, I've pushed a commit to change it so that we'll use _id instead of _pk for database primary keys.

Please let me know before Wednesday morning (15 April) if you have concerns or if you'd like more time to re-review! cc @sarina .

kdmccormick changed the title ~~docs: Naming Identifiers~~ docs: OEP-68 Naming Identifiers Mar 18, 2026

robrap reviewed Mar 18, 2026

View reviewed changes

Comment thread oeps/best-practices/oep-0068-bp-naming-identifiers.rst Outdated

kdmccormick changed the title ~~docs: OEP-68 Naming Identifiers~~ docs: OEP-68 Identifiers Mar 20, 2026

kdmccormick force-pushed the kdmccormick/keys branch from 7e19a86 to aaa05fc Compare March 20, 2026 20:42

kdmccormick changed the title ~~docs: OEP-68 Identifiers~~ docs: OEP-68 Learning Content Identifiers Mar 20, 2026

sarina reviewed Mar 23, 2026

View reviewed changes