Update 'utils' from PHP to TypeScript by javagl · Pull Request #280 · KhronosGroup/glTF-Sample-Assets

javagl · 2026-05-21T20:29:30Z

Previous state

The 'utils' directory contained two PHP scripts, model.php and modelmetadata.php. The purpose of these scripts was to create and update several files in the repository. For example:

The model-index.json that lists all models and their tags and variants
The "listing" files like Model-showcase.md that shows all models with the 'showcase' tag
The 'dep5' file that contains the repo-wide license information
For each model:
- The README.md file
- The LICENSE.md file
- The metadata.json file

The metadata.json file was also the input for all of this, and there have been a few functionalities that appeared to be related to various forms of "updating" the license information (that have been a bit hard to follow, admittedly).

New state

The (relevant?) functionality of these scripts is rewritten in TypeScript here. This includes the introduction of structures/types for ... "various things", extensive documentation, and (what I hope to be) clearer processes.

There is a `main.ts` that only parses the command line arguments.

Some of the command line arguments had been marked as obsolete, or they have been "smeared" through the code in a form that made it hard to parse (~40 appearances of dry-run or dryRun...). Once the command lines are parsed, they are passed to the only "public function", namely ModelMetadata.process

The `ModelMetadata` class summarizes the main, actual tasks.

It collects the models that should be processed (usually, all models from the Models subdirectory). These are collected as Model objects (details below). Once the models are collected, several functions are called:

createListings, which creates the list files that summarize models by their tags, e.g. Models-showcase.md
createModelIndex, which created the model-index.json file
createReuseLicense which creates the dep5-file with license information about all models
Optional: updateAllModelsFiles: This updates the README.md, LICENSE.md, and metadata.json of all models

The `Model` class summarizes information about "one model"

This class contains, typed, documented information. The main goal was to avoid lines like
$this->metadata['createReadme'] = (isset($this->metadata['AutoGenerateREADME'])) ? $this->metadata['AutoGenerateREADME'] && $this->metadata['createReadme'] : $this->metadata['createReadme'];
and to have a more clearly encapsulated, consistent functionality. This does require a certain level of validation. Therefore...:

The `Models` class is used for creating `Model` objects.

This class takes the data that was read from the metadata.json of the models, and performs several levels of validation:

It checks that all required fields in the metadata JSON are present
It checks that all required fields in the legal array of the metadata JSON are present
It checks that the metadata is valid
It checks that all 'legal' objects are valid

Only when all this succeeds, it creates an actual, valid Model. (Otherwise, it collects any errors/warnings, so that the caller can deal with that).

Changes to licenses

No, this is not what it sounds like. The point is that the previous state defined an array of licenses, and there have been quite a few inconsistencies in that.

For example, the so-called spdx identifiers have not been valid SPDX identifiers. Some of the license links have been plainly wrong. I don't know where this CC0-1.0 link is coming from, but it redirects to some random blog post. Also, there is no estabished license that is called "Public Domain". And there is no difference between "CC-BY International 4.0" and "CC-BY-4.0". Long story short: There are now exactly (EDIT:) four licenses left. When a metadata.json contains a license where the license is not a valid, known SPDX identifier, then it is assumed to be a "custom" license, meaning that it must have a licenseUrl that points to one of the licenses in this repository.

(The current state here contains a few quirks to "canonicalize" and "clean up" the license information in the existing metadata.json files...)

Review

I know, this is a big change, coming out of the blue (or not so, because we talked about some of that earlier). So I assume that not everybody will be willing or able to review this just so. Nevertheless, I'll tag a few people as potential reviewers here:

@DRx3D @emackey @lexaknyazev

The process for running this state locally should be:

cd utils
npm install
npm run check
npm run update

This will check/update all files locally. And the result may look faaar worse than it actually is: It will mark (nearly) all files as "modified", but nearly all these modifications are only the result of canonicalizing the license names to their proper SPDX names. For the other files, I tried to keep the structure and layout the same as before.

So that's what 'dep' stands for.

javagl · 2026-05-22T19:22:10Z

An inlined comment pointed out that the ./reuse/dep5 file is deprecated, and this should be a REUSE.toml in the repository root. After a bit of back and forth, this is updated here accordingly: The process now writes out a new REUSE.toml file during the update. I also cleaned up the LICENSES directory accordingly, and now, running
reuse lint
locally after the update prints:

SUMMARY

Bad licenses: 0

Deprecated licenses: 0

Licenses without file extension: 0

Missing licenses: 0

Unused licenses: 0

Used licenses: CC-BY-4.0, CC-BY-NC-4.0, CC-BY-NC-SA-4.0, CC0-1.0, LicenseRef-3DRT-Testing, LicenseRef-Adobe-Stock, LicenseRef-CRYENGINE-Agreement, LicenseRef-LegalMark-Cesium, LicenseRef-LegalMark-DGG, LicenseRef-LegalMark-Khronos, LicenseRef-LegalMark-UX3D, LicenseRef-Poser-EULA, LicenseRef-Stanford-Graphics, SCEA

Read errors: 0

Invalid SPDX License Expressions: 0

Files with copyright information: 2317 / 2317

Files with license information: 2317 / 2317

Congratulations! Your project is compliant with version 3.3 of the REUSE Specification :-)

So that seems to be fine.

The careful observer might have noticed that this PR now removes 19.160 lines. I have no clue what reuse.spdx was, but am 100% sure that it should either not have been there at all, or at least, not in the given form. You know, with these random parts of .gltf files and such.

The validation via reuse lint could (and probably should) be part of the CI one day, but I think that this could be addressed in a dedicated PR.

Another inlined comment pointed out the updateLegacyLicenseStructure function that is part of the current state. This step essentially just updates all metadata.json files to use proper, canonical SPDX identifiers.

This should not be part of the state that is eventually merged. I think that one way of dealing with this would be:

Run the update locally
Commit the updated state (basically updating all files in the repo, including the metadata.json files)
Remove that updateLegacyLicenseStructure function call
Commit the state without that function

Any thoughts on that...?

javagl · 2026-05-22T19:45:46Z

Small addendum to the last point from the previous comment: Such a "one-time-cleanup pass" could also fix #281 . The current state contains some "debug" function to print these cases. Of course, that will be removed before marking this PR as ready, but only when the intended workflow for that one-time-cleanup is sorted out.

lexaknyazev · 2026-05-22T21:51:34Z

My personal preference would be to split changes into more targeted PRs, e.g., scripts vs data as appropriate, stacked on top of each other or opened sequentially. At the very least, obvious and trivial one-time cleanups (removal of reuse.spdx in particular) could be merged separately in advance to not inflate PRs that contain changes for reviews.

That said, other maintainers may have different preferences and, ultimately, the goal is to eventually land all changes, so an all-in-one PR that brings the repo to a new state is great too. Up to you how to organize PR's commit history (it can be rewritten if desired).

javagl · 2026-05-22T22:56:24Z

It's not clear know how that file ended up here, but the step of removing it is just part of the update for the dep5-recation. It could probably have been done at any time, as a 1-commit-PR. There could also be a dedicated PR for #281. And a dedicated PR for that unused test tag. And one for the newly added pbrtest. We could set up a project board, and have planning meetings, and set up a task force, and define a roadmap, but I'll leave that to the managers. I'm just trying to (spend my spare time to) get some overdue cleanup work done here.

By the way, I just saw #277. Did anybody care about that? No. Should it have been merged by now? Yes. However, I'll try to align it with the changes here.

lexaknyazev · 2026-05-22T23:19:10Z

By the way, I just saw #277. Did anybody care about that? No. Should it have been merged by now? Yes. However, I'll try to align it with the changes here.

I suppose there are only two ways of handling this: either merging #277 and resyncing this PR afterwards or closing #277 and incorporating its changes here. As this PR's name suggests that it's about rewriting scripts, merging #277 first would be a bit more semantically accurate.

Anyway, please feel free to structure this work as you see fit. Getting it merged is more important than planning details.

javagl · 2026-05-23T12:33:44Z

merging #277 first would be a bit more semantically accurate.

Merging that other PR does not make sense: As soon as it is merged and the PHP script runs, it will build a new dep5 file, leading to a completely inconsistent state.

But that other PR is pretty much a strict subset of what is done here: Removing dep5, adding .toml, adding/fixing some licenses. So in the spirit of not trying to cross a chasm with two jumps, I'd add the other (minor) changes from that PR here (renaming the MD to CONTRIBUTING.md and such), and then, this PR should supersede the other one.

javagl · 2026-05-23T16:45:08Z

Addressing the points from #277

Migrate REUSE to current version
This was done here as well. In contrast to the other PR, this PR does not only remove the dep5 file and add the REUSE.toml file, but also updates the util-functionality to actually generate updated REUSE.toml files in CI
Renamed SubmittingModels.md -> CONTRIBUTING.md to fall in line with other repositories in KhronosGroup
This was done here as well. And while I'm at it, and because I have nothing else to do, I essentially re-wrote the CONTRIBUTING.md, trying to give it a structure that makes it easier to follow the contribution process:
- The document now lists the required files immediately: metadata.json, README.body.md, screenshot.
- It explains all properties in the metadata.json and its legal array, in a way that is hopefully clearer and easier to fill out.
- I also added some information about the 'variants' (even though I noticed that the variant names have turned a bit arbitrary...).
- I added an appendix, listing the valid 'tags' and licenses (this wasn't really made clear before that)
- Direct preview for reviewers: CONTRIBUTING.md
Added AI Contribution text to CONTRIBUTING.md
Yeah, added an appendix for that as well 🤷

The remaining points from that PR...

Fixed typos breaking REUSE lint (CC BY-SA-NC 4.0 -> CC-BT-SA-NC-4.0)
Added CC-BT-SA-NC-4.0 license
Add missing DGG license, using standard used from other 3rd party model licenses

had been done here as well, as part of the overall cleanup of the license namings.

So I think that this PR will make the PR #277 obsolete now.

Previews:

A preview for the result of updating all metadata.json files to their canonical form is in Preview: Update all metadata JSON files to canonical form javagl/glTF-Sample-Assets#1
A preview for the intended final state after merging this PR is in Preview: Update of auto-generated files javagl/glTF-Sample-Assets#2

javagl · 2026-05-23T16:49:51Z

I'll mark this as "Ready For Review" now (even though I expect some minor cleanups and comments to come in during the review).

EDIT: This is obsolete now:

The only "larger" task that I'd consider to be a remaining TODO would be to add the
reuse lint
step to the CI, to ensure that newly submitted models have proper (custom or SPDX) licensing. Some validation could be done by the utils here (e.g. checking for the presence of a .txt file in the LICENSES folder), but maybe that can be left to the 'reuse' linting.

The CI integration of the REUSE check is drafted in #282

Note Before this is actually merged, the updateLegacyLicenseStructure function should be removed. (It wouldn't do any harm, but should no longer be necessary in the future)

javagl · 2026-05-23T16:59:14Z

A very specific technical aside:

The CONTRIBUTING.md lists the properties that have to go into the metadata.legal entries. For SPDX licenses, this does not require the properties

"licenseUrl": "https://creativecommons.org/licenses/by/4.0/legalcode",
"spdx": "CC-BY-4.0",
"icon": "https://licensebuttons.net/l/by/4.0/88x31.png"

These are inserted automatically. I don't know what the intention was behind that legal structure, and why it includes that unused icon, for example. I think that there could/should be a structure of that legal that does not include that icon, but that would be a 'breaking change'. If people think that this should/could be omitted, I'm all in favor of making that metadata.json as lean and clean as possible.

EDIT: When I said that this would be a 'breaking change', then this referred to things like https://github.khronos.org/glTF-Assets/model/AnimatedTriangle , which already does display this 'icon' at the bottom, and one has to assume that the URL for that is pulled from the metadata. So it can not simply be removed...

javagl · 2026-05-23T19:47:09Z

Recreational saturday evening coding: The new structure would allow some additional functionalities more easily. For example, it took only a few minutes to extend the functionality of creating 'listings' to other categories - specifically, the extensions that the models are using. A preview: A listing of all models using EXT_materials_clearcoat (with links to all other extension listings).

But of course, at some point, it does not make sense to maintain these as MD files here. The https://github.khronos.org/glTF-Assets/ website (or https://asset-explorer.needle.tools/ ) may be more suitable for all this.

lexaknyazev · 2026-05-25T15:14:16Z

+          if (fileLowercase.endsWith(".glb")) {
+            if (!file.endsWith(".glb")) {
+              warnings.push(
+                `File does not have an all-lowercase extension: ${file}`
+              );
+            }
+            variants[name] = file;
+            foundFile = true;
+            break;
+          }
+          if (fileLowercase.endsWith(".gltf")) {
+            if (!file.endsWith(".gltf")) {
+              warnings.push(
+                `File does not have an all-lowercase extension: ${file}`
+              );
+            }
+            variants[name] = file;
+            foundFile = true;
+            break;
+          }
+        }


Please consider deduplicating this section as .endsWith(".glb") and .endsWith(".gltf") can't be true simultaneously.

Done. That else was actually there in the original code.

Also a bit more generally:

I'd like that "variants search" to be "better" in some waysl. But we currently don't have a strict enough definition of what 'variants' are. I tried to summarize a bit of that in the updated CONTRIBUTING.md. But there are oddly specific ones like glTF-Binary-KTX-ETC1S-Draco that are not mentioned there explicitly.

Nearly a detail: The current check does not complain about a glTF-Binary subdirectory containing a .gltf file and such.

We should probably consider to define an accepted set of 'variants' somewhere, maybe even including the expected extension per variant (e.g. glTF-Binary: .glb). That would likely mean to rename some of the existing ones, and be more strict to disallow things like glTF-ReallyObscureName.

Maybe also worth escalating into an issue...?

That else was actually there in the original code.

I meant deduplicating code blocks themselves as they are identical except for the conditions. That surely could be done later.

We should probably consider to define an accepted set of 'variants' somewhere

For starters, we should require standard names for standard variants:

glTF and glTF-Embedded: .gltf

glTF-Binary: .glb

All others could be ignored for now.

This was added via eb8672e (only a slight improvement, but maybe OK for now)

lexaknyazev · 2026-05-25T15:35:23Z

+- Each model must pass the [glTF-Validator](https://github.khronos.org/glTF-Validator/).
+- Each model must have an associated `README.md` markdown file that describes the model and the features of the model that make it appropriate of this repo
+- Each model must have an associated `metadata.json` file that includes legal information (ownership, copyright, and license) 
+- Each model must have a properly formatted screenshot


Maybe explain "properly formatted" here or elsewhere; the dedicated section below does not contain any specifics either.

(I'll close the fixed typo ones and push the update soon. Some of the text (e.g. 'appropritate of') was taken from the original state, but a cleanup pass should clean up this as well)

The "properly formatted" probably referred to the size of the screenshot. The current state mentions that the screenshot should be at most 150 pixels wide, but I think that's a bit too small. I considered to add some details about the screenshots having ~"an aspect ratio of roughly 1.0". But there are some broader questions around all of that in #49 . If there is any preference (e.g. allowing widths of 150...500, and using a fixed width of 250px in the table or so), then I could add this here as well.

lexaknyazev · 2026-05-25T15:37:11Z

+- `summary`: A short summary of the model, to be displayed in tables and overviews. It should usually be a single sentence of short paragraph, and _not_ a full description of the model. For example, a summary might be `"A simple triangle with a rotation animation"`.
+- `screenshot`: The path to a screenshot that should be displayed for the model, suitable for being displayed in an overview table. 
+- `tags`: An array of tags that are used for classifying the model. See the [Tags](#tags) section for the set of tags that are currently supported.
+- `createReadme`: Whether a `README.md` file should be generated automatically for the model. This should usually be `true`. 


When would this be false?

This is taken from the PHP state. The only reason for this being false is probably when someone wants to create the whole README.md manually, and not let it be auto-generated from the screenshot and the README.body.md.

I'd actually lean towards removing this createReadme flag completely, and require the README.body.md, to ensure some consistency between the READMEs, and avoid that "None provided" that is/was inserted by default (even though this model does have a body file - there are some inconsistencies here...)

Anyone opposed to ...

removing the createReadme flag

requiring the README.body.md

?

removing the createReadme flag

+1; the main README.md should always be auto-generated.

requiring the README.body.md

+1; if there's absolutely nothing to say about the asset, then it should not be added.

This was addressed in 8743f37

lexaknyazev · 2026-05-25T15:40:07Z

+```
+
+- `license`: An identifier for the license, e.g. `"LicenseRef-LegalMark-Khronos"`
+- `licenseUrl`: A link to the license text in the `LICENSES/` directory of the repository. The license text must be a file with the naming pattern _`<license>`_`.txt`. For example, such a license URL could be `"../../LICENSES/LicenseRef-LegalMark-Khronos.txt"`.


Isn't this field redundant? If the directory is fixed and the file name must match the license identifier, it would be better to omit it and enforce correct names implicitly (by checking that the file exists).

Again, there are some legacy aspects inherited from the PHP here.

The current state contains
"spdx": "LicenseRef-LegalMark-Khronos"
in metadata files, which I'd consider to be undesirable - it's simply not an SPDX identifier. My intention to clean this up somehow became obsolete: Apparently, the REUSE.toml contains a SPDX-License-Identifier that has to be a string that is not an SPDX identifier - ouch 🤕

The current intention is that:

the license can either be a known SPDX identifier, or that LicenseRef... string

if it is a known SPDX identifier, then the associated fields (licenseUrl, spdx, icon) are filled out automatically

if it is a custom license name, then there must be the LicenseUrl. Apparently, the new REUSE policy is that there must be that 1-to-1 correspondence between that license name, and the licenseUrl pointing to the LICENSES/....txt file. So yes, this could probably be omitted.

Checking that this file exists is a TODO in the current state. This existence check is also done by the reuse lint step. But it could be better to catch this earlier in the process (i.e. at the place marked with that TODO).

I'll try to remove the licenseUrl and check the validity earlier. (There currently are some places with a wrong ../-nesting, so I have to review that anyhow)

This was addressed in b7a00dc

javagl added 5 commits May 21, 2026 21:29

Add new utils

2f0a0fc

Remove old utils

6445dab

Minor update to model submission info

eb453f8

Add back dry-run option

64f7904

Add 'pbrtest' listing. Minor cleanups and tests.

f810edd

lexaknyazev reviewed May 22, 2026

View reviewed changes

javagl added 15 commits May 22, 2026 15:32

Prepare CI, cleanups

12e97f4

Minor cleanups

898b525

Prepare CI with package JSON scripts

bbcf897

Update CI for npm

320da35

Set working directory for CI

d5b7c44

Remove lowercase arguments

6d28fd0

Variants checking with issue report

9e76b79

Lowercase domain name, omit "V" in version number

636c246

Consistently encode URIs

90f11af

Consistent layout for used/required extensions

058e841

Use current year for copyright

22a1733

Omit "V" from version number

b073849

Add and update licenses, remove unused ones

c0d7f7b

Remove deprecated dep5 file.

675fd9f

So that's what 'dep' stands for.

Update to create REUSE.toml file

21dfe89

javagl added 2 commits May 22, 2026 21:34

Write in UTF8

78dfede

Use URL for GLB download

8a83839

Remove obsolete information from REUSE.toml

82a9a92

javagl added 2 commits May 23, 2026 18:13

Update contribution guidelines

8f1d5a8

Temporary: Print variants information

e189294

This was referenced May 23, 2026

Preview: Update all metadata JSON files to canonical form javagl/glTF-Sample-Assets#1

Draft

Preview: Update of auto-generated files javagl/glTF-Sample-Assets#2

Draft

javagl mentioned this pull request May 23, 2026

Update REUSE and Licenses #277

Open

javagl marked this pull request as ready for review May 23, 2026 16:50

Fixed what appears to be a typo

3c2b590

javagl mentioned this pull request May 24, 2026

Add REUSE compliance check to CI #282

Draft

Minor fix to handle missing glTF file

a61a488

lexaknyazev reviewed May 25, 2026

View reviewed changes

Comment thread README.md Outdated

javagl added 2 commits May 25, 2026 17:32

Deduplicate file name check. Minor formatting.

b8cb1ce

Update copyright year

18b01be

lexaknyazev reviewed May 25, 2026

View reviewed changes

javagl added 7 commits May 25, 2026 18:07

Fix typos in CONTRIBUTING.md

6043756

More consistent handling of licenseUrl

b7a00dc

Omit 'createReadme' flag. Update CONTRIBUTING.md.

8743f37

Slightly stricter variants search

eb8672e

Fixed typo in comment

10c0d43

Minor cleanup to use 'Issues' type

35af3de

Expect lowercase file extensions

49de53f

Conversation

javagl commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Previous state

New state

There is a main.ts that only parses the command line arguments.

The ModelMetadata class summarizes the main, actual tasks.

The Model class summarizes information about "one model"

The Models class is used for creating Model objects.

Changes to licenses

Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

javagl commented May 22, 2026

SUMMARY

Uh oh!

javagl commented May 22, 2026

Uh oh!

lexaknyazev commented May 22, 2026

Uh oh!

javagl commented May 22, 2026

Uh oh!

lexaknyazev commented May 22, 2026

Uh oh!

javagl commented May 23, 2026

Uh oh!

javagl commented May 23, 2026

Previews:

Uh oh!

javagl commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

javagl commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

javagl commented May 23, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

javagl May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

javagl May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

javagl commented May 21, 2026 •

edited

Loading

There is a `main.ts` that only parses the command line arguments.

The `ModelMetadata` class summarizes the main, actual tasks.

The `Model` class summarizes information about "one model"

The `Models` class is used for creating `Model` objects.

javagl commented May 23, 2026 •

edited

Loading

javagl commented May 23, 2026 •

edited

Loading

javagl May 25, 2026 •

edited

Loading

javagl May 25, 2026 •

edited

Loading