Update 'utils' from PHP to TypeScript#280
Conversation
So that's what 'dep' stands for.
|
An inlined comment pointed out that the
So that seems to be fine. The careful observer might have noticed that this PR now removes 19.160 lines. I have no clue what The validation via Another inlined comment pointed out the This should not be part of the state that is eventually merged. I think that one way of dealing with this would be:
Any thoughts on that...? |
|
Small addendum to the last point from the previous comment: Such a "one-time-cleanup pass" could also fix #281 . The current state contains some "debug" function to print these cases. Of course, that will be removed before marking this PR as ready, but only when the intended workflow for that one-time-cleanup is sorted out. |
|
My personal preference would be to split changes into more targeted PRs, e.g., scripts vs data as appropriate, stacked on top of each other or opened sequentially. At the very least, obvious and trivial one-time cleanups (removal of That said, other maintainers may have different preferences and, ultimately, the goal is to eventually land all changes, so an all-in-one PR that brings the repo to a new state is great too. Up to you how to organize PR's commit history (it can be rewritten if desired). |
|
It's not clear know how that file ended up here, but the step of removing it is just part of the update for the By the way, I just saw #277. Did anybody care about that? No. Should it have been merged by now? Yes. However, I'll try to align it with the changes here. |
I suppose there are only two ways of handling this: either merging #277 and resyncing this PR afterwards or closing #277 and incorporating its changes here. As this PR's name suggests that it's about rewriting scripts, merging #277 first would be a bit more semantically accurate. Anyway, please feel free to structure this work as you see fit. Getting it merged is more important than planning details. |
Merging that other PR does not make sense: As soon as it is merged and the PHP script runs, it will build a new But that other PR is pretty much a strict subset of what is done here: Removing |
|
Addressing the points from #277
The remaining points from that PR...
had been done here as well, as part of the overall cleanup of the license namings. So I think that this PR will make the PR #277 obsolete now. Previews:
|
|
I'll mark this as "Ready For Review" now (even though I expect some minor cleanups and comments to come in during the review). EDIT: This is obsolete now: The CI integration of the REUSE check is drafted in #282 Note Before this is actually merged, the |
|
A very specific technical aside: The These are inserted automatically. I don't know what the intention was behind that EDIT: When I said that this would be a 'breaking change', then this referred to things like https://github.khronos.org/glTF-Assets/model/AnimatedTriangle , which already does display this 'icon' at the bottom, and one has to assume that the URL for that is pulled from the metadata. So it can not simply be removed... |
|
Recreational saturday evening coding: The new structure would allow some additional functionalities more easily. For example, it took only a few minutes to extend the functionality of creating 'listings' to other categories - specifically, the extensions that the models are using. A preview: A listing of all models using But of course, at some point, it does not make sense to maintain these as MD files here. The https://github.khronos.org/glTF-Assets/ website (or https://asset-explorer.needle.tools/ ) may be more suitable for all this. |
| if (fileLowercase.endsWith(".glb")) { | ||
| if (!file.endsWith(".glb")) { | ||
| warnings.push( | ||
| `File does not have an all-lowercase extension: ${file}` | ||
| ); | ||
| } | ||
| variants[name] = file; | ||
| foundFile = true; | ||
| break; | ||
| } | ||
| if (fileLowercase.endsWith(".gltf")) { | ||
| if (!file.endsWith(".gltf")) { | ||
| warnings.push( | ||
| `File does not have an all-lowercase extension: ${file}` | ||
| ); | ||
| } | ||
| variants[name] = file; | ||
| foundFile = true; | ||
| break; | ||
| } | ||
| } |
There was a problem hiding this comment.
Please consider deduplicating this section as .endsWith(".glb") and .endsWith(".gltf") can't be true simultaneously.
There was a problem hiding this comment.
Done. That else was actually there in the original code.
Also a bit more generally:
I'd like that "variants search" to be "better" in some waysl. But we currently don't have a strict enough definition of what 'variants' are. I tried to summarize a bit of that in the updated CONTRIBUTING.md. But there are oddly specific ones like glTF-Binary-KTX-ETC1S-Draco that are not mentioned there explicitly.
Nearly a detail: The current check does not complain about a glTF-Binary subdirectory containing a .gltf file and such.
We should probably consider to define an accepted set of 'variants' somewhere, maybe even including the expected extension per variant (e.g. glTF-Binary: .glb). That would likely mean to rename some of the existing ones, and be more strict to disallow things like glTF-ReallyObscureName.
Maybe also worth escalating into an issue...?
There was a problem hiding this comment.
That else was actually there in the original code.
I meant deduplicating code blocks themselves as they are identical except for the conditions. That surely could be done later.
We should probably consider to define an accepted set of 'variants' somewhere
For starters, we should require standard names for standard variants:
glTFandglTF-Embedded:.gltfglTF-Binary:.glb
All others could be ignored for now.
There was a problem hiding this comment.
This was added via eb8672e (only a slight improvement, but maybe OK for now)
| - Each model must pass the [glTF-Validator](https://github.khronos.org/glTF-Validator/). | ||
| - Each model must have an associated `README.md` markdown file that describes the model and the features of the model that make it appropriate of this repo | ||
| - Each model must have an associated `metadata.json` file that includes legal information (ownership, copyright, and license) | ||
| - Each model must have a properly formatted screenshot |
There was a problem hiding this comment.
Maybe explain "properly formatted" here or elsewhere; the dedicated section below does not contain any specifics either.
There was a problem hiding this comment.
(I'll close the fixed typo ones and push the update soon. Some of the text (e.g. 'appropritate of') was taken from the original state, but a cleanup pass should clean up this as well)
The "properly formatted" probably referred to the size of the screenshot. The current state mentions that the screenshot should be at most 150 pixels wide, but I think that's a bit too small. I considered to add some details about the screenshots having ~"an aspect ratio of roughly 1.0". But there are some broader questions around all of that in #49 . If there is any preference (e.g. allowing widths of 150...500, and using a fixed width of 250px in the table or so), then I could add this here as well.
| - `summary`: A short summary of the model, to be displayed in tables and overviews. It should usually be a single sentence of short paragraph, and _not_ a full description of the model. For example, a summary might be `"A simple triangle with a rotation animation"`. | ||
| - `screenshot`: The path to a screenshot that should be displayed for the model, suitable for being displayed in an overview table. | ||
| - `tags`: An array of tags that are used for classifying the model. See the [Tags](#tags) section for the set of tags that are currently supported. | ||
| - `createReadme`: Whether a `README.md` file should be generated automatically for the model. This should usually be `true`. |
There was a problem hiding this comment.
This is taken from the PHP state. The only reason for this being false is probably when someone wants to create the whole README.md manually, and not let it be auto-generated from the screenshot and the README.body.md.
I'd actually lean towards removing this createReadme flag completely, and require the README.body.md, to ensure some consistency between the READMEs, and avoid that "None provided" that is/was inserted by default (even though this model does have a body file - there are some inconsistencies here...)
Anyone opposed to ...
- removing the
createReadmeflag - requiring the
README.body.md
?
There was a problem hiding this comment.
removing the
createReadmeflag
+1; the main README.md should always be auto-generated.
requiring the
README.body.md
+1; if there's absolutely nothing to say about the asset, then it should not be added.
| ``` | ||
|
|
||
| - `license`: An identifier for the license, e.g. `"LicenseRef-LegalMark-Khronos"` | ||
| - `licenseUrl`: A link to the license text in the `LICENSES/` directory of the repository. The license text must be a file with the naming pattern _`<license>`_`.txt`. For example, such a license URL could be `"../../LICENSES/LicenseRef-LegalMark-Khronos.txt"`. |
There was a problem hiding this comment.
Isn't this field redundant? If the directory is fixed and the file name must match the license identifier, it would be better to omit it and enforce correct names implicitly (by checking that the file exists).
There was a problem hiding this comment.
Again, there are some legacy aspects inherited from the PHP here.
The current state contains
"spdx": "LicenseRef-LegalMark-Khronos"
in metadata files, which I'd consider to be undesirable - it's simply not an SPDX identifier. My intention to clean this up somehow became obsolete: Apparently, the REUSE.toml contains a SPDX-License-Identifier that has to be a string that is not an SPDX identifier - ouch 🤕
The current intention is that:
- the
licensecan either be a known SPDX identifier, or thatLicenseRef...string - if it is a known SPDX identifier, then the associated fields (
licenseUrl,spdx,icon) are filled out automatically - if it is a custom license name, then there must be the
LicenseUrl. Apparently, the new REUSE policy is that there must be that 1-to-1 correspondence between thatlicensename, and thelicenseUrlpointing to theLICENSES/....txtfile. So yes, this could probably be omitted.
Checking that this file exists is a TODO in the current state. This existence check is also done by the reuse lint step. But it could be better to catch this earlier in the process (i.e. at the place marked with that TODO).
I'll try to remove the licenseUrl and check the validity earlier. (There currently are some places with a wrong ../-nesting, so I have to review that anyhow)
Previous state
The 'utils' directory contained two PHP scripts,
model.phpandmodelmetadata.php. The purpose of these scripts was to create and update several files in the repository. For example:model-index.jsonthat lists all models and their tags and variantsModel-showcase.mdthat shows all models with the 'showcase' tagREADME.mdfileLICENSE.mdfilemetadata.jsonfileThe
metadata.jsonfile was also the input for all of this, and there have been a few functionalities that appeared to be related to various forms of "updating" the license information (that have been a bit hard to follow, admittedly).New state
The (relevant?) functionality of these scripts is rewritten in TypeScript here. This includes the introduction of structures/types for ... "various things", extensive documentation, and (what I hope to be) clearer processes.
There is a
main.tsthat only parses the command line arguments.Some of the command line arguments had been marked as obsolete, or they have been "smeared" through the code in a form that made it hard to parse (~40 appearances of
dry-runordryRun...). Once the command lines are parsed, they are passed to the only "public function", namelyModelMetadata.processThe
ModelMetadataclass summarizes the main, actual tasks.It collects the models that should be processed (usually, all models from the
Modelssubdirectory). These are collected asModelobjects (details below). Once the models are collected, several functions are called:createListings, which creates the list files that summarize models by their tags, e.g.Models-showcase.mdcreateModelIndex, which created themodel-index.jsonfilecreateReuseLicensewhich creates thedep5-file with license information about all modelsupdateAllModelsFiles: This updates theREADME.md,LICENSE.md, andmetadata.jsonof all modelsThe
Modelclass summarizes information about "one model"This class contains, typed, documented information. The main goal was to avoid lines like
$this->metadata['createReadme'] = (isset($this->metadata['AutoGenerateREADME'])) ? $this->metadata['AutoGenerateREADME'] && $this->metadata['createReadme'] : $this->metadata['createReadme'];and to have a more clearly encapsulated, consistent functionality. This does require a certain level of validation. Therefore...:
The
Modelsclass is used for creatingModelobjects.This class takes the data that was read from the
metadata.jsonof the models, and performs several levels of validation:legalarray of the metadata JSON are presentOnly when all this succeeds, it creates an actual, valid
Model. (Otherwise, it collects any errors/warnings, so that the caller can deal with that).Changes to licenses
No, this is not what it sounds like. The point is that the previous state defined an array of licenses, and there have been quite a few inconsistencies in that.
For example, the so-called
spdxidentifiers have not been valid SPDX identifiers. Some of the license links have been plainly wrong. I don't know where this CC0-1.0 link is coming from, but it redirects to some random blog post. Also, there is no estabished license that is called "Public Domain". And there is no difference between "CC-BY International 4.0" and "CC-BY-4.0". Long story short: There are now exactly (EDIT:) four licenses left. When ametadata.jsoncontains a license where thelicenseis not a valid, known SPDX identifier, then it is assumed to be a "custom" license, meaning that it must have alicenseUrlthat points to one of the licenses in this repository.(The current state here contains a few quirks to "canonicalize" and "clean up" the license information in the existing
metadata.jsonfiles...)Review
I know, this is a big change, coming out of the blue (or not so, because we talked about some of that earlier). So I assume that not everybody will be willing or able to review this just so. Nevertheless, I'll tag a few people as potential reviewers here:
@DRx3D @emackey @lexaknyazev
The process for running this state locally should be:
cd utilsnpm installnpm run checknpm run updateThis will check/update all files locally. And the result may look faaar worse than it actually is: It will mark (nearly) all files as "modified", but nearly all these modifications are only the result of canonicalizing the license names to their proper SPDX names. For the other files, I tried to keep the structure and layout the same as before.