Skip to content

Conversation

@monsieurswag
Copy link
Contributor

@monsieurswag monsieurswag commented Nov 19, 2025

Note for myself: Failing tests has to be fixed:

  • The problem can likely be reproduced by creating a new DB from scratch
  • The bug is surely caused by importing the core/signals.py file which imports the Framework model (which may not be created in the database since it may this file may be imported and therefore executed before the Framework table creation migration was applied).

What this PR does:

  • Load custom libraries on upload.
  • Update custom libraries on updated version upload.
  • Automatically update libraries with autoload=True at startup.
  • Refresh the MappingEngine in-memory requirement mapping set list when a requirement mapping set is added/updated/deleted.
  • Refresh the MappingEngine in-memory (frame.urn => framework score) dict when a framework is added/updated/deleted.

Using signals to invalidate the MappingEngine cache gives the ability to the cache to react to on_delete=models.CASCADE originated deletions.
It's also more convenient for transaction support as the post_save and post_delete signals are only fired when a transaction executes successfully.

Summary by CodeRabbit

  • New Features

    • Added automatic update mechanism for auto-loaded libraries to load and refresh library content.
  • Improvements

    • Enhanced cache invalidation when framework and mapping data changes, improving performance consistency.
    • Improved error handling and logging for library update operations.
  • Changes

    • Removed framework comparison restriction in library mappings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 19, 2025

Walkthrough

The pull request refactors library storage and autoloading mechanisms. StoredLibrary.store_library_content() now returns tuples indicating newly stored and outdated libraries. Signal handlers are added for cache invalidation on model changes. Library autoload logic is updated with version comparison, and associated views and management commands are adjusted accordingly.

Changes

Cohort / File(s) Summary
Signal Registration & Cache Invalidation
backend/core/apps.py, backend/core/signals.py
CoreConfig.ready() now imports core.signals to activate signal handlers. New signals.py file introduces post_save/post_delete receivers for RequirementMappingSet and Framework that invalidate engine caches by calling load_rms_data() and load_frameworks().
Core Model API Changes
backend/core/models.py
StoredLibrary.store_library_content() return type changed from single value to Tuple of (library, outdated_library) or (None, None). Added autoload determination logic based on library_data structure. Removed StoredLibrary.save() override that triggered engine loading on commit. Removed RequirementMappingSet validation restricting identical source/target frameworks. Type hints updated to include Tuple.
Engine Cleanup
backend/core/mappings/engine.py
Removed unused import of sizeof from ctypes module.
Library Autoload Management
backend/library/management/commands/autoloadlibraries.py
Added imports for Subquery, OuterRef, Max aggregation, and LoadedLibrary model. Command help text updated to indicate "Load/Update" behavior. New update pass compares LoadedLibrary versions against StoredLibrary max versions where autoload is True, iterating over libs_to_update and calling lib.update() with error handling.
Library Views
backend/library/views.py
Introduced private _import_library() method extracting core import logic. Public import_library() now wraps _import_library(). upload_library() extended to handle two-value tuple return from store_library_content() with update/no-update path logic. Added Optional type import.
Test Updates
backend/library/tests/test_store_library_content.py
Updated test assertions to unpack two-value tuple return from store_library_content() into library and dummy variable.

Sequence Diagram(s)

sequenceDiagram
    participant Caller as Caller
    participant signals as Signal Handlers
    participant engine as Engine Cache
    participant db as Database

    autonumber
    
    Caller->>db: post_save(RequirementMappingSet) or post_delete
    db->>signals: Trigger invalidate_requirement_mapping_set_cache
    signals->>engine: engine.load_rms_data()
    engine->>db: Reload RMS data
    engine-->>signals: Cache updated
    
    Caller->>db: post_save(Framework) or post_delete
    db->>signals: Trigger invalidate_framework_cache
    signals->>engine: engine.load_frameworks()
    engine->>db: Reload frameworks
    engine-->>signals: Cache updated
Loading
sequenceDiagram
    participant Caller as Caller/View
    participant Models as Models
    participant AutoloadCmd as Autoload Command
    participant Loader as LoadedLibrary

    autonumber
    
    Caller->>Models: store_library_content(data)
    Models->>Models: Determine autoload flag from library_data
    Models-->>Caller: (new_library, outdated_library)
    
    AutoloadCmd->>Models: Query LoadedLibrary where autoload=True
    Models->>Models: Compare version vs max(StoredLibrary.version)
    Models-->>AutoloadCmd: libs_to_update
    
    AutoloadCmd->>Loader: lib.update() for each in libs_to_update
    Loader->>Models: Update LoadedLibrary with new version
    Models-->>Loader: Updated
    Loader-->>AutoloadCmd: Confirm/Error
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Return type change propagation: The tuple return from store_library_content() requires careful verification across multiple callers (views, tests, management commands) to ensure proper unpacking and handling.
  • Signal handler correctness: Verify that signal receivers in core/signals.py trigger at appropriate lifecycle points and don't cause circular invalidations or missing cache updates.
  • Autoload update logic: The new version-comparison query in autoloadlibraries.py using Subquery and OuterRef should be reviewed for correct aggregation and filtering semantics.
  • Removed constraints: Verification that removing the identical framework validation and the save() hook doesn't create unintended side effects in dependent systems.

Possibly related PRs

Suggested labels

High Value

Suggested reviewers

  • eric-intuitem
  • nas-tabchiche
  • ab-smith

Poem

🐰 A signal springs when frameworks change,
Cache refresh—no need to rearrange!
Libraries load, versions compare,
Outdated swept, new ones with care.
Tuples returned, workflows align—
Autoload magic, oh so fine! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 6.67% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main changes: automatically loading/updating libraries and improving mapping engine cache invalidation, which are the core objectives of the PR.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/auto_load_and_update_libs

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
backend/library/views.py (1)

287-323: Handle (None, None) result from store_library_content and re-examine upload RBAC

The new upload flow mostly aligns with the (library, outdated_library) API, but there are a couple of important edge cases:

  1. Bug when no new library is stored

StoredLibrary.store_library_content can legitimately return (None, None) (e.g., same checksum already stored, same or older version, etc.). In that case:

library, outdated_library = StoredLibrary.store_library_content(content)
# ...
return Response(
    StoredLibrarySerializer(library).data, status=HTTP_201_CREATED
)

Here library is None, so StoredLibrarySerializer(library).data will assert/raise and be caught only by the outer broad except, resulting in an "invalidLibraryFileError" instead of a clear “library already stored / no update” response.

You likely want an explicit guard before serializing:

-                library, outdated_library = StoredLibrary.store_library_content(content)
-
-                if library is not None:
+                library, outdated_library = StoredLibrary.store_library_content(content)
+
+                if library is None and outdated_library is None:
+                    # No new stored library; checksum/version already present.
+                    # Adjust status/body to whatever UX you prefer.
+                    return HttpResponse(
+                        json.dumps({"error": "libraryAlreadyStored"}),
+                        status=HTTP_400_BAD_REQUEST,
+                    )
+
+                if library is not None:
                     if not (_has_update := outdated_library is not None):
                         response = self._import_library(request, library.urn)
                         if response is not None:
                             return response
                     else:
                         if error_msg := outdated_library.update():
                             return Response(
                                 error_msg, status=status.HTTP_400_BAD_REQUEST
                             )

This avoids the crash and lets you send a specific, predictable outcome when nothing changes.

  1. RBAC semantics on upload vs import/update
  • The upload endpoint correctly checks add_storedlibrary up front.
  • When a new library is created and outdated_library is None, you unconditionally call _import_library, which requires add_loadedlibrary. If the caller lacks that permission, they get a 403 from _import_library, but the StoredLibrary row has already been created. That’s a bit surprising: a 403 response after a successful write.
  • When an existing loaded library is updated (outdated_library is not None), you call outdated_library.update() without checking add_loadedlibrary, effectively allowing any caller with add_storedlibrary to trigger a load/update.

It would be good to confirm whether this mixing of add_storedlibrary and add_loadedlibrary semantics is intentional. If not, options include:

  • Only auto‑import/auto‑update when the caller also has add_loadedlibrary, otherwise just store and return the new StoredLibrary.
  • Or treat upload as an “admin‑only” pathway that’s allowed to manage both stored and loaded libraries, and adjust permissions accordingly.
backend/core/models.py (1)

264-363: Fix outdated_loaded_library locale handling and align return contract docs

There are a few things worth tightening up in store_library_content:

  1. Potentially wrong LoadedLibrary selected (locale missing + arbitrary version)
    outdated_loaded_library is currently resolved only by urn and version__lt=version, so if you ever have multiple locales for the same URN you may:

    • Pick a LoadedLibrary with a different locale than the newly stored StoredLibrary.
    • Pick an arbitrary older version instead of the latest one below version.

    This can cause the wrong loaded library to be treated as “updatable” for this stored content. You almost certainly want to match the same locale and, if several versions exist, prefer the highest one below the new version.

    Consider updating the query like this:

  •    outdated_loaded_library = LoadedLibrary.objects.filter(
    
  •        urn=urn, version__lt=version
    
  •    ).first()
    
  •    outdated_loaded_library = (
    
  •        LoadedLibrary.objects.filter(
    
  •            urn=urn,
    
  •            locale=locale,
    
  •            version__lt=version,
    
  •        )
    
  •        .order_by("-version")
    
  •        .first()
    
  •    )
    
    
    
    
  1. Docstring still mentions False instead of None
    The docstring describes library as “otherwise False”, but all early‑exit paths now return (None, None). To avoid surprising callers:

  •    - `library`: The newly created `StoredLibrary` object on success, otherwise False
    
  •    - `library`: The newly created `StoredLibrary` object on success, otherwise None
    
    
    
    
  1. Optional readability improvement for return type
    Not blocking, but the return annotation could be made easier to read as a single tuple of optionals instead of a union of two Tuple shapes, e.g.:

    def store_library_content(
        cls, library_content: bytes, builtin: bool = False
    ) -> tuple["StoredLibrary | None", "LoadedLibrary | None"]:
        ...

    This would still express the (None, None) / (StoredLibrary, LoadedLibrary | None) contract cleanly.

🧹 Nitpick comments (4)
backend/library/tests/test_store_library_content.py (1)

140-155: Tests correctly adapted to tuple return from store_library_content

Unpacking (stored_library, _) matches the new (library, outdated_library) contract and the autoload assertions remain valid for the provided YAML samples. You might later add a dedicated test that asserts the outdated_library behavior when uploading a higher-version library, but that’s optional.

backend/core/signals.py (1)

1-15: Signal-driven cache invalidation works; review transaction semantics

Hooking post_save/post_delete on RequirementMappingSet and Framework to call engine.load_rms_data() / engine.load_frameworks() is a clean way to keep the in‑memory mappings in sync with model changes and satisfies the stated invalidation needs.

Be aware though:

  • In Django, post_save / post_delete are triggered when the ORM executes the operation, not strictly “after commit”. If you perform changes inside transaction.atomic blocks that later roll back, the engine may have loaded data that no longer exists in the DB. For strict transaction awareness, consider wrapping the engine calls in transaction.on_commit(...).
  • If you expect bulk updates or high write throughput on these models, repeatedly recomputing the full mappings on every row change could become expensive; batching or debouncing invalidations might be useful in that scenario.
backend/library/management/commands/autoloadlibraries.py (1)

5-8: Autoupdate query is correct; consider scoping by locale and tightening error handling

The autoload + autoupdate flow is implemented correctly: you first load missing autoloaded libraries, then select LoadedLibrary rows whose version is less than the max StoredLibrary.version for the same urn with autoload=True, and call lib.update() with appropriate logging.

Two refinements to consider:

  • Scope by locale: libs_to_update only constrains on urn. If you have multiple locales for the same URN, you’ll attempt updates for locales that may have no newer StoredLibrary row, resulting in update() returning "libraryHasNoUpdate" and an error log. Adding locale=OuterRef("locale") to the subquery filter would avoid needless work and clearer logs.
  • Differentiate expected “no update” from real failures: Right now every non‑None error_msg is logged as “Failed to automatically update library”. If "libraryHasNoUpdate" is a benign outcome, you might treat it separately from real errors, or avoid logging it at error level to keep logs actionable.

Also applies to: 15-44

backend/library/views.py (1)

228-264: Helper _import_library can return a plain Response and be called directly

The new _import_library helper cleanly centralizes permission checks, lookup, and load error handling, which is a nice refactor. Given the current implementation:

  • _import_library always returns a Response; there is no branch that yields None.
  • import_library only unwraps the helper and returns the value if not None.

You can simplify this to make the flow clearer:

  • Change the return annotation to -> Response.
  • Have import_library just return self._import_library(request, pk) without the if (response := ...) is not None guard.

This keeps behavior identical but tightens types and reduces indirection.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7957198 and a13c416.

📒 Files selected for processing (7)
  • backend/core/apps.py (1 hunks)
  • backend/core/mappings/engine.py (0 hunks)
  • backend/core/models.py (6 hunks)
  • backend/core/signals.py (1 hunks)
  • backend/library/management/commands/autoloadlibraries.py (2 hunks)
  • backend/library/tests/test_store_library_content.py (1 hunks)
  • backend/library/views.py (5 hunks)
💤 Files with no reviewable changes (1)
  • backend/core/mappings/engine.py
🧰 Additional context used
🧬 Code graph analysis (5)
backend/core/signals.py (3)
backend/core/models.py (2)
  • RequirementMappingSet (1754-1774)
  • Framework (1579-1660)
backend/core/views.py (1)
  • mappings (5558-5567)
backend/core/mappings/engine.py (2)
  • load_rms_data (57-92)
  • load_frameworks (94-100)
backend/library/tests/test_store_library_content.py (1)
backend/core/models.py (2)
  • StoredLibrary (243-385)
  • store_library_content (264-362)
backend/library/management/commands/autoloadlibraries.py (1)
backend/core/models.py (3)
  • LoadedLibrary (876-992)
  • StoredLibrary (243-385)
  • update (882-894)
backend/core/models.py (1)
backend/core/utils.py (1)
  • sha256 (37-41)
backend/library/views.py (2)
backend/iam/models.py (4)
  • RoleAssignment (807-1094)
  • is_access_allowed (837-857)
  • Folder (68-316)
  • get_root_folder (75-77)
backend/core/models.py (3)
  • StoredLibrary (243-385)
  • store_library_content (264-362)
  • update (882-894)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: startup-docker-compose-test
  • GitHub Check: enterprise-startup-functional-test (3.12)
  • GitHub Check: enterprise-startup-docker-compose-test
  • GitHub Check: build_community_frontend
  • GitHub Check: build_enterprise_frontend
🔇 Additional comments (3)
backend/core/apps.py (1)

13-18: Signal module import in ready() is appropriate

Importing core.signals inside CoreConfig.ready() is a standard way to register signal handlers and placing it before the RUN_MAIN guard ensures they are wired in all processes while keeping existing post_migrate behavior intact.

backend/core/models.py (2)

7-7: Typing imports match usage

The added Tuple and Optional imports are used in the new store_library_content annotation and look consistent with the rest of the file’s typing style.


371-372: Usage of new tuple return shape looks correct

store_library_file correctly unpacks the (new_library, outdated_loaded_library) tuple and returns only new_library, preserving the previous public API (StoredLibrary | None) for callers of this helper. Ignoring the second element here makes sense for file‑based loading where any update action is handled elsewhere.

@monsieurswag monsieurswag marked this pull request as draft November 19, 2025 15:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants