Skip to content

Conversation

@mathbou
Copy link

@mathbou mathbou commented Aug 23, 2023

Hi there,

I was looking at my temp folder while doing some tests with the tk-core, and I saw that each time I reload an engine, a massive amount of temporary clone folders were created (associated with a big network load). By digging into the source code of git descriptors, I found that they clone repos each and every time they needs to execute a command in it.

So here is a PR with a big refactoring of the git, git_tag and git_branch descriptors:

  • replace git clone with git ls-remote when fetching tags/commit values, checking remote access
  • Add version pattern and get_latest support for git branch, and improve the logic for git tag
  • Add caching system to avoid processing the same repo multiple times

@mathbou mathbou marked this pull request as draft August 24, 2023 01:08
@codecov
Copy link

codecov bot commented Aug 27, 2023

Codecov Report

Attention: Patch coverage is 91.01796% with 15 lines in your changes missing coverage. Please review.

Project coverage is 73.76%. Comparing base (0158a91) to head (8ddde7e).

Files with missing lines Patch % Lines
python/tank/descriptor/io_descriptor/git.py 87.77% 11 Missing ⚠️
python/tank/descriptor/io_descriptor/git_tag.py 93.47% 3 Missing ⚠️
python/tank/descriptor/io_descriptor/git_branch.py 96.55% 1 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (0158a91) and HEAD (8ddde7e). Click for more details.

HEAD has 9 uploads less than BASE
Flag BASE (0158a91) HEAD (8ddde7e)
18 9
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #913      +/-   ##
==========================================
- Coverage   79.78%   73.76%   -6.02%     
==========================================
  Files         198      198              
  Lines       20773    20818      +45     
==========================================
- Hits        16574    15357    -1217     
- Misses       4199     5461    +1262     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mathbou mathbou marked this pull request as ready for review August 27, 2023 14:36
@julien-lang julien-lang changed the title Refactor git descriptors SG-39245 Refactor git descriptors May 29, 2025
@julien-lang julien-lang requested a review from Copilot June 25, 2025 16:22
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the git descriptors to reduce network load and improve performance by replacing full repository clones with remote queries and caching. Key changes include:

  • Switching from “git clone” to “git ls-remote” when fetching remote information.
  • Adding version pattern matching, get_latest support for git branch/tag descriptors.
  • Implementing a caching mechanism to prevent redundant processing.

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/descriptor_tests/test_io_descriptors.py Updated the version attribute type for consistent string handling.
tests/descriptor_tests/test_git.py Updated expected commit/tag values to align with the refactored descriptor logic.
tests/descriptor_tests/test_downloadables.py Replaced shutil.move with a copytree/rmtree approach with onerror callback.
tests/descriptor_tests/test_descriptors.py Updated command validation to use the new _get_git_clone_commands method.
python/tank/util/filesystem.py Added a writable check and chmod call when the destination file is read-only.
python/tank/descriptor/io_descriptor/git_tag.py Improved tag fetching logic with refined exception handling and clearer logging.
python/tank/descriptor/io_descriptor/git_branch.py Made the “version” field optional and streamlined branch-related git commands.
python/tank/descriptor/io_descriptor/git.py Reworked internal git command executions, introduced _normalize_path, and implemented a caching metaclass for git descriptors.
Comments suppressed due to low confidence (1)

python/tank/descriptor/io_descriptor/git.py:70

  • The cache timeout calculation uses time()/100 and a modulus of 2, resulting in a cache window of approximately 200 seconds, which does not match the documented 2 minute validity. Consider reviewing and adjusting the computation to accurately reflect the intended duration.
        now = int(time() / 100)

@mathbou mathbou force-pushed the feature/git_descriptor branch from a84ff90 to 608c0c0 Compare June 25, 2025 17:55
@mathbou mathbou force-pushed the feature/git_descriptor branch from 608c0c0 to 756823f Compare July 4, 2025 22:29
@mathbou mathbou force-pushed the feature/git_descriptor branch from 756823f to 8ddde7e Compare July 4, 2025 22:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant