Skip to content

Conversation

@arielshaqed
Copy link
Contributor

No description provided.

Previously the test mocks presented commit IDs in chronological order.  But
GC code actually receives commit IDs in lexicographic order -- which is
indistinguishable from random.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to clean up GC (Garbage Collection) tests by improving resource management and test data randomization. However, there are two critical bugs that prevent the code from working correctly.

Key Changes:

  • Moved iterator cleanup to use defer pattern in active_commits_test.go
  • Added ID scrambling functionality to avoid alphabetical ordering bias in tests
  • Attempted to reposition Close() call in starting_point_iterator_test.go

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
pkg/graveler/retention/starting_point_iterator_test.go Attempted to move Close() call but positioned it incorrectly, breaking the test
pkg/graveler/retention/active_commits_test.go Added test data scrambling with fingerprint function and improved resource cleanup with defer, but fingerprint implementation has a critical hash computation bug

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

branchIterator := testutil.NewFakeBranchIterator(branchRecords)
commitIterator := testutil.NewFakeCommitIterator(commitRecords)
it := NewGCStartingPointIterator(commitIterator, branchIterator)
it.Close()
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The iterator is being closed before it's used. Moving it.Close() to line 74 (before the for it.Next() loop on line 76) will cause the test to fail because the iterator will be closed before any data is read from it. The Close() call should remain after the loop completes (after line 92).

Suggested change
it.Close()
defer it.Close()

Copilot uses AI. Check for mistakes.

func fingerprint[T ~string](s T) T {
h := fnv.New64a()
h.Sum([]byte(s))
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fingerprint function incorrectly uses h.Sum([]byte(s)) on line 177. The Sum method appends the current hash to the provided byte slice; it doesn't hash the data. To hash the string, use h.Write([]byte(s)) instead. The current implementation will produce incorrect fingerprints as it's not actually hashing the input string.

Suggested change
h.Sum([]byte(s))
h.Write([]byte(s))

Copilot uses AI. Check for mistakes.
@arielshaqed arielshaqed added the area/testing Improvements or additions to tests label Dec 2, 2025
@arielshaqed arielshaqed requested a review from a team December 2, 2025 15:17
@arielshaqed arielshaqed added pr/merge-if-approved Reviewer: please feel free to merge if no major comments GC+ and removed pr/merge-if-approved Reviewer: please feel free to merge if no major comments labels Dec 2, 2025
Neither breaks anything... but both definitely improve tests and/or
readability.
@arielshaqed arielshaqed requested a review from Copilot December 2, 2025 15:19
Copilot finished reviewing on behalf of arielshaqed December 2, 2025 15:22
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

headsRetentionDays map[string]int32
expectedActiveIDs []string
}

Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The fingerprint function lacks documentation explaining its purpose and behavior. Consider adding a comment describing that it creates a deterministic hash-prefixed string to scramble commit IDs for testing purposes.

Suggested addition:

// fingerprint creates a hash-prefixed version of the input string to scramble
// commit IDs in tests, ensuring they are not alphabetically ordered in a way
// that could accidentally match their time ordering.
func fingerprint[T ~string](s T) T {
Suggested change
// fingerprint creates a hash-prefixed version of the input string to scramble
// commit IDs in tests, ensuring they are not alphabetically ordered in a way
// that could accidentally match their time ordering.

Copilot uses AI. Check for mistakes.
fingerprint := h.Sum64()
return T(fmt.Sprintf("%016x-%s", fingerprint, s))
}

Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The scrambleIDs function lacks documentation explaining its purpose. Consider adding a comment describing that it applies fingerprinting to all IDs in the test data structure.

Suggested addition:

// scrambleIDs applies fingerprinting to all commit IDs, head IDs, and expected
// active IDs in the test data to ensure they are not alphabetically ordered.
func scrambleIDs(tst testRepoCommits) testRepoCommits {
Suggested change
// scrambleIDs applies fingerprinting to all commit IDs, head IDs, and expected
// active IDs in the test data to ensure they are not alphabetically ordered.

Copilot uses AI. Check for mistakes.
Comment on lines +185 to +187
for idx, parent := range commit.parents {
commit.parents[idx] = fingerprint(parent)
}
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commit.parents slice is being modified, but since commit is a value copy from the range loop, these modifications won't be reflected in the commit value stored in the commits map on line 188. The parents should be modified before storing the commit.

Suggested fix:

for id, commit := range tst.commits {
    parents := make([]graveler.CommitID, len(commit.parents))
    for idx, parent := range commit.parents {
        parents[idx] = fingerprint(parent)
    }
    commit.parents = parents
    commits[fingerprint(id)] = commit
}
Suggested change
for idx, parent := range commit.parents {
commit.parents[idx] = fingerprint(parent)
}
parents := make([]graveler.CommitID, len(commit.parents))
for idx, parent := range commit.parents {
parents[idx] = fingerprint(parent)
}
commit.parents = parents

Copilot uses AI. Check for mistakes.
Comment on lines +178 to +179
fingerprint := h.Sum64()
return T(fmt.Sprintf("%016x-%s", fingerprint, s))
Copy link

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The local variable fingerprint shadows the function name, which can be confusing. Consider renaming the variable to something like hash or fpHash.

Suggested fix:

func fingerprint[T ~string](s T) T {
    h := fnv.New64a()
    h.Write([]byte(s)) // Write cannot fail for a hash func.
    hash := h.Sum64()
    return T(fmt.Sprintf("%016x-%s", hash, s))
}
Suggested change
fingerprint := h.Sum64()
return T(fmt.Sprintf("%016x-%s", fingerprint, s))
hash := h.Sum64()
return T(fmt.Sprintf("%016x-%s", hash, s))

Copilot uses AI. Check for mistakes.
@arielshaqed arielshaqed added exclude-changelog PR description should not be included in next release changelog minor-change Used for PRs that don't require issue attached labels Dec 3, 2025
Comment on lines +175 to +180
func fingerprint[T ~string](s T) T {
h := fnv.New64a()
h.Write([]byte(s)) // Write cannot fail for a hash func.
fingerprint := h.Sum64()
return T(fmt.Sprintf("%016x-%s", fingerprint, s))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit; can keep it simple while accepting strings

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/testing Improvements or additions to tests exclude-changelog PR description should not be included in next release changelog GC+ minor-change Used for PRs that don't require issue attached

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants