-
Notifications
You must be signed in to change notification settings - Fork 10
Prevent FallbackRepository from download storms #1278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: push-volovwxpnxry
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
7ff7fc2 to
b937392
Compare
| self.code().unwrap_or_else(|| Box::new("spfs::generic")) | ||
| ) | ||
| )] | ||
| pub enum SyncError { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did all this Error rework and then realized I didn't need it. But I think this change is for the better anyway.
| Err(SyncError::PayloadReadError(digest, Error::String("no repositories could successfully read the payload".into()).into())) | ||
| }) | ||
| .await | ||
| .map_err(|err| Error::String(format!("failed to open payload: {err}")))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather return a more specific error here, but this function has to return spfs::Error and the error type in the cache is now SyncError. Most of the SyncError variants have a nested Arc<spfs::Error> that in theory could be returned, but spfs::Error is not Clone.
I worked on making a new error type that is Clone but oops, this function would have to change to return that error type. Or it could change to return yet another new Error type that is Clone and more specific to the storage/repository domain instead of the catch-all spfs::Error type.
c69b0d5 to
bca6bae
Compare
b937392 to
88019a9
Compare
a288a39 to
bc02db1
Compare
88019a9 to
18b497f
Compare
bc02db1 to
b165c1a
Compare
41fb372 to
cc58d46
Compare
b165c1a to
b507fd1
Compare
This should continue to work after removing blob objects. Add a "test-fixtures" feature to spfs in order to use the fixtures in external crates. Signed-off-by: J Robert Ray <[email protected]>
This is also expected to still succeed after removing blob objects. Signed-off-by: J Robert Ray <[email protected]>
Blob files contain the digest and size of the payload. The digest is redundant because it is the same as the file name of the blob. The size is also available in the Entry type in Manifests. Reading a blob to learn the digest of the payload requires already knowing the payload digest in order to read the blob. The invariant of a blob not existing unless the payload also exists has proven to be difficult to maintain. By eliminating blobs, we eliminate this invariant. Signed-off-by: J Robert Ray <[email protected]>
Since we don't write a blob object anymore is may be confusing to continue calling this "commit_blob". Signed-off-by: J Robert Ray <[email protected]>
With blob files gone there is no longer an invariant to maintain when writing payloads. Signed-off-by: J Robert Ray <[email protected]>
Without a blob object to read, if something wants to know the size of a payload it is possible with this to do it without having to read the payload data. Signed-off-by: J Robert Ray <[email protected]>
The blob existence invariant is gone and these operations are now safe. Signed-off-by: J Robert Ray <[email protected]>
Following up on #1052 there are still spurious test failures in other tests, e.g., sync::sync_test::test_sync_ref_including_annotation_blob, that fail because the spfs config is set to legacy encoding format. These tests that change the config were not putting the config back to defaults, so that could explain the spurious failures depending on the order that tests run. Although, it is suspicious that the tests weren't failing more often if this is the cause. Signed-off-by: J Robert Ray <[email protected]>
b507fd1 to
7356e01
Compare
cc58d46 to
90526eb
Compare
7356e01 to
505d504
Compare
90526eb to
c9b4e7c
Compare
505d504 to
d1bf725
Compare
c9b4e7c to
8d55227
Compare
The previous commit that removed `write_blob` didn't actually accomplish much because blobs were mostly being written with `write_object`. There is now a debug assertion in `write_object` to prevent writing blobs. Once this was in place, the tests starting revealing all the places that were still writing blobs, and then all the places that didn't work anymore when blobs don't exist. This commit redefines some terminology used in the API. Previously an "object" could include blob objects and their digests could be used interchangeably with payload digests. Now payloads are not considered to be an "object" and methods that operate on object digests do not work for payloads. Some method names have been changed to make this more apparent. The term "item" is used for things (digests) that can be either an object of a payload. A "ref" can be a digest and it is expected that it is still possible to interact with payloads via a ref. A tag can refer to a payload as well. The 'find_digests' operation used to return digests for blob objects so absent blobs it has been changed to return both object digests and payload digests. Signed-off-by: J Robert Ray <[email protected]>
Signed-off-by: J Robert Ray <[email protected]>
This new error type can give better context to the kinds of errors that can happen during a sync and also limits the enum variants compared to spfs::Error so errors that can't happen aren't represented. This work was originally motivated to make an Error type that is Clone, which is why the wrapped errors are all Arc, however it ended up that FallbackProxy::open_payload still has to return spfs::Error so having a Clone SyncError didn't help. The actual need is for an Error type that open_payload can return that is Clone. A Clone Error type is needed because the moka Cache puts errors into an Arc, since it needs to be able to clone the error itself. To get an error out of the cache and then return it, it needs to be Clone. Signed-off-by: J Robert Ray <[email protected]>
8d55227 to
3c7788b
Compare
d1bf725 to
8df9867
Compare
This error type is Clone and can be used in the upcoming change to open_payload that can return a cached result. Signed-off-by: J Robert Ray <[email protected]>
As well as not checking the test code. Signed-off-by: J Robert Ray <[email protected]>
Use a moka Cache to funnel multiple requests for the same missing payload into a single task to sync the payload locally. Any failure to sync the payload will be cached for 5 minutes, after which another request for that digest will attempt to sync it again. This was inspired by observing that syncing a single spfs tag would report having "repaired" several different digests multiple times, often over a dozen times. It suggests that Syncer may be unexpectedly trying to process the same digest multiple times. Each of these digests were for large payloads (500MB+). The sync was happening over a high-latency link. Without this change, it can end up performing excessive duplicate copies over a limited bandwidth link. Signed-off-by: J Robert Ray <[email protected]>
3c7788b to
ef4563c
Compare
9ef4438 to
1c49e90
Compare
Use a moka Cache to funnel multiple requests for the same missing payload into a single task to sync the payload locally. Any failure to sync the payload will be cached for 5 minutes, after which another request for that digest will attempt to sync it again.
This was inspired by observing that syncing a single spfs tag would report having "repaired" several different digests multiple times, often over a dozen times. It suggests that Syncer may unexpectedly trying to process the same digest multiple times. Each of these digests were for large payloads (500MB+). The sync was happening over a high-latency link. Without this change, it can end up performing excessive duplicate copies over a limited bandwidth link.