File.Open given to the handler for Tar.Extract can't be used after the handler has returned

## What would you like to have changed?

Currently when you call Extract on a .zip file the handler method is given `File` objects with `Open` methods which can be called some time in the future, after the handler has returned.

This is because the zip reader is upgraded into an io.ReaderAt so the files can be seeked and opened later.

However if you do the same with Tar, the `File.Open` methods are only valid as long as the handler hasn't returned. If you call them after the handler has returned they return 0 bytes.

One solution would be to run Extract on the archive again with just the file name required. This would be an O(n^2) solution though :-( In rclone's case this would mean reading the entire tar file off remote storage in entirety for each file extracted which is super inefficient. Rclone could use local caching for this which would help somewhat.

For plain `.tar` files, they should be quite seekable in theory, but I don't think this is something that we can get from the standard library. For compressed `.tar.gz` for example one would need to keep the state of the decompressor at the start of each file in order to seek which is starting to sound very complicated. You can create seekable gzip files (rclone does this in its compress backend) but that is complicated and non-standard.

I think for rclone's purposes compressed tar files would mean you've got to read the entire file just to see the directory entries - I don't think there is any way around that. Where in theory at least you should be able to skip the actual file data when scanning an uncompressed `.tar`. I looked through the source of `archive/tar` and if the reader is capable of seeking then it uses it to skip data where necessary so that is good.

I think reading the seek position of the underlying reader each file in an uncompressed `.tar` could be interesting but that would need a fork of archive/tar.

Any ideas?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

File.Open given to the handler for Tar.Extract can't be used after the handler has returned #371

What would you like to have changed?

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

File.Open given to the handler for Tar.Extract can't be used after the handler has returned #371

Description

What would you like to have changed?

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions