Issue
This line defines the root marker as /.
Then in the parent class, the _parent method would always return a string with leading slash.
|
def _parent(cls, path): |
|
path = cls._strip_protocol(path) |
|
if "/" in path: |
|
parent = path.rsplit("/", 1)[0].lstrip(cls.root_marker) |
|
return cls.root_marker + parent |
|
else: |
|
return cls.root_marker |
However, when used with a S3 such as ArrowFSWrapper(pyarrow.fs.S3FileSystem()), pyarrow does not like leading slashes.
../.venv/lib/python3.12/site-packages/fsspec/spec.py:1106: in put
self.put_file(lpath, rpath, callback=child, **kwargs)
../.venv/lib/python3.12/site-packages/fsspec/spec.py:1031: in put_file
self.mkdirs(self._parent(os.fspath(rpath)), exist_ok=True)
../.venv/lib/python3.12/site-packages/fsspec/spec.py:1773: in mkdirs
return self.makedirs(path, exist_ok=exist_ok)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
../.venv/lib/python3.12/site-packages/fsspec/implementations/arrow.py:23: in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
../.venv/lib/python3.12/site-packages/fsspec/implementations/arrow.py:195: in makedirs
self.fs.create_dir(path, recursive=True)
pyarrow/_fs.pyx:638: in pyarrow._fs.FileSystem.create_dir
???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> ???
E pyarrow.lib.ArrowInvalid: Path cannot start with a separator ('/test-bucket-dcf315a3aff54e8db4eb2bf8e9e85f5b/path')
pyarrow/error.pxi:92: ArrowInvalid
If the root marker is reverted using something as follows, pyarrow is happy.
class Wrapper(ArrowFSWrapper):
root_marker = ""
Wrapper(pyarrow.fs.S3FileSystem())
Potential Fix
Since it is a class member, probably it is necessary to have a separate wrapper for S3.
class ArrowS3FSWrapper(ArrowFSWrapper):
root_marker = ""
Issue
This line defines the root marker as
/.filesystem_spec/fsspec/implementations/arrow.py
Line 49 in 2b5ed0f
Then in the parent class, the
_parentmethod would always return a string with leading slash.filesystem_spec/fsspec/spec.py
Lines 1266 to 1272 in 2b5ed0f
However, when used with a S3 such as
ArrowFSWrapper(pyarrow.fs.S3FileSystem()), pyarrow does not like leading slashes.If the root marker is reverted using something as follows, pyarrow is happy.
Potential Fix
Since it is a class member, probably it is necessary to have a separate wrapper for S3.