Skip to content

[R] Unable to disable url-encoding #41618

@r2evans

Description

@r2evans

Describe the bug, including details regarding any error messages, version, and platform.

I have a local datamart of various table schemas using hive partitioning. There are non-arrow (and non-R) tools accessing the directories, it would be nice to not have to search for names both with and without URL encoding. I cannot find an option or an argument that allows me to disable it. I recognize that perhaps S3 buckets might require it, but it seems like a bug (or mis-design?) that we cannot disable this otherwise disruptive and undocumented feature. Is this really silently hard-coded and required in all instances?

The datamart is on a local filesystem, and spaces are (obviously) fully permissible in directory names.

At a minimum, I feel documentation in write_dataset would be appropriate, though it would be really useful to not have to change all other utilities to work around this seemingly unnecessary behavior.

R-4.3.2 and arrow_15.0.1.

mt <- mtcars
mt$key <- paste(mt$cyl, mt$gear)
(td <- tempfile(fileext=".d"))
# [1] "/home/r2/tmp/RtmpAsPlcj/file185e1d942690.d"
dir.create(td)
res <- arrow::write_dataset(mt, path = td, partitioning = "key")
res
# NULL
Sys.glob(paste0(td, "/*/*"))
# [1] "/home/r2/tmp/RtmpAsPlcj/file185e1d942690.d/key=4%203/part-0.parquet" "/home/r2/tmp/RtmpAsPlcj/file185e1d942690.d/key=4%204/part-0.parquet"
# [3] "/home/r2/tmp/RtmpAsPlcj/file185e1d942690.d/key=4%205/part-0.parquet" "/home/r2/tmp/RtmpAsPlcj/file185e1d942690.d/key=6%203/part-0.parquet"
# [5] "/home/r2/tmp/RtmpAsPlcj/file185e1d942690.d/key=6%204/part-0.parquet" "/home/r2/tmp/RtmpAsPlcj/file185e1d942690.d/key=6%205/part-0.parquet"
# [7] "/home/r2/tmp/RtmpAsPlcj/file185e1d942690.d/key=8%203/part-0.parquet" "/home/r2/tmp/RtmpAsPlcj/file185e1d942690.d/key=8%205/part-0.parquet"

There is nothing in the return value that suggests the partitioning keys were url-encoded.

Component(s)

R

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions