|
1 | 1 | Handling Paths |
2 | 2 | ============== |
3 | 3 |
|
4 | | -A great deal of convention deals with the handling of **paths**. Paths are |
5 | | -stored internally—in the database, for instance—as byte strings (i.e., ``bytes`` |
6 | | -instead of ``str`` in Python 3). This is because POSIX operating systems’ path |
7 | | -names are only reliably usable as byte strings—operating systems typically |
8 | | -recommend but do not require that filenames use a given encoding, so violations |
9 | | -of any reported encoding are inevitable. On Windows, the strings are always |
10 | | -encoded with UTF-8; on Unix, the encoding is controlled by the filesystem. Here |
11 | | -are some guidelines to follow: |
12 | | - |
13 | | -- If you have a Unicode path or you’re not sure whether something is Unicode or |
14 | | - not, pass it through ``bytestring_path`` function in the ``beets.util`` module |
15 | | - to convert it to bytes. |
16 | | -- Pass every path name through the ``syspath`` function (also in ``beets.util``) |
17 | | - before sending it to any *operating system* file operation (``open``, for |
18 | | - example). This is necessary to use long filenames (which, maddeningly, must be |
19 | | - Unicode) on Windows. This allows us to consistently store bytes in the |
20 | | - database but use the native encoding rule on both POSIX and Windows. |
21 | | -- Similarly, the ``displayable_path`` utility function converts bytestring paths |
22 | | - to a Unicode string for displaying to the user. Every time you want to print |
23 | | - out a string to the terminal or log it with the ``logging`` module, feed it |
24 | | - through this function. |
| 4 | +Historically, this chapter recommended the utilities ``syspath()``, |
| 5 | +``normpath()``, ``bytestring_path()``, and ``displayable_path()`` for handling |
| 6 | +file paths in Beets. These ensured consistent behavior across Linux, macOS, and |
| 7 | +Windows before Python’s ``pathlib`` offered a unified and reliable API. |
| 8 | + |
| 9 | +- ``syspath()`` worked around Windows Unicode and long-path issues by converting |
| 10 | + to a system-safe string (adding the ``\\?\`` prefix where needed). Modern |
| 11 | + Python (≥3.6) handles this automatically through its wide-character APIs. |
| 12 | +- ``normpath()`` normalized slashes and removed ``./`` or ``..`` parts but did |
| 13 | + not expand ``~``. It was used mainly for paths from user input or config |
| 14 | + files. |
| 15 | +- ``bytestring_path()`` converted paths to ``bytes`` for storage in the |
| 16 | + database. Paths in the database are still stored as bytes today, though there |
| 17 | + are plans to eventually store ``pathlib.Path`` objects directly. |
| 18 | +- ``displayable_path()`` converted byte paths to Unicode for display or logging. |
| 19 | + |
| 20 | +These utilities remain safe to use when maintaining older code, but new code and |
| 21 | +refactors should prefer ``pathlib.Path``: |
| 22 | + |
| 23 | +- Use the ``.filepath`` property on ``Item`` and ``Album`` to access paths as |
| 24 | + ``pathlib.Path``. This replaces ``displayable_path(item.path)``. |
| 25 | +- Normalize or expand paths using ``Path(...).expanduser().resolve()``, which |
| 26 | + correctly expands ``~`` and resolves symlinks. |
| 27 | +- Cross-platform details like path separators, Unicode handling, and long-path |
| 28 | + support are handled automatically by ``pathlib``. |
| 29 | + |
| 30 | +Examples |
| 31 | +-------- |
| 32 | + |
| 33 | +Old style |
| 34 | + |
| 35 | +.. code-block:: python |
| 36 | +
|
| 37 | + displayable_path(item.path) |
| 38 | + normpath("~/Music/../Artist") |
| 39 | + syspath(path) |
| 40 | +
|
| 41 | +New style |
| 42 | + |
| 43 | +.. code-block:: python |
| 44 | +
|
| 45 | + item.filepath |
| 46 | + Path("~/Music/../Artist").expanduser().resolve() |
| 47 | + Path(path) |
| 48 | +
|
| 49 | +When storing paths in the database |
| 50 | + |
| 51 | +.. code-block:: python |
| 52 | +
|
| 53 | + path_bytes = bytestring_path(item.filepath) |
| 54 | +
|
| 55 | +In short, the old utilities were necessary for cross-platform safety in early |
| 56 | +Beets versions, but ``pathlib.Path`` now provides these guarantees natively and |
| 57 | +should be used for all new code. ``bytestring_path()`` is still used only for |
| 58 | +database storage. |
0 commit comments