Skip to content

Conversation

@Louson
Copy link

@Louson Louson commented Oct 6, 2023

Description

Suggestion for musicbrainz substitution in relation with discussion #4660

Tests have been ran with command tox -e py311-test

To Do

  • Documentation. (If you've add a new command-line flag, for example, find the appropriate page under docs/ to describe it.)
  • Changelog. (Add an entry to docs/changelog.rst near the top of the document.)
  • Tests. (Encouraged but not strictly required.)

@Louson Louson force-pushed the dev/louis/mbzero branch 2 times, most recently from 62900d8 to 6351d61 Compare November 7, 2023 10:52
@stale
Copy link

stale bot commented Mar 17, 2024

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Mar 17, 2024
@Louson
Copy link
Author

Louson commented Mar 18, 2024

Hello, it is still relevant but the merge is corrupted since the massive style conformance that was made last summer. Unfortunately I have a serious lack of availability to fix it.

However the replacement library mbzero is not sufficiently experimented and would expect some review.

To move forward, we could split this PR to prepare for a replacement and drop the last commit.

I would be happy to work on that when I have time, but if anyone wants to take the lead, it's also fine.

As musicbrainzngs is still unmaintained, I would recommend to keep that PR open

@Louson Louson force-pushed the dev/louis/mbzero branch from 6351d61 to b512c92 Compare May 29, 2024 20:04
@stale stale bot removed the stale label May 29, 2024
@Louson Louson force-pushed the dev/louis/mbzero branch from b512c92 to 110cbbb Compare May 29, 2024 20:07
@Louson Louson marked this pull request as draft May 29, 2024 20:12
@Louson Louson force-pushed the dev/louis/mbzero branch 2 times, most recently from 4280735 to 6cbc3f2 Compare May 29, 2024 20:50
@Louson Louson changed the title DRAFT: dev/louis/mbzero dev/louis/mbzero May 29, 2024
@Louson Louson marked this pull request as ready for review May 29, 2024 20:50
@Louson
Copy link
Author

Louson commented May 29, 2024

Hello, as the library mbzero evolved (test coverage got pretty good), I updated the request to match the new style.

Tests done: py312 py312-lint format format_check

Again, read commit per commit.

Suggestions are welcome

(ping @sampsyo)

@Louson Louson force-pushed the dev/louis/mbzero branch from 6cbc3f2 to a6f88ec Compare June 6, 2024 13:18
@Louson Louson changed the title dev/louis/mbzero MusicbrainzNGS replacement Jun 18, 2024
@Serene-Arc Serene-Arc mentioned this pull request Jun 25, 2024
@Serene-Arc
Copy link
Contributor

Serene-Arc commented Jun 25, 2024

Thanks for the PR! First off, I made a bunch of comments on the code but there's one issue to deal with before anything else: mbzero requires Python 3.10, which is not the minimum-supported python version for the beets project. That would be 3.8, soon to be 3.9. We won't support 3.10 until October, 2025 at the earliest. This is why the tests are failing.

Do you have a solution to that? Because if not, this PR will have to be placed on hold until 2025-10. I see you're the maintainer for mbzero; is there a reason that 3.10 is your MSV?

@Louson
Copy link
Author

Louson commented Jun 25, 2024

Hello,

I saw that last week. I don't think there is any blocker, 3.10 was my python version at the time of development. Let me check on 3.8 and 3.9.

Louis Rannou added 2 commits June 25, 2024 14:54
Some of the work that was done in musicbrainzngs requires to be done in beets to
lighten the musicbrainz API.

Signed-off-by: Louis Rannou <[email protected]>
The structure of data changes a bit

Signed-off-by: Louis Rannou <[email protected]>
@Louson Louson force-pushed the dev/louis/mbzero branch 4 times, most recently from d3147cf to 267c1af Compare June 25, 2024 14:10
Copy link
Contributor

@Serene-Arc Serene-Arc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for checking that! There's a couple things to address with the code though

Comment on lines 61 to 86
limit_interval = 1.0
limit_requests = 1
do_rate_limit = True


def set_rate_limit(limit_or_interval=1.0, new_requests=1):
"""Sets the rate limiting behavior of the module. Must be invoked
before the first Web service call.
If the `limit_or_interval` parameter is set to False then
rate limiting will be disabled. If it is a number then only
a set number of requests (`new_requests`) will be made per
given interval (`limit_or_interval`).
"""
global limit_interval
global limit_requests
global do_rate_limit
if isinstance(limit_or_interval, bool):
do_rate_limit = limit_or_interval
else:
if limit_or_interval <= 0.0:
raise ValueError("limit_or_interval can't be less than 0")
if new_requests <= 0:
raise ValueError("new_requests can't be less than 0")
do_rate_limit = True
limit_interval = limit_or_interval
limit_requests = new_requests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to use something other than these types of global variables? It's doesn't fit with beet's OOP model. Perhaps a singleton class for the rate limiter would work?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can undertsand that ! So far the PR s a 1st step. That's a simple copy from what used to be in musicbrainzngs.

If the whole idea is fine for beets, I'm ok to find a solution. Would you have any tip for the singleton ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would look up singletons in Python, but they're pretty simple. If you don't know what they are (and sorry for being presumptive, if you do know then ignore), they are classes that can only create a single instance. You store an instance in the class data and that is the only one that is every created, which allows you to keep track of state like with these global variables, but in an OOP way.

Here is an explanation of how to create a singleton in Python. Again, ignore if you're already aware.

For the implementation, I would suggest perhaps a class that manages requests and handles the rate limiting. The RateLim decorator would be the best place, you can make it a singleton and then just use the decorator. If that doesn't work, then there can be a separate object that manages the rate inside the decorator should be fine. It's basically just trying to keep track of a counter and maybe a lock for the sake of concurrency. That's prime singleton stuff.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, as far as first steps go, this is good! My idea on PRs is that they should be complete enough to not need immediate changes. I think the global variables would be that type of thing; it's just a change that should be done, and there's no reason not to do it now rather than add them and then have to remove them in a separate PR.

Comment on lines 307 to 308
useragent = "beets/{} (https://beets.io/)".format(beets.__version__)
mbi = MbInterface(useragent)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More global variables. Is there a way to not have global variables for this? The user agent isn't much of a problem as a global but the MbInterface really shouldn't be one.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do agree. I had the idea to create an other file mbinterface.py that would replace the MbInterface class. The drawback is that we would have two files for the musicbrainz connection.

I'm not sure what's good here. Could we opt for an acceptable solution until mb.py is restructured ? That file is not really OOP btw

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that's a solution, properly. The issue is that the mbi object is being initialised in the module directly. Honestly a better solution, based on its structure and that the user-agent is a constant, we should just make a new object of the MbInterface whenever we call it, or even convert them to class functions. The latter is probably best. There's no actual reason to make an MbInterface object unless you're changing the source from the MusicBrainz servers to something else.

Even integrating it with the rate limiting code as a singleton could work. It's tricky. If we didn't offer support for other sources, I would just say make them class functions. Maybe passing in a destination object or reading that directly from the config could work.

What do you think?

Louis Rannou added 4 commits June 30, 2024 14:46
mbzero is much lighter and things have to be implemented in beets:

- there is no helper such as get_release_by_id
- no checking, so wrong requests will raise an exception
- there is nothing to limit the requests, it needs to be implemented
- mocks are moved to the interface

Signed-off-by: Louis Rannou <[email protected]>
mbzero 0.4 has support for python 3.8 and 3.9

Signed-off-by: Louis Rannou <[email protected]>
@Louson Louson force-pushed the dev/louis/mbzero branch from ff1befe to 9843ae2 Compare June 30, 2024 12:46
Comment on lines -1871 to +1901
"release": {
"title": releases[id_][0],
"id": id_,
"medium-list": [
{
"track-list": [
{
"id": "baz",
"recording": {
"title": "foo",
"id": "bar",
"length": 59,
},
"position": 9,
"number": "A2",
}
],
"position": 5,
}
],
"artist-credit": [
{
"artist": {
"name": releases[id_][1],
"id": "some-id",
},
}
],
"release-group": {
"id": "another-id",
},
"status": "Official",
}
"title": releases[id_][0],
"id": id_,
"media": [
{
"tracks": [
{
"id": "baz",
"recording": {
"title": "foo",
"id": "bar",
"length": 59,
},
"position": 9,
"number": "A2",
}
],
"position": 5,
}
],
"artist-credit": [
{
"artist": {
"name": releases[id_][1],
"id": "some-id",
},
}
],
"release-group": {
"id": "another-id",
},
"status": "Official",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll need to do a full review of the PR with these new changes, but there is one thing that could help me get through it a bit faster. Could you explain why the structure of the response dict has changes here? It just raises a red flag when the test fixture has this type of change.

It only works if everywhere in the code that expected the old structure has changed as well, which I'll need to check and can be time consuming. Because changing the tests without changing the codebase can lead to a mismatch between the tests and the actual code if the fixtures and mocks don't actually relate. Unnesting a value like this can cause big problems.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand. For a better understanding, you may go commit by commit.

The structure of the response change because the library relies on the json interface instead of the xml as suggested in #4651 (comment)
The structure of the response is a bit different.

The difference is mainly :

  1. the "release" sublevel has disappeared
  2. "xxx-list" has become "xxxs"
  3. "medium-list" has become "media"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello. Eventually, I my split the work and first have a replacement that keeps the xml interface. That would leave the test. The xml is supported by mbzero anyway.
When that's fine, we can think about dropping the xml in favor of json.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks! I still need to review, my semester has hit like a truck so I have had very, very little spare time or energy to do code review. I'll get to it soon hopefully :)

@quantenzitrone
Copy link

can i help with testing?
i would really like this pr to be merged soon (because it is a preliminary to #3549, which i would like to have)

@stale
Copy link

stale bot commented Apr 26, 2025

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Apr 26, 2025
@Louson
Copy link
Author

Louson commented Apr 26, 2025 via email

@stale stale bot removed the stale label Apr 26, 2025
@nicomem
Copy link
Contributor

nicomem commented May 28, 2025

I tried to test this branch by importing some musics, but I got a couple of errors:

1st error

config:

library: ./beets.db
directory: ./imported-musics

command: poetry run beet -c config.yml import ./my-musics

Traceback
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/nicomem/projects/beets/beets/ui/__init__.py", line 1865, in main
    _raw_main(args)
  File "/home/nicomem/projects/beets/beets/ui/__init__.py", line 1852, in _raw_main
    subcommand.func(lib, suboptions, subargs)
  File "/home/nicomem/projects/beets/beets/ui/commands.py", line 1395, in import_func
    import_files(lib, paths, query)
  File "/home/nicomem/projects/beets/beets/ui/commands.py", line 1326, in import_files
    session.run()
  File "/home/nicomem/projects/beets/beets/importer.py", line 360, in run
    pl.run_parallel(QUEUE_SIZE)
  File "/home/nicomem/projects/beets/beets/util/pipeline.py", line 447, in run_parallel
    raise exc_info[1].with_traceback(exc_info[2])
  File "/home/nicomem/projects/beets/beets/util/pipeline.py", line 312, in run
    out = self.coro.send(msg)
          ^^^^^^^^^^^^^^^^^^^
  File "/home/nicomem/projects/beets/beets/util/pipeline.py", line 195, in coro
    func(*(args + (task,)))
  File "/home/nicomem/projects/beets/beets/importer.py", line 1497, in lookup_candidates
    task.lookup_candidates()
  File "/home/nicomem/projects/beets/beets/importer.py", line 688, in lookup_candidates
    artist, album, prop = autotag.tag_album(
                          ^^^^^^^^^^^^^^^^^^
  File "/home/nicomem/projects/beets/beets/autotag/match.py", line 550, in tag_album
    for matched_candidate in hooks.album_candidates(
                             ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nicomem/projects/beets/beets/plugins.py", line 593, in decorated
    for v in generator(*args, **kwargs):
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nicomem/projects/beets/beets/autotag/hooks.py", line 734, in album_candidates
    yield from invoke_mb(
  File "/home/nicomem/projects/beets/beets/autotag/mb.py", line 1005, in match_album
    res = MbInterface().search_releases(
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nicomem/projects/beets/beets/autotag/mb.py", line 312, in search_releases
    query=self._make_query(query, fields),
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nicomem/projects/beets/beets/autotag/mb.py", line 254, in _make_query
    value = util._unicode(value)
            ^^^^^^^^^^^^^
AttributeError: module 'beets.util' has no attribute '_unicode'. Did you mean: 'unidecode'?

2nd error

config:

library: ./beets.db
directory: ./imported-musics

import:
    languages: fr en

command: poetry run beet -c config.yml import ./my-musics

Traceback
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/nicomem/projects/beets/beets/ui/__init__.py", line 1865, in main
    _raw_main(args)
  File "/home/nicomem/projects/beets/beets/ui/__init__.py", line 1852, in _raw_main
    subcommand.func(lib, suboptions, subargs)
  File "/home/nicomem/projects/beets/beets/ui/commands.py", line 1395, in import_func
    import_files(lib, paths, query)
  File "/home/nicomem/projects/beets/beets/ui/commands.py", line 1326, in import_files
    session.run()
  File "/home/nicomem/projects/beets/beets/importer.py", line 360, in run
    pl.run_parallel(QUEUE_SIZE)
  File "/home/nicomem/projects/beets/beets/util/pipeline.py", line 447, in run_parallel
    raise exc_info[1].with_traceback(exc_info[2])
  File "/home/nicomem/projects/beets/beets/util/pipeline.py", line 312, in run
    out = self.coro.send(msg)
          ^^^^^^^^^^^^^^^^^^^
  File "/home/nicomem/projects/beets/beets/util/pipeline.py", line 195, in coro
    func(*(args + (task,)))
  File "/home/nicomem/projects/beets/beets/importer.py", line 1497, in lookup_candidates
    task.lookup_candidates()
  File "/home/nicomem/projects/beets/beets/importer.py", line 688, in lookup_candidates
    artist, album, prop = autotag.tag_album(
                          ^^^^^^^^^^^^^^^^^^
  File "/home/nicomem/projects/beets/beets/autotag/match.py", line 512, in tag_album
    id_info = match_by_id(items)
              ^^^^^^^^^^^^^^^^^^
  File "/home/nicomem/projects/beets/beets/autotag/match.py", line 350, in match_by_id
    return hooks.album_for_mbid(first)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nicomem/projects/beets/beets/autotag/hooks.py", line 661, in album_for_mbid
    album = mb.album_for_id(release_id)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nicomem/projects/beets/beets/autotag/mb.py", line 1152, in album_for_id
    release = album_info(res)
              ^^^^^^^^^^^^^^^
  File "/home/nicomem/projects/beets/beets/autotag/mb.py", line 684, in album_info
    artist_name, artist_sort_name, artist_credit_name = _flatten_artist_credit(
                                                        ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nicomem/projects/beets/beets/autotag/mb.py", line 517, in _flatten_artist_credit
    artist_parts, artist_sort_parts, artist_credit_parts = _multi_artist_credit(
                                                           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/nicomem/projects/beets/beets/autotag/mb.py", line 486, in _multi_artist_credit
    cur_artist_name = alias["alias"]
                      ~~~~~^^^^^^^^^
KeyError: 'alias'

@nicomem
Copy link
Contributor

nicomem commented Jun 10, 2025

I have started working on rebasing and fixing this PR (not tested much for now, but I've re-added all applicable changes of this PR, with some fixes). When I have something that I'm satisfied of, I'll create a PR.

For those that want to check beforehand: my working branch

@Louson
Copy link
Author

Louson commented Jun 11, 2025 via email

@nicomem
Copy link
Contributor

nicomem commented Jun 12, 2025

@Louson

Sorry about the signoff, I forgot to add it (along with the commit author change) since I added the changes from master instead of rebasing (due to the many conflicts).


[...] However, could you split fixes, styling and improvements [...]

Do you mean on the commit containing your changes, or on my commits that comes after?

  • If the former, I've split the few fixes I did out of that commit so that the commit containing your changes contains all and only the changes of this PR
    • Due to the conflicts with the master branch, there will obviously be some differences with this PR, but I've tried my best to re-apply as-is the changes
  • If the later, my commits seem well split as-is: I don't think splitting them more would be better
    • For example for the Clean MbInterface API, it would be weird to split the commit into one that cleans the API, another that adds the documentation, etc.; or one commit per function modified
    • I've split the Fix BASE_URL handling and alias retrieval, but this was the only one that seemed "splittable" to me

@Louson
Copy link
Author

Louson commented Sep 7, 2025

Closing in favour of #5976

@Louson Louson closed this Sep 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants