Skip to content

Conversation

@snejus
Copy link
Member

@snejus snejus commented Nov 2, 2025

Generalize some of common tagging functionality to Info and Match base classes

This PR centralises some common tagging functionality between singletons and albums allowing to simplify ChangeRepresentation and importing functionality. This is prep work for a larger PR which refactors and simplifies the entire tagging workflow.

Changes

  • Core type changes: Changed mapping parameter from Mapping[Item, TrackInfo] to list[tuple[Item, TrackInfo]] in apply_metadata(), distance(), and assign_items() functions
  • Match dataclasses: Converted AlbumMatch and TrackMatch from NamedTuple to @dataclass, introducing base Match class with common functionality
  • New properties: Added name cached property to Info class for unified name access
  • ChangeRepresentation refactor: Converted to @dataclass with lazy property evaluation, replacing cur_album/cur_title with unified cur_name field
  • UI improvements: Simplified display logic by using match.info.name instead of type-specific field checks
  • Parameter renaming: Renamed search_album/search_title parameters to search_name for consistency across singleton and album workflows

The changes maintain backward compatibility in behavior while improving type safety and code clarity.

@snejus snejus requested a review from a team as a code owner November 2, 2025 23:43
@github-actions
Copy link

github-actions bot commented Nov 2, 2025

Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry.

@snejus snejus requested review from JOJ0 and removed request for semohr November 2, 2025 23:43
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • Define a clear type alias (e.g., MappingList[Item, TrackInfo]) or a small wrapper for the list-of-tuples mapping to reduce repetitive conversion boilerplate and improve readability.
  • Ensure the Match.type classproperty that strips the "Match" suffix is robust (e.g., add a fallback or assert for unexpected class names) or document that all subclasses must end with 'Match'.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Define a clear type alias (e.g., MappingList[Item, TrackInfo]) or a small wrapper for the list-of-tuples mapping to reduce repetitive conversion boilerplate and improve readability.
- Ensure the Match.type classproperty that strips the "Match" suffix is robust (e.g., add a fallback or assert for unexpected class names) or document that all subclasses must end with 'Match'.

## Individual Comments

### Comment 1
<location> `beets/ui/commands.py:433-442` </location>
<code_context>
+        if self.cur_name:
</code_context>

<issue_to_address>
**suggestion:** The logic for printing Album/Title changes is now unified, but the fallback label is always 'Album'.

Since cur_name is used for both albums and singletons, consider making the fallback label dynamic to accurately reflect the type of match and prevent confusion for single track matches.

Suggested implementation:

```python
        if self.cur_name:
            # Determine label based on match type
            label = "Album" if self.match.is_album else "Track"
            name_l, name_r = self.cur_name or "", self.match.info.name
            if (
                self.cur_name != self.match.info.name
                and self.match.info.name != VARIOUS_ARTISTS
            ):

```

You will also need to update any print statements or usages of the fallback label within this block to use the new `label` variable instead of the hardcoded "Album". For example, if you have:
```python
print_(f"{self.indent_detail}*", "Album:", name_r)
```
change it to:
```python
print_(f"{self.indent_detail}*", label + ":", name_r)
```
Make sure to check the rest of the block for similar usages.
</issue_to_address>

### Comment 2
<location> `test/test_autotag.py:110` </location>
<code_context>
-        expected = dict(zip(items, trackinfo)), [], []
+        expected = list(zip(items, trackinfo)), [], []

         assert match.assign_items(items, trackinfo) == expected

</code_context>

<issue_to_address>
**suggestion (testing):** Consider adding tests for edge cases in assign_items, such as empty input lists and duplicate tracks/items.

Current tests only cover standard cases. Including tests for empty lists, duplicates, and mismatched lengths will improve coverage and reliability.

Suggested implementation:

```python
        expected = list(zip(items, trackinfo)), [], []

        assert match.assign_items(items, trackinfo) == expected

def test_assign_items_empty_lists():
    items = []
    trackinfo = []
    expected = [], [], []
    assert match.assign_items(items, trackinfo) == expected

def test_assign_items_empty_items():
    items = []
    trackinfo = [info(1, "Track 1", 100)]
    expected = [], [], [trackinfo[0]]
    assert match.assign_items(items, trackinfo) == expected

def test_assign_items_empty_tracks():
    items = [Item("Item 1")]
    trackinfo = []
    expected = [], [items[0]], []
    assert match.assign_items(items, trackinfo) == expected

def test_assign_items_duplicate_items():
    items = [Item("Item 1"), Item("Item 1")]
    trackinfo = [info(1, "Track 1", 100), info(2, "Track 2", 200)]
    expected_mapping = [(items[0], trackinfo[0]), (items[1], trackinfo[1])]
    expected = expected_mapping, [], []
    assert match.assign_items(items, trackinfo) == expected

def test_assign_items_duplicate_tracks():
    items = [Item("Item 1"), Item("Item 2")]
    trackinfo = [info(1, "Track 1", 100), info(1, "Track 1", 100)]
    expected_mapping = [(items[0], trackinfo[0]), (items[1], trackinfo[1])]
    expected = expected_mapping, [], []
    assert match.assign_items(items, trackinfo) == expected

def test_assign_items_more_items_than_tracks():
    items = [Item("Item 1"), Item("Item 2"), Item("Item 3")]
    trackinfo = [info(1, "Track 1", 100)]
    expected_mapping = [(items[0], trackinfo[0])]
    expected = expected_mapping, [items[1], items[2]], []
    assert match.assign_items(items, trackinfo) == expected

def test_assign_items_more_tracks_than_items():
    items = [Item("Item 1")]
    trackinfo = [info(1, "Track 1", 100), info(2, "Track 2", 200)]
    expected_mapping = [(items[0], trackinfo[0])]
    expected = expected_mapping, [], [trackinfo[1]]
    assert match.assign_items(items, trackinfo) == expected

```

These tests assume the existence of `Item` and `info` constructors, and that `match.assign_items` returns a tuple of (matched_pairs, extra_items, extra_tracks).
If your test file uses fixtures or a different setup for `Item` and `info`, you may need to adjust the instantiation accordingly.
Also, ensure that the test functions are properly discovered by your test runner (e.g., pytest).
</issue_to_address>

### Comment 3
<location> `test/test_autotag.py:119` </location>
<code_context>
+        mapping = list(zip(self.items, info.tracks))
         config["per_disc_numbering"] = per_disc_numbering
         config["artist_credit"] = artist_credit
         autotag.apply_metadata(info, mapping)
</code_context>

<issue_to_address>
**suggestion (testing):** Consider adding a test for apply_metadata with an empty mapping.

This will verify that apply_metadata behaves correctly when given no data.
</issue_to_address>

### Comment 4
<location> `test/autotag/test_distance.py:185` </location>
<code_context>
     @pytest.fixture
     def get_dist(self, items):
         def inner(info: AlbumInfo):
-            return distance(items, info, dict(zip(items, info.tracks)))
+            return distance(items, info, list(zip(items, info.tracks)))

         return inner
</code_context>

<issue_to_address>
**suggestion (testing):** Consider adding tests for the distance function with non-standard mappings, such as missing or extra items/tracks.

Additional tests should include cases with extra items, missing tracks, and out-of-order pairs to ensure the distance function behaves correctly in these scenarios.

Suggested implementation:

```python
    @pytest.fixture
    def get_dist(self, items):
        def inner(info: AlbumInfo):
            return distance(items, info, list(zip(items, info.tracks)))

        return inner

    def test_distance_with_extra_items(self):
        items = ["track1", "track2", "track3"]
        info = AlbumInfo(tracks=["track1", "track2"])
        mapping = [("track1", "track1"), ("track2", "track2"), ("track3", None)]
        result = distance(items, info, mapping)
        assert isinstance(result, (int, float)), "Distance should be a number"
        # Add more specific assertions based on expected behavior

    def test_distance_with_missing_tracks(self):
        items = ["track1", "track2"]
        info = AlbumInfo(tracks=["track1", "track2", "track3"])
        mapping = [("track1", "track1"), ("track2", "track2")]
        result = distance(items, info, mapping)
        assert isinstance(result, (int, float)), "Distance should be a number"
        # Add more specific assertions based on expected behavior

    def test_distance_with_out_of_order_pairs(self):
        items = ["track1", "track2", "track3"]
        info = AlbumInfo(tracks=["track3", "track1", "track2"])
        mapping = [("track1", "track3"), ("track2", "track1"), ("track3", "track2")]
        result = distance(items, info, mapping)
        assert isinstance(result, (int, float)), "Distance should be a number"
        # Add more specific assertions based on expected behavior

```

- You may need to adjust the assertions in each test to match the expected output of your `distance` function.
- Ensure that `AlbumInfo` and `distance` are imported or defined in the test file.
- If your test class or function structure is different, you may need to adapt the placement of these tests.
</issue_to_address>

### Comment 5
<location> `test/test_ui.py:1158` </location>
<code_context>
         """Return an unicode string representing the changes"""
         items = items or self.items
         info = info or self.info
-        mapping = dict(zip(items, info.tracks))
+        mapping = list(zip(items, info.tracks))
         config["ui"]["color"] = color
         config["import"]["detail"] = True
</code_context>

<issue_to_address>
**suggestion (testing):** Consider adding a test for _show_change with mismatched items and tracks.

Testing with mismatched items and tracks will verify that the UI logic correctly handles irregular input.

Suggested implementation:

```python
def test_show_change_with_mismatched_items_and_tracks():
    # Setup: items and tracks have different lengths
    items = ["item1", "item2", "item3"]
    class Info:
        tracks = ["trackA", "trackB"]  # fewer tracks than items

    info = Info()
    config = {"ui": {}, "import": {}}
    color = "red"

    # Assuming _show_change is a method of a class, e.g., UIHandler
    # If it's a standalone function, adjust accordingly
    handler = UIHandler(items=items, info=info)
    try:
        result = handler._show_change(items=items, info=info, config=config, color=color)
    except Exception as e:
        result = str(e)

    # The test passes if no exception is raised and result is a string (or expected type)
    assert isinstance(result, str)

```

- If `_show_change` is not a public method or not directly accessible, you may need to expose it or test via a public interface.
- Replace `UIHandler` with the actual class name if different.
- Adjust the assertion to match the expected output type or value for your implementation.
</issue_to_address>

### Comment 6
<location> `beets/autotag/__init__.py:213` </location>
<code_context>
def apply_metadata(
    album_info: AlbumInfo, mapping: list[tuple[Item, TrackInfo]]
):
    """Set the items' metadata to match an AlbumInfo object using a
    mapping from Items to TrackInfo objects.
    """
    for item, track_info in mapping:
        # Artist or artist credit.
        if config["artist_credit"]:
            item.artist = (
                track_info.artist_credit
                or track_info.artist
                or album_info.artist_credit
                or album_info.artist
            )
            item.artists = (
                track_info.artists_credit
                or track_info.artists
                or album_info.artists_credit
                or album_info.artists
            )
            item.albumartist = album_info.artist_credit or album_info.artist
            item.albumartists = album_info.artists_credit or album_info.artists
        else:
            item.artist = track_info.artist or album_info.artist
            item.artists = track_info.artists or album_info.artists
            item.albumartist = album_info.artist
            item.albumartists = album_info.artists

        # Album.
        item.album = album_info.album

        # Artist sort and credit names.
        item.artist_sort = track_info.artist_sort or album_info.artist_sort
        item.artists_sort = track_info.artists_sort or album_info.artists_sort
        item.artist_credit = (
            track_info.artist_credit or album_info.artist_credit
        )
        item.artists_credit = (
            track_info.artists_credit or album_info.artists_credit
        )
        item.albumartist_sort = album_info.artist_sort
        item.albumartists_sort = album_info.artists_sort
        item.albumartist_credit = album_info.artist_credit
        item.albumartists_credit = album_info.artists_credit

        # Release date.
        for prefix in "", "original_":
            if config["original_date"] and not prefix:
                # Ignore specific release date.
                continue

            for suffix in "year", "month", "day":
                key = f"{prefix}{suffix}"
                value = getattr(album_info, key) or 0

                # If we don't even have a year, apply nothing.
                if suffix == "year" and not value:
                    break

                # Otherwise, set the fetched value (or 0 for the month
                # and day if not available).
                item[key] = value

                # If we're using original release date for both fields,
                # also set item.year = info.original_year, etc.
                if config["original_date"]:
                    item[suffix] = value

        # Title.
        item.title = track_info.title

        if config["per_disc_numbering"]:
            # We want to let the track number be zero, but if the medium index
            # is not provided we need to fall back to the overall index.
            if track_info.medium_index is not None:
                item.track = track_info.medium_index
            else:
                item.track = track_info.index
            item.tracktotal = track_info.medium_total or len(album_info.tracks)
        else:
            item.track = track_info.index
            item.tracktotal = len(album_info.tracks)

        # Disc and disc count.
        item.disc = track_info.medium
        item.disctotal = album_info.mediums

        # MusicBrainz IDs.
        item.mb_trackid = track_info.track_id
        item.mb_releasetrackid = track_info.release_track_id
        item.mb_albumid = album_info.album_id
        if track_info.artist_id:
            item.mb_artistid = track_info.artist_id
        else:
            item.mb_artistid = album_info.artist_id

        if track_info.artists_ids:
            item.mb_artistids = track_info.artists_ids
        else:
            item.mb_artistids = album_info.artists_ids

        item.mb_albumartistid = album_info.artist_id
        item.mb_albumartistids = album_info.artists_ids
        item.mb_releasegroupid = album_info.releasegroup_id

        # Compilation flag.
        item.comp = album_info.va

        # Track alt.
        item.track_alt = track_info.track_alt

        _apply_metadata(
            album_info,
            item,
            nullable_fields=config["overwrite_null"]["album"].as_str_seq(),
        )

        _apply_metadata(
            track_info,
            item,
            nullable_fields=config["overwrite_null"]["track"].as_str_seq(),
        )

        correct_list_fields(item)

</code_context>

<issue_to_address>
**issue (code-quality):** Low code quality found in apply\_metadata - 19% ([`low-code-quality`](https://docs.sourcery.ai/Reference/Default-Rules/comments/low-code-quality/))

<br/><details><summary>Explanation</summary>The quality score for this function is below the quality threshold of 25%.
This score is a combination of the method length, cognitive complexity and working memory.

How can you solve this?

It might be worth refactoring this function to make it shorter and more readable.

- Reduce the function length by extracting pieces of functionality out into
  their own functions. This is the most important thing you can do - ideally a
  function should be less than 10 lines.
- Reduce nesting, perhaps by introducing guard clauses to return early.
- Ensure that variables are tightly scoped, so that code using related concepts
  sits together within the function rather than being scattered.</details>
</issue_to_address>

### Comment 7
<location> `beets/autotag/distance.py:422` </location>
<code_context>
def distance(
    items: Sequence[Item],
    album_info: AlbumInfo,
    mapping: list[tuple[Item, TrackInfo]],
) -> Distance:
    """Determines how "significant" an album metadata change would be.
    Returns a Distance object. `album_info` is an AlbumInfo object
    reflecting the album to be compared. `items` is a sequence of all
    Item objects that will be matched (order is not important).
    `mapping` is a dictionary mapping Items to TrackInfo objects; the
    keys are a subset of `items` and the values are a subset of
    `album_info.tracks`.
    """
    likelies, _ = get_most_common_tags(items)

    dist = Distance()

    # Artist, if not various.
    if not album_info.va:
        dist.add_string("artist", likelies["artist"], album_info.artist)

    # Album.
    dist.add_string("album", likelies["album"], album_info.album)

    preferred_config = config["match"]["preferred"]
    # Current or preferred media.
    if album_info.media:
        # Preferred media options.
        media_patterns: Sequence[str] = preferred_config["media"].as_str_seq()
        options = [
            re.compile(rf"(\d+x)?({pat})", re.I) for pat in media_patterns
        ]
        if options:
            dist.add_priority("media", album_info.media, options)
        # Current media.
        elif likelies["media"]:
            dist.add_equality("media", album_info.media, likelies["media"])

    # Mediums.
    if likelies["disctotal"] and album_info.mediums:
        dist.add_number("mediums", likelies["disctotal"], album_info.mediums)

    # Prefer earliest release.
    if album_info.year and preferred_config["original_year"]:
        # Assume 1889 (earliest first gramophone discs) if we don't know the
        # original year.
        original = album_info.original_year or 1889
        diff = abs(album_info.year - original)
        diff_max = abs(datetime.date.today().year - original)
        dist.add_ratio("year", diff, diff_max)
    # Year.
    elif likelies["year"] and album_info.year:
        if likelies["year"] in (album_info.year, album_info.original_year):
            # No penalty for matching release or original year.
            dist.add("year", 0.0)
        elif album_info.original_year:
            # Prefer matchest closest to the release year.
            diff = abs(likelies["year"] - album_info.year)
            diff_max = abs(
                datetime.date.today().year - album_info.original_year
            )
            dist.add_ratio("year", diff, diff_max)
        else:
            # Full penalty when there is no original year.
            dist.add("year", 1.0)

    # Preferred countries.
    country_patterns: Sequence[str] = preferred_config["countries"].as_str_seq()
    options = [re.compile(pat, re.I) for pat in country_patterns]
    if album_info.country and options:
        dist.add_priority("country", album_info.country, options)
    # Country.
    elif likelies["country"] and album_info.country:
        dist.add_string("country", likelies["country"], album_info.country)

    # Label.
    if likelies["label"] and album_info.label:
        dist.add_string("label", likelies["label"], album_info.label)

    # Catalog number.
    if likelies["catalognum"] and album_info.catalognum:
        dist.add_string(
            "catalognum", likelies["catalognum"], album_info.catalognum
        )

    # Disambiguation.
    if likelies["albumdisambig"] and album_info.albumdisambig:
        dist.add_string(
            "albumdisambig", likelies["albumdisambig"], album_info.albumdisambig
        )

    # Album ID.
    if likelies["mb_albumid"]:
        dist.add_equality(
            "album_id", likelies["mb_albumid"], album_info.album_id
        )

    # Tracks.
    dist.tracks = {}
    for item, track in mapping:
        dist.tracks[track] = track_distance(item, track, album_info.va)
        dist.add("tracks", dist.tracks[track].distance)

    # Missing tracks.
    for _ in range(len(album_info.tracks) - len(mapping)):
        dist.add("missing_tracks", 1.0)

    # Unmatched tracks.
    for _ in range(len(items) - len(mapping)):
        dist.add("unmatched_tracks", 1.0)

    dist.add_data_source(likelies["data_source"], album_info.data_source)

    return dist

</code_context>

<issue_to_address>
**issue (code-quality):** Low code quality found in distance - 16% ([`low-code-quality`](https://docs.sourcery.ai/Reference/Default-Rules/comments/low-code-quality/))

<br/><details><summary>Explanation</summary>The quality score for this function is below the quality threshold of 25%.
This score is a combination of the method length, cognitive complexity and working memory.

How can you solve this?

It might be worth refactoring this function to make it shorter and more readable.

- Reduce the function length by extracting pieces of functionality out into
  their own functions. This is the most important thing you can do - ideally a
  function should be less than 10 lines.
- Reduce nesting, perhaps by introducing guard clauses to return early.
- Ensure that variables are tightly scoped, so that code using related concepts
  sits together within the function rather than being scattered.</details>
</issue_to_address>

### Comment 8
<location> `beets/autotag/match.py:240` </location>
<code_context>
def tag_album(
    items,
    search_artist: str | None = None,
    search_name: str | None = None,
    search_ids: list[str] = [],
) -> tuple[str, str, Proposal]:
    """Return a tuple of the current artist name, the current album
    name, and a `Proposal` containing `AlbumMatch` candidates.

    The artist and album are the most common values of these fields
    among `items`.

    The `AlbumMatch` objects are generated by searching the metadata
    backends. By default, the metadata of the items is used for the
    search. This can be customized by setting the parameters.
    `search_ids` is a list of metadata backend IDs: if specified,
    it will restrict the candidates to those IDs, ignoring
    `search_artist` and `search album`. The `mapping` field of the
    album has the matched `items` as keys.

    The recommendation is calculated from the match quality of the
    candidates.
    """
    # Get current metadata.
    likelies, consensus = get_most_common_tags(items)
    cur_artist: str = likelies["artist"]
    cur_album: str = likelies["album"]
    log.debug("Tagging {} - {}", cur_artist, cur_album)

    # The output result, keys are the MB album ID.
    candidates: dict[Any, AlbumMatch] = {}

    # Search by explicit ID.
    if search_ids:
        for search_id in search_ids:
            log.debug("Searching for album ID: {}", search_id)
            if info := metadata_plugins.album_for_id(search_id):
                _add_candidate(items, candidates, info)

    # Use existing metadata or text search.
    else:
        # Try search based on current ID.
        if info := match_by_id(items):
            _add_candidate(items, candidates, info)
            rec = _recommendation(list(candidates.values()))
            log.debug("Album ID match recommendation is {}", rec)
            if candidates and not config["import"]["timid"]:
                # If we have a very good MBID match, return immediately.
                # Otherwise, this match will compete against metadata-based
                # matches.
                if rec == Recommendation.strong:
                    log.debug("ID match.")
                    return (
                        cur_artist,
                        cur_album,
                        Proposal(list(candidates.values()), rec),
                    )

        # Search terms.
        if not (search_artist and search_name):
            # No explicit search terms -- use current metadata.
            search_artist, search_name = cur_artist, cur_album
        log.debug("Search terms: {} - {}", search_artist, search_name)

        # Is this album likely to be a "various artist" release?
        va_likely = (
            (not consensus["artist"])
            or (search_artist.lower() in VA_ARTISTS)
            or any(item.comp for item in items)
        )
        log.debug("Album might be VA: {}", va_likely)

        # Get the results from the data sources.
        for matched_candidate in metadata_plugins.candidates(
            items, search_artist, search_name, va_likely
        ):
            _add_candidate(items, candidates, matched_candidate)

    log.debug("Evaluating {} candidates.", len(candidates))
    # Sort and get the recommendation.
    candidates_sorted = _sort_candidates(candidates.values())
    rec = _recommendation(candidates_sorted)
    return cur_artist, cur_album, Proposal(candidates_sorted, rec)

</code_context>

<issue_to_address>
**issue (code-quality):** We've found these issues:

- Replace mutable default arguments with None ([`default-mutable-arg`](https://docs.sourcery.ai/Reference/Default-Rules/suggestions/default-mutable-arg/))
- Low code quality found in tag\_album - 25% ([`low-code-quality`](https://docs.sourcery.ai/Reference/Default-Rules/comments/low-code-quality/))

<br/><details><summary>Explanation</summary>
The quality score for this function is below the quality threshold of 25%.
This score is a combination of the method length, cognitive complexity and working memory.

How can you solve this?

It might be worth refactoring this function to make it shorter and more readable.

- Reduce the function length by extracting pieces of functionality out into
  their own functions. This is the most important thing you can do - ideally a
  function should be less than 10 lines.
- Reduce nesting, perhaps by introducing guard clauses to return early.
- Ensure that variables are tightly scoped, so that code using related concepts
  sits together within the function rather than being scattered.</details>
</issue_to_address>

### Comment 9
<location> `beets/autotag/match.py:327` </location>
<code_context>
def tag_item(
    item,
    search_artist: str | None = None,
    search_name: str | None = None,
    search_ids: list[str] | None = None,
) -> Proposal:
    """Find metadata for a single track. Return a `Proposal` consisting
    of `TrackMatch` objects.

    `search_artist` and `search_title` may be used to override the item
    metadata in the search query. `search_ids` may be used for restricting the
    search to a list of metadata backend IDs.
    """
    # Holds candidates found so far: keys are MBIDs; values are
    # (distance, TrackInfo) pairs.
    candidates = {}
    rec: Recommendation | None = None

    # First, try matching by the external source ID.
    trackids = search_ids or [t for t in [item.mb_trackid] if t]
    if trackids:
        for trackid in trackids:
            log.debug("Searching for track ID: {}", trackid)
            if info := metadata_plugins.track_for_id(trackid):
                dist = track_distance(item, info, incl_artist=True)
                candidates[info.track_id] = hooks.TrackMatch(dist, info)
                # If this is a good match, then don't keep searching.
                rec = _recommendation(_sort_candidates(candidates.values()))
                if (
                    rec == Recommendation.strong
                    and not config["import"]["timid"]
                ):
                    log.debug("Track ID match.")
                    return Proposal(_sort_candidates(candidates.values()), rec)

    # If we're searching by ID, don't proceed.
    if search_ids:
        if candidates:
            assert rec is not None
            return Proposal(_sort_candidates(candidates.values()), rec)
        else:
            return Proposal([], Recommendation.none)

    # Search terms.
    search_artist = search_artist or item.artist
    search_name = search_name or item.title
    log.debug("Item search terms: {} - {}", search_artist, search_name)

    # Get and evaluate candidate metadata.
    for track_info in metadata_plugins.item_candidates(
        item, search_artist, search_name
    ):
        dist = track_distance(item, track_info, incl_artist=True)
        candidates[track_info.track_id] = hooks.TrackMatch(dist, track_info)

    # Sort by distance and return with recommendation.
    log.debug("Found {} candidates.", len(candidates))
    candidates_sorted = _sort_candidates(candidates.values())
    rec = _recommendation(candidates_sorted)
    return Proposal(candidates_sorted, rec)

</code_context>

<issue_to_address>
**issue (code-quality):** We've found these issues:

- Use named expression to simplify assignment and conditional ([`use-named-expression`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-named-expression/))
- Swap if/else branches ([`swap-if-else-branches`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/swap-if-else-branches/))
- Remove unnecessary else after guard condition ([`remove-unnecessary-else`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/remove-unnecessary-else/))
</issue_to_address>

### Comment 10
<location> `beets/ui/commands.py:376` </location>
<code_context>
    def show_match_header(self) -> None:
        """Print out a 'header' identifying the suggested match (album name,
        artist name,...) and summarizing the changes that would be made should
        the user accept the match.
        """
        # Print newline at beginning of change block.
        print_("")

        # 'Match' line and similarity.
        print_(
            f"{self.indent_header}Match ({dist_string(self.match.distance)}):"
        )

        artist_album_str = f"{self.match.info.artist} - {self.match.info.name}"
        print_(
            self.indent_header
            + dist_colorize(artist_album_str, self.match.distance)
        )

        # Penalties.
        penalties = penalty_string(self.match.distance)
        if penalties:
            print_(f"{self.indent_header}{penalties}")

        # Disambiguation.
        disambig = disambig_string(self.match.info)
        if disambig:
            print_(f"{self.indent_header}{disambig}")

        # Data URL.
        if self.match.info.data_url:
            url = ui.colorize("text_faint", f"{self.match.info.data_url}")
            print_(f"{self.indent_header}{url}")

</code_context>

<issue_to_address>
**issue (code-quality):** Use named expression to simplify assignment and conditional [×2] ([`use-named-expression`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-named-expression/))
</issue_to_address>

### Comment 11
<location> `beets/ui/commands.py:436-439` </location>
<code_context>
    def show_match_details(self) -> None:
        """Print out the details of the match, including changes in album name
        and artist name.
        """
        # Artist.
        artist_l, artist_r = self.cur_artist or "", self.match.info.artist
        if artist_r == VARIOUS_ARTISTS:
            # Hide artists for VA releases.
            artist_l, artist_r = "", ""
        left: Line
        right: Line
        if artist_l != artist_r:
            artist_l, artist_r = ui.colordiff(artist_l, artist_r)
            left = {
                "prefix": f"{self.changed_prefix} Artist: ",
                "contents": artist_l,
                "suffix": "",
            }
            right = {"prefix": "", "contents": artist_r, "suffix": ""}
            self.print_layout(self.indent_detail, left, right)

        else:
            print_(f"{self.indent_detail}*", "Artist:", artist_r)

        if self.cur_name:
            # Album
            name_l, name_r = self.cur_name or "", self.match.info.name
            if (
                self.cur_name != self.match.info.name
                and self.match.info.name != VARIOUS_ARTISTS
            ):
                name_l, name_r = ui.colordiff(name_l, name_r)
                left = {
                    "prefix": f"{self.changed_prefix} {self.match.type}: ",
                    "contents": name_l,
                    "suffix": "",
                }
                right = {"prefix": "", "contents": name_r, "suffix": ""}
                self.print_layout(self.indent_detail, left, right)
            else:
                print_(f"{self.indent_detail}*", "Album:", name_r)

</code_context>

<issue_to_address>
**suggestion (code-quality):** Combine two compares on same value into a chained compare ([`chain-compares`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/chain-compares/))

```suggestion
            if self.cur_name != self.match.info.name != VARIOUS_ARTISTS:
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +433 to +442
if self.cur_name:
# Album
album_l, album_r = self.cur_album or "", self.match.info.album
name_l, name_r = self.cur_name or "", self.match.info.name
if (
self.cur_album != self.match.info.album
and self.match.info.album != VARIOUS_ARTISTS
self.cur_name != self.match.info.name
and self.match.info.name != VARIOUS_ARTISTS
):
album_l, album_r = ui.colordiff(album_l, album_r)
name_l, name_r = ui.colordiff(name_l, name_r)
left = {
"prefix": f"{self.changed_prefix} Album: ",
"contents": album_l,
"prefix": f"{self.changed_prefix} {self.match.type}: ",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: The logic for printing Album/Title changes is now unified, but the fallback label is always 'Album'.

Since cur_name is used for both albums and singletons, consider making the fallback label dynamic to accurately reflect the type of match and prevent confusion for single track matches.

Suggested implementation:

        if self.cur_name:
            # Determine label based on match type
            label = "Album" if self.match.is_album else "Track"
            name_l, name_r = self.cur_name or "", self.match.info.name
            if (
                self.cur_name != self.match.info.name
                and self.match.info.name != VARIOUS_ARTISTS
            ):

You will also need to update any print statements or usages of the fallback label within this block to use the new label variable instead of the hardcoded "Album". For example, if you have:

print_(f"{self.indent_detail}*", "Album:", name_r)

change it to:

print_(f"{self.indent_detail}*", label + ":", name_r)

Make sure to check the rest of the block for similar usages.

expected = dict(zip(items, trackinfo)), [], []
expected = list(zip(items, trackinfo)), [], []

assert match.assign_items(items, trackinfo) == expected
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Consider adding tests for edge cases in assign_items, such as empty input lists and duplicate tracks/items.

Current tests only cover standard cases. Including tests for empty lists, duplicates, and mismatched lengths will improve coverage and reliability.

Suggested implementation:

        expected = list(zip(items, trackinfo)), [], []

        assert match.assign_items(items, trackinfo) == expected

def test_assign_items_empty_lists():
    items = []
    trackinfo = []
    expected = [], [], []
    assert match.assign_items(items, trackinfo) == expected

def test_assign_items_empty_items():
    items = []
    trackinfo = [info(1, "Track 1", 100)]
    expected = [], [], [trackinfo[0]]
    assert match.assign_items(items, trackinfo) == expected

def test_assign_items_empty_tracks():
    items = [Item("Item 1")]
    trackinfo = []
    expected = [], [items[0]], []
    assert match.assign_items(items, trackinfo) == expected

def test_assign_items_duplicate_items():
    items = [Item("Item 1"), Item("Item 1")]
    trackinfo = [info(1, "Track 1", 100), info(2, "Track 2", 200)]
    expected_mapping = [(items[0], trackinfo[0]), (items[1], trackinfo[1])]
    expected = expected_mapping, [], []
    assert match.assign_items(items, trackinfo) == expected

def test_assign_items_duplicate_tracks():
    items = [Item("Item 1"), Item("Item 2")]
    trackinfo = [info(1, "Track 1", 100), info(1, "Track 1", 100)]
    expected_mapping = [(items[0], trackinfo[0]), (items[1], trackinfo[1])]
    expected = expected_mapping, [], []
    assert match.assign_items(items, trackinfo) == expected

def test_assign_items_more_items_than_tracks():
    items = [Item("Item 1"), Item("Item 2"), Item("Item 3")]
    trackinfo = [info(1, "Track 1", 100)]
    expected_mapping = [(items[0], trackinfo[0])]
    expected = expected_mapping, [items[1], items[2]], []
    assert match.assign_items(items, trackinfo) == expected

def test_assign_items_more_tracks_than_items():
    items = [Item("Item 1")]
    trackinfo = [info(1, "Track 1", 100), info(2, "Track 2", 200)]
    expected_mapping = [(items[0], trackinfo[0])]
    expected = expected_mapping, [], [trackinfo[1]]
    assert match.assign_items(items, trackinfo) == expected

These tests assume the existence of Item and info constructors, and that match.assign_items returns a tuple of (matched_pairs, extra_items, extra_tracks).
If your test file uses fixtures or a different setup for Item and info, you may need to adjust the instantiation accordingly.
Also, ensure that the test functions are properly discovered by your test runner (e.g., pytest).

@pytest.fixture
def get_dist(self, items):
def inner(info: AlbumInfo):
return distance(items, info, dict(zip(items, info.tracks)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Consider adding tests for the distance function with non-standard mappings, such as missing or extra items/tracks.

Additional tests should include cases with extra items, missing tracks, and out-of-order pairs to ensure the distance function behaves correctly in these scenarios.

Suggested implementation:

    @pytest.fixture
    def get_dist(self, items):
        def inner(info: AlbumInfo):
            return distance(items, info, list(zip(items, info.tracks)))

        return inner

    def test_distance_with_extra_items(self):
        items = ["track1", "track2", "track3"]
        info = AlbumInfo(tracks=["track1", "track2"])
        mapping = [("track1", "track1"), ("track2", "track2"), ("track3", None)]
        result = distance(items, info, mapping)
        assert isinstance(result, (int, float)), "Distance should be a number"
        # Add more specific assertions based on expected behavior

    def test_distance_with_missing_tracks(self):
        items = ["track1", "track2"]
        info = AlbumInfo(tracks=["track1", "track2", "track3"])
        mapping = [("track1", "track1"), ("track2", "track2")]
        result = distance(items, info, mapping)
        assert isinstance(result, (int, float)), "Distance should be a number"
        # Add more specific assertions based on expected behavior

    def test_distance_with_out_of_order_pairs(self):
        items = ["track1", "track2", "track3"]
        info = AlbumInfo(tracks=["track3", "track1", "track2"])
        mapping = [("track1", "track3"), ("track2", "track1"), ("track3", "track2")]
        result = distance(items, info, mapping)
        assert isinstance(result, (int, float)), "Distance should be a number"
        # Add more specific assertions based on expected behavior
  • You may need to adjust the assertions in each test to match the expected output of your distance function.
  • Ensure that AlbumInfo and distance are imported or defined in the test file.
  • If your test class or function structure is different, you may need to adapt the placement of these tests.

"""Return an unicode string representing the changes"""
items = items or self.items
info = info or self.info
mapping = dict(zip(items, info.tracks))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Consider adding a test for _show_change with mismatched items and tracks.

Testing with mismatched items and tracks will verify that the UI logic correctly handles irregular input.

Suggested implementation:

def test_show_change_with_mismatched_items_and_tracks():
    # Setup: items and tracks have different lengths
    items = ["item1", "item2", "item3"]
    class Info:
        tracks = ["trackA", "trackB"]  # fewer tracks than items

    info = Info()
    config = {"ui": {}, "import": {}}
    color = "red"

    # Assuming _show_change is a method of a class, e.g., UIHandler
    # If it's a standalone function, adjust accordingly
    handler = UIHandler(items=items, info=info)
    try:
        result = handler._show_change(items=items, info=info, config=config, color=color)
    except Exception as e:
        result = str(e)

    # The test passes if no exception is raised and result is a string (or expected type)
    assert isinstance(result, str)
  • If _show_change is not a public method or not directly accessible, you may need to expose it or test via a public interface.
  • Replace UIHandler with the actual class name if different.
  • Adjust the assertion to match the expected output type or value for your implementation.



def apply_metadata(album_info: AlbumInfo, mapping: Mapping[Item, TrackInfo]):
def apply_metadata(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): Low code quality found in apply_metadata - 19% (low-code-quality)


ExplanationThe quality score for this function is below the quality threshold of 25%.
This score is a combination of the method length, cognitive complexity and working memory.

How can you solve this?

It might be worth refactoring this function to make it shorter and more readable.

  • Reduce the function length by extracting pieces of functionality out into
    their own functions. This is the most important thing you can do - ideally a
    function should be less than 10 lines.
  • Reduce nesting, perhaps by introducing guard clauses to return early.
  • Ensure that variables are tightly scoped, so that code using related concepts
    sits together within the function rather than being scattered.

return dist


def distance(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (code-quality): Low code quality found in distance - 16% (low-code-quality)


ExplanationThe quality score for this function is below the quality threshold of 25%.
This score is a combination of the method length, cognitive complexity and working memory.

How can you solve this?

It might be worth refactoring this function to make it shorter and more readable.

  • Reduce the function length by extracting pieces of functionality out into
    their own functions. This is the most important thing you can do - ideally a
    function should be less than 10 lines.
  • Reduce nesting, perhaps by introducing guard clauses to return early.
  • Ensure that variables are tightly scoped, so that code using related concepts
    sits together within the function rather than being scattered.

Comment on lines 436 to 439
if (
self.cur_album != self.match.info.album
and self.match.info.album != VARIOUS_ARTISTS
self.cur_name != self.match.info.name
and self.match.info.name != VARIOUS_ARTISTS
):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): Combine two compares on same value into a chained compare (chain-compares)

Suggested change
if (
self.cur_album != self.match.info.album
and self.match.info.album != VARIOUS_ARTISTS
self.cur_name != self.match.info.name
and self.match.info.name != VARIOUS_ARTISTS
):
if self.cur_name != self.match.info.name != VARIOUS_ARTISTS:

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the autotag matching system to use list of tuples instead of dictionaries for item-to-track mappings, supporting a more flexible and consistent data structure throughout the codebase. It also consolidates naming between album and track matching to use a unified name property.

Key changes:

  • Changed mapping from dict[Item, TrackInfo] to list[tuple[Item, TrackInfo]] across the codebase
  • Converted AlbumMatch and TrackMatch from NamedTuple to @dataclass
  • Added unified name property to Info, AlbumInfo, and TrackInfo classes to simplify album/track name access
  • Added items property and type class property to Match classes for cleaner data access

Reviewed Changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
beets/autotag/hooks.py Converted Match classes to dataclasses; added name property to Info classes and type/items properties to Match classes
beets/autotag/match.py Updated assign_items return type and renamed search_album/search_title parameters to search_name
beets/autotag/init.py Updated apply_metadata signature to use list of tuples
beets/autotag/distance.py Updated distance function signature to use list of tuples
beets/ui/commands.py Refactored ChangeRepresentation classes to dataclasses; updated to use new mapping structure and unified name property
beets/importer/tasks.py Updated to use match.items property instead of accessing mapping keys directly
beetsplug/mbsync.py Changed mapping construction from dict to list of tuples with .append()
test/test_ui.py Updated test to use list instead of dict for mapping
test/test_autotag.py Updated tests to iterate over mapping as list and create list of tuples
test/autotag/test_distance.py Updated test to use list of tuples for mapping

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

self.print_layout(self.indent_detail, left, right)
else:
print_(f"{self.indent_detail}*", "Title:", title_r)
print_(f"{self.indent_detail}*", "Album:", name_r)
Copy link

Copilot AI Nov 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The label should use self.match.type instead of the hardcoded string "Album:" to handle both album and singleton track matches correctly. This is inconsistent with line 442 which uses self.match.type for the changed case.

Suggested change
print_(f"{self.indent_detail}*", "Album:", name_r)
print_(f"{self.indent_detail}*", f"{self.match.type}:", name_r)

Copilot uses AI. Check for mistakes.
Comment on lines 666 to 667
# match is an AlbumMatch NamedTuple, mapping is a dict
# Sort the pairs by the track_info index (at index 1 of the NamedTuple)
Copy link

Copilot AI Nov 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is outdated. AlbumMatch is now a dataclass (not NamedTuple) and mapping is a list of tuples (not a dict).

Suggested change
# match is an AlbumMatch NamedTuple, mapping is a dict
# Sort the pairs by the track_info index (at index 1 of the NamedTuple)
# match is an AlbumMatch dataclass; mapping is a list of (item, track_info) tuples
# Sort the pairs by the track_info index (at index 1 of the tuple)

Copilot uses AI. Check for mistakes.
# Sort the pairs by the track_info index (at index 1 of the NamedTuple)
pairs = list(self.match.mapping.items())
pairs.sort(key=lambda item_and_track_info: item_and_track_info[1].index)
pairs = sorted(self.match.mapping, key=lambda pair: pair[1]["index"])
Copy link

Copilot AI Nov 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accessing pair[1][\"index\"] treats TrackInfo as a dict, but it should use attribute access: pair[1].index. While TrackInfo extends AttrDict which supports both syntaxes, using attribute access is more consistent with the type annotations and other code (e.g., lines 676, 685 use attribute access).

Suggested change
pairs = sorted(self.match.mapping, key=lambda pair: pair[1]["index"])
pairs = sorted(self.match.mapping, key=lambda pair: pair[1].index)

Copilot uses AI. Check for mistakes.
info: Info

@cached_classproperty
def type(cls) -> str:
Copy link

Copilot AI Nov 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normal methods should have 'self', rather than 'cls', as their first parameter.

Copilot uses AI. Check for mistakes.
@codecov
Copy link

codecov bot commented Nov 2, 2025

Codecov Report

❌ Patch coverage is 97.05882% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 66.98%. Comparing base (adc0d9e) to head (95df036).
⚠️ Report is 10 commits behind head on master.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
beetsplug/mbsync.py 50.00% 2 Missing ⚠️
beets/autotag/match.py 88.88% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6142      +/-   ##
==========================================
- Coverage   67.00%   66.98%   -0.02%     
==========================================
  Files         118      119       +1     
  Lines       18180    18284     +104     
  Branches     3079     3097      +18     
==========================================
+ Hits        12181    12248      +67     
- Misses       5340     5370      +30     
- Partials      659      666       +7     
Files with missing lines Coverage Δ
beets/autotag/__init__.py 87.71% <100.00%> (ø)
beets/autotag/distance.py 82.59% <100.00%> (ø)
beets/autotag/hooks.py 100.00% <100.00%> (ø)
beets/importer/tasks.py 90.68% <100.00%> (ø)
beets/ui/commands.py 74.97% <100.00%> (+0.12%) ⬆️
beets/autotag/match.py 78.80% <88.88%> (+0.14%) ⬆️
beetsplug/mbsync.py 81.81% <50.00%> (ø)

... and 4 files with indirect coverage changes

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@semohr
Copy link
Contributor

semohr commented Nov 3, 2025

Just had a short look and this will absolutly break beets-flask downstream. We use the mappings dict quite a bit... Can we move this change in particular to a different PR?

I think the other things are mainly no brainers.

I need to think abit about the implications a how to migrate this in beets-flask. Just a bit more time here would be highly appreciated :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants