Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lastgenre: New config option keep_existing #4982

Open
wants to merge 44 commits into
base: master
Choose a base branch
from

Conversation

JOJ0
Copy link
Member

@JOJ0 JOJ0 commented Oct 29, 2023

Description (moved)

Initially this PR included the following fixes which moved to a separate PR:

Description

  • Fix the behavior of theforce option. Previously disabling the option still led to manipulation of existing tags:
    • If content was found, a whitelist check was issued and if valid the plugin exited early and logged ("keep").
    • This whitelist check was not aware of multiple genres (separated typically by a string like , ), thus it failed and erased genres.

This didn't feel like a typical behaviour of a force option, which this PR tries to improve as follows...

  • String-separated multi-genres are now compiled into a list and depending on the whitelist option are kept and enriched with freshly fetched last.fm genres.

  • A lot of refactoring was done, some absolutely required, some as a preparation for future work on the plugin.

Details & Docs

Back in 2023-09 we decided on an additional option named keep_allowed, details on what we came up with: #4982 (comment)):

My final conclusion is to change that option name to keep_existing, which feels slightly more self-explanatory. Also decide on Setup 3 (see below) as the default because:

  • force always was the plugin's default.
  • with keep-existing enabled it might be a pretty common use-case.

Setup 1

Overwrite all. Only fresh last.fm genres remain.

force: yes
keep_existing: no

Setup 2

Add new last.fm genres when empty. Present tags stay untouched.

force: no
keep_existing: no

Setup 3 (default)

Add new last.fm genres. Combine genres in present tags with new ones
(depending on the whitelist setting, allowed or any).

force: yes
keep_existing: yes

To Do

  • Documentation
  • Changelog.
  • Fix existing tests.
  • Refactor _get_genre tests using pytest.mark.parametrize and add new test-cases.
  • Implement Case 1
  • Implement Case 2
  • Implement Case 3
  • Implement Case 4

@JOJ0 JOJ0 requested a review from sampsyo November 2, 2023 16:27
@JOJ0 JOJ0 marked this pull request as ready for review November 2, 2023 16:27
@JOJ0
Copy link
Member Author

JOJ0 commented Nov 2, 2023

I'd request a review from you @sampsyo since I think you initially created it. Also @rain0r would be good since 5 years ago they added the -A option. Hi @rain0r , you wanna take a look? :-)

In short: I think I fixed the plugin to now really reflect what's documented. Any nitpicking in my code or functionality-wise is appreciated.

One question already. Here we do not state that a -a/--album option exists: https://beets.readthedocs.io/en/latest/plugins/lastgenre.html#running-manually

When I started out with using this plugin I was confused a verry long time about this option. As far as I understand it now: It doesn't do anything since it is default. So why keep it? Or is having a -a option that is the default anyway a common thing in beets? I know we have a lot of -a commands which streamlines usablity, and that is a very good thing! Usuall they change behaviour to not do something with items but with albums. I'm just not sure about this one....do we have such a pattern anywhere else? So, just leave it? Should I add some words to the docs?

I think the both of you decided these options should look like that around here: #3220 (comment)

JOJ0 added a commit to JOJ0/beets that referenced this pull request Nov 2, 2023
@sampsyo
Copy link
Member

sampsyo commented Nov 3, 2023

Thanks for the extra context, @JOJ0!

About the existence of -a (the default mode) specifically: it's not too uncommon… for example, the beet import command has several flags that are opposites of each other, one of which is the default. Of course, it's important in that case because the default mode can be set in the config, so the user needs a way to override the default in either direction. That's not the case here, so maybe it at least makes sense to add "(default)" to the -a option's help string, or to remove it altogether?

Copy link
Member

@sampsyo sampsyo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the ping!! Here are a couple of straightforward comments.

beetsplug/lastgenre/__init__.py Outdated Show resolved Hide resolved
beetsplug/lastgenre/__init__.py Outdated Show resolved Hide resolved
beetsplug/lastgenre/__init__.py Outdated Show resolved Hide resolved
@JOJ0 JOJ0 marked this pull request as draft November 8, 2023 08:08
@JOJ0 JOJ0 force-pushed the lastgenre_fixes branch 2 times, most recently from 1e81209 to 89ae925 Compare November 16, 2023 12:33
JOJ0 added a commit to JOJ0/beets that referenced this pull request Nov 16, 2023
@JOJ0
Copy link
Member Author

JOJ0 commented Nov 17, 2023

I'd like to pull out this conversation #4982 (comment) into a new thread, to make it more obvious for others as well. I think it could be a broader discussion of where this plugin should go. Basically we were talking about the current force: no behaviour being weird as well as the new behaviour I am initially proposing with this PR. I gave all this some thought and came up with this idea. Let's discuss it:

So from my point of view, the main problem with the current behaviour when force is disabled, is that it's not really what a user would typically expect. So what could we do to make force: no more predictable?

The following idea would require a new config setting as well as a whole new branch of behaviour (Case 3):

Case 1

force: yes
overwrite all, only fresh last.fm genres remain

Case 2

force: no

keep any string in present genre tag, only write last.fm genres when empty

Case 3

force: yes
keep_allowed: yes

keep present genres when whitelisted and add new last.fm genres (this is a new branch of behaviour and needs to be coded, I think there is open feature requests for it. Update: Something was feature-requested, but it might not be exactly as I'm proposing here: #4750)

Case 4

force: no
keep_allowed: yes

cleanup only - keep present genres when whitelisted but don't add new last.fm genres; Only when genre is empty, add last.fm genres.

That last combination is weird though....but it's what I proposed for force:no before!

Which of these would now make sense to be the new default? The new force: no (Case 2) would be the least invasive IMO...

@sampsyo brainstorming request 🧐

@JOJ0 JOJ0 changed the title Lastgenre: fix track-level handling, fix multi-genre keep, streamline singleton log Lastgenre: Fix track-level handling, multi-genre keep, force behaviour, logging Nov 17, 2023
@JOJ0
Copy link
Member Author

JOJ0 commented Nov 17, 2023

Some more context / cross-linking:

The initial reason why I got my hands dirty with this plugin was when I realised that comma separated multi-genres where not recognized: #4751 (comment)

Here @arsaboo requests a feature that goes in direction of Case 3 above: #4750

@arsaboo
Copy link
Contributor

arsaboo commented Nov 17, 2023

So, we have two config options - force and keep_allowed, i.e., 4 options in all. Given that, keep_allowed is no in cases 1 and 2. Thus, here's a slightly modified behavior in the 4 cases above:

Case 1: overwrite all, only fresh last.fm genres remain

force: yes
keep_allowed: no

Case 2: Since keep_allowed is no, we only write last.fm genres when empty. There may be incorrect genres in pre-existing tags even after this, as this option is not touching pre-existing tags

force: no
keep_allowed: no

Case 3: keep present genres when whitelisted and add new last.fm genres

force: yes
keep_allowed: yes

Case 4: keep any string in the present genre tag; only write last.fm genres when empty. This will not touch pre-existing genre tags.

force: no
keep_allowed: yes

Thus, Case 4 seems like the best default choice. It does not affect existing genre tags and updates the empty ones. Case 3, on the other hand, is the most useful one (at least for me).

@sampsyo
Copy link
Member

sampsyo commented Nov 17, 2023

This brainstorming honestly sounds great, y'all. It is indeed really weird that the force: no mode can still update old genres; keeping all nonempty genres seems like it should at least be an option. I feel less specific about what the default should be, but I like your idea about decoupling the two aspects of the behavior (when to override existing, nonempty data and what to do to old data) into two different options.

@JOJ0
Copy link
Member Author

JOJ0 commented Nov 18, 2023

Ähem I might be slow or too tired already. Which of those 4 cases are now different from my proposal @arsaboo ? Sorry I must have missed it! Help! :-)

@arsaboo
Copy link
Contributor

arsaboo commented Nov 18, 2023

Not different....just a little more explicit about the force and keep_allowed config options. I think we have an agreement about the options.

@JOJ0 JOJ0 force-pushed the lastgenre_fixes branch 2 times, most recently from fb9f58d to c12b26b Compare September 17, 2024 16:34
@JOJ0 JOJ0 marked this pull request as ready for review September 17, 2024 16:38
Copy link

Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry.

@JOJ0 JOJ0 marked this pull request as draft September 17, 2024 16:39
@JOJ0
Copy link
Member Author

JOJ0 commented Sep 17, 2024

Hi @arsaboo! I finally managed to find time to almost finish this PR. The general behaviour and docs of the new config options combinations are finished. If you want to, an "early" review would be super helpful. Since it probably also for you is a long time ago it might be interesting what you think if you read through the docs. Is it 100% clear what force/keep_allowed options do? Certainly but only if you have the time, some playing around and checking if it also really works that way would be great. Thanks a ton!

@arsaboo
Copy link
Contributor

arsaboo commented Sep 17, 2024

@JOJ0 this is AWESOME 🎉🎉

The docs look reasonably clear. I will play with this. The debug logs are great to see what is going on.

@JOJ0 JOJ0 force-pushed the lastgenre_fixes branch 2 times, most recently from 796a3bf to a56098f Compare October 31, 2024 14:47
@JOJ0 JOJ0 force-pushed the lastgenre_fixes branch 2 times, most recently from 217aa33 to 8138708 Compare January 2, 2025 10:17
JOJ0 added 3 commits January 8, 2025 18:10
- Adapt tests to _resolve_genres returning a list with not yet formatted genres.
- Rename and adapt test_count -> test_to_delimited_string. Note that the
  new function does not apply whitelist, prefer anything. It just cuts
  to count and formats!
@JOJ0 JOJ0 force-pushed the lastgenre_fixes branch from 375b4a1 to d8ce25a Compare January 8, 2025 17:23
@JOJ0
Copy link
Member Author

JOJ0 commented Jan 8, 2025

I think it is because of the multiple artists being queried:

album, get_album, ('Arijit Singh, Neelesh Mishra, Raftaar', 'Pagglait (Original Motion Picture Soundtrack)')

The artist for the above album is Arijit Singh on LastFM. The two additional artists are causing the issue.

You are absolutely right, this is the reason. We should note this and handle it in a future PR. Maybe note it in a new issue with a possible solution idea already: We have a multi-artist field in beets already: artists, it could be used without any string parsing to fetch a single artist.

When I modified the artist for one of the album to match the artist on LastFM, I saw:

lastgenre: _last_lookup receives: album, get_album, ('Pritam', 'Rocky Aur Rani Kii Prem Kahaani (Original Motion Picture Soundtrack)')
lastgenre: _resolve_genres received: []
lastgenre: fetch_genre returns (whitelist checked): None
lastgenre: _last_lookup returns: None
lastgenre: _last_lookup receives: artist, get_artist, ('Pritam',)
lastgenre: _resolve_genres received: ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']

Exactly, only because of the artist "Pritam", genres could be found.

@JOJ0
Copy link
Member Author

JOJ0 commented Jan 8, 2025

@arsaboo I think final testing is in order. If I haven't forgotten anything (except a changelog) I think this is finally done. Some testing with all the different possible (and impossible) configuration combination lastgenre offers would be great. (whitelist of/on, canonical off/on, prefer_specifc, count, and so on). If you find the time, it would be super-helpful!!!! :-)

@snejus thanks to the pytest.mark.parametrize crash-course you gave me lately I managed to refactor the get_genre tests and added a lot of test-cases. This would be interesting to review in particular.

Note that I left a lot of debug logging in place for now, to make final testing easier Those lines will be removed before merge but if we find something particularly useful we could keep it in the final version.

@JOJ0 JOJ0 marked this pull request as ready for review January 8, 2025 17:59
Copy link

github-actions bot commented Jan 8, 2025

Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry.

@arsaboo
Copy link
Contributor

arsaboo commented Jan 8, 2025

It is working fine for the most part. Here's one that is not working fine. With whitelist, all_genres, and canonical disabled,

lastgenre:
    auto: no
    source: album
    count: 5
    separator: '\?'
#    whitelist: ~/.config/beets/genres/genres.txt
#    all_genres: ~/.config/beets/genres/all_genres.txt
#    canonical: ~/.config/beets/genres/genres-trees.yaml
    force: yes
    keep_existing: no

I am not sure why it still checks against the whitelist.

$ beet -v lastgenre album:"Rocky Aur Rani Kii Prem Kahaani"
user configuration: /home/arsaboo/.config/beets/config.yaml
data directory: /home/arsaboo/.config/beets
Sending event: pluginload
library database: /home/arsaboo/.config/beets/musiclibrary.blb
library directory: /data/music
Sending event: library_opened
Parsed query: AndQuery([SubstringQuery('album', 'Rocky Aur Rani Kii Prem Kahaani', fast=True)])
Parsed sort: NullSort()
lastgenre: Genre 'Filmi' allowed. FOUND in whitelist.
lastgenre: Genre 'World Music' allowed. FOUND in whitelist.
lastgenre: Genre 'Desi' allowed. FOUND in whitelist.
lastgenre: _last_lookup receives: album, get_album, ('Pritam', 'Rocky Aur Rani Kii Prem Kahaani (Original Motion Picture Soundtrack)')
lastgenre: _tags_for result is: []
lastgenre: fetch_genre returns (whitelist checked): []
lastgenre: _last_lookup returns: []
lastgenre: _last_lookup receives: artist, get_artist, ('Pritam',)
lastgenre: _tags_for result is: ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: fetch_genre returns (whitelist checked): ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: _last_lookup returns: ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: _combine got type new_genres: ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: _resolve_genres received: ['filmi', 'world music', 'desi', 'bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india']
lastgenre: Genre 'filmi' allowed. FOUND in whitelist.
lastgenre: Genre 'world music' allowed. FOUND in whitelist.
lastgenre: Genre 'desi' allowed. FOUND in whitelist.
lastgenre: _resolve_genres (canonicalized and whitelist filtered): ['filmi', 'world music', 'desi']
lastgenre: Reducing ['filmi', 'world music', 'desi'] to configured count 5
lastgenre: Reduced and formatted tags to Filmi, World Music, Desi
lastgenre: genre for album "Rocky Aur Rani Kii Prem Kahaani (Original Motion Picture Soundtrack)" (keep + artist): Filmi, World Music, Desi

@JOJ0
Copy link
Member Author

JOJ0 commented Jan 8, 2025

It is working fine for the most part. Here's one that is not working fine. With whitelist, all_genres, and canonical disabled,

lastgenre:
    auto: no
    source: album
    count: 5
    separator: '\?'
#    whitelist: ~/.config/beets/genres/genres.txt
#    all_genres: ~/.config/beets/genres/all_genres.txt
#    canonical: ~/.config/beets/genres/genres-trees.yaml
    force: yes
    keep_existing: no

I am not sure why it still checks against the whitelist.

Have a look at the defaults: https://github.com/beetbox/beets/pull/4982/files#diff-f85204832b1e3e76cf854983f1f1773be9f7857d94656cbb3a0f2a0ae6431140L99

and note that an empty string "" still will enable the built-in whitelist (same as True/on). To really disable it set whitelist: False (or "no")

Also note that the prefer_specific option uses the default canonical tree (genres-tree.txt) even if you have canonical disabled, it will load the dfault and use it (for sorting for by specificity).

This all is expected behaviour (even unittested) and existed before this PR!

@arsaboo
Copy link
Contributor

arsaboo commented Jan 8, 2025

hmm...even with whitelist: False"

lastgenre:
    auto: no
    source: album
    count: 5
    separator: ','
    whitelist: False
#    whitelist: ~/.config/beets/genres/genres.txt
#    all_genres: ~/.config/beets/genres/all_genres.txt
#    canonical: ~/.config/beets/genres/genres-trees.yaml
    force: yes
    keep_existing: no
lastgenre: Genre 'Filmi' allowed. FOUND in whitelist.
lastgenre: Genre 'World Music' allowed. FOUND in whitelist.
lastgenre: Genre 'Desi' allowed. FOUND in whitelist.
lastgenre: _last_lookup receives: album, get_album, ('Pritam', 'Rocky Aur Rani Kii Prem Kahaani (Original Motion Picture Soundtrack)')
lastgenre: _tags_for result is: []
lastgenre: fetch_genre returns (whitelist checked): []
lastgenre: _last_lookup returns: []
lastgenre: _last_lookup receives: artist, get_artist, ('Pritam',)
lastgenre: _tags_for result is: ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: fetch_genre returns (whitelist checked): ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: _last_lookup returns: ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: _combine got type new_genres: ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: _resolve_genres received: ['filmi', 'world music', 'desi', 'bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india']
lastgenre: Genre 'filmi' allowed. FOUND in whitelist.
lastgenre: Genre 'world music' allowed. FOUND in whitelist.
lastgenre: Genre 'desi' allowed. FOUND in whitelist.
lastgenre: _resolve_genres (canonicalized and whitelist filtered): ['filmi', 'world music', 'desi']
lastgenre: Reducing ['filmi', 'world music', 'desi'] to configured count 5
lastgenre: Reduced and formatted tags to Filmi, World Music, Desi
lastgenre: genre for album "Rocky Aur Rani Kii Prem Kahaani (Original Motion Picture Soundtrack)" (keep + artist): Filmi, World Music, Desi

@JOJ0
Copy link
Member Author

JOJ0 commented Jan 8, 2025

Interesting. I added a debug message in 2025a56

Could you try the same again?

@arsaboo
Copy link
Contributor

arsaboo commented Jan 8, 2025

Ok...it is working 🎉🎉

Too embarrassed to admit that I had lastgenre config twice in my config 🙈.

lastgenre: The whitelist config setting is 'False'
lastgenre: The self.whitelist property after file parsing is 'set()'
Sending event: pluginload
library database: /home/arsaboo/.config/beets/musiclibrary.blb
library directory: /data/music
Sending event: library_opened
Parsed query: AndQuery([SubstringQuery('album', 'Rocky Aur Rani Kii Prem Kahaani', fast=True)])
Parsed sort: NullSort()
lastgenre: _last_lookup receives: album, get_album, ('Pritam', 'Rocky Aur Rani Kii Prem Kahaani (Original Motion Picture Soundtrack)')
lastgenre: _tags_for result is: []
lastgenre: fetch_genre returns (whitelist checked): []
lastgenre: _last_lookup returns: []
lastgenre: _last_lookup receives: artist, get_artist, ('Pritam',)
lastgenre: _tags_for result is: ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: fetch_genre returns (whitelist checked): ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: _last_lookup returns: ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: _combine got type new_genres: ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: _resolve_genres received: ['filmi', 'world music', 'desi', 'bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india']
lastgenre: Genre 'filmi' allowed. Whitelist OFF.
lastgenre: Genre 'world music' allowed. Whitelist OFF.
lastgenre: Genre 'desi' allowed. Whitelist OFF.
lastgenre: Genre 'bollywood' allowed. Whitelist OFF.
lastgenre: Genre 'hindi' allowed. Whitelist OFF.
lastgenre: Genre 'indian' allowed. Whitelist OFF.
lastgenre: Genre 'pritam' allowed. Whitelist OFF.
lastgenre: Genre 'soundtrack' allowed. Whitelist OFF.
lastgenre: Genre 'world' allowed. Whitelist OFF.
lastgenre: Genre 'india' allowed. Whitelist OFF.
lastgenre: _resolve_genres (canonicalized and whitelist filtered): ['filmi', 'world music', 'desi', 'bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india']
lastgenre: Reducing ['filmi', 'world music', 'desi', 'bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india'] to configured count 5
lastgenre: Reduced and formatted tags to Filmi, World Music, Desi, Bollywood, Hindi
lastgenre: genre for album "Rocky Aur Rani Kii Prem Kahaani (Original Motion Picture Soundtrack)" (keep + artist): Filmi, World Music, Desi, Bollywood, Hindi

I tried various combinations of keep_existing and force, and they work as expected.

In terms of debug logs, we should at least log:

  1. Genre returned by LastFm before whitelisting so users can update their whitelist if required.
  2. Genre filtered after whitelist and canonicalization

Would love to see this merged ASAP so that we can work on improving the search.

@JOJ0
Copy link
Member Author

JOJ0 commented Jan 8, 2025

Thanks! Very helpful and no problem :-) You tried the prefer_specific option too?

I agree with logging 1

2is actually already logged in the current version and will improve in #5582

@arsaboo
Copy link
Contributor

arsaboo commented Jan 8, 2025

I am unsure if this album is a good test for prefer_specific, but here are the results anyway.

lastgenre:
    auto: no
    source: album
    count: 5
    separator: ','
    whitelist: no
#    whitelist: ~/.config/beets/genres/genres.txt
#    all_genres: ~/.config/beets/genres/all_genres.txt
#    canonical: ~/.config/beets/genres/genres-trees.yaml
    force: yes
    keep_existing: no
    prefer_specific: True
lastgenre: _last_lookup receives: album, get_album, ('Pritam', 'Rocky Aur Rani Kii Prem Kahaani (Original Motion Picture Soundtrack)')
lastgenre: _tags_for result is: []
lastgenre: fetch_genre returns (whitelist checked): []
lastgenre: _last_lookup returns: []
lastgenre: _last_lookup receives: artist, get_artist, ('Pritam',)
lastgenre: _tags_for result is: ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: fetch_genre returns (whitelist checked): ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: _last_lookup returns: ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: _combine got type new_genres: ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: _resolve_genres received: ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: _resolve_genres (canonicalized and whitelist filtered): []
lastgenre: Reducing [] to configured count 5
lastgenre: Reduced and formatted tags to
lastgenre: genre for album "Rocky Aur Rani Kii Prem Kahaani (Original Motion Picture Soundtrack)" (artist):
lastgenre:
    auto: no
    source: album
    count: 5
    separator: ','
    whitelist: no
#    whitelist: ~/.config/beets/genres/genres.txt
#    all_genres: ~/.config/beets/genres/all_genres.txt
#    canonical: ~/.config/beets/genres/genres-trees.yaml
    force: yes
    keep_existing: no
    prefer_specific: False
lastgenre: _last_lookup receives: album, get_album, ('Pritam', 'Rocky Aur Rani Kii Prem Kahaani (Original Motion Picture Soundtrack)')
lastgenre: _tags_for result is: []
lastgenre: fetch_genre returns (whitelist checked): []
lastgenre: _last_lookup returns: []
lastgenre: _last_lookup receives: artist, get_artist, ('Pritam',)
lastgenre: _tags_for result is: ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: fetch_genre returns (whitelist checked): ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: _last_lookup returns: ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: _combine got type new_genres: ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: _resolve_genres received: ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: Genre 'bollywood' allowed. Whitelist OFF.
lastgenre: Genre 'hindi' allowed. Whitelist OFF.
lastgenre: Genre 'indian' allowed. Whitelist OFF.
lastgenre: Genre 'pritam' allowed. Whitelist OFF.
lastgenre: Genre 'soundtrack' allowed. Whitelist OFF.
lastgenre: Genre 'world' allowed. Whitelist OFF.
lastgenre: Genre 'india' allowed. Whitelist OFF.
lastgenre: Genre 'world music' allowed. Whitelist OFF.
lastgenre: Genre 'desi' allowed. Whitelist OFF.
lastgenre: _resolve_genres (canonicalized and whitelist filtered): ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: Reducing ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi'] to configured count 5
lastgenre: Reduced and formatted tags to Bollywood,Hindi,Indian,Pritam,Soundtrack
lastgenre: genre for album "Rocky Aur Rani Kii Prem Kahaani (Original Motion Picture Soundtrack)" (artist): Bollywood,Hindi,Indian,Pritam,Soundtrack

I am not sure if this album is a good test for prefer specific.

@JOJ0
Copy link
Member Author

JOJ0 commented Jan 9, 2025

I am unsure if this album is a good test for prefer_specific, but here are the results anyway.

It is a difficult one yes, but that makes it an even more interesting one to see how the plugin handles it.

lastgenre:
    auto: no
    source: album
    count: 5
    separator: ','
    whitelist: no
#    whitelist: ~/.config/beets/genres/genres.txt
#    all_genres: ~/.config/beets/genres/all_genres.txt
#    canonical: ~/.config/beets/genres/genres-trees.yaml
    force: yes
    keep_existing: no
    prefer_specific: True
lastgenre: _last_lookup receives: album, get_album, ('Pritam', 'Rocky Aur Rani Kii Prem Kahaani (Original Motion Picture Soundtrack)')
lastgenre: _tags_for result is: []
lastgenre: fetch_genre returns (whitelist checked): []
lastgenre: _last_lookup returns: []
lastgenre: _last_lookup receives: artist, get_artist, ('Pritam',)
lastgenre: _tags_for result is: ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: fetch_genre returns (whitelist checked): ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: _last_lookup returns: ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: _combine got type new_genres: ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: _resolve_genres received: ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: _resolve_genres (canonicalized and whitelist filtered): []
lastgenre: Reducing [] to configured count 5
lastgenre: Reduced and formatted tags to
lastgenre: genre for album "Rocky Aur Rani Kii Prem Kahaani (Original Motion Picture Soundtrack)" (artist):

First a sidenote that my debug messages are not 100% accurate anymore. For example _resolve_genres (canonicalized and whitelist filtered) rather means: the function may have whitelist filtered depending on the config).

Anyway, what I see here is that even that we found genres:

lastgenre: _resolve_genres received: ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']

they are kicked out in the next step:

lastgenre: _resolve_genres (canonicalized and whitelist filtered): []

You have the whitelist off. One would assume this shouldn't happen.......

but you also have prefer_specific on,

which relies on the canonicalization tree and enables it (even though you have set canonical: no (default)

Now I assume that this "kick out" happens because the perfer_specific "sorting function, relies on the canonicalization tree where the found genres might not have been found.

I'm not 100% sure yet if this can be expected but from what I checked, the default genres-tree.txt does not have any of them. (Not even bollywood, soundtrack or world music - which I find odd and think generally should be there in our default -> maybe expand that in the future with some intenational basics?)

Anyway, what now would be superinteresting. What happens if you put in place a genres-tree.txt where you have a tree with (at least some) of these genres?

Thanks so far, very helpful!!!

@JOJ0
Copy link
Member Author

JOJ0 commented Jan 9, 2025

Some debug messages to better understand what prefer_specific does here: d8fea77

@arsaboo
Copy link
Contributor

arsaboo commented Jan 9, 2025

with canonical enabled and genre-tree includes some of the genres returned by lastfm.

lastgenre:
    auto: no
    source: album
    count: 5
    separator: ','
    whitelist: no
#    whitelist: ~/.config/beets/genres/genres.txt
#    all_genres: ~/.config/beets/genres/all_genres.txt
    canonical: ~/.config/beets/genres/genres-trees.yaml
    force: yes
    keep_existing: no
    prefer_specific: True
Parsed query: AndQuery([SubstringQuery('album', 'Rocky Aur Rani Kii Prem Kahaani', fast=True)])
Parsed sort: NullSort()
lastgenre: _last_lookup receives: album, get_album, ('Pritam', 'Rocky Aur Rani Kii Prem Kahaani (Original Motion Picture Soundtrack)')
lastgenre: _tags_for result is: []
lastgenre: fetch_genre returns (whitelist checked): []
lastgenre: _last_lookup returns: []
lastgenre: _last_lookup receives: artist, get_artist, ('Pritam',)
lastgenre: _tags_for result is: ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: fetch_genre returns (whitelist checked): ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: _last_lookup returns: ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: _combine got type new_genres: ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: _resolve_genres received: ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: _get_depth returns depth:None
lastgenre: _get_depth returns depth:None
lastgenre: _get_depth returns depth:None
lastgenre: _get_depth returns depth:None
lastgenre: _get_depth returns depth:None
lastgenre: _get_depth returns depth:None
lastgenre: _get_depth returns depth:None
lastgenre: _get_depth returns depth:None
lastgenre: _get_depth returns depth:None
lastgenre: _sort_by_depth sorted depth_tag_paris: []
lastgenre: _resolve_genres (canonicalized and whitelist filtered): []
lastgenre: Reducing [] to configured count 5
lastgenre: Reduced and formatted tags to
lastgenre: genre for album "Rocky Aur Rani Kii Prem Kahaani (Original Motion Picture Soundtrack)" (artist):

Let me know if you want me to test anything else.

@JOJ0
Copy link
Member Author

JOJ0 commented Jan 9, 2025

canonical: ~/.config/beets/genres/genres-trees.yaml

Doesnt seem to have changed anything compared to your run without a canoncial file.

How is the genres we try to handle structured in the file? maybe post a snippet how you put these in your genres-tree.txt:

['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']

what you also could try: does this work on master branch with your particular config?

@arsaboo
Copy link
Contributor

arsaboo commented Jan 9, 2025

Here's part of the file:

- filmi
    - feature film soundtrack
    - indian playback
    - bollywood
    - indian film music composer
    - indian film music composers
    - soundtrack
    - film music
    - bolly
    - bollywood disco
    - bollywood film
    - bollywood funk
    - bollywood hindi
    - bollywood indian
    - bollywood legend
    - bollywood movies
    - bollywood music
    - bollywood songs
    - bollywood sound
    - bollywood soundtrack
    - bollywood soundtracks
    - stage & screen
    - soundtrack
    - modern bollywood
    - desi

@arsaboo
Copy link
Contributor

arsaboo commented Jan 9, 2025

In any case, it doesn't look like you changed anything related to prefer_specific in this PR. If necessary, we can try to fix that in a separate PR.

JOJ0 added 3 commits January 9, 2025 22:21
The best place to log what we actually fetched from last.fm seems to be
here in _combine_and_label_genres. Leave out the existing genres we also
receive in this function - less is more.
@JOJ0
Copy link
Member Author

JOJ0 commented Jan 9, 2025

In any case, it doesn't look like you changed anything related to prefer_specific in this PR. If necessary, we can try to fix that in a separate PR.

Well, actually yes, I teared apart _resolve_genres() whch did all the work at once (now several functions do that and _resolve_genre is a little smaller). But anyhow I did some testing myself and my tests went well. Along the way I fixed an issue in the "magic" for various artists compilations that tries to find by figuring out the most popular track genre. I'm also happy with that now.

And also I now have an assumption why your canonical file doesn't work with prefer_specific. _get_depth returns None always because "Filmi" is not a genre last.fm fetched.

You could do a final test and instead of "Filmi" use something last.fm definitely fetches for the "main genre". For example "world music". That should work, I think...

...AND everything also must be in the whitlist of course.

@arsaboo
Copy link
Contributor

arsaboo commented Jan 9, 2025

@JOJ0 Looks like something changed in the last couple of commits. With the following config:

lastgenre:
    auto: no
    source: album
    count: 5
    separator: ','
    whitelist: ~/.config/beets/genres/genres.txt
    all_genres: ~/.config/beets/genres/all_genres.txt
    canonical: ~/.config/beets/genres/genres-trees.yaml
    force: yes
    keep_existing: no

I am getting:

Parsed query: AndQuery([SubstringQuery('album', 'Rocky Aur Rani Kii Prem Kahaani', fast=True)])
Parsed sort: NullSort()
lastgenre: fetched last.fm tags: ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: genre for album "Rocky Aur Rani Kii Prem Kahaani (Original Motion Picture Soundtrack)" (artist):

Here's part of my genre-trees

- filmi
    - modern bollywood
    - desi

desi is not part of my whitelisted genre. I would have expected this to be canonicalized to filmi

With the exact same commit, I had earlier,

lastgenre:
    auto: no
    source: album
    count: 5
    separator: ','
    whitelist: False
#    whitelist: ~/.config/beets/genres/genres.txt
#    all_genres: ~/.config/beets/genres/all_genres.txt
#    canonical: ~/.config/beets/genres/genres-trees.yaml
    force: yes
    keep_existing: no

I am getting different results now:

Parsed query: AndQuery([SubstringQuery('album', 'Rocky Aur Rani Kii Prem Kahaani', fast=True)])
Parsed sort: NullSort()
lastgenre: fetched last.fm tags: ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: genre for album "Rocky Aur Rani Kii Prem Kahaani (Original Motion Picture Soundtrack)" (artist): Bollywood,Hindi,Indian,Pritam,Soundtrack

with:

lastgenre:
    auto: no
    source: album
    count: 5
    separator: ', '
    whitelist: ~/.config/beets/genres/genres.txt
    all_genres: ~/.config/beets/genres/all_genres.txt
    canonical: ~/.config/beets/genres/genres-trees.yaml
    force: yes
    keep_existing: yes

I get:

lastgenre: fetched last.fm tags: ['bollywood', 'hindi', 'indian', 'pritam', 'soundtrack', 'world', 'india', 'world music', 'desi']
lastgenre: genre for album "Rocky Aur Rani Kii Prem Kahaani (Original Motion Picture Soundtrack)" (keep + artist): Filmi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants