Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use pathlib in tests and add symlink test #84

Merged
merged 3 commits into from Apr 17, 2024
Merged

Conversation

geigerzaehler
Copy link
Owner

No description provided.

@geigerzaehler geigerzaehler marked this pull request as ready for review April 5, 2024 11:39
Copy link
Collaborator

@wisp3rwind wisp3rwind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a disclaimer, the handling of filesystem encoding and locale belongs to the "dark magic" parts of beets that I don't really understand in depth.


In principle, it would be great to just have pathlib everywhere, but the benefits of starting to add it here are unclear to me. Could you comment a bit on your motivation to refactor this now? If I remember correctly, there was a project to move beets to pathlib, but that never went anywhere.

My impression is that this might introduce extra friction at the interface between beets and the plugin, given that beets still (tries to) consistently represents everything as bytes internally. I've marked a few places where I'm not sure that the expected types are being used.

I didn't review this PR in full; and I don't think I have the time to do so: That would require going back to the beets code quite a lot and checking my expectations about it.

If I were to make this change myself, I think I'd try to add typings to as much of beets and alternatives as possible to have the support of the type checker in getting it right.

test/cli_test.py Outdated
@@ -506,7 +493,7 @@ def test_convert_and_embed(self):
self.config["convert"]["embed"] = True

album = self.add_album(myexternal="true", format="m4a")
album.artpath = self.IMAGE_FIXTURE1
album.artpath = str(self.IMAGE_FIXTURE1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure beets expects bytes here.

@@ -518,14 +505,14 @@ def test_convert_write_tags(self):

# We "convert" by copying the file. Setting the title simulates
# a badly behaved converter
mediafile_converted = MediaFile(syspath(item.path))
mediafile_converted = MediaFile(item.path)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

item.path should be the beets-internal representation (i.e. bytes), right? Then, from my understanding, the syspath is required as that represenation is not the correct one to pass to OS APIs on Windows.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t know enough about MediaFile to understand what’s going on. But this is test code and it works on both platforms on CI some I’m ok with just using it as is.

mediafile_converted.title = "WRONG"
mediafile_converted.save()

self.runcli("alt", "update", "myexternal")
item.load()

alt_mediafile = MediaFile(syspath(self.get_path(item)))
alt_mediafile = MediaFile(self.get_path(item))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In contrast to the above, this passes a pathlib.Path to Mediafile.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like I said above, this is test code and works on both platforms so it’s fine

@geigerzaehler
Copy link
Owner Author

Could you comment a bit on your motivation to refactor this now?

Of course. I’ve been bugged by finding the cause for #74 and my suspicion is that it comes from beets’s path handling. In addition to that, I feel that beets’s path handling is hard to understand, difficult to use and a lot less readable than pathlib.

My impression is that this might introduce extra friction at the interface between beets and the plugin, given that beets still (tries to) consistently represents everything as bytes internally.

I understand your concern. But in my experience the main friction comes from how beets handles paths and how hard that is to understand and predict. For example, you said that beets “tries to“ use bytes for paths but that is not actually the case on Windows where syspath returns a str. On all other platforms it just passes through the argument, so it is up to the user to ensure the correct type.

This is also why improved type hints don’t solve the underlying problem. (The would help improve understanding in a lot of other areas, though.)

I didn't review this PR in full; and I don't think I have the time to do so: That would require going back to the beets code quite a lot and checking my expectations about it.

That’s ok, thanks for your effort anyway. Since this only touches test code (for now) I think it’s ok to play a bit loose with things like the type passed to Mediafile since it just works™.

@geigerzaehler geigerzaehler force-pushed the pathlib-tests branch 2 times, most recently from a69b59c to 9ca498d Compare April 11, 2024 09:27
@wisp3rwind
Copy link
Collaborator

Could you comment a bit on your motivation to refactor this now?

Of course. I’ve been bugged by finding the cause for #74 and my suspicion is that it comes from beets’s path handling. In addition to that, I feel that beets’s path handling is hard to understand, difficult to use and a lot less readable than pathlib.

It definitely is, and I don't think there's any documentation on how it's supposed to work. However, I'm not sure whether pathlib would really resolve all issues on its own, or whether it would just shift input validation/normalization to different places in the code.

My impression is that this might introduce extra friction at the interface between beets and the plugin, given that beets still (tries to) consistently represents everything as bytes internally.

I understand your concern. But in my experience the main friction comes from how beets handles paths and how hard that is to understand and predict. For example, you said that beets “tries to“ use bytes for paths but that is not actually the case on Windows where syspath returns a str. On all other platforms it just passes through the argument, so it is up to the user to ensure the correct type.

It should be the case on Windows as well: From my understanding, the intended usage is

  • any input from OS APIs -> bytestringpath -> internal representation as bytes (no-op on *nix, conversion on Windows)
  • internal representation -> syspath -> OS APIs and third-party libraries (no-op on *nix, conversion on Windows. There's a bit of defensive programming here when passing in str, but that should really be considered a programming error.)
  • internal representation -> py3_path -> APIs that only accept str, even on Linux where the native path type is bytes
  • displayable_path takes the internal bytes representation and returns a string that can safely be printed (but this might be lossy)
  • the DB stores BLOBs, i.e. bytes

Thus, overall, except for input and at API boundaries, any path handled by beets should always be represented as bytes. I'm pretty sure that there are still some bugs lurking in beets, definitely in some of the vendored plugins and possibly even in the core.

This is also why improved type hints don’t solve the underlying problem. (The would help improve understanding in a lot of other areas, though.)

One idea that I mentioned at some point is using newtypes for InternalPath, SysPath (which would have a platform-dependent inner type) and thereby help verification by the type checker. But that on its own would already by a rather big project in beets.

I didn't review this PR in full; and I don't think I have the time to do so: That would require going back to the beets code quite a lot and checking my expectations about it.

That’s ok, thanks for your effort anyway. Since this only touches test code (for now) I think it’s ok to play a bit loose with things like the type passed to Mediafile since it just works™.

I'm still not fully convinced, since this means that the testing code might end up using different types/code paths than the real usage in beets.

That said and given that I'm not going to review this in more depth, please ignore my comments and go with your judgement if you're convinced that this should be done!

@geigerzaehler
Copy link
Owner Author

Thanks for the reply. I appreciate your input on this.

I’ve reflected on the change again but I think the main argument is still very strong: There’s a standard, pythonic way to deal with paths robustly and not care about the OS. That way is different from beets’s but I trust the former more and I believe it is the way forward. I’m willing to deal with the friction that comes from using both.

With that in mind, I’ll go ahead and merge this change. I’ll have another PR ready soon that also switches to pathlib in the actual plugin implementation.

@geigerzaehler geigerzaehler merged commit e5d8dd2 into main Apr 17, 2024
9 of 11 checks passed
@geigerzaehler geigerzaehler deleted the pathlib-tests branch April 17, 2024 07:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants