Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Polish support for ISO9660 file format #264

Open
2 of 7 tasks
blackwind opened this issue Feb 3, 2023 · 23 comments
Open
2 of 7 tasks

Polish support for ISO9660 file format #264

blackwind opened this issue Feb 3, 2023 · 23 comments
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@blackwind
Copy link

blackwind commented Feb 3, 2023

  • Support extracting files larger than 2^32 bytes
  • Don't truncate extracted filenames
  • Preserve casing for extracted filenames
  • Apply timestamp metadata to extracted files
  • Extract to root _unpackerred folder instead of creating a subfolder based on ISO's filename
  • Delete ISO with other archives after extraction if the associated option is enabled
  • Confirm exotic ISO formats can be extracted (UDF 1.02 seems to be most common)
$ ls -l extracted-by-unpackerr/
total 4432364
drwxr-xr-x 2 docker everyone       4096 2023-02-01 21:43 ARTBOOK
drwxr-xr-x 2 docker everyone       4096 2023-02-01 21:43 GUIDE
drwxr-xr-x 2 docker everyone       4096 2023-02-01 21:43 MANUAL
drwxr-xr-x 4 docker everyone       4096 2023-02-01 21:43 OST
drwxr-xr-x 2 docker everyone       4096 2023-02-01 21:44 POSTER
-rw-r--r-- 1 docker everyone  243729046 2023-02-01 21:44 SETUP~01.BIN
-rw-r--r-- 1 docker everyone 4294081022 2023-02-01 21:44 SETUP_X-.BIN
-rw-r--r-- 1 docker everyone     896112 2023-02-01 21:44 SETUP_X-.EXE
drwxr-xr-x 2 docker everyone       4096 2023-02-01 21:44 WALLPAPE

$ ls -l extracted-by-winrar/
total 4432364
drwxr-xr-x 2 docker everyone       4096 2022-12-16 08:20 artbook
drwxr-xr-x 2 docker everyone       4096 2022-12-16 08:20 guide
drwxr-xr-x 2 docker everyone       4096 2022-12-16 08:20 manual
drwxr-xr-x 4 docker everyone       4096 2022-12-16 08:20 ost
drwxr-xr-x 2 docker everyone       4096 2022-12-16 08:20 poster
-rw-r--r-- 1 docker everyone 4294081022 2022-12-16 08:20 setup_x-blades_hd_1.0_(60327)-1.bin
-rw-r--r-- 1 docker everyone  243729046 2022-12-16 08:20 setup_x-blades_hd_1.0_(60327)-2.bin
-rw-r--r-- 1 docker everyone     896112 2022-12-16 08:20 setup_x-blades_hd_1.0_(60327).exe
drwxr-xr-x 2 docker everyone       4096 2022-12-16 08:20 wallpapers
@davidnewhall
Copy link
Collaborator

Thank you! I have a few points to make, but I'm super busy and will catch up on this soon!

@davidnewhall
Copy link
Collaborator

Found this in someone else's log.

unpackerr-2023-02-06T18-44-33.236.log:2023/02/06 03:27:53 Extraction Error: Pokemon.Origins.S01.2013.ANiME.DUAL.COMPLETE.BLURAY-iFPD: failed to open iso image: /downloads/tv-sonarr/Pokemon.Origins.S01.2013.ANiME.DUAL.COMPLETE.BLURAY-iFPD/ifpd-pokemonoriginss01-bluray.iso: volume descriptor "BEA01" != "CD001"

@davidnewhall davidnewhall added bug Something isn't working help wanted Extra attention is needed labels Feb 13, 2023
@davidnewhall
Copy link
Collaborator

davidnewhall commented Apr 28, 2023

Man, this is a rough one. I thought, back when you opened this issue, I found another ISO library for Go (seo: golang). Today, I'm only finding 3 libraries, and I seem to be using the 'best' one. None of them support Joilet file extensions, which means file names over 32 characters are out. The bugs you've run into seem to be directly in this library. I don't believe I can fix them myself. I'm also afraid the 4 GB file limitation is built into the library, but I think it may be inadvertently used for extractions when it should be used for compression. Not entirely sure yet.

Question for ya @blackwind .. if I give you a spot to upload, can you send me an ISO file or two that didn't work? I'll try to engage with @kdomanski once I have a reproducible example to share with him.

This is the library I'm using now:

These are the other two I found:

EDIT: Found more that are 2+ years old:

If anyone find a good ISO9660 library for Go.. lemme know.

@blackwind
Copy link
Author

Proper support for these files will be a huge time-saver for me, so absolutely, I'm happy to help in any way I can. The one I used in my log (X-Blades_HD-DINOByTES) is a good example of all mentioned issues and is available in the obvious places, but I'll do the legwork if you need me to for whatever reason.

@davidnewhall
Copy link
Collaborator

I'll try that file with a few of these libraries. Will see if anything can extract it.

@kdomanski
Copy link

Found this in someone else's log.

unpackerr-2023-02-06T18-44-33.236.log:2023/02/06 03:27:53 Extraction Error: Pokemon.Origins.S01.2013.ANiME.DUAL.COMPLETE.BLURAY-iFPD: failed to open iso image: /downloads/tv-sonarr/Pokemon.Origins.S01.2013.ANiME.DUAL.COMPLETE.BLURAY-iFPD/ifpd-pokemonoriginss01-bluray.iso: volume descriptor "BEA01" != "CD001"

That's a UDF descriptor.

@davidnewhall
Copy link
Collaborator

It's funny, because the ISO9660 library I currently use just released a new version. Literally the only significant change they made was to add an error that says "UDF volumes are not supported." rip. I'm messing with this a little bit today, but I'm not very optimistic. :(

@davidnewhall
Copy link
Collaborator

I'm stumped at this point. The only actively maintained libraries I can find do not support 2 or more of:

  • Joliet (Unicode / non-Latin + 100 character file names)
  • Rock Ridge (posix timestamps and permissions + 255 byte file names)
  • UDF (DVD ISOs, generally)

I did find 1 old library that has Rock Ridge support, and 1 that supports UDF, but I don't think I found any that support Joliet.

Ideally, the rock ridge support can be ported into https://github.com/kdomanski/iso9660 or https://github.com/diskfs/go-diskfs or both.

@kdomanski
Copy link

kdomanski commented May 7, 2023

Funny indeed. Maybe the author got a notification when you mentioned him, and saw your post.

The 1 library with Rock Ridge support that you linked, it only gets the full filename from the RR data, but not timestamps. Looks like the master branch of the library you use already has RR test fixture added, so full RR support might drop any time. Supporting Joliet might then be redundant for your usecase, we'll see.

As for files larger than 4GB, this requires support for multi-extent descriptors. It's not hard to implement, but it requires a bit of free time.

@davidnewhall
Copy link
Collaborator

davidnewhall commented May 7, 2023

Funny indeed. Maybe the author got a notification when you mentioned him, and saw your post.

Love it. Thanks for stopping by!

Looks like the master branch of the library you use already has RR test fixture added,

haha, don't be so modest. I see your recent commits (now), and am very pleased!

Supporting Joliet might then be redundant for your usecase

I hope so. Seems like rock ridge will give us what we're missing.

As for files larger than 4GB, this requires support for multi-extent descriptors.

Give me a pointer or two? I'm willing to try if you think it might be a worthwhile use of my time. This is probably the last "hurdle."

EDIT: derp moment. Just realized who I replied to earlier. haha EDIT2: and now realizing the new release you made was probably because of this issue, and that error message you quoted. Thank you :)

@kdomanski
Copy link

kdomanski commented May 7, 2023

Give me a pointer or two? I'm willing to try if you think it might be a worthwhile use of my time. This is probably the last "hurdle."

ECMA-119 9.1.6: multi-extent flag.
ECMA-119 6.5.1 "Each file shall consist of one or more File Sections."

It's not very explicit, but I infer that maybe it means a multi-extent file has several consecutive Directory Records and the flag turned on.

The Linux Kernel's code for this appears to interpret this flag as an indication of the given DE not being the last one for the file.

Supporting Joliet might then be redundant for your usecase

I hope so. Seems like rock ridge will give us what we're missing.

Looks like (outside of some edge cases) Linux will use RR and ignore Joliet if both are present.

@davidnewhall
Copy link
Collaborator

@kdomanski You're right, overall this doesn't look too hard. It's going to take me a bit to come up to speed on this, but I've got a couple hours into it now and may be able to get there. Here's where I'm at...

None of the files have dirFlagMultiExtent set in their FileFlags. This image doesn't actually seem to have any files larger than 4 GB, so I will keep looking for one.

de.SystemUse is also empty, so I don't seem to get Rock Ridge files names. It could be that this image has two volumes and doesn't do rock ridge. Have you figured out how to access that second volume yet?

Here's my "update" to do some debugging: kdomanski/iso9660@45c0c7d

I ran this new code against the ISO file mention earlier in the thread. Here's the whole output:
https://gist.github.com/davidnewhall/b67c6fdf1c942fb8d8026ba1a42fad25

This is what it looks like mounted on my Mac:
Screen Shot 2023-05-18 at 1 54 09 AM

...which makes me want to ask: Is the volume name exposed by this library yet? (the name in the title)

@kdomanski
Copy link

Hmm, this might be a Joliet-only image. I'll look into the dump you provided.

Is the volume name exposed by this library yet? (the name in the title)

it is now. ;-) https://github.com/kdomanski/iso9660/releases/tag/v0.3.5

@davidnewhall
Copy link
Collaborator

amazing!

@blackwind
Copy link
Author

Any further progress on this? Or are we blocked indefinitely?

@davidnewhall
Copy link
Collaborator

No one has ever extracted or created these 'advanced' format ISO images with Go apps. This is all new. kdomanski is the only person that's put together a comprehensive library that will one day provide these features. Today, it does not. I haven't had time to visit this. I have dozens of projects, and this feature is a lot of work, so it will be a while before I'm intrigued enough to spend the time required.

There has been no further progress at this time.

@blackwind
Copy link
Author

If you detected impatience in my tone, none was intended. I appreciate the update and all the work done on this so far.

@kdomanski
Copy link

Sup. Release v0.4.0 can read Rock Ridge filenames.
Looking forward to your feedback (and bug reports 😉 ).

@davidnewhall
Copy link
Collaborator

davidnewhall commented Aug 20, 2023

There's probably more I can do here, but I updated the library and pushed some updates. You can download it here https://unstable.golift.io - thanks Kamil!

EDIT: Docker is ready.

@blackwind
Copy link
Author

Unable to test until the Docker image is available, but it sounds like no more filename truncation, no more incorrect filename casing, but the other issues persist. I've marked the completed tasks in the first post.

@blackwind
Copy link
Author

Currently just getting "UDF volumes are not supported", which I guess is an improvement over the old behavior.

@davidnewhall
Copy link
Collaborator

What is the image you're testing? UDF is probably another problem that needs a solution.

@blackwind
Copy link
Author

Tried a few, but Stray.v1.5-Razor1911 is a well sized one for testing the 4GB issue as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Development

No branches or pull requests

3 participants