Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

7z: failed extraction due to unsupported header/data type #2076

Open
mehrabiworkmail opened this issue Feb 26, 2024 · 2 comments
Open

7z: failed extraction due to unsupported header/data type #2076

mehrabiworkmail opened this issue Feb 26, 2024 · 2 comments

Comments

@mehrabiworkmail
Copy link

mehrabiworkmail commented Feb 26, 2024

When attempting to use libarchive to extract a corpus of malicious SFX files, I found that libarchive fails to extract many SFX files that the 7z tool can extract just fine. I investigated the issue and found the following to be one of the causes.

After reading archive_read_support_format_7zip.c, I realized that libarchive currently does not support the following:

  1. Parsing the kAdditionalStreamsInfo header type (and extracting data from that stream, for that matter)
  2. Parsing kName values from FilesInfo that have external flag set (i.e., the data is external to the header)
  3. Parsing 'kAttributeswhen theexternal` flag is set

Unfortunately, it seems that when libarchive encountes one such data/header, it simply gives up processing the file, even if it is able to extract data from the main stream just fine. This might not tbe desired behaviour, depending on the application (e.g., for processing malicious files it is often desirable to extract as much info as possible, even if it means skipping some unsupported files)

Obviously, the ideal solution is to implement support for these header/data types. However, a workaround is to make libarchive skip such data/header types, but only if the users so chooses by setting specific compiler flags. Below are screenshots of such a solution I implemented.

Firstly, as the kAdditionalStreamsInfo header has basically the same structure as the kMainStreamsInfo header, we can reuse the latter's code to read the former:
image

image

Similarly, libarchive already has partial support for skipping external data in the read_Times function in the same file (archive_read_support_format_7zip.c:2753). So, we can simply reuse the same code to skip external values in other places:
image

image

Here's the git patch for my solution:
0002-7z-optional-compiler-flag-to-skip-additonalstreamsin.patch

Please let me know if you have any feedback or other suggestions.

@kientzle
Copy link
Contributor

I prefer to avoid compiler flags for cases like this: It makes testing much more complex (do we now need to build and test both versions?) and means that only the few people who are able to compile their own library can benefit.

There is precedent elsewhere in libarchive for skipping unrecognized or malformed data with a warning. I think that would be the better approach here. (Obviously, the best answer is to implement support where possible.)

@mehrabiworkmail
Copy link
Author

I have created a pull request for this issue that contains my kAdditionalHeaderInfo skip code (just in case someone is interested). It is closed for now as I intend to implement support for the header instead. Will reopen the pull-request once it is done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants