Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

intermittent issue unzipping a decrypted file previously zipped+encrypted with libarchive. #2090

Open
liquidjorgeb13 opened this issue Mar 13, 2024 · 0 comments

Comments

@liquidjorgeb13
Copy link

liquidjorgeb13 commented Mar 13, 2024

sample_file.zip

zipping and encrypting with libarchive will intermittently generate a zip file that won't be able to be unzipped and unencrypted by libarchive itself. A "bad" zip does unzip fine on the terminal though. It has to be executed multiple times and most of the times there's no issue. I isolated as much as possible the libarchive relevant code that could still replicate the issue.

I attached the sample file I've been using. It only contains 1 text file with "this is a test" text. Its password is "password".

In a nutshell the sample code below:

  1. unzips sample.zip
  2. zips and encrypts it back using the original password
    - this is where I suspect the error might happen sometimes intermittently.
  3. unzips again (this doesn't seem to be where the error happens but is the way to reveal if something wrong happened in the previous zipping+encrypting step)

Observations:

  • On a run that fails, the callback function (on the second unzipping, decrypting the "bad" file) seems to not iterate through all passwords needed to reach the actual password of the file (the password is "password", the 4th one in the list).
  • The zipped file that breaks libarchive does seem to work fine when used on the terminal. (The program saves this file to disk under the name zip_that_breaks_decrypting.zip)
  • If you use the zip file from a bad run as the sample file (eg. cp zip_that_breaks_decrypting.zip sample.zip and run compile_and_loop.sh ), libarchive will always break with that file.
#include <iostream>
#include <stdint.h>
#include <vector>
#include <string.h>
#include <archive.h>
#include <archive_entry.h>
#include <fstream>
 
struct CallbackData {
    uint64_t idx{0};
    std::vector<std::string> passphrases;
    std::string last_passphrase_used{};
};
 
static const char* passphrase_callback(struct archive* /* archive */, void* client_data)
{
    std::cout << "inside passphrase callback" << std::endl;
    auto* cd = static_cast<CallbackData*>(client_data);
 
    if (cd->idx < cd->passphrases.size()) {
        std::cout << "testing passphrase: " << cd->passphrases[cd->idx] << std::endl;
        const char* ret = cd->passphrases[cd->idx].c_str();
        cd->last_passphrase_used = cd->passphrases[cd->idx];
        cd->idx++;
        return ret;
    }
    return nullptr;
}
 
int decrypt_wrapper(uint8_t* in_zip, const size_t& in_size, uint8_t*& unzip_buffer, size_t& read_file_size, std::string& guessed_pswd,
                                struct archive_entry*& entry, std::vector<int>& filters) {
    struct archive* la_archive{archive_read_new()};
    archive_read_support_format_zip(la_archive);
 
    struct archive_entry* la_archive_entry{nullptr};
 
    // Passphrase callback
    std::vector<std::string> pswds{"not it", "still not it", "infected", "password", "passphrase", "P@ssw0rd", "1234"};
    CallbackData cd{0, pswds, ""};
    archive_read_set_passphrase_callback(la_archive, static_cast<void*>(&cd), passphrase_callback);
 
    // Open archive file
    std::cout << "archive_read_open_memory result: " <<
    archive_read_open_memory(la_archive, reinterpret_cast<void*>(in_zip), in_size) << std::endl;
                                  
    // Filters
    for (int idx = 0; idx < archive_filter_count(la_archive); idx++) {
        filters.emplace_back(archive_filter_code(la_archive, idx));
    }
 
    constexpr ssize_t unarchived_max_size{30};
    unzip_buffer = new uint8_t[unarchived_max_size];
 
    // Header metadata
    archive_read_next_header(la_archive, &la_archive_entry);
    entry = archive_entry_clone(la_archive_entry);
 
    // Read archive data
    ssize_t read_bytes{0}; // store read return
    int64_t total_bytes_processed{0};
    while ((read_bytes = archive_read_data(la_archive, unzip_buffer + read_file_size,
                                           unarchived_max_size - read_file_size)) > 0) {
        // increment current size
        std::cout << "Looping - Inside archive_read_data loop. " << std::endl;
        read_file_size += static_cast<size_t>(read_bytes);
        total_bytes_processed += read_bytes;
    }
 
    if (read_bytes < 0) {
        std::cout << "------------------------------------- BAD RUN ----------------------------------" << std::endl;
    std::cout << "  Got an instance of archive_read_data() returning -30. Bad zip created with libarchive?" << std::endl;
    std::cout << "  Saving input file to zip_that_breaks_decrypting.zip. Expect this file to always make archive_read_data() fail and return -30" << std::endl;
        const std::string bad_name{"zip_that_breaks_decrypting.zip"};
        std::ofstream bad_zip(bad_name, std::ios::binary);
        bad_zip.write((char*)in_zip, in_size);
    bad_zip.close();
    return 1;
    }
 
    std::cout << "in_size: " << in_size << std::endl;
    std::cout << "format is zip?: " <<  (ARCHIVE_FORMAT_ZIP == archive_format(la_archive)) << std::endl;
    std::cout << "read_file_size: " << read_file_size << std::endl;
    std::cout << "read_bytes: " << read_bytes << std::endl;
    std::cout << "total_bytes_processed: " << total_bytes_processed << std::endl;
    std::cout << "last_password_used: " << cd.last_passphrase_used << std::endl;
 
    guessed_pswd = cd.last_passphrase_used;
                         
    archive_read_close(la_archive);
    archive_read_free(la_archive);
    return 0;
}
 
void encrypt_wrapper(uint8_t*& out_zip, size_t& out_zip_size, uint8_t*& decrypted_file,
                                      size_t& decrypted_file_size, const std::string& guessed_pswd,
                                      struct archive_entry*& entry, const std::vector<int>& archive_filters) {
 
    // Now we got a decrypted text file
    ///////////////////////////////////
 
    // Let's zip and encrypt it back
    struct archive* archive{archive_write_new()};
 
    for (int filter : archive_filters) {
        archive_write_add_filter(archive, filter);
    }
 
    archive_write_set_format(archive, ARCHIVE_FORMAT_ZIP);
    archive_write_set_options(archive, "zip:encryption=zipcrypt");
    archive_write_set_passphrase(archive, guessed_pswd.c_str());
       
    size_t data_size{20480};
    size_t buff_used{0};
    uint8_t* write_buffer[20480];
 
    archive_write_open_memory(archive, write_buffer, data_size, &buff_used);
    archive_entry_set_size(entry, decrypted_file_size);
    archive_write_header(archive, entry);
    archive_write_data(archive, decrypted_file, decrypted_file_size);
    archive_write_close(archive);
    archive_write_free(archive);
     
    uint8_t* out_buffer = new uint8_t[buff_used];
    memcpy(out_buffer, write_buffer, buff_used);
 
    out_zip = out_buffer;
    out_zip_size = buff_used;
}
 
 
int main() {
    std::cout << "----------------------------------- NEW RUN ---------------------------------" << std::endl;
    // Zip file to buffer
    std::ifstream original_zip("sample_file.zip", std::ios::binary);
    original_zip.seekg(0, std::ifstream::end);
    const size_t zip_file_size = original_zip.tellg();
    original_zip.seekg(0, std::ifstream::beg);
    uint8_t* original_in_zip = new uint8_t[zip_file_size];
    original_zip.read((char*)original_in_zip, zip_file_size);
    original_zip.close();
 
 
    // Return data from decrypt
    uint8_t* decrypted_file{nullptr};
    size_t decrypted_file_size{0};
    std::string guessed_pswd{};
    struct archive_entry* entry{nullptr};
    std::vector<int> archive_filters;
 
    decrypt_wrapper(original_in_zip, zip_file_size, decrypted_file, decrypted_file_size, guessed_pswd, entry, archive_filters);
 
    uint8_t* out_zip{nullptr};
    size_t out_zip_size{0};
    encrypt_wrapper(out_zip, out_zip_size, decrypted_file, decrypted_file_size, guessed_pswd, entry, archive_filters);
 
     
    uint8_t* decrypted_file2{nullptr};
    size_t decrypted_file_size2{0};
    struct archive_entry* entry2{nullptr};
    std::vector<int> archive_filters2;
    int returnVal{decrypt_wrapper(out_zip, out_zip_size, decrypted_file2, decrypted_file_size2, guessed_pswd, entry2, archive_filters2)};
 
    delete[] original_in_zip;
    return returnVal;
}

How I'm building (behaves the same in a more complex build environment) :

clone libarchive-src git repo to libarchive_question/build_materials (You should end up with libarchive_question/build_materials/libarchive-src)
from libarchive_question/build_materials, run cmake and make
from libarchive_question/ run

g++ zip_and_encrypt_bug.cpp -Ibuild_materials/libarchive-src/libarchive -Lbuild_materials/libarchive -larchive 
export LD_LIBRARY_PATH=./build_materials/libarchive:$LD_LIBRARY_PATH
while  ./a.out  ; do echo "heyo" ; done

The program will run several times until it fails and breaks the loop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant