Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gaps in the ext vhd map seem to cause problems... #276

Open
rtestardi opened this issue Mar 2, 2023 · 3 comments
Open

Gaps in the ext vhd map seem to cause problems... #276

rtestardi opened this issue Mar 2, 2023 · 3 comments

Comments

@rtestardi
Copy link

rtestardi commented Mar 2, 2023

Hi.

We're running the test program far below...

Basically ext filesystem on top of vhd...

For our ext vhd, we find that the ext vhd map has gaps in it (we are not sure if there is another bug causing this)...

There seem to be multiple bugs that cause us to end up in an infinite loop, with "toRead" value of 0, never making progress.

((1)) in DiscUtils\Library\DiscUtils.Ext\Extent.cs

The first bug is that many map entries have "NumBlocks" entries with the high bit (bit 15 of a ushort) set -- this seems very consistent and if we strip off that bit when reading data structures in from the underlying vhd, things are better. Otherwise, we totally goof the extent lengths as if we had many overlapping extents, which is impossible.

I think this code:

NumBlocks = EndianUtilities.ToUInt16LittleEndian(buffer, offset + 4)

Needs to be:

NumBlocks = (ushort)(EndianUtilities.ToUInt16LittleEndian(buffer, offset + 4) & 0x7fff);

Possibly the VHD format has changed to use this high bit for something?

((2)) in DiscUtils\Library\DiscUtils.Ext\ExtentsFileBuffer.cs

The second bug is when we return an extent from FindExtent() for a "gap" in the map, we return the previous extent (when the next extent starts after what we are looking for). Then the code in the parent function has a broken "if" that can never be true:

if (extent.FirstLogicalBlock > logicalBlock)

Inside this "if" is where we "clear" data rather than reading it from the underlying stream. I believe a quick examination of FindExtent() will show this code can never run today.

I believe this "if" should be:

if (logicalBlock >= extent.FirstLogicalBlock + extent.NumBlocks)

Which would be an appropriate time to "clear" data rather than reading it...

((3)) in also DiscUtils\Library\DiscUtils.Ext\ExtentsFileBuffer.cs

Finally, the code that computes how much data to "clear" for a gap in the map computes an impossible (negative) number:

numRead = (int)Math.Min(totalBytesRemaining, (extent.FirstLogicalBlock - logicalBlock) * blockSize - blockOffset);

Given FindExtents returns the previous extent and not the next extent in the case of a gap, it is impossible to know how much is really safe to clear (without a much larger change)...

So the simplest fix here is to just iterate blocksize at a time:

numRead = (int)blockSize;

Might this ring a bell for anyone???

Thank you!

-- Rich

Our test code is:

`
namespace DiscUtilsTestProject
{
using DiscUtils;
using DiscUtils.Ext;
using DiscUtils.Setup;
using System.IO;

public class testproject
{
    public static void Main(string[] args)
    {
        System.Diagnostics.Debugger.Launch();

        var diskPath = @"C:\Users\richardt\Desktop\8c6f6c2f-5c00-47fc-ad37-6995a2649586_ForGuestLogExtractor.vhd";
        var targetFile = @"\var\log\journal\dd4975ec7ef74a6d840979485e422853\system@e01041e7056c4ecb981f09a6fedaf662-0000000000000001-0005f4d29d44cd63.journal";
        var outputPath = @"c:\temp\outputfile.log";

        SetupHelper.RegisterAssembly(typeof(DiscUtils.Vhd.Disk).Assembly);
        SetupHelper.RegisterAssembly(typeof(DiscUtils.Ext.ExtFileSystem).Assembly);

        using (var virtualDisk = DiscUtils.VirtualDisk.OpenDisk(diskPath, FileAccess.Read))
        {
            var virtualDiskPhysicalVolumes = VolumeManager.GetPhysicalVolumes(virtualDisk);

            using(var stream = virtualDiskPhysicalVolumes[1].Open())
            {
                using (var extFileSystem = new ExtFileSystem(stream))
                {
                    //var fileInfo = extFileSystem.GetFileInfo(targetFile);

                    using (var readStream = extFileSystem.OpenFile(targetFile, FileMode.Open))
                    using (var writeStream = File.Create(outputPath))
                    {
                        readStream.CopyTo(writeStream);
                    }

                    //File.Delete(outputPath);
                }
            }
        }
    }
}

}
`

@LTRData
Copy link

LTRData commented Mar 2, 2023

I have found some strange behaviors as well in the current implementation of Ext file systems, however I do remember whether I have seen this exact problem. It seems a bit related though to problems with sparse allocated files that we found a while ago.

Looking at the changes that I did back then, it seems it could be related. But I am not sure if it really solves the problem you have seen here.
My changes: LTRData@c44eaee

@rtestardi
Copy link
Author

rtestardi commented Mar 2, 2023

Possibly the VHD format has changed to use this high bit for something?

PS I goofed part of the analysis -- the "high bit" bug is in the ext parser, not the vhd parser!

Ext4 changed things:

Extents
Extents replace the traditional block mapping scheme used by ext2 and ext3. An extent is a range of contiguous physical blocks, improving large-file performance and reducing fragmentation. A single extent in ext4 can map up to 128 MiB of contiguous space with a 4 KiB block size.[[4]](https://en.wikipedia.org/wiki/Ext4#cite_note-Mathur-4) There can be four extents stored directly in the inode. When there are more than four extents to a file, the rest of the extents are indexed in a tree.[14]
ext4 - Wikipedia

For a 4KiB block size, a short can map up to 256MiB of data, and the extent size is limited to 128MiB, freeing up the high bit!

The high bit is used for:

There's one more big concept we need to cover before you can really start decoding EXT4 file systems. As I mentioned in Part 1 of this series, you can only have a maximum of 4 extent structures per inode. Furthermore, there are only 16 bits in each extent structure for representing the number of blocks in the extent, and in fact the upper bit is reserved (it's used to mark the extent as "reserved but initialized", part of EXT4's pre-allocation feature). That means each extent can only contain a maximum of 2^15 blocks- which is 128MB assuming 4K blocks.
SANS Digital Forensics and Incident Response Blog | Understanding EXT4 (Part 3): Extent Trees | SANS Institute (archive.org)

So DiscUtils.Ext\Extent.cs definitely needs to ignore this, as proposed above in ((1))!

@LTRData
Copy link

LTRData commented Mar 2, 2023

Yes, the vhd implementation in DiscUtils is most probably very stable and correct these days. It is after all an open format, well documented and very easy to follow the logic in the official documentation for it.

Theoretically, the same should of course also apply to Ext. But it is a lot more complex and the DiscUtils implementation for it is not nearly as well tested and stable as the vhd one.

@rtestardi rtestardi changed the title Gaps in the vhd map seem to cause problems... Gaps in the ext vhd map seem to cause problems... Mar 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants