Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Bilevel support (Gray(1) ColorType) #217

Open
samsieber opened this issue Oct 16, 2023 · 8 comments
Open

Adding Bilevel support (Gray(1) ColorType) #217

samsieber opened this issue Oct 16, 2023 · 8 comments

Comments

@samsieber
Copy link

I'm very interested in using this library for ingesting some document into a document processing system I work on. I'd like to be able to handle bilevel tiffs - BW tiffs with 1 bit per sample. Like this:

=== TIFF directory 0 ===
TIFF Directory at offset 0x101024 (1052708)
  Subfile Type: multi-page document (2 = 0x2)
  Image Width: 2550 Image Length: 3300
  Resolution: 300, 300 pixels/inch
  Position: 0, 0
  Bits/Sample: 1
  Compression Scheme: None
  Photometric Interpretation: min-is-black
  FillOrder: msb-to-lsb
  Orientation: row 0 top, col 0 lhs
  Samples/Pixel: 1
  Rows/Strip: 3280
  Planar Configuration: single image plane
  Page Number: 0-12

As I understand it, you guys are open to expanding format support in this library, right? I'd be perfectly fine just being able to convert this to a Gray(8) image for manipulation in the main image library.

How hard would adding that support be? I'd be happy to work on it if someone's willing to mentor me. I've poked around a bit but I'm not very familiar with the tiff spec. I got as far as trying to read a multi-page tiff like so, but part-way through it panicked because it couldn't fill the buffer when reading, and I assume that's because it was expected 1 byte per pixel instead of one bit.

  let contents = std::fs::read("/Users/sam/multipage-bilevel.tif").expect("Could not read file");
  let cursor = Cursor::new(&contents);
  let mut reader = tiff::decoder::Decoder::new(cursor).unwrap();
  let mut pages = 0;
  let mut read_first_page = false;
  while !read_first_page || reader.more_images() {
      println!("Got color of {:?}", reader.colortype().unwrap());
      println!("Got dimensions of {:?}", reader.dimensions().unwrap());

      if read_first_page {
          reader.next_image().unwrap();
      } else {
          read_first_page = true;
      }
      pages += 1;
      let page = reader.read_image().unwrap();
  }
@fintelia
Copy link
Contributor

Could you provide the exact output and if possible the failing image? We should already support Gray(1) images, though it is possible you're hitting a bug or the image file uses some other feature we don't support.

@samsieber
Copy link
Author

I can't provide the file, sorry :(. The exact fail is in the read_exact call inside of the expand_chunk() call. Something like read_image() -> expand_chunk() -> read_exact(). On my 12 page multitiff, this happens when reading page 6 of 12. I was able to use tiffinfo to break up the multipage tiff into single pages, and the same crash happens on the first chunk of any single page file.

I've cloned the repo locally and added some print statements to the code to help track down the issue. Here's the new test harness:

#[test]
fn test_read_bad_tiff() {
    let contents = std::fs::read("/Users/sam/Downloads/tiffs/xaaa.tif")
        .expect("Could not read file");
    let cursor = Cursor::new(&contents);
    let mut reader = tiff::decoder::Decoder::new(cursor).unwrap();

    println!("Got color of {:?}", reader.colortype().unwrap());
    println!("Got dimensions of {:?}", reader.dimensions().unwrap());
    reader.read_image().unwrap();
}

And here's the output:

Got color of Gray(1)
Got dimensions of (2550, 3300)

Inside read_image()
chunk_dimensions (2550, 3280)
samples 1
chunks_across 1
strip_samples 8364000
image_chunks 2

Inside for chunk in 0..image_chunks { // inside read_image()
Starting chunk 0
self.image().chunk_offsets[chunk]: 8
self.image.expand_chunk(
        &mut self.reader,
        result.as_buffer(0).copy(),
        2550 as usize,
        LittleEndian,
        0 as u32,
        &self.limits,
    )?;

Inside of expand_chunk, these two lines:
let total_samples = data_dims.0 as usize * data_dims.1 as usize * samples
let tile = &mut buffer.as_bytes_mut()[..total_samples * byte_len]
Use the following values:
let 8364000 = 2550 as usize * 3280 as usize * 1
let tile = &mut buffer.as_bytes_mut()[..8364000 * 1]

That output makes me think it's trying to read one byte per pixel instead of one bit.

@samsieber
Copy link
Author

Oh, I forgot the actual panic message. Here it is:

called `Result::unwrap()` on an `Err` value: IoError(Error { kind: UnexpectedEof, message: "failed to fill whole buffer" })
thread 'test_read_bad_tiff' panicked at 'called `Result::unwrap()` on an `Err` value: IoError(Error { kind: UnexpectedEof, message: "failed to fill whole buffer" })', core/tests/test.rs:318:38
stack backtrace:
   0: rust_begin_unwind
             at /rustc/d5c2e9c342b358556da91d61ed4133f6f50fc0c3/library/std/src/panicking.rs:593:5
   1: core::panicking::panic_fmt
             at /rustc/d5c2e9c342b358556da91d61ed4133f6f50fc0c3/library/core/src/panicking.rs:67:14
   2: core::result::unwrap_failed
             at /rustc/d5c2e9c342b358556da91d61ed4133f6f50fc0c3/library/core/src/result.rs:1651:5
   3: core::result::Result<T,E>::unwrap
             at /rustc/d5c2e9c342b358556da91d61ed4133f6f50fc0c3/library/core/src/result.rs:1076:23
   4: test::test_read_bad_tiff
             at ./tests/test.rs:318:18
   5: test::test_read_bad_tiff::{{closure}}
             at ./tests/test.rs:309:25
   6: core::ops::function::FnOnce::call_once
             at /rustc/d5c2e9c342b358556da91d61ed4133f6f50fc0c3/library/core/src/ops/function.rs:250:5
   7: core::ops::function::FnOnce::call_once
             at /rustc/d5c2e9c342b358556da91d61ed4133f6f50fc0c3/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

@samsieber
Copy link
Author

With that said, it does appear that it's reading bytes instead of bits. If I go around and divide sample counts by 8 in a few places in the image-tiff library, it mostly works and I'm able to get out an almost identical png with the following test harness:

fn bits(value: u8) -> [bool; 8] {
    [
        value & 128 == 128,
        value & 64 == 64,
        value & 32 == 32,
        value & 16 == 16,
        value & 8 == 8,
        value & 4 == 4,
        value & 2 == 2,
        value & 1 == 1,
    ]
}

#[test]
fn test_read_bad_tiff() {
    let contents = std::fs::read("/Users/sam/Downloads/tiffs/xaaa.tif")
        .expect("Could not read file");
    let cursor = Cursor::new(&contents);
    let mut reader = tiff::decoder::Decoder::new(cursor).unwrap();

    println!("Got color of {:?}", reader.colortype().unwrap());
    println!("Got dimensions of {:?}", reader.dimensions().unwrap());

    let result = reader.read_image().unwrap();
    let bit_pixels = match result {
        DecodingResult::U8(pixels) => pixels,
        _ => panic!("Unsupported!")
    };

    let mut img: RgbImage = ImageBuffer::new(reader.dimensions().unwrap().0, reader.dimensions().unwrap().1);

    let width = reader.dimensions().unwrap().0;
    let height = reader.dimensions().unwrap().1;
    let mut x = 0;
    let mut y = 0;
    for eight_pixels in bit_pixels {
        let pixels = bits(eight_pixels);
        for value in pixels {
            let px = if value {[255, 255, 255]} else {[0,0,0]};
            img.put_pixel(x, y, Rgb(px));
            x += 1;
            if x == width {
                x = 0;
                y += 1;
                break;
            }
        }
        if y == height {break; }
    }
    img.save("/Users/sam/Downloads/tiffs/tiff-as-png.png").unwrap();
}

I say mostly works because the width of the image is 2550, and 2550 % 8 = 6, so a row doesn't fit into a chunk of bytes nicely. My first naive attempt at rendering had the image slowing sliding to the right as you went down the page, since it was drawing an extra two black pixels that should have been skipped. And right now I'm trying to figure out why there's a bit of a black bar at the bottom... but I'm close.

Here's the relevant changes so far; the excessive parenthesis are me trying to integer division where it's rounding up instead of down (see rust-lang/rfcs#2844 (comment) if you're curious about that..). I'm not sure I'm doing the division in quite the right spots since I still have artifacts. And I'm not sure how it'll play with different stripe and chunk layouts either. And even then, after figuring out how to get it working correctly with the hard-coded 8, it'll take a bit of work to migrate it to be based on the bits_per_sample, unless there's something wildly incorrect I'm missing.

// src/decoder/image.rs
-            let total_samples = data_dims.0 as usize * data_dims.1 as usize * samples;
+            let total_samples = ((data_dims.0 as usize+ 8 -1 ) / 8) * data_dims.1 as usize * samples;
// src/decoder/image.rs
-            let buffer_offset = y * strip_samples + x * chunk_dimensions.0 as usize * samples;
+            let buffer_offset = y * strip_samples / 8 + ((x * chunk_dimensions.0 as usize + 8 -1 ) / 8) * samples ;

@samsieber
Copy link
Author

I have found that the following ImageMagick command generates a bilevel tiff:

convert input-image.png -colorspace Gray -type Grayscale -depth 1 output-tiff.png

I've run that on this: https://en.wikipedia.org/wiki/Binary_image#/media/File:Neighborhood_watch_bw.png and ended up with the following tiffinfo:

=== TIFF directory 0 ===
TIFF Directory at offset 0xdb4 (3508)
  Image Width: 200 Image Length: 140
  Bits/Sample: 1
  Compression Scheme: None
  Photometric Interpretation: min-is-black
  FillOrder: msb-to-lsb
  Orientation: row 0 top, col 0 lhs
  Samples/Pixel: 1
  Rows/Strip: 140
  Planar Configuration: single image plane
  Page Number: 0-1

And it exhibits the same issue I've seen. Equipped with this I'm going to see what it would take to alter the data extraction portion to work with sub-byte bits-per-sample. I see a few spots, and perhaps I'll be able to exercise the relevant code paths.

@fintelia
Copy link
Contributor

fintelia commented Oct 21, 2023

Thanks for looking into this. It seems that expand_chunks isn't sub-byte bits-per-sample aware, and that's where the main changes are needed. In particular, all uses of byte_len should instead be using the true bits-per-sample.

I just double checked the TIFF spec, one thing to be aware is that rows always start on byte boundaries and padding is inserted to ensure that. Which means we'll probably want to make sure to test on images with odd widths.

@samsieber
Copy link
Author

samsieber commented Oct 30, 2023

OK, that makes sense. So at this point I think I'd be comfortable working on PR for this, but I have a few questions before I get started:

  • Are we preserving end-of-row padding in the output buffer? My initial thought is to include as not including it would make copying the bytes out a lot more expensive.
  • What bits-per-sample do we want to support? 1,2 & 4 seem like obvious choices, but I don't seen any reason not to allow 3,5,6,7 if we're preserving padding.
  • In the Image.expand_chunk method, there's 3 branches at the end; it seems like the first two is for stripped (the normal path and then the FP decoder), and the last one is for tiled tifs; is that correct?
  • We don't need a new DecodingBuffer enum discriminant, correct?

@fintelia
Copy link
Contributor

Are we preserving end-of-row padding in the output buffer?

I'd say to keep the end-of-row padding. One annoying edge case would be tiled images where every tile has end-of-row padding. Handling that properly would require a bunch of bit shifting to join the tiles together, but I'd say it would be fine to return "unsupported" if you hit that. (In practice it should be very rare because tiled images frequently have power-of-two tile sizes)

What bits-per-sample do we want to support?

1 and 4 are the important cases. If it isn't too much extra work, I'm on board with supporting 2, 3, 5, 6, 7 bits per sample as well.

In the Image.expand_chunk method, there's 3 branches at the end; it seems like the first two is for stripped (the normal path and then the FP decoder), and the last one is for tiled tifs; is that correct?

The first is for stripped (or tiled with only a single tile in the horizontal direction), and the next two are for tiled.

We don't need a new DecodingBuffer enum discriminant, correct?

Yep. Just use DecodingBuffer::U8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants