Remove `BinRead::after_parse` and add offset-table helpers #210

csnover · 2023-07-02T03:47:25Z

This trait method exists pretty much exclusively to support deferred reads of FilePtr values but causes ongoing trouble
elsewhere and the benefits do not seem to outweigh the problems:

It requires T::Args to be cloneable in many cases where it should not be the case, which causes confusion, has a strong chance of causing accidental slowness, and makes it unnecessarily hard to move from imports;
An analysis I ran of binrw users on GitHub showed that pretty much all cases of FilePtr were using FilePtr::parse or deref_now, so any potential performance benefit does not seem to be realised by real-world projects;
Because there is no hard requirement to call after_parse and it mostly will not break anything, it is too easy for
users to write custom implementations that do not do this and so are subtly broken. From the same GH analysis, there
was only one case where I found someone who wrote a custom implementation that correctly called after_parse;
Since after_parse does not build a stack of objects to revisit later, its ability to avoid non-linear reads of data is limited to at most one struct or enum at a time anyway.

Given these things (and probably others that I forget), IMO the existence of this feature is not justified. Instead, I think that a design that reads offsets into a Vec<{integer}> and then iterates over them later to convert into Vec<T> is
preferable.

To that end, this patch set includes some new helper functions in file_ptr that enable efficient offset table pattern, with the added benefit of (I think?) solving the absolute/relative offset issue: now, a user puts the data field in the correct place in the struct and it Just Works, or else can use #[br(seek_before)] on the data field if the offsets are relative to a different position in the file, which means the offset position can be relative or absolute and there is no need for special offset plumbing.

I think this also suggests a path forward for offset table writing, where a temporary marker for the offset table can be created that zero-fills the space, and then the data field and the marker are passed to a helper function that writes the data and then writes out the offset table, but that isn’t something I’ve actually prototyped yet.

This trait method exists pretty much exclusively to support deferred reads of `FilePtr` values but causes ongoing trouble elsewhere and the benefits do not seem to outweight the problems: 1. It requires `T::Args` to be cloneable in many cases where it should not be the case, which causes confusion, has a strong chance of causing accidental slowness, and makes it unnecessarily hard to move from imports; 2. An analysis I ran of binrw users on GitHub showed that pretty much all cases of `FilePtr` were using `FilePtr::parse` or `deref_now`, so any potential performance benefit does not seem to be realised by real-world projects; 3. Because there is no hard requirement to call `after_parse` and it mostly will not break anything, it is too easy for users to write custom implementations that do not do this and so are subtly broken. From the same GH analysis, there was only one case where I found someone who wrote a custom implementation that correctly called `after_parse`; 4. Since `after_parse` does not build a stack of objects to revisit later, its ability to avoid non-linear reads of data is limited to at most one struct or enum at a time anyway. Given these things (and probably others that I forget), IMO the existence of this feature is not justified. Instead, I think that a design that reads offsets into a `Vec<{integer}>` and then iterates over them later to convert into `Vec<T>` is preferable; a subsequent patch includes some helper functions to do this, but also right now it can be done (with some verbosity) using the `args_iter` helper. Closes jam1garner#17, jam1garner#119. Fixes jam1garner#185, jam1garner#197.

csnover · 2023-09-21T21:14:27Z

This is merged now.

csnover added 3 commits July 1, 2023 22:14

Allow mutable borrows in map functions

924e094

Add helper functions for reading offset tables

f823c31

csnover requested a review from jam1garner July 2, 2023 03:47

csnover closed this Sep 21, 2023

csnover deleted the rip-after-parse branch July 5, 2024 21:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove `BinRead::after_parse` and add offset-table helpers #210

Remove `BinRead::after_parse` and add offset-table helpers #210

csnover commented Jul 2, 2023

csnover commented Sep 21, 2023

Remove BinRead::after_parse and add offset-table helpers #210

Remove BinRead::after_parse and add offset-table helpers #210

Conversation

csnover commented Jul 2, 2023

csnover commented Sep 21, 2023

Remove `BinRead::after_parse` and add offset-table helpers #210

Remove `BinRead::after_parse` and add offset-table helpers #210