Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a reason for SegmentHistory to internally use a Vec vs BTreeMap? #763

Open
vortexofdoom opened this issue Jan 25, 2024 · 3 comments
Labels
needs further discussion It is unclear how to progress without making further decisions. performance Affects the performance of the code.

Comments

@vortexofdoom
Copy link

vortexofdoom commented Jan 25, 2024

SegmentHistory uses a Vec<(i32, Time)> internally while essentially mandating the interface and structure of a BTreeMap<i32, Time>.

Using a BTreeMap instead would theoretically improve insertion/removal costs to be logarithmic rather than linear, and it would statically guarantee that the list was sorted. It would also prevent mutating the keys when using iter_mut(), rather than only having a warning. (Given the way run data is structured, allowing this currently is probably not a great idea, and I think would virtually always constitute an error, as opposed to removing and inserting with the changed key.)

It would mean that the collection would no longer be contiguous in memory, but unless that severely impacts performance, I propose changing it to use BTreeMap instead of a Vec of tuples. This should be able to be done without changing the public interface, since the internal type is not publicly exposed. I'm willing to submit a pull request.

Edit: After looking into some of the function signatures, true safety for iter_mut would likely require a "breaking" change (the Item would be (&i32, &mut Time) instead of (&mut i32, &mut Time)), but it's only used in one place in the rest of the library, and it's unlikely to be used often in userland code, and will coerce just fine if it's not mutated (like in the core library) so I think it's still worth doing eventually.

@CryZe
Copy link
Collaborator

CryZe commented Feb 16, 2024

Yeah it's a trade-off between iteration performance and insertion / removal performance. I'd say that iteration is the main use case, and insertion / removal rarely happen (and iirc it almost always inserts at the end). If you make any changes, I'd like to see some benchmarks that confirm that it's faster for our use case.

@vortexofdoom
Copy link
Author

That makes sense. My initial reason for digging into it was trying to build a SoB that only kept track of full splits (and not subsplits,) so when moving through subsplits and finding that a segment didn't complete, it'd be removed from that SegmentHistory. This being a case for regularly removing was what prompted me to look at it. Obviously there are other ways to handle it.

I do have my doubts that iteration speed is a bottleneck at a higher level for any of the operations where it's used, but benchmarks would prove/disprove that as well. I also think using a custom iter_mut that doesn't allow changing the index is something that's worth considering regardless of the backing data structure.

Tangentially, I DO think that using a BTreeMap for segments in Run could improve things like re-ordering splits a great deal. I just had to move segments around, and moving a couple adjacent splits down one spot made the program hang for over a minute, with absolutely no split history or times to deal with. I'm fairly sure this is caused by the inefficiency of array insertion, and I think that's a use case that is common enough to warrant looking at further.

@CryZe
Copy link
Collaborator

CryZe commented Feb 19, 2024

one spot made the program hang for over a minute

Are you talking about the web LSO or the original LiveSplit or what? And if so, how do I reproduce that?

LSO with 1500 attempts does not even remotely take a single millisecond to move a segment up or down.

In fact moving segments up or down doesn't even use insert or remove from the segment history.

@CryZe CryZe added needs further discussion It is unclear how to progress without making further decisions. performance Affects the performance of the code. labels Feb 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs further discussion It is unclear how to progress without making further decisions. performance Affects the performance of the code.
Projects
None yet
Development

No branches or pull requests

2 participants