Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Issue #2

Open
garethgeorge opened this issue Jul 3, 2018 · 3 comments
Open

Performance Issue #2

garethgeorge opened this issue Jul 3, 2018 · 3 comments

Comments

@garethgeorge
Copy link

This implementation traverses the list of blocks for each access, checking the size of each block. It would be MUCH faster if the block size was known in advance. It is presumably also okay to sacrifice allocation time for indexing time if we assume all of the memory will be used, this can result in gains in amatorized time.

It should be preferable to allocate a list of fixed size regions, and if any of those allocations fail, reduce the size by a factor of two (throwing away all previously allocated segments that may have succeeded) and then index into this structure.

This would allow the block to be computed as index / block_size and the offset as index % block_size

Alternatively, a tree-like approach could be applied to reduce the theoretical O(n) but any performance increase, in reality, would likely be overwhelmed by increased constant runtime components of any algorithm.

@soryy708
Copy link
Owner

soryy708 commented Jul 4, 2018

It would be MUCH faster if the block size was known in advance

So you mean instead of each block storing just its own length, have it store the sum of lengths of all before it?

if we assume all of the memory will be used

Why assume that?

this can result in gains in amatorized time

I'd like to benchmark that.

It should be preferable to allocate a list of fixed size regions

I disagree. Wouldn't it be better to allocate regions that re as big as we can get away with? The less regions we have the better we'll perform.

throwing away all previously allocated segments that may have succeeded

Sounds wasteful.

a tree-like approach could be applied to reduce the theoretical O(n) but any performance increase, in reality, would likely be overwhelmed by increased constant runtime components of any algorithm.

Need to benchmark that.

@garethgeorge
Copy link
Author

Thanks for getting back, you raise some good points. Would you have any objections to my proposing a pull reque? I found your post on a Quora article and have been giving some thought to the problem.

I’m not sure the number of regions matters except if they become small enough to affect cache locality, what really matters is the speed of determining what block of memory to index into.

If you use fixed size chunks then it becomes very fast to calculate what chunk to index into.

You are right that throwing away chunks that are successfully allocated is wasteful so I gave that some thought. These chunks could actually be sub-divided rather than being discarded & this maintains great cache locality.

The general goal though would be to simplify the algorithm by making the chunks all the same size, reducing complexity, and gaining back speed.

@soryy708
Copy link
Owner

soryy708 commented Jul 5, 2018

I’m not sure the number of regions matters except if they become small enough to affect cache locality, what really matters is the speed of determining what block of memory to index into.

Yes, but the speed of determining what block memory to index to depends on the amount of blocks. Given a constant length, smaller blocks mean bigger amount.

If you use fixed size chunks then it becomes very fast to calculate what chunk to index into.

Yes, but how do you determine what the size should be?

I gave that some thought. These chunks could actually be sub-divided rather than being discarded

Yes, but that would complicate the house keeping logic behind the scenes.

The general goal though would be to simplify the algorithm by making the chunks all the same size, reducing complexity, and gaining back speed.

A noble goal, but the proposed method poses new challenges that may be fun to tackle.

Would you have any objections to my proposing a pull request?

No objections at all! Feel free to fork the repository and make all the changes you need.
It's under the MIT license anyway.
When you're done, you open a pull request, and I'll review your code.

I've added a "how to contribute" section to the README.md file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants