Performance Issue #2

garethgeorge · 2018-07-03T19:36:12Z

This implementation traverses the list of blocks for each access, checking the size of each block. It would be MUCH faster if the block size was known in advance. It is presumably also okay to sacrifice allocation time for indexing time if we assume all of the memory will be used, this can result in gains in amatorized time.

It should be preferable to allocate a list of fixed size regions, and if any of those allocations fail, reduce the size by a factor of two (throwing away all previously allocated segments that may have succeeded) and then index into this structure.

This would allow the block to be computed as index / block_size and the offset as index % block_size

Alternatively, a tree-like approach could be applied to reduce the theoretical O(n) but any performance increase, in reality, would likely be overwhelmed by increased constant runtime components of any algorithm.

The text was updated successfully, but these errors were encountered:

soryy708 · 2018-07-04T06:43:51Z

It would be MUCH faster if the block size was known in advance

So you mean instead of each block storing just its own length, have it store the sum of lengths of all before it?

if we assume all of the memory will be used

Why assume that?

this can result in gains in amatorized time

I'd like to benchmark that.

It should be preferable to allocate a list of fixed size regions

I disagree. Wouldn't it be better to allocate regions that re as big as we can get away with? The less regions we have the better we'll perform.

throwing away all previously allocated segments that may have succeeded

Sounds wasteful.

a tree-like approach could be applied to reduce the theoretical O(n) but any performance increase, in reality, would likely be overwhelmed by increased constant runtime components of any algorithm.

Need to benchmark that.

garethgeorge · 2018-07-04T18:34:50Z

Thanks for getting back, you raise some good points. Would you have any objections to my proposing a pull reque? I found your post on a Quora article and have been giving some thought to the problem.

I’m not sure the number of regions matters except if they become small enough to affect cache locality, what really matters is the speed of determining what block of memory to index into.

If you use fixed size chunks then it becomes very fast to calculate what chunk to index into.

You are right that throwing away chunks that are successfully allocated is wasteful so I gave that some thought. These chunks could actually be sub-divided rather than being discarded & this maintains great cache locality.

The general goal though would be to simplify the algorithm by making the chunks all the same size, reducing complexity, and gaining back speed.

soryy708 · 2018-07-05T08:13:10Z

I’m not sure the number of regions matters except if they become small enough to affect cache locality, what really matters is the speed of determining what block of memory to index into.

Yes, but the speed of determining what block memory to index to depends on the amount of blocks. Given a constant length, smaller blocks mean bigger amount.

If you use fixed size chunks then it becomes very fast to calculate what chunk to index into.

Yes, but how do you determine what the size should be?

I gave that some thought. These chunks could actually be sub-divided rather than being discarded

Yes, but that would complicate the house keeping logic behind the scenes.

The general goal though would be to simplify the algorithm by making the chunks all the same size, reducing complexity, and gaining back speed.

A noble goal, but the proposed method poses new challenges that may be fun to tackle.

Would you have any objections to my proposing a pull request?

No objections at all! Feel free to fork the repository and make all the changes you need.
It's under the MIT license anyway.
When you're done, you open a pull request, and I'll review your code.

I've added a "how to contribute" section to the README.md file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance Issue #2

Performance Issue #2

garethgeorge commented Jul 3, 2018

soryy708 commented Jul 4, 2018

garethgeorge commented Jul 4, 2018

soryy708 commented Jul 5, 2018 •

edited

Performance Issue #2

Performance Issue #2

Comments

garethgeorge commented Jul 3, 2018

soryy708 commented Jul 4, 2018

garethgeorge commented Jul 4, 2018

soryy708 commented Jul 5, 2018 • edited

soryy708 commented Jul 5, 2018 •

edited