Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Priority queue / self-sorting list [can PR] #1791

Open
mcclure opened this issue Oct 8, 2020 · 8 comments
Open

Feature request: Priority queue / self-sorting list [can PR] #1791

mcclure opened this issue Oct 8, 2020 · 8 comments
Labels
awaiting-response Awaiting response from creator. Can't fix without it. enhancement

Comments

@mcclure
Copy link
Contributor

mcclure commented Oct 8, 2020

Hi, I have a need for a "priority queue" style data structure, or more specifically I need an immutable structure that I can add items to in arbitrary order and then repeatedly efficiently iterate over according to some fixed property (for example if I have a list of objects obj, I might want to keep it sorted by the key obj.date).

I need this enough I plan to attempt an implementation myself. I would be happy to write a PR if there is a specific way I could write it that you would be willing to accept.

I find a number of algorithms online for immutable priority queues, but it seems like it might be a good idea to base my implementation on the existing list structure rather than implementing a totally new structure. For this reason I am curious if there is any "contributor documentation" that would help me understand the underlying principles of immutable.js List. I am pawing through the source but not having much luck understanding where the important parts are. List.js seems to be describing something like a linked list but each node has this "array" member rather than a next reference and I'm not sure what "array" contains.

@mcclure
Copy link
Contributor Author

mcclure commented Oct 9, 2020

Got some help on Twitter, here's the best I can understand: List is an "Vector Trie" apparently patterned on the Scala vector type, meaning it's a tree where every leaf is an array of maximum 32 elements. Bottom leaves other than the rightmost are assumed to be filled and when you lookup an index the node you're looking for is found by pure math, which is why midlist insert is O(N).

My plan is:

  • Subclass List; take the "key function" that turns values into keys as an constructor argument
  • Change SortedList so every node at every level of the tree knows its max and min relative to the "key value"
  • Kill the random-access get() and set() functions (maybe this is a sign List should not be a superclass), only pop front, pop back and iterate are allowed
  • Since every level knows its min/max, and random access is disabled, index arithmetic is no longer needed to calculate the leaf node corresponding to a value. This means when I remove an item from a leaf node I don't have to repack, and if a mid-list insertion would push a leaf node above 32 elements I can simply split it into two 16-leaf nodes.

On top of this basic plan, I may attempt the following two nice-to-haves:

  • If I get many pushes at once via withMutations, rather than doing a search for each push I can sort the list of elements-to-push conventionally and then do something resembling a mergesort merge.
  • Since "indexes" are not meaningful with this approach, some variants of slice() or toSliceSeq() might be nice that let you say, for example, "give me all of this list newer than key X" or "give me the ten values less than or equal to key Y".

I haven't done any proofs this is actually efficient, but my intuition is that because removal is generally only done at the beginning and end and insertion cannot reduce leaf size below 16, that this will give between O(log_16(n)) and O(log_32(n)) performance on insertions, front pops and end pops, and this is provable.

This approach would resolve the problem in my application. If I get it working, would you be interested in merging it? What requirements do you have that a new feature like this has adequate perf (ie, do you require proofs, or just verification of good average-case performance in testing?)

Things I'm not clear on yet:

  • What is the "_origin" member in the List.js implementation?
  • What is the "__hash" member in the List.js implementation?
  • Is there a difference between butLast() and pop()?

I am also seeing a problem where if I run npm install in an immutable-js checkout, I get node-gyp errors related to the v8 headers and ending in "Failed at the microtime@2.1.8 install script". This is on OS X 10.13.6 with npm 6.14.8, node v14.10.1. I have not investigated this yet. I do not get this error when npm installing a project with immutable-js as a dependency so I assume this is a problem with one of the devDependencies. EDIT: fixed this with PR #1792. However I am still unable to run npm run build or npm run test because I get "Error: Cannot find module 'react/addons'", and I am seeing this mysterious error when I run git run perf:

Pearl:immutable-js mcc$ npm run perf

> immutable@4.0.0-rc.12 perf /Users/mcc/work/h/tempt2/fork/immutable-js
> node ./resources/bench.js

ugh Error: Command failed: git show master:dist/immutable.js
fatal: Path 'dist/immutable.js' exists on disk, but not in 'master'.

    at ChildProcess.exithandler (child_process.js:308:12)
    at ChildProcess.emit (events.js:314:20)
    at maybeClose (internal/child_process.js:1047:16)
    at Process.ChildProcess._handle.onexit (internal/child_process.js:287:5)

I can't even figure out what script is trying to invoke git show.

@mcclure
Copy link
Contributor Author

mcclure commented Oct 9, 2020

(One other small thing: an alternative to using the "key function" would be make this a key/value store and sort on the key. This would result in the same implementation but a slightly different external interface. I think this approach is a little weird because you can have multiple values sharing a single key, but I don't have a strong opinion either way about it.)

@mcclure
Copy link
Contributor Author

mcclure commented Oct 10, 2020

ownerID in List is only needed for the implementation of mutable mode, right?

@mcclure
Copy link
Contributor Author

mcclure commented Oct 10, 2020

Also, what are the "@@transducer/" functions for? I do not find any description of them in the docs.

@mcclure
Copy link
Contributor Author

mcclure commented Oct 12, 2020

Made a StackOverflow question about my difficulties running npm run test / npm run perf: https://stackoverflow.com/questions/64325833/how-to-run-perf-tests-automated-tests-in-an-immutable-js-repo

@mcclure
Copy link
Contributor Author

mcclure commented Oct 13, 2020

I've started on an implementation of this, feature branch is here https://github.com/mcclure/immutable-js/tree/pq

This adds an exported SortedList and isSortedList. SortedList has a complete implementation of one function, add(). If you call it, it crashes. (The implementation of "add" turned out to be more complicated than I thought! I don't actually know whether my approach is actually more efficient than just using a balanced tree or something.)

Will make an initial PR once I have add, pop and shift working.

@jdeniau
Copy link
Member

jdeniau commented Jul 19, 2021

@mcclure can you re-open [your PR][https://github.com/immutable-js-oss/pull/183) here as you said that it is not ready to be merged in the old fork ?

@jdeniau jdeniau added the awaiting-response Awaiting response from creator. Can't fix without it. label Jul 19, 2021
@mcclure
Copy link
Contributor Author

mcclure commented Jul 19, 2021

I have a version of the PR here also, actually, it is this:
#1794

There are some questions (i think some of the Set methods don't work), it would be good for someone with more familiarity to look at which standard interface bits are not present. I think I failed to implement the cursor method?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting-response Awaiting response from creator. Can't fix without it. enhancement
Projects
None yet
Development

No branches or pull requests

3 participants