Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How are you going to implement ordered iteration? #1

Open
kmike opened this issue Jul 20, 2012 · 4 comments
Open

How are you going to implement ordered iteration? #1

kmike opened this issue Jul 20, 2012 · 4 comments

Comments

@kmike
Copy link

kmike commented Jul 20, 2012

How are you going to implement ordered iteration for HAT-Trie? It is not implemented in both C and C++ library mentioned in README.

@dlecocq
Copy link
Owner

dlecocq commented Jul 20, 2012

The original paper (as far as I can tell) is a little vague on this subject, suggesting that sorting be done on the fly for the array-hash buckets. Certainly not the most glamorous approach, though.

@kmike
Copy link
Author

kmike commented Jul 20, 2012

So is it the limitation of HAT-Tries that sorted iteration will be inefficient and some standard trie operations (such as prefix searches) will be slowish (at least much slower than using standard tries, PAT-trees or DA-tries)? This makes the application of HAT-tries quite limited. I've read the original paper this way but maybe you have some ideas?

Please excuse me for polluting the bug tracker. I'm discussing this in github issues because you seems to be actively interested in this topic and github doesn't have private messages :)

@dlecocq
Copy link
Owner

dlecocq commented Jul 20, 2012

It's hard to say definitively what the performance impact will be until it's implemented, of course. In particular, Askitis' intuition was that the improved caching of a shorter tree and the fewer levels of indirection in leaves would improve performance.

In the case of prefix searches, I believe the number of non-matching items to search would be limited to 2 * max_bucket_size, though in-order traversal may prove slow. I'm inclined to think that the caching benefits aren't as profound as advertised, but lacking more than a couple of implementations to benchmark (and in particular lacking the Askitis' implementation), it's anyone's guess.

I think issues are a perfectly fine venue for this kind of discussion!

@kmike
Copy link
Author

kmike commented Jul 21, 2012

You're right, there is nothing to talk about without an implementation or 2. Thanks for the discussion!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants