-
-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize for parallel I/O #82
Comments
To clarify, when you say "I/O" do you mean both reading and writing in parallel? Or are you strictly talking about reading? |
I did think mostly about reading. Writing in parallel would require chunks, I think? Seems to be that would require some radical different approach since features can vary alot in size. |
I agree, just making sure I understood! |
Thinking about this again the current format does support concurrency. Most efficiently when non indexed form where order does not matter but even when indexed assuming the order has been determined features could be written in buckets to be assembled in the right order in the end. Reading non indexed form is still problematic to make concurrent but doesn't really need a full feature offset index, it really only needs a set of offsets (up to max concurrency) and that could be a new optional and backwards compatible array in the header. I'll think on it for a while and perhaps introduce that. 🙂 |
v3 spec non indexed FlatGeobuf aren't suitable for massively parallel I/O. I think what is needed to do this are one of:
I'm leaning on feature index. Possibly as post data section to allow streaming write.
The text was updated successfully, but these errors were encountered: