New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change topojson serialization format #1099
Conversation
3ac8ed3
to
16ba9c0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good 👍
I deleted the topo.buf
for the NJ config I have in my database, then recreated it and confirmed that it loaded and I could make a map.
Re performance: I found this comparison of Javascript encoding tools, and this PR that adds v8.serialize to the mix. It seems like the tools that are notably faster than v8.serialize
require a predefined schema, which is probably more trouble than it would be worth for our purposes.
await asyncLoop( | ||
sortedRegions.map(region => () => { | ||
const promise = this.fetchLayer(region.s3URI, region.archived); | ||
this._layers = { ...this._layers, [region.s3URI]: promise }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wish asyncLoop
could work as a throttled map
rather than a throttled forEach
, so we wouldn't have to collect the results as a side effect inside this mapped function. But my attempts to mentally revise it to work that way have not gone well, so this will do just fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I tried / failed at this as well - good to know it wasn't just me 😅
Uses a queue to load states concurrently without overwhelming the application by attempting to load all states at once.
Tried a few different combinations and this seemed to be the most stable
00380cf
to
cb22be0
Compare
Overview
Upgrading to Node.js 16 in #602 caused a regression in data loading - the maximum size a of string was was reduced from ~1Gb to ~512Mb, which meant we could no longer load our two largest regions by using
JSON.parse(buffer.toString())
This PR takes a multi-step approach to tackling this problem:
load-dev-data
, but I still need to update production, and any one-off regions will no longer work unless updatedAfter testing this branch on staging, I encountered a lot of resource contention and a large number of layers failed to load the first time, so I added 807dc9d & e8ea4f1 to make the
TopologyService
more robust.Checklist
CHANGELOG.md
and grouped with similar changes, if possibleDemo
Updated healthcheck, showing which layers are loading / pending.
Notes
I was hopeful this would decrease our load times, but it hasn't done so to any meaningful degree. We did buy a lot of headroom in terms of TopoJSON file size - the max file size for
v8.deserialize
is ~4GB.Testing Instructions
Try running
./scripts/manage serialize-topojson ...
locally, and make sure you can run the serverI tested this against staging, and was able to get staging to load successfully.
Closes #1095