Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run reindex on monthly schedule #1101

Merged
merged 3 commits into from
Jan 20, 2022
Merged

Conversation

maurizi
Copy link
Contributor

@maurizi maurizi commented Jan 11, 2022

Overview

Our largest table gets many frequent updates, leading over time to index bloat.
I have been manually running REINDEX TABLE CONCURRENTLY project, but we would prefer to do this automatically, not manually.

This PR aims to achieve that by creating a stand-alone Nest.js application which runs REINDEX, and then calling that via a scheduled ECS task.

Checklist

  • Description of PR is in an appropriate section of CHANGELOG.md and grouped with similar changes, if possible

Notes

  • This is both our first scheduled ECS task as well as our first stand-alone Nest.js application, so I didn't have any internal examples to draw on. Other than the documentation, I also used this blog post for inspiration.
  • This builds on top of Change topojson serialization format #1099, I'll rebase this and change the base branch the PR is pointing to once that's merged

Testing Instructions

I'm not quite sure how to test this end-to-end.

I've run the reindex script locally, as well as ran scripts/infra plan and saw the new resources I expect in the plan output, though I haven't run scripts/infra apply yet (nor do I know of a great way to verify this works other than waiting until February)

Closes #1086

@maurizi maurizi force-pushed the test/mvm/change-topojson-serialization branch from 00380cf to cb22be0 Compare January 12, 2022 14:11
Base automatically changed from test/mvm/change-topojson-serialization to develop January 12, 2022 14:24
Copy link
Collaborator

@nanotubing nanotubing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed the reindex related code in this commit, and it seems reasonable to me, though agreed that this is hard to test.

You could potentially create a very small test table and submit a job to reindex it on an hourly basis? That may give you a test job that repeats frequently enough to let you troubleshoot a potential issue, but which puts a trivial load on the server

@maurizi maurizi force-pushed the feature/mvm/reindex-on-schedule branch from e8224a0 to 21f39b0 Compare January 19, 2022 20:47
@maurizi maurizi merged commit bc6a8ab into develop Jan 20, 2022
@maurizi maurizi deleted the feature/mvm/reindex-on-schedule branch January 20, 2022 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add cron job to run concurrent REINDEX on project TABLE on a regular basis
2 participants