Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGTERM no longer reloads data properly - just adds more workers #178

Open
jakubklimek opened this issue Feb 23, 2023 · 2 comments
Open
Labels

Comments

@jakubklimek
Copy link

For a couple of years, I have an LDF server set up as Linux service (systemd), similarly to the guide in the wiki:

[Unit]
Description=Linked Data Fragments Server
After=network.target

[Service]
Type=simple
User=ldf
WorkingDirectory=/opt/ldf-server
ExecStart=/usr/bin/ldf-server config.json 5000 8
ExecReload=/bin/kill -HUP $MAINPID
Restart=on-failure

[Install]
WantedBy=multi-user.target

Note the ExecReload=/bin/kill -HUP $MAINPID. Even though I cannot find it anywhere now, I figured SIGTERM causes the server to reload data. However, after an update recently (not sure exactly since when, maybe a year?) this started to cause ldf-server to spin-up additional workers. I start with 8, after SIGTERM there are 16, and so on. Eventually, the VM runs out of memory and kills something (not necessarily ldf-server).
In syslog, I can see (a lot more of):

Feb 23 08:23:10 nkod-test-db ldf-server[91215]: Worker 91215 running on http://localhost:5000/ (URL: https://pod-test.mvcr.gov.cz/ldf/).
Feb 23 08:23:10 nkod-test-db ldf-server[90236]: Worker 90251died with SIGTERM. Starting new worker.
Feb 23 08:23:10 nkod-test-db ldf-server[90236]: Worker 91215 replaces killed worker 90251.
Feb 23 08:23:10 nkod-test-db ldf-server[91214]: Predicate Bitmap in 29 us
Feb 23 08:23:10 nkod-test-db ldf-server[91214]: Count predicates in 11 us
Feb 23 08:23:10 nkod-test-db ldf-server[91214]: Count Objects in 6 us Max was: 4
Feb 23 08:23:10 nkod-test-db ldf-server[91214]: Bitmap in 6 us
Feb 23 08:23:10 nkod-test-db ldf-server[91214]: Bitmap bits: 97 Ones: 73
Feb 23 08:23:10 nkod-test-db ldf-server[91214]: Object references in 26 us
Feb 23 08:23:10 nkod-test-db ldf-server[91214]: Sort lists in 16 us
Feb 23 08:23:10 nkod-test-db ldf-server[91214]: Index generated in 152 us
Feb 23 08:23:10 nkod-test-db ldf-server[91214]: Worker 91214 running on http://localhost:5000/ (URL: https://pod-test.mvcr.gov.cz/ldf/).
Feb 23 08:23:11 nkod-test-db ldf-server[91237]: Predicate Bitmap in 32 us
Feb 23 08:23:11 nkod-test-db ldf-server[91237]: Count predicates in 11 us
Feb 23 08:23:11 nkod-test-db ldf-server[91237]: Count Objects in 7 us Max was: 4
Feb 23 08:23:11 nkod-test-db ldf-server[91237]: Bitmap in 5 us
Feb 23 08:23:11 nkod-test-db ldf-server[91237]: Bitmap bits: 97 Ones: 73
Feb 23 08:23:11 nkod-test-db ldf-server[91237]: Object references in 26 us
Feb 23 08:23:11 nkod-test-db ldf-server[91237]: Sort lists in 28 us
Feb 23 08:23:11 nkod-test-db ldf-server[91237]: Index generated in 172 us
Feb 23 08:23:11 nkod-test-db ldf-server[91238]: Predicate Bitmap in 66 us
Feb 23 08:23:11 nkod-test-db ldf-server[91237]: Worker 91237 running on http://localhost:5000/ (URL: https://pod-test.mvcr.gov.cz/ldf/).
Feb 23 08:23:11 nkod-test-db ldf-server[91238]: Count predicates in 41 us
Feb 23 08:23:11 nkod-test-db ldf-server[91238]: Count Objects in 30 us Max was: 4
Feb 23 08:23:11 nkod-test-db ldf-server[91238]: Bitmap in 28 us
Feb 23 08:23:11 nkod-test-db ldf-server[91238]: Bitmap bits: 97 Ones: 73
Feb 23 08:23:11 nkod-test-db ldf-server[91238]: Object references in 32 us
Feb 23 08:23:11 nkod-test-db ldf-server[91238]: Sort lists in 19 us
Feb 23 08:23:11 nkod-test-db ldf-server[91238]: Index generated in 1 ms 299 us
Feb 23 08:23:11 nkod-test-db ldf-server[91238]: Worker 91238 running on http://localhost:5000/ (URL: https://pod-test.mvcr.gov.cz/ldf/).
Feb 23 08:23:11 nkod-test-db ldf-server[90236]: Worker 90249died with SIGTERM. Starting new worker.
Feb 23 08:23:11 nkod-test-db ldf-server[90236]: Worker 91238 replaces killed worker 90249.
Feb 23 08:23:13 nkod-test-db ldf-server[91259]: Predicate Bitmap in 39 us
Feb 23 08:23:13 nkod-test-db ldf-server[91259]: Count predicates in 12 us
Feb 23 08:23:13 nkod-test-db ldf-server[91259]: Count Objects in 7 us Max was: 4
Feb 23 08:23:13 nkod-test-db ldf-server[91259]: Bitmap in 6 us
Feb 23 08:23:13 nkod-test-db ldf-server[91259]: Bitmap bits: 97 Ones: 73
Feb 23 08:23:13 nkod-test-db ldf-server[91259]: Object references in 24 us
Feb 23 08:23:13 nkod-test-db ldf-server[91259]: Sort lists in 15 us
Feb 23 08:23:13 nkod-test-db ldf-server[91259]: Index generated in 166 us
Feb 23 08:23:13 nkod-test-db ldf-server[91260]: Predicate Bitmap in 31 us
Feb 23 08:23:13 nkod-test-db ldf-server[91260]: Count predicates in 12 us
Feb 23 08:23:13 nkod-test-db ldf-server[91260]: Count Objects in 6 us Max was: 4
Feb 23 08:23:13 nkod-test-db ldf-server[91260]: Bitmap in 6 us
Feb 23 08:23:13 nkod-test-db ldf-server[91260]: Bitmap bits: 97 Ones: 73
Feb 23 08:23:13 nkod-test-db ldf-server[91260]: Object references in 25 us
Feb 23 08:23:13 nkod-test-db ldf-server[91260]: Sort lists in 20 us
Feb 23 08:23:13 nkod-test-db ldf-server[91260]: Index generated in 163 us
Feb 23 08:23:13 nkod-test-db ldf-server[91259]: Worker 91259 running on http://localhost:5000/ (URL: https://pod-test.mvcr.gov.cz/ldf/).
Feb 23 08:23:13 nkod-test-db ldf-server[91260]: Worker 91260 running on http://localhost:5000/ (URL: https://pod-test.mvcr.gov.cz/ldf/).
Feb 23 08:23:13 nkod-test-db ldf-server[90236]: Worker 90248died with SIGTERM. Starting new worker.
Feb 23 08:23:13 nkod-test-db ldf-server[90236]: Worker 91260 replaces killed worker 90248.
Feb 23 08:23:14 nkod-test-db ldf-server[91281]: Predicate Bitmap in 43 us
Feb 23 08:23:14 nkod-test-db ldf-server[91281]: Count predicates in 11 us
Feb 23 08:23:14 nkod-test-db ldf-server[91281]: Count Objects in 6 us Max was: 4
Feb 23 08:23:14 nkod-test-db ldf-server[91281]: Bitmap in 6 us
Feb 23 08:23:14 nkod-test-db ldf-server[91281]: Bitmap bits: 97 Ones: 73
Feb 23 08:23:14 nkod-test-db ldf-server[91281]: Object references in 24 us
Feb 23 08:23:14 nkod-test-db ldf-server[91281]: Sort lists in 16 us
Feb 23 08:23:14 nkod-test-db ldf-server[91281]: Index generated in 169 us
Feb 23 08:23:14 nkod-test-db ldf-server[91281]: Worker 91281 running on http://localhost:5000/ (URL: https://pod-test.mvcr.gov.cz/ldf/).
Feb 23 08:23:14 nkod-test-db ldf-server[91282]: Predicate Bitmap in 34 us
Feb 23 08:23:14 nkod-test-db ldf-server[91282]: Count predicates in 12 us
Feb 23 08:23:14 nkod-test-db ldf-server[91282]: Count Objects in 7 us Max was: 4
Feb 23 08:23:14 nkod-test-db ldf-server[91282]: Bitmap in 5 us
Feb 23 08:23:14 nkod-test-db ldf-server[91282]: Bitmap bits: 97 Ones: 73
Feb 23 08:23:14 nkod-test-db ldf-server[91282]: Object references in 25 us
Feb 23 08:23:14 nkod-test-db ldf-server[91282]: Sort lists in 15 us
Feb 23 08:23:14 nkod-test-db ldf-server[91282]: Index generated in 159 us
Feb 23 08:23:14 nkod-test-db ldf-server[91282]: Worker 91282 running on http://localhost:5000/ (URL: https://pod-test.mvcr.gov.cz/ldf/).
Feb 23 08:23:14 nkod-test-db ldf-server[90236]: Worker 90247died with SIGTERM. Starting new worker.
Feb 23 08:23:14 nkod-test-db ldf-server[90236]: Worker 91282 replaces killed worker 90247.
Feb 23 08:23:14 nkod-test-db ldf-server[90236]: Respawned all workers of master 90236.
Feb 23 08:23:15 nkod-test-db ldf-server[91303]: Predicate Bitmap in 40 us
Feb 23 08:23:15 nkod-test-db ldf-server[91303]: Count predicates in 12 us
Feb 23 08:23:15 nkod-test-db ldf-server[91303]: Count Objects in 7 us Max was: 4
Feb 23 08:23:15 nkod-test-db ldf-server[91303]: Bitmap in 6 us
Feb 23 08:23:15 nkod-test-db ldf-server[91303]: Bitmap bits: 97 Ones: 73
Feb 23 08:23:15 nkod-test-db ldf-server[91303]: Object references in 30 us
Feb 23 08:23:15 nkod-test-db ldf-server[91303]: Sort lists in 17 us
Feb 23 08:23:15 nkod-test-db ldf-server[91303]: Index generated in 174 us
Feb 23 08:23:15 nkod-test-db ldf-server[91303]: Worker 91303 running on http://localhost:5000/ (URL: https://pod-test.mvcr.gov.cz/ldf/).

Is there another preferred way of telling a running ldf-server to reload data (after data update), or is this it and there is a bug? Or should I just shut it down and start again?

@rubensworks
Copy link
Member

That definitely sounds like an issue.
Probably a bug in here somewhere: https://github.com/LinkedDataFragments/Server.js/blob/master/packages/core/lib/CliRunner.js#L48

Could you check if you encounter the same problem with SIGHUP?

Or should I just shut it down and start again?

It should not be needed, but sounds like a good fallback until this bug is resolved.

@jakubklimek
Copy link
Author

Actually, the script (see the systemd snippet above) uses SIGHUP, but in /var/log/syslog I see messages about receiving SIGTERM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants