Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition in phase 5 #141

Closed
theseion opened this issue Mar 18, 2023 · 11 comments
Closed

Race condition in phase 5 #141

theseion opened this issue Mar 18, 2023 · 11 comments
Labels
bug Something isn't working

Comments

@theseion
Copy link
Collaborator

Tests for rules in phase 5 are subject to a rare race condition, as exhibited in the following log for tests 980170-2 and 980170-3 (line breaks between rules added for readability):

[Sat Mar 18 16:47:21.474075 2023] [security2:error] [pid 193:tid 140523746522880] [client 172.18.0.1:39150] [client 172.18.0.1] ModSecurity: Warning. Pattern match "^.*$" at REQUEST_HEADERS:X-CRS-Test. [file "/etc/modsecurity.d/owasp-crs/crs-setup.conf"] [line "737"] [id "999999"] [msg "04709bb3-6b2b-477d-aa14-76a89363b2dd"] [tag "modsecurity"] [hostname "localhost"] [uri "/status/200"] [unique_id "ZBXrGVXqtKqlnATxdUEg7QAAANg"]
[Sat Mar 18 16:47:21.476378 2023] [security2:error] [pid 193:tid 140524082149120] [client 172.18.0.1:39164] [client 172.18.0.1] ModSecurity: Warning. Pattern match "(?:^([\\\\d.]+|\\\\[[\\\\da-f:]+\\\\]|[\\\\da-f:]+)(:[\\\\d]+)?$)" at REQUEST_HEADERS:Host. [file "/etc/modsecurity.d/owasp-crs/rules/REQUEST-920-PROTOCOL-ENFORCEMENT.conf"] [line "761"] [id "920350"] [msg "Host header is a numeric IP address"] [data "127.0.0.1"] [severity "WARNING"] [ver "OWASP_CRS/4.0.0-rc1"] [tag "modsecurity"] [tag "application-multi"] [tag "language-multi"] [tag "platform-multi"] [tag "attack-protocol"] [tag "paranoia-level/1"] [tag "OWASP_CRS"] [tag "capec/1000/210/272"] [tag "PCI/6.5.10"] [hostname "127.0.0.1"] [uri "/"] [unique_id "ZBXrGVXqtKqlnATxdUEg7gAAAMQ"]
[Sat Mar 18 16:47:21.480771 2023] [security2:error] [pid 193:tid 140524098930432] [client 172.18.0.1:39172] [client 172.18.0.1] ModSecurity: Warning. Pattern match "^.*$" at REQUEST_HEADERS:X-CRS-Test. [file "/etc/modsecurity.d/owasp-crs/crs-setup.conf"] [line "737"] [id "999999"] [msg "04709bb3-6b2b-477d-aa14-76a89363b2dd"] [tag "modsecurity"] [hostname "localhost"] [uri "/status/200"] [unique_id "ZBXrGVXqtKqlnATxdUEg7wAAAMM"]


[Sat Mar 18 16:47:21.483110 2023] [security2:error] [pid 35:tid 140523931117312] [client 172.18.0.1:39180] [client 172.18.0.1] ModSecurity: Warning. Pattern match "^.*$" at REQUEST_HEADERS:X-CRS-Test. [file "/etc/modsecurity.d/owasp-crs/crs-setup.conf"] [line "737"] [id "999999"] [msg "c50b1e8e-e8a5-46d1-afc8-bf82368390f4"] [tag "modsecurity"] [hostname "localhost"] [uri "/status/200"] [unique_id "ZBXrGcs3WOXFKxhOjQ9D-AAAAE0"]
[Sat Mar 18 16:47:21.483333 2023] [security2:error] [pid 193:tid 140524082149120] [client 172.18.0.1:39164] [client 172.18.0.1] ModSecurity: Warning. Unconditional match in SecAction. [file "/etc/modsecurity.d/owasp-crs/rules/RESPONSE-980-CORRELATION.conf"] [line "96"] [id "980170"] [msg "Anomaly Scores: (Inbound Scores: blocking=3, detection=3, per_pl=3-0-0-0, threshold=5) - (Outbound Scores: blocking=0, detection=0, per_pl=0-0-0-0, threshold=4) - (SQLI=0, XSS=0, RFI=0, LFI=0, RCE=0, PHPI=0, HTTP=0, SESS=0, COMBINED_SCORE=3)"] [ver "OWASP_CRS/4.0.0-rc1"] [tag "modsecurity"] [tag "reporting"] [hostname "127.0.0.1"] [uri "/"] [unique_id "ZBXrGVXqtKqlnATxdUEg7gAAAMQ"]
[Sat Mar 18 16:47:21.489998 2023] [security2:error] [pid 193:tid 140523830429440] [client 172.18.0.1:39206] [client 172.18.0.1] ModSecurity: Warning. Pattern match "^.*$" at REQUEST_HEADERS:X-CRS-Test. [file "/etc/modsecurity.d/owasp-crs/crs-setup.conf"] [line "737"] [id "999999"] [msg "c50b1e8e-e8a5-46d1-afc8-bf82368390f4"] [tag "modsecurity"] [hostname "localhost"] [uri "/status/200"] [unique_id "ZBXrGVXqtKqlnATxdUEg8AAAANM"]

Rule 920350 is triggered as expected for the first test but rule 980170 is not. However, looking at the unique_id field it becomes clear that rule 980170 did in fact trigger but was reported as part of the second rule (the msg field of rule 999999 denotes start and stop ID of a rule).

The logic for finding triggered rules in the log is:

  1. send test request to mark start
  2. run test
  3. send test request to mark end
  4. while end marker not in log
    1. send test request to flush log
  5. check test against lines between start and end markers

Apparently, the log entry for rule 980170 may under some circumstances be written to the log much later and, which is worse, independently of other log entries of the same phase (both 999999 and 980170 run in phase 5), otherwise it would have shown up before the start marker for test 980170-3.

@fzipi
Copy link
Member

fzipi commented Mar 29, 2023

Is there an easy fix for this?

@theseion
Copy link
Collaborator Author

Not that I'm aware of. I didn't dive deeply though.

@fzipi
Copy link
Member

fzipi commented May 24, 2023

One thing that I see in common there is the unique_id. Maybe we can use it?

@theseion
Copy link
Collaborator Author

Maybe... Does that work reliably?

@RedXanadu
Copy link
Member

This happened again and caused an automated test run to fail on the CRS repo the other day. So, this definitely remains an issue.

@theseion
Copy link
Collaborator Author

We can't use unique_id because we don't know how many lines to expect.

I have an idea how to solve this though. Unfortunately, it will require replacing the current mechanism (maybe with a transition phase). The issue we are facing is that a test request will run in a thread / worker that is different from the marker requests. Hence, the marker requests may complete while processing of the test request is still ongoing. The fix thus has to be to use the same thread / worker as the test request.

For this to work we will need an additional rule that logs in phase 5 of every test request. Then we know that we can look for this special log entry to find the end of the log lines for a given test request. Essentially, the end marker we currently use would be replaced with a different end marker. The start marker can still be used because there's no race condition in that case.

I'm pretty sure this will work, however, I'm not sure wether a drop action would skip phase 5. If that is the case, then we're pretty much screwed... @dune73?

@theseion
Copy link
Collaborator Author

theseion commented Nov 5, 2023

Ping @dune73.

@dune73
Copy link
Member

dune73 commented Nov 6, 2023

Sorry, I overlooked this so far.

Unfortunately, phase 5 is indeed skipped when the drop action is used. I suspect this is because the idea with drop is to free the thread immediately from the connection. I wonder how garbage collection works in this case ...

Either way, it's an unfortunate situation. But then I would not invest too much into phase 5. The only rule we use in phase 5 is 980170 (CRSv4) and it's not a very important rule. I would probably be pragmatic and test at reporting level 0 where the rule is inactive.

@RedXanadu
Copy link
Member

RedXanadu commented Nov 9, 2023

I have been encountering this problem more and more frequently over the past week or so, for what it's worth. Several PRs have been marked as 'failing tests' in error.

@theseion
Copy link
Collaborator Author

theseion commented Nov 9, 2023

Thanks. I'm working on a workaround.

theseion added a commit to theseion/coreruleset that referenced this issue May 20, 2024
@theseion
Copy link
Collaborator Author

retry_once has been implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants