Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clustering_range intersection() can cause an infinite loop #18688

Closed
nyh opened this issue May 15, 2024 · 0 comments
Closed

clustering_range intersection() can cause an infinite loop #18688

nyh opened this issue May 15, 2024 · 0 comments

Comments

@nyh
Copy link
Contributor

nyh commented May 15, 2024

A user reported Scylla hanging for an unknown reason, and a stall report pointed to the intersection() function in statement_restrictions.cc.

@michoecho looked at the intersection(r1, r2, cmp) function and spotted the following suspicious recursion:

    if (starts_before_start(r2, r1, cmp)) {
        return intersection(r2, r1, cmp);
    }

So if starts_before_start() can return true for both (r1,r2) and (r2,r1), we will have infinite recursion - which because this is tail recursion will look like an infinite loop and not a stack overflow (which is unfortunate, by the way, because Scylla didn't crash and didn't provide a core - it just hung forever, which is worse).

The comment on starts_before_start() suggests (r1,r2) and (r2,r1) can't both be true because it returns "True iff r1 start is strictly before r2 start."

However, @michoecho found such a bug in starts_before_start() - it has:

    } else { // r2 start is a prefix of r1 start.
        // (a,b)>=(1,1) starts before (a)>(1) but after (a)>=(1).
        return r2.start()->is_inclusive();
    }

The logic is reversed - as the comment says, if r2 is inclusive (>=) like that (a)>=(1), then r1 like (a,b)>=(1,1) does not start before it. So it needs to be return ! r2.start()->is_inclusive().

@michoecho also found a reproducer that reaches this infinite loop. Here it is in cql-pytest syntax:

@pytest.fixture(scope="module")
def table4(cql, test_keyspace):
    with new_test_table(cql, test_keyspace, "a int, b int, c int, PRIMARY KEY (a, b, c)") as table:
        yield table
def test_intersection_hang(cql, table4):
    cql.execute(f'SELECT * FROM {table4} WHERE (b) < (1) AND (b) >= (0) AND (b, c) > (0,0) ALLOW FILTERING')

This test hangs Scylla.

We don't know if this is the only bug in intersect() or there are more, and whether there are different types of CQL requests which can reach the same bug. We should consider writing a unit test for the intersect() function because its implementation it is very far from obvious. Even worse, we have another open issue on the same function (#8157) but I can't find any reproducer or unit test demonstrating that other issue.

@nyh nyh self-assigned this May 15, 2024
nyh added a commit to nyh/scylla that referenced this issue May 15, 2024
The function intersection(r1,r2) in statement_restrictions.cc is used
when several WHERE restrictions were applied to the same column.
For example, for "WHERE b<1 AND b<2" the intersection of the two ranges
is calculated to be b<1.

As noted in issue scylladb#18690, Scylla is inconsistent in where it allows or
doesn't allow these intersecting restrictions. But where they are
allowed they must be implemented correctly. And it turns out the
function intersection() had a bug that caused it to sometimes enter
an infinite loop - when the intent was only to call itself once with
swapped parameters.

This patch includes a test reproducing this bug, and a fix for the
bug. The test hangs before the fix, and passes after the fix.

While at it, I carefully reviewed the entire code used to implement
the intersection() function to try to make sure that the bug we found
was the only one. I also added a few more comments where I thought they
were needed to understand complicated logic of the code.

The bug, the fix and the test were originally discovered by
Michał Chojnowski.

Fixes scylladb#18688
Refs scylladb#18690

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
mergify bot pushed a commit that referenced this issue May 16, 2024
The function intersection(r1,r2) in statement_restrictions.cc is used
when several WHERE restrictions were applied to the same column.
For example, for "WHERE b<1 AND b<2" the intersection of the two ranges
is calculated to be b<1.

As noted in issue #18690, Scylla is inconsistent in where it allows or
doesn't allow these intersecting restrictions. But where they are
allowed they must be implemented correctly. And it turns out the
function intersection() had a bug that caused it to sometimes enter
an infinite loop - when the intent was only to call itself once with
swapped parameters.

This patch includes a test reproducing this bug, and a fix for the
bug. The test hangs before the fix, and passes after the fix.

While at it, I carefully reviewed the entire code used to implement
the intersection() function to try to make sure that the bug we found
was the only one. I also added a few more comments where I thought they
were needed to understand complicated logic of the code.

The bug, the fix and the test were originally discovered by
Michał Chojnowski.

Fixes #18688
Refs #18690

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
(cherry picked from commit 2f6cd04)

# Conflicts:
#	test/cql-pytest/test_restrictions.py
mergify bot pushed a commit that referenced this issue May 16, 2024
The function intersection(r1,r2) in statement_restrictions.cc is used
when several WHERE restrictions were applied to the same column.
For example, for "WHERE b<1 AND b<2" the intersection of the two ranges
is calculated to be b<1.

As noted in issue #18690, Scylla is inconsistent in where it allows or
doesn't allow these intersecting restrictions. But where they are
allowed they must be implemented correctly. And it turns out the
function intersection() had a bug that caused it to sometimes enter
an infinite loop - when the intent was only to call itself once with
swapped parameters.

This patch includes a test reproducing this bug, and a fix for the
bug. The test hangs before the fix, and passes after the fix.

While at it, I carefully reviewed the entire code used to implement
the intersection() function to try to make sure that the bug we found
was the only one. I also added a few more comments where I thought they
were needed to understand complicated logic of the code.

The bug, the fix and the test were originally discovered by
Michał Chojnowski.

Fixes #18688
Refs #18690

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
(cherry picked from commit 2f6cd04)
denesb pushed a commit that referenced this issue May 21, 2024
The function intersection(r1,r2) in statement_restrictions.cc is used
when several WHERE restrictions were applied to the same column.
For example, for "WHERE b<1 AND b<2" the intersection of the two ranges
is calculated to be b<1.

As noted in issue #18690, Scylla is inconsistent in where it allows or
doesn't allow these intersecting restrictions. But where they are
allowed they must be implemented correctly. And it turns out the
function intersection() had a bug that caused it to sometimes enter
an infinite loop - when the intent was only to call itself once with
swapped parameters.

This patch includes a test reproducing this bug, and a fix for the
bug. The test hangs before the fix, and passes after the fix.

While at it, I carefully reviewed the entire code used to implement
the intersection() function to try to make sure that the bug we found
was the only one. I also added a few more comments where I thought they
were needed to understand complicated logic of the code.

The bug, the fix and the test were originally discovered by
Michał Chojnowski.

Fixes #18688
Refs #18690

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
(cherry picked from commit 27ab560)

Closes #18717
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants