fix: add expected errors in MariaDB #834

ZhengLin-Li · 2023-06-21T10:11:07Z

fix: add expected errors in MariaDB

bajinsheng · 2023-06-21T11:26:14Z

It seems SQLancer triggers another issue in MariaDB?

ZhengLin-Li · 2023-06-21T12:05:51Z

I reproduced the queries on Ubuntu 22.04 and found the following:

The statements I run are:

DROP DATABASE IF EXISTS database11;
CREATE DATABASE database11;
USE database11;
CREATE OR REPLACE TABLE t0(c0 MEDIUMINT  UNIQUE, c1 VARCHAR(100) , c2 REAL  UNIQUE PRIMARY KEY);
CREATE OR REPLACE TABLE t1(c0 VARCHAR(100)  NOT NULL) engine=Aria;
CREATE OR REPLACE TABLE t2 LIKE t1;
SET SESSION autocommit = 1;
TRUNCATE t0  NOWAIT;
OPTIMIZE TABLE t0 NOWAIT;

Question 1: It seams that table t0 does not support optimize. How do we deal with this?
Question 2: I can suucessfully run TRUNCATE t0 NOWAIT; on my Ubuntu 22.04 but it caused an error in CI, what's the reason?

bajinsheng · 2023-06-21T12:20:51Z

1: I think it can be solved by checking whether the table uses the InnoDB storage engine. If it is difficult, we can simply ignore it as it is not a significant issue.
2. Perhaps you can try to run the full log, like database0-cur.log? I guess some statements that are ignored in the crash log, may incur this issue.

ZhengLin-Li · 2023-06-21T12:38:33Z

1: I think it can be solved by checking whether the table uses the InnoDB storage engine. If it is difficult, we can simply ignore it as it is not a significant issue. 2. Perhaps you can try to run the full log, like database0-cur.log? I guess some statements that are ignored in the crash log, may incur this issue.

Yes, it uses InnoDB engine. But how which engine will affact optimize statement?
Below is the database11.log. In table t0, we have only one boolean column, but we insert a value of Integer? It does not meet expectation.

--java.lang.AssertionError: INSERT INTO t0 VALUES (-1497191741);
--	at sqlancer.common.query.SQLQueryAdapter.checkException(SQLQueryAdapter.java:111)
--	at sqlancer.common.query.SQLQueryAdapter.execute(SQLQueryAdapter.java:93)
--	at sqlancer.Main$QueryManager.execute(Main.java:265)
--	at sqlancer.GlobalState.executeStatement(GlobalState.java:108)
--	at sqlancer.mariadb.MariaDBProvider.generateDatabase(MariaDBProvider.java:151)
--	at sqlancer.mariadb.MariaDBProvider.generateDatabase(MariaDBProvider.java:1)
--	at sqlancer.ProviderAdapter.generateAndTestDatabase(ProviderAdapter.java:52)
--	at sqlancer.Main$DBMSExecutor.run(Main.java:364)
--	at sqlancer.Main$2.run(Main.java:559)
--	at sqlancer.Main$2.runThread(Main.java:541)
--	at sqlancer.Main$2.run(Main.java:532)
--	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
--	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
--	at java.base/java.lang.Thread.run(Thread.java:829)
--Caused by: java.sql.SQLSyntaxErrorException: (conn=570) Table 'database11.t0' doesn't exist
--	at org.mariadb.jdbc.export.ExceptionFactory.createException(ExceptionFactory.java:282)
--	at org.mariadb.jdbc.export.ExceptionFactory.create(ExceptionFactory.java:370)
--	at org.mariadb.jdbc.message.ClientMessage.readPacket(ClientMessage.java:134)
--	at org.mariadb.jdbc.client.impl.StandardClient.readPacket(StandardClient.java:855)
--	at org.mariadb.jdbc.client.impl.StandardClient.readResults(StandardClient.java:794)
--	at org.mariadb.jdbc.client.impl.StandardClient.readResponse(StandardClient.java:713)
--	at org.mariadb.jdbc.client.impl.StandardClient.execute(StandardClient.java:637)
--	at org.mariadb.jdbc.Statement.executeInternal(Statement.java:941)
--	at org.mariadb.jdbc.Statement.execute(Statement.java:1067)
--	at org.mariadb.jdbc.Statement.execute(Statement.java:458)
--	at sqlancer.common.query.SQLQueryAdapter.execute(SQLQueryAdapter.java:87)
--	... 12 more
---- Time: 2023/06/21 12:30:12
-- Database: database11
-- Database version: 10.3.38-MariaDB-0ubuntu0.20.04.1
-- seed value: 11
DROP DATABASE IF EXISTS database11;
CREATE DATABASE database11;
USE database11;
CREATE OR REPLACE TABLE t0(c0 BOOLEAN  UNIQUE);
SET SESSION myisam_stats_method = nulls_unequal;
SET SESSION join_cache_level = 5;
INSERT INTO t0 VALUES (299405238);
CHECKSUM TABLE t0;
INSERT INTO t0 VALUES (-486923179);
INSERT INTO t0 VALUES (-471262417);
SET SESSION optimizer_search_depth = 4;
INSERT INTO t0 VALUES (618607252);
INSERT INTO t0 VALUES (-433413318);
INSERT INTO t0 VALUES (266645617);
SET SESSION max_length_for_sort_data = 3932438;
SET SESSION sql_auto_is_null = ON;
INSERT INTO t0 VALUES (916765166);
SET SESSION sql_log_off = ON;
SET SESSION max_sort_length = 6112935;
INSERT INTO t0 VALUES (550488471);
SET SESSION optimizer_prune_level = 0;
INSERT INTO t0 VALUES (499498418);
INSERT INTO t0 VALUES (1540761514);
SET SESSION autocommit = 1;
SET SESSION unique_checks = ON;
SET SESSION join_cache_level = 5;
INSERT INTO t0 VALUES (1476203969);
INSERT INTO t0 VALUES (-2079398998);
SET SESSION join_cache_level = 1;
INSERT INTO t0 VALUES (-839316519);
TRUNCATE t0  WAIT 1;
SET SESSION sql_buffer_result = OFF;
INSERT INTO t0 VALUES (1712230220);
SET SESSION sql_quote_show_create = OFF;
INSERT INTO t0 VALUES (1984650463);
INSERT INTO t0 VALUES (NULL);

bajinsheng · 2023-06-21T13:40:35Z

1 I guess only innodb engines can not use optimization, so perhaps you can use storage engine to identify it.
2 Is it the full log? Sometimes the column data types are weak, and accept different data types. Or the staement is rejected by database and sqlancer does not show the error information.

ZhengLin-Li · 2023-06-22T07:06:25Z

It seams that the error message in CI is different from before (after force push), now the only CI error message is:
AssertionError: Lock wait timeout exceeded; try restarting transaction

How will we deal with it? Still add to expected errors?

bajinsheng · 2023-06-22T12:33:40Z

Seems it is a flaky issue that cannot be reproduced stably.
Can you look into the root reason of this issue? It's definitely ideal to avoid this error message. If it is impossible or too hard, we can consider adding the error message to the expected error class.

ZhengLin-Li · 2023-06-22T13:50:40Z

I tested it on ubuntu 22.04 and it result in build success FOR TWO TIMES.

First run:

Second run:

I will test for serveral times and see what's the result.

ZhengLin-Li · 2023-06-22T13:57:13Z

@bajinsheng could you return this workflow accroding to https://docs.github.com/en/actions/managing-workflow-runs/re-running-workflows-and-jobs. I can not see the re-run bottom.

bajinsheng · 2023-06-22T14:47:39Z

Done.

ZhengLin-Li · 2023-06-23T07:43:09Z

I can not reproduce the error besides GitHub Action.

Acutally I found there are too many errors in the SQL queries we excuted. For example, we intern a null value to a NOT NULL column. And we optimize a InnoDB engine table.

I think the optimize operation is the cause but can not determine the root cause. How about adding one SHOW ENGINE INNODB STATUS statement before the optimize operation, this will let us see the InnoDB engine status? Accroding to https://stackoverflow.com/a/5837308

mrigger · 2023-06-23T10:54:07Z

Acutally I found there are too many errors in the SQL queries we excuted. For example, we intern a null value to a NOT NULL column. And we optimize a InnoDB engine table.

I think we are making a deliberate tradeoff here. Checking for all these different things is quite tedious, and by inserting invalid values, we can also test the DBMSs. Also see https://github.com/sqlancer/sqlancer/blob/main/CONTRIBUTING.md#expected-errors regarding data types.

mrigger · 2023-06-23T10:56:47Z

I tested it on ubuntu 22.04 and it result in build success FOR TWO TIMES.

Did you use the same MariaDB version we use in the CI? I assume you also copied the statements for running the tests?

ZhengLin-Li · 2023-06-23T12:47:21Z

I tested it on ubuntu 22.04 and it result in build success FOR TWO TIMES.

Did you use the same MariaDB version we use in the CI? I assume you also copied the statements for running the tests?

Yes, the same version. Yes, copied the logs from CI.

I tested again and found the reason for these two seccuss #834 (comment) here is that, SQLancer on my local mechine ran 0 queries and 0 was succesful, resulting the build success.

So, the conclusion is that, the problem still exists. It can be reproduced in CI, although it can not be reproduced on my local ubuntu.

I will try to reproduce it on my local ubuntu again.

mrigger · 2023-06-23T13:06:04Z

SQLancer on my local mechine ran 0 queries and 0 was succesful, resulting the build success.

It seems like some kind of deadlock is triggered?

ZhengLin-Li · 2023-06-23T13:37:16Z

I tested again in GitHub's Codespace (Ubuntu 22.04) and got 0 queries executed again. I am wondering if there are any errors with my commands. Could anyone help take a look?

The commands I run are, (same as they in ci file):

sudo apt-key adv --recv-keys --keyserver hkp://keyserver.ubuntu.com:80 0xF1656F24C74CD1D8
sudo add-apt-repository 'deb [arch=amd64,arm64,ppc64el] http://sfo1.mirrors.digitalocean.com/mariadb/repo/10.3/ubuntu bionic main'   
sudo apt update
sudo apt install mariadb-server
sudo service mysql start
sudo mysql -uroot -proot -e "CREATE USER 'sqlancer'@'localhost' IDENTIFIED BY 'sqlancer'; GRANT ALL PRIVILEGES ON * . * TO 'sqlancer'@'localhost';"

git clone https://github.com/sqlancer/sqlancer.git
cd sqlancer
MARIADB_AVAILABLE=true mvn -Dtest=TestMariaDB test

ZhengLin-Li · 2023-06-23T13:39:41Z

SQLancer on my local mechine ran 0 queries and 0 was succesful, resulting the build success.

It seems like some kind of deadlock is triggered?

Not sure what's the reason. Could you help take a look at the commands I ran above?

bajinsheng · 2023-06-23T14:32:49Z

It looks like a deadlock bug. If so, it is hard to bypass this bug in the CI.

mrigger · 2023-06-24T01:06:49Z

According to

sqlancer/test/sqlancer/dbms/TestMariaDB.java

Line 18 in 1f80de1

"--num-queries", "0", "mariadb" }));

, it seems expected that we don't generate any queries. I remember that queries often crashed MariaDB, so this might be the reason why we disabled it. So perhaps this is an issue with multiple threads and generating databases? @ZhengLin-Li, perhaps you could try to set the number of threads to 1 here, to see if we can avoid the deadlock issue?

ZhengLin-Li · 2023-06-24T02:51:57Z

It looks like a deadlock bug. If so, it is hard to bypass this bug in the CI.

You mean the 0 query bug? I solved it by running setting --num-threads 1. This bug only happens on my Ubuntu, and it did not occur in CI.

According to

sqlancer/test/sqlancer/dbms/TestMariaDB.java

Line 18 in 1f80de1

"--num-queries", "0", "mariadb" }));

, it seems expected that we don't generate any queries. I remember that queries often crashed MariaDB, so this might be the reason why we disabled it. So perhaps this is an issue with multiple threads and generating databases? @ZhengLin-Li, perhaps you could try to set the number of threads to 1 here, to see if we can avoid the deadlock issue?

Oh, yes, setting --num-threads 1 solved the issue. I tested it again with the command java -jar sqlancer-*.jar --num-threads 1 mariadb and it can successfully run queries.

But the problem is that I still cannot reproduce the Lock wait timeout exceeded errors as those in CI.

I analyzed the CI log and found all assertion errors were triggered by the optimize table statement, but all optimize table statements on my Ubuntu work fine.

mrigger · 2023-06-24T04:35:26Z

In this case, what about disabling OPTIMIZE for now using the pattern described in https://github.com/sqlancer/sqlancer/blob/main/CONTRIBUTING.md#unfixed-bugs?

ZhengLin-Li · 2023-06-25T13:10:13Z

In this case, what about disabling OPTIMIZE for now using the pattern described in https://github.com/sqlancer/sqlancer/blob/main/CONTRIBUTING.md#unfixed-bugs?

So in this case, we need to report this issue to the MariaDB team and refer to their issue page?

bajinsheng · 2023-06-26T00:44:30Z

If you can produce a minimal test case to reproduce the issue, say executing several SQL statements triggers this deadlock, you can report it to MySQL.

mrigger · 2023-06-26T05:19:52Z

In this case, what about disabling OPTIMIZE for now using the pattern described in https://github.com/sqlancer/sqlancer/blob/main/CONTRIBUTING.md#unfixed-bugs?

So in this case, we need to report this issue to the MariaDB team and refer to their issue page?

This would be ideal, as suggested by @bajinsheng, but we could also just refer to this PR for now (or create a separate issue in the repo so someone can investigate it in the future).

ZhengLin-Li · 2023-06-26T07:07:39Z

Do not know why it is still reporting errors after we using https://github.com/sqlancer/sqlancer/blob/main/CONTRIBUTING.md#unfixed-bugs

bajinsheng · 2023-06-26T08:41:24Z

This error happens in the TRUNCATE statement, so you should add expected errors to https://github.com/sqlancer/sqlancer/blob/main/src/sqlancer/mariadb/gen/MariaDBTruncateGenerator.java
Otherwise, the error can not be caught/

mrigger · 2023-06-26T12:57:27Z

src/sqlancer/mariadb/gen/MariaDBIndexGenerator.java

+
+        if (MariaDBBugs.bug835) {
+            errors.add("Specified key was too long; max key length is 2300 bytes");
+            errors.add("Lock wait timeout exceeded; try restarting transaction");


I thought rather than setting the deadlock as expected, we could just skip generating the statement(s) causing it. Also, the ("Specified key was too long; max key length is 2300 bytes" statement doesn't seem to be part of the bug?

avoiding the generation of such statements also make sense. will try to do this

Specified key was too long; max key length is 2300 bytes is a bug here: https://github.com/sqlancer/sqlancer/actions/runs/5357089828/jobs/9717469891

I think this looks like a new limitation? In this case, we shouldn't consider it as a bug, but just add it as an expected error.

bajinsheng · 2023-06-30T07:35:47Z

You disabled the TRUNCATE statements? I meant adding expected errors to the generator of `TRUNCATE1

mrigger · 2023-06-30T07:38:35Z

You disabled the TRUNCATE statements? I meant adding expected errors to the generator of `TRUNCATE1

Does this statement result in a deadlock? If so, we might want to avoid generating the statement, since otherwise, SQLancer cannot generate any additional tests. If not, adding an expected error might be better.

mrigger · 2023-06-30T07:39:43Z

src/sqlancer/mariadb/MariaDBProvider.java

@@ -126,6 +129,9 @@ public void generateDatabase(MariaDBGlobalState globalState) throws Exception {
                    query = MariaDBInsertGenerator.insert(globalState.getSchema(), globalState.getRandomly());
                    break;
                case OPTIMIZE:
+                    if (MariaDBBugs.bug834) {
+                        break;


The logic here looks and below quite complex. Can we just set nrPerformed to 0?

ZhengLin-Li · 2023-06-30T11:40:59Z

You disabled the TRUNCATE statements? I meant adding expected errors to the generator of `TRUNCATE1

Does this statement result in a deadlock? If so, we might want to avoid generating the statement, since otherwise, SQLancer cannot generate any additional tests. If not, adding an expected error might be better.

After analsis, I think such a TRUNCATE statement will not result in deadlock, so maybe we can add it to the expected errors.

ZhengLin-Li · 2023-06-30T12:14:01Z

From the CI, it turned out that, presently, we got:

198 java.lang.AssertionError:
- 46 java.lang.AssertionError: Got error -2 "Internal error < 0 (Not system error)" from storage engine InnoDB
- 148 java.lang.AssertionError: Operation failed
- 4 java.lang.AssertionError: Error updating stats for table 't0' after table rebuild: Lock wait timeout

I think we also need add this errors into exected errors?

mrigger · 2023-06-30T12:56:12Z

Some of these seem like potential bugs that we should look into and potentially report. I guess this might take a while, so not sure if you might want to focus on StoneDB first?

bajinsheng · 2023-06-30T13:01:08Z

I asked @ZhengLin-Li to fix this ERROR in CI so that I can also merge his PR. Since this bug is hard to fix and you can help to merge pr, I think we can lower its priority now.

ZhengLin-Li · 2023-06-30T14:40:23Z

ok, I will just add it to the todo list.

ZhengLin-Li force-pushed the fix_MariaDB branch from c226590 to 08c07cb Compare June 21, 2023 10:22

fix: add expected errors in MariaDB

ac54fd3

ZhengLin-Li force-pushed the fix_MariaDB branch from 08c07cb to ac54fd3 Compare June 22, 2023 13:57

fix: add bug834

dfc1099

mrigger reviewed Jun 26, 2023

View reviewed changes

ZhengLin-Li added 3 commits June 30, 2023 14:27

fix: do not generate truncate and optimize statement

1ef8073

fix: null pointer exception

bfc6b60

fix: 2400 and 1000 bytes

089f1ea

mrigger reviewed Jun 30, 2023

View reviewed changes

ZhengLin-Li added 5 commits June 30, 2023 19:42

fix: do not pass TRUNCATE statement

017941d

fix: add expected errors in optimize statement

e5db7f1

style: run formatter

5aeab57

fix: mvn verify

4d0d30a

fix: add expected errors when truncate

d86db5e

malwaregarry mentioned this pull request May 18, 2024

[MariaDB] Update expected errors and pin to version 11.3.2 #945

Open

fix: add expected errors in MariaDB #834

Are you sure you want to change the base?

fix: add expected errors in MariaDB #834

Conversation

ZhengLin-Li commented Jun 21, 2023

bajinsheng commented Jun 21, 2023

ZhengLin-Li commented Jun 21, 2023 • edited

bajinsheng commented Jun 21, 2023

ZhengLin-Li commented Jun 21, 2023 • edited

bajinsheng commented Jun 21, 2023

ZhengLin-Li commented Jun 22, 2023 • edited

bajinsheng commented Jun 22, 2023

ZhengLin-Li commented Jun 22, 2023 • edited

ZhengLin-Li commented Jun 22, 2023

bajinsheng commented Jun 22, 2023

ZhengLin-Li commented Jun 23, 2023

mrigger commented Jun 23, 2023

mrigger commented Jun 23, 2023

ZhengLin-Li commented Jun 23, 2023 • edited

mrigger commented Jun 23, 2023

ZhengLin-Li commented Jun 23, 2023 • edited

ZhengLin-Li commented Jun 23, 2023

bajinsheng commented Jun 23, 2023

mrigger commented Jun 24, 2023

ZhengLin-Li commented Jun 24, 2023

mrigger commented Jun 24, 2023

ZhengLin-Li commented Jun 25, 2023

bajinsheng commented Jun 26, 2023

mrigger commented Jun 26, 2023

ZhengLin-Li commented Jun 26, 2023

bajinsheng commented Jun 26, 2023

mrigger Jun 26, 2023 • edited

Choose a reason for hiding this comment

ZhengLin-Li Jun 26, 2023

Choose a reason for hiding this comment

mrigger Jun 27, 2023

Choose a reason for hiding this comment

bajinsheng commented Jun 30, 2023

mrigger commented Jun 30, 2023

mrigger Jun 30, 2023

Choose a reason for hiding this comment

ZhengLin-Li commented Jun 30, 2023

ZhengLin-Li commented Jun 30, 2023 • edited

mrigger commented Jun 30, 2023

bajinsheng commented Jun 30, 2023

ZhengLin-Li commented Jun 30, 2023

ZhengLin-Li commented Jun 21, 2023 •

edited

ZhengLin-Li commented Jun 21, 2023 •

edited

ZhengLin-Li commented Jun 22, 2023 •

edited

ZhengLin-Li commented Jun 22, 2023 •

edited

ZhengLin-Li commented Jun 23, 2023 •

edited

ZhengLin-Li commented Jun 23, 2023 •

edited

mrigger Jun 26, 2023 •

edited

ZhengLin-Li commented Jun 30, 2023 •

edited