You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi. I'm facing issue using PySpark+JDBC driver, when columns have spaces, like "Test Column". E.g. when trying to filter table using pure jdbc driver or spark-clickhouse-connector+same jdbc driver:
Translates into: SELECT COUNT(*) FROM test.table1 WHERE (1=1) AND ((1=1) AND (``some ID`` = '123') AND (1=1))
Which causes error in clickhouse because it wraps column names with `` even if we already did it
Meanwhile tt.show() works fine
paf91
changed the title
Columns with spaces cause double quotes when filtering
Columns with spaces cause double quotes when filtering, getting errors
Apr 10, 2024
Describe the bug
Hi. I'm facing issue using PySpark+JDBC driver, when columns have spaces, like "Test Column". E.g. when trying to filter table using pure jdbc driver or spark-clickhouse-connector+same jdbc driver:
tt = spark.table("test.table1")
tt.where(col("Test Column") == '123').count()
Translates into:
SELECT COUNT(*) FROM test.table1 WHERE (1=1) AND ((1=1) AND (``some ID`` = '123') AND (1=1))
Which causes error in clickhouse because it wraps column names with `` even if we already did it
Meanwhile tt.show() works fine
Steps to reproduce
spark = SparkSession.builder.getOrCreate()
test_table.where(col("Column Test").isNotNull()).show()
Expected behaviour
No error, dataframe return results
Code example
Error log
LIMIT 21
. Expected one of: literal, NULL, number, Bool, true, false, string literal, SELECT query, possibly with UNION, list of union elements, SELECT query, subquery, possibly with UNION, SELECT subquery, SELECT query, WITH, FROM, SELECT, EXPLAIN, token, Comma, ClosingRoundBracket, CAST operator, NOT, INTERVAL, CASE, DATE, TIMESTAMP, tuple, collection of literals, array, asterisk, qualified asterisk, compound identifier, list of elements, identifier, COLUMNS matcher, COLUMNS, qualified COLUMNS matcher, substitution, MySQL-style global variable. (SYNTAX_ERROR), Stack trace (when copying this message, always include the lines below):
(version 23.12.2.59 (official build))
Configuration
PySpark 3.3.4
Clikchouse JDBC Driver 0.6
Environment
ClickHouse server
The text was updated successfully, but these errors were encountered: