Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Columns with spaces cause double quotes when filtering, getting errors #1608

Open
paf91 opened this issue Apr 10, 2024 · 0 comments
Open

Columns with spaces cause double quotes when filtering, getting errors #1608

paf91 opened this issue Apr 10, 2024 · 0 comments
Labels

Comments

@paf91
Copy link

paf91 commented Apr 10, 2024

Describe the bug

Hi. I'm facing issue using PySpark+JDBC driver, when columns have spaces, like "Test Column". E.g. when trying to filter table using pure jdbc driver or spark-clickhouse-connector+same jdbc driver:

tt = spark.table("test.table1")
tt.where(col("Test Column") == '123').count()

Translates into:
SELECT COUNT(*) FROM test.table1 WHERE (1=1) AND ((1=1) AND (``some ID`` = '123') AND (1=1))
Which causes error in clickhouse because it wraps column names with `` even if we already did it
Meanwhile tt.show() works fine

Steps to reproduce

  1. Initialize spark session
    spark = SparkSession.builder.getOrCreate()
  2. Connect to clickhouse
host = 'somehost'
port = '8443'
database = 'test'
url = f'jdbc:clickhouse://{host}:{port}/{database}?&ssl=true'
user = 'test.user'
dbtable = 'test.test_table'
driver = 'com.clickhouse.jdbc.ClickHouseDriver'
test_table = (
    spark.read.format('jdbc')
    .option('driver', driver)
    .option('url', url)
    .option('user', user)
    .option('password', password)
    .option('dbtable', dbtable)
    .load()
)
  1. Execute query
    test_table.where(col("Column Test").isNotNull()).show()

Expected behaviour

No error, dataframe return results

Code example

tt = spark.table("test.table1")
tt.select(col("Test Column")).where(col("Test Column") == '123').count()

Error log

<Error> DynamicQueryHandler: Code: 62. DB::Exception: Syntax error: failed at position 83 ('``') (line 3, col 29): ``Test Column`` = '123') AND (1=1))

LIMIT 21
. Expected one of: literal, NULL, number, Bool, true, false, string literal, SELECT query, possibly with UNION, list of union elements, SELECT query, subquery, possibly with UNION, SELECT subquery, SELECT query, WITH, FROM, SELECT, EXPLAIN, token, Comma, ClosingRoundBracket, CAST operator, NOT, INTERVAL, CASE, DATE, TIMESTAMP, tuple, collection of literals, array, asterisk, qualified asterisk, compound identifier, list of elements, identifier, COLUMNS matcher, COLUMNS, qualified COLUMNS matcher, substitution, MySQL-style global variable. (SYNTAX_ERROR), Stack trace (when copying this message, always include the lines below):

  1. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000c6d5d7b in /usr/bin/clickhouse
  2. DB::Exception::createDeprecated(String const&, int, bool) @ 0x000000000c72de4d in /usr/bin/clickhouse
  3. DB::parseQueryAndMovePosition(DB::IParser&, char const*&, char const*, String const&, bool, unsigned long, unsigned long) @ 0x0000000012ebc03c in /usr/bin/clickhouse
  4. DB::executeQueryImpl(char const*, char const*, std::shared_ptrDB::Context, DB::QueryFlags, DB::QueryProcessingStage::Enum, DB::ReadBuffer*) @ 0x00000000117221c5 in /usr/bin/clickhouse
  5. DB::executeQuery(DB::ReadBuffer&, DB::WriteBuffer&, bool, std::shared_ptrDB::Context, std::function<void (DB::QueryResultDetails const&)>, DB::QueryFlags, std::optionalDB::FormatSettings const&, std::function<void (DB::IOutputFormat&)>) @ 0x0000000011729bea in /usr/bin/clickhouse
  6. DB::HTTPHandler::processQuery(DB::HTTPServerRequest&, DB::HTMLForm&, DB::HTTPServerResponse&, DB::HTTPHandler::Output&, std::optionalDB::CurrentThread::QueryScope&) @ 0x0000000012616f8d in /usr/bin/clickhouse
  7. DB::HTTPHandler::handleRequest(DB::HTTPServerRequest&, DB::HTTPServerResponse&) @ 0x000000001261bdb6 in /usr/bin/clickhouse
  8. DB::HTTPServerConnection::run() @ 0x0000000012696c12 in /usr/bin/clickhouse
  9. Poco::Net::TCPServerConnection::start() @ 0x00000000150f4e52 in /usr/bin/clickhouse
  10. Poco::Net::TCPServerDispatcher::run() @ 0x00000000150f5c51 in /usr/bin/clickhouse
  11. Poco::PooledThread::run() @ 0x00000000151ece67 in /usr/bin/clickhouse
  12. Poco::ThreadImpl::runnableEntry(void*) @ 0x00000000151eb45c in /usr/bin/clickhouse
  13. ? @ 0x00007f2c94dbcac3 in ?
  14. ? @ 0x00007f2c94e4e850 in ?
    (version 23.12.2.59 (official build))

Configuration

PySpark 3.3.4
Clikchouse JDBC Driver 0.6

Environment

  • Client version: Clikchouse JDBC Driver 0.6
  • Language version: Python 3.10
  • OS: Ubuntu

ClickHouse server

  • ClickHouse Server version: 23.12.2.59
@paf91 paf91 added the bug label Apr 10, 2024
@paf91 paf91 changed the title Columns with spaces cause double quotes when filtering Columns with spaces cause double quotes when filtering, getting errors Apr 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant