-
Notifications
You must be signed in to change notification settings - Fork 819
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unicode whitespace is removed from unquoted elements of text[] columns #1257
Comments
@kevinkyyro , nice catch.
+1
Could you share the tests? For instance: https://github.com/pgjdbc/pgjdbc/tree/master/test-anorm-sbt |
Here's relevant code from the backend: https://github.com/postgres/postgres/blob/f2c587067a8eb9cf1c8f009262381a6576ba3dd0/src/backend/utils/adt/arrayfuncs.c#L421-L438 |
The comments there indicate there used to be different behavior, possibly similar to what java is doing now. |
#1194 Includes unit tests for arrays. Adding some cases including non ascii whitespace should be fairly easy. |
Here's the comments from the commit on the backend.
I think we either stick with the backend behaviour. Unless of course we have a convincing argument to change that as well ? |
The postgresql backend only quotes ascii whitespace. pgjdbc#1257
@vlsi I would propose that we change the java driver to only look for the 6 specific chars (PR #1266). Would it also make sense to have the back end more liberally use quotes?
Is there any downside to erring on side of more liberal quoting when returning values? |
Just want to point out that regardless of the set of characters considered whitespace, it says that it only trims leading and trailing whitespace on input. The driver should certainly not be doing anything to non-leading-or-trailing, non-escape characters, correct? It seems likely that no whitespace should be dropped at all on output, since it was already dropped on input. Looking at my example above, the delimiter seems to be a comma without any added space (hard to check for sure on my phone). |
* fix: don't ignore unquoted non-ascii whitespace The postgresql backend only quotes ascii whitespace. #1257 add comment to existing isSpace method change from switch to chained ORs * test: fix comment in ArrayTest.testDirectFieldString * fix: revert spaces in array literal * add changelog entry add test showing inner white spaces not dropped on unquoted strings by backend. Co-authored-by: Vladimir Sitnikov <sitnikov.vladimir@gmail.com> Co-authored-by: BO8979 <BO8979@W1971362.northamerica.cerner.net>
) * fix: don't ignore unquoted non-ascii whitespace The postgresql backend only quotes ascii whitespace. pgjdbc#1257 add comment to existing isSpace method change from switch to chained ORs * test: fix comment in ArrayTest.testDirectFieldString * fix: revert spaces in array literal * add changelog entry add test showing inner white spaces not dropped on unquoted strings by backend. Co-authored-by: Vladimir Sitnikov <sitnikov.vladimir@gmail.com> Co-authored-by: BO8979 <BO8979@W1971362.northamerica.cerner.net>
pgjdbc/pgjdbc/src/main/java/org/postgresql/jdbc/PgArray.java
Lines 489 to 491 in b1b1afb
Versions
PgJDBC driver:
42.2.2
PostgreSQL server:
10.4 (Debian 10.4-2.pgdg90+1)
Java:
1.8.0_161-b12
Context
I have a table with a
text[]
column and wrote some property tests to verify selecting after inserting resulted in the same value, and I came across a strange issue.For some reason, the server returns
text[]
columns without quoting elements that only contain non-ASCII UTF-8 whitespace (ex:\u2001
). As a result, those elements have all their whitespace (UTF-8 or otherwise) stripped out at the linked line, which I identified by stepping through with the debugger.I am reporting this as a driver bug because it seems fixable there, but it may also be undesirable behavior for the server.
SQL Example
Code example
Here's an example based on my setup, which is using jooq (
3.11.0
).It should be a pretty straight forward conversion to java 8, but in either case running this would require either wrapping a main method or throwing it into a script and getting jooq on the classpath. I'm happy to help with that, but I assume you will have a preferred way of doing that.
The output should be as follows, demonstrating that the whitespace on the first array element has been removed (the printing format here varies from psql, so neither are quoted, but that's not a significant detail):
The text was updated successfully, but these errors were encountered: