New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: preserve unquoted unicode whitespace in array literals #1266
Conversation
The postgresql backend only quotes ascii whitespace. pgjdbc#1257
This reverts commit 8281083.
This reverts commit 8b5072d.
…ce""" This reverts commit e407aaf.
Codecov Report
@@ Coverage Diff @@
## master #1266 +/- ##
============================================
+ Coverage 68.78% 68.78% +<.01%
- Complexity 3908 3914 +6
============================================
Files 179 179
Lines 16420 16421 +1
Branches 2674 2674
============================================
+ Hits 11295 11296 +1
+ Misses 3880 3879 -1
- Partials 1245 1246 +1 |
more unit tests
Please create a method near pgjdbc/pgjdbc/src/main/java/org/postgresql/core/Parser.java Lines 777 to 778 in da831de
|
Tests for leading/trailing/multiple whitespaces are needed as well |
@@ -529,6 +529,32 @@ private synchronized void buildArrayList() throws SQLException { | |||
} | |||
} | |||
|
|||
/** | |||
* Is whitespace which postgresql will force quotes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we elaborate a bit here. I my not have enough coffee yet but I have difficulty reading this line.
Should the If we do need a different method, do you prefer the style of the switch statement or the chained OR equals? We should probably have both methods use same style. |
Is this better: Identifies white space characters which the backend uses to determine if a {@code String} value needs to be quoted in array representation. |
@bokken yes, thanks! |
more unit tests relocate utility method to Parser
I think it should be left as is and a comment pointing to https://github.com/postgres/postgres/blob/17bb62501787c56e0518e61db13a523d47afd724/src/backend/parser/scan.l#L194-L198 should be added to existing |
I will add the comment. |
I would follow the style of |
add comment to existing isSpace method change from switch to chained ORs
PostgreSQL does drop whitespace at the start and at the end of the value:
|
Just to clarify: sending data to the backend is not affected. |
Correct, this only affects the parsing of strings, presumably coming from the backend. Nothing is changed in assembling strings to send to backend. |
@@ -416,8 +416,7 @@ public void testDirectFieldString() throws SQLException { | |||
assertEquals(" \fnew \n ", array[2]); | |||
assertEquals(" ", array[3]); | |||
|
|||
// this is likely undesired behavior of dropping white spaces mid-value (not | |||
// leading/trailing) | |||
// PostgreSQL drops leading and trailing whitespace, so does the driver |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test asserts that the driver is also dropping ascii white space in the middle of an unquoted value.
This is not consistent with the backend. However, the backend should never produce this type of value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vlsi thoughts?
This is asserting dropping of mid value white spaces, which is probably undesired, but practically a non issue as the backend should never produce such a value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not consistent with the backend
Why do you think so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a test when the same values get passed to the backend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is still different from testDirectFieldString
add test showing inner white spaces not dropped on unquoted strings by backend.
In that the contents of the entire array are not the same?
The specific element element being asserted with all sorts of white space
characters is the same.
…On Tue, Jul 24, 2018 at 12:17 PM Vladimir Sitnikov ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In pgjdbc/src/test/java/org/postgresql/test/jdbc2/ArrayTest.java
<#1266 (comment)>:
> @@ -416,8 +416,7 @@ public void testDirectFieldString() throws SQLException {
assertEquals(" \fnew \n ", array[2]);
assertEquals(" ", array[3]);
- // this is likely undesired behavior of dropping white spaces mid-value (not
- // leading/trailing)
+ // PostgreSQL drops leading and trailing whitespace, so does the driver
It is still different from testDirectFieldString
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1266 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AC26bWWmzGF9HFfe8ofbP9ULiJaN0neVks5uJ1a3gaJpZM4VZciM>
.
|
The tests in this PR are hard to review. I agree the existing tests in that area are weak, however if you think there's a mismatch between pgjdbc and backend behavior it would be great if you could create a test that highlights that diff. |
I am out on vacation for next week+. I can add more targeted tests when I
return.
There is a difference in how unquoted values are parsed. The backend only
trims leading/trailing whitespace. The driver drops all whitespace, even
those surrounded by other letters/words. Thus `un quote` becomes `unquote`.
But this difference seems largely academic, as the driver is only expected
to parse strings produced by the back end and the backend quotes strings
with whitespace anywhere. This was the point of my comment that was
asserting all white spaces, including inner, were dropped.
…On Wed, Jul 25, 2018 at 11:41 PM Vladimir Sitnikov ***@***.***> wrote:
The tests in this PR are hard to review.
It is not obvious what spaces get trimmed by the driver, which are trimmed
(or not) by the backend and it is not clear if the behaviour of the both
agree.
I agree the existing tests in that area are weak, however if you think
there's a mismatch between pgjdbc and backend behavior it would be great if
you could create a test that highlights that diff.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1266 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AC26bTFot4R03kzF02x-c6knFR_1xUicks5uKY6vgaJpZM4VZciM>
.
|
Ah, I see now. |
add test explicitly showing differences between driver and backend handling unquoted string array values
more unit tests
more unit tests relocate utility method to Parser
add comment to existing isSpace method change from switch to chained ORs
add test showing inner white spaces not dropped on unquoted strings by backend.
add test explicitly showing differences between driver and backend handling unquoted string array values
✅ Build pgjdbc 1.0.228 completed (commit 636f9c9b39 by @bokken) |
c25d807
to
adcb194
Compare
# Conflicts: # pgjdbc/src/main/java/org/postgresql/jdbc/PgArray.java
@davecramer I updated this to be able to merge. |
* fix: don't ignore unquoted non-ascii whitespace The postgresql backend only quotes ascii whitespace. #1257 add comment to existing isSpace method change from switch to chained ORs * test: fix comment in ArrayTest.testDirectFieldString * fix: revert spaces in array literal * add changelog entry add test showing inner white spaces not dropped on unquoted strings by backend. Co-authored-by: Vladimir Sitnikov <sitnikov.vladimir@gmail.com> Co-authored-by: BO8979 <BO8979@W1971362.northamerica.cerner.net>
) * fix: don't ignore unquoted non-ascii whitespace The postgresql backend only quotes ascii whitespace. pgjdbc#1257 add comment to existing isSpace method change from switch to chained ORs * test: fix comment in ArrayTest.testDirectFieldString * fix: revert spaces in array literal * add changelog entry add test showing inner white spaces not dropped on unquoted strings by backend. Co-authored-by: Vladimir Sitnikov <sitnikov.vladimir@gmail.com> Co-authored-by: BO8979 <BO8979@W1971362.northamerica.cerner.net>
The postgresql backend only quotes ascii whitespace.
#1257