problem with decode regexp #466

fnnzzz · 2018-09-12T20:28:58Z

Hi, I found a problem in case, when to me from the back-end comes a cookie query-string like, where a part of the value is encoded in Cyrillic and is presented in lower case.

I could not understand why the get() method returns undefined.

A little debugged made the situation clear - RegExp used in decodeURIComponent matches only lowerCase parts, therefore inside decode() function is throw an error URI malformed (because of an incorrect string).

So I send you my PR with a simple solution to this problem.

… is case insensitive

mac2000 · 2018-09-13T04:40:14Z

And where is test to catch regression in future?

mac2000 · 2018-09-13T04:48:40Z

Just to clarify problem exists in .net implementation of url encode - https://stackoverflow.com/questions/918019/net-urlencode-lowercase-problem

And RFC https://tools.ietf.org/html/rfc3986 says:

The uppercase hexadecimal digits 'A' through 'F' are equivalent to
the lowercase digits 'a' through 'f', respectively.

FagnerMartinsBrack · 2018-09-14T20:42:07Z

@mac2000 Thank you for the research!
@fnnzzz Thanks for the very clear report!

The RFC says:

For consistency, URI producers and
normalizers should use uppercase hexadecimal digits for all percent-
encodings

But it also says:

The uppercase hexadecimal digits 'A' through 'F' are equivalent to
the lowercase digits 'a' through 'f', respectively. If two URIs
differ only in the case of hexadecimal digits used in percent-encoded
octets, they are equivalent.

I can see a coupe of options:

Accept lowercase percent-encoding data to make it possible for backend systems to integrate with the standard js-cookie encoding algorithm. We still need a test for it as @mac2000 pointed out and fix the tests that are failing the build.
Do not accept lowercase percent-encoding data. Document somewhere that if you have a custom server, we recommend the user to use a converter instead and leave the standard js-cookie encoding/decoding algorithm only for compliant servers or to/from js-cookie interchange.

Be aware that option 1 might have other edge-case regressions.

Which one should we go for?
Ping @carhartl

mac2000 · 2018-09-14T21:19:58Z

@FagnerMartinsBrack thanks for your participation but I am kind of not agree with your point about relaying on undefined

Both decodeURIComponent('%D0%A5') and decodeURIComponent('%d0%a5') do the work - this is what should be expected

And if someone relaying on this - they are definitely doing something wrong

The right question is what should be done to regressions, but this is a place where only you guys can help

mac2000 · 2018-09-14T21:23:31Z

While writing this I did realize to my self that there is a chance to solve a problem in a following way - hexies are going by two pairs (at least for cyrillic) so may be we can figure out how to make regexp something like (%[0-9A-F]{2}){2}? (but still not sure if this is a case for everything)

Lowercase escapes are legal, `decodeURIComponent()` supports them, and we should too. The most difficult part is to know when we are dealing with hexadecimal pairs and when not. Supporting only uppercase characters made interpreting strings slightly more constrained, nevertheless even in this case we might run into a scenario where we encounter strings that look like encoded characters but aren't, for instance "%A1". The solution to this is to no longer permit strings with mixed unencoded and encoded characters, which seems like a rather rare edge case to support. Either a cookie value is a fully url encoded string (which is ensured when we're writing the cookie via our own api) or we treat it as non-readable otherwise. In such fully encoded strings "%" would appear as "%25", e.g. "%A1" would look like "%25A1". In case someone needs to work with such mixed encoding cookie values, they would need to resort to implementing their own specialized converter. Closing #466

carhartl · 2019-09-23T11:17:31Z

Fixed: 18ed0fd

fix an error throw on decode() method in the case where decoded value…

1121426

… is case insensitive

carhartl added this to the v3.0.0 milestone Sep 16, 2019

carhartl self-assigned this Sep 19, 2019

carhartl closed this Sep 23, 2019

carhartl mentioned this pull request Oct 6, 2019

s.replace(/(%[0-9A-Z]{2})+/g, decodeURIComponent) error #530

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

problem with decode regexp #466

problem with decode regexp #466

fnnzzz commented Sep 12, 2018 •

edited

mac2000 commented Sep 13, 2018

mac2000 commented Sep 13, 2018

FagnerMartinsBrack commented Sep 14, 2018 •

edited

mac2000 commented Sep 14, 2018

mac2000 commented Sep 14, 2018

carhartl commented Sep 23, 2019

problem with decode regexp #466

problem with decode regexp #466

Conversation

fnnzzz commented Sep 12, 2018 • edited

mac2000 commented Sep 13, 2018

mac2000 commented Sep 13, 2018

FagnerMartinsBrack commented Sep 14, 2018 • edited

mac2000 commented Sep 14, 2018

mac2000 commented Sep 14, 2018

carhartl commented Sep 23, 2019

fnnzzz commented Sep 12, 2018 •

edited

FagnerMartinsBrack commented Sep 14, 2018 •

edited