Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The behavior of query_keywords and query_parameters seems orthogonal, but in actuality it is not. #134

Open
asparkman opened this issue May 11, 2016 · 4 comments

Comments

@asparkman
Copy link

Been using Catalyst for quite some time, and have found it to be a really great framework. Unfortunately, I came across something the other day that seemed a little confusing. I feel a code change, or documentation change may be appropriate.

I am using version 5.90092, but I believe the behavior is the same on the latest.

A data dump of the following:
my( $res, $c) = ctx_request(GET '/test_psgi_keys?x&a=1&b=2');

produces:
$c->req->query_parameters = { x => undef, a = 1, b = 2 }
$c->req->query_keywords = undef

A data dump of the following:
my( $res, $c) = ctx_request(GET '/test_psgi_keys?x');

produces:
$c->req->query_parameters = {}
$c->req->query_keywords = "x"

I feel like in the second case that the query_parameter member should instead be:
$c->req->query_parameters = { x => undef }

The query_keywords documentation states:
"Contains the keywords portion of a query string, when no '=' signs are present."

The query_parameters documentation states:
"Returns a reference to a hash containing query string (GET) parameters. Values can be either a scalar or an arrayref containing scalars."

The query_keywords member appears to have been created to support the isindex element, which is going through various states of deprecation. I understand there are a lot of finer points to this discussion, but I felt it pertinent to open this up for discussion.

@jjn1056
Copy link
Member

jjn1056 commented May 11, 2016

old docs around isindex tag: http://www.w3.org/MarkUp/html-spec/html-spec_7.html#SEC7.5

@wolfsage
Copy link
Contributor

It does appear that this functionality (query_keywords) was intended to support ISINDEX. However, it indiscriminately sets query_keywords if there are no '=' signs found in the query string, when it should only be doing that if the isindex flag is set.

See the decode section of https://www.w3.org/TR/html5/forms.html#url-encoded-form-data

@asparkman
Copy link
Author

asparkman commented May 26, 2016

I did some more research into the problem. This is mainly to document my intended actions, and their reasoning. The changes that I believe are the best are as follows along with their supporting sections.

  1. The query_keywords member should remain functionally equivalent.
    • Partial isindex Support
    • The History of query_keywords
  2. The query_parameters and body_parameters members should be modified to interpret name-value pairs without an equal sign as a name with an undefined value.
    • Fading isindex Support
    • Consistency Over isindex Support

Partial isindex Support

If this member was originally added to support the isindex tag, it does so in a very incomplete way. As a result, this member should not be interpretted to support the isindex tag.

The special isindex encoding outlined in the HTML 5 algorithms for encoding and dedcoding application/x-www-form-urlencoded payloads [1], is applicable to both GET and POSTs according to step 18 of the Form submission algorithm [2]. No body_keywords member is present, and query_keywords will not be populated if the body is absent of the equal sign. The query_keywords member is populated through the prepare_query_parameter subroutine defined in Engine.pm. This subroutine only uses the query string.

Furthermore, the encoding algorithm states the isindex value may exist with other name-value pairs. It should only be encoded/decoded as such if it is first in the form, query string, or request body. This means that the absence of the equal sign does not determine the use of the isindex tag.

The History of query_keywords

The query_keywords member was added with the following commit.

commit 3b4d12511c59793e85feca1ac1b4a8c2c5f1a6ae
Author: Andy Grundman <andy@hybridized.org>
Date:   Fri Aug 3 01:53:17 2007 +0000

    Added req->query_keywords method

It was documented as behaving as follows.

Contains the keywords portion of a query string, when no '=' signs are present.

All changes happening since then have been for the following reasons.

  • Whitespace formatting
  • The introduction of Moose
  • The addition of a decoding step after unescaping the query string and before assigning the query keywords.

None of those changes made query_keywords more or less supportive of the isindex tag.

The query_keywords member is documented as being release with 5.7008 [3].

5.7008  2008-08-10 08:09:00
        - Added $c->request->query_keywords for getting the keywords
          (a query string with no parameters).

John Napiorkowski indicated that an alternative intention for the query_keywords could not be determined through various talks in #catalyst.

Before query_keywords was present, the absence of an equal sign in the query string would prevent query_parameters from being populated. Users would have to rely on the $c->req->{uri}->query instead.

Fading isindex Support

The isindex element was deprecated in HTML 4.01 [4].

Browsers are moving towards no longer supporting it [5].

See section Partial isindex Support.

Consistency Over isindex Support

The isindex tag is only treated specially if it is the first entry in the form data set. Otherwise, it is encoded as normal. This special treatment causes the value of the isindex tag to be added to the query string or payload without an equal sign, or name. When it is decoded, the string found by splitting it is meant to be treated as a value, not a name [1].

Perl provides the query_parameters and body_parameters as a hash. Hashes do not support using undefined as a key. As a result, it would require changing the return type of those members.

Instead, Catalyst should choose to skip special treatment when the isindex flag is set, and decode it as a name with an undefined value.

my( $res, $c) = ctx_request(GET '/test_psgi_keys?x&a=1&b=2');

would produce:

$c->req->query_parameters = { x => undef, a = 1, b = 2 }
$c->req->query_keywords = undef

my( $res, $c) = ctx_request(GET '/test_psgi_keys?x');

would produce:

$c->req->query_parameters = { x => undef }
$c->req->query_keywords = "x"

References

[1] - https://www.w3.org/TR/html5/forms.html#url-encoded-form-data
[2] - https://www.w3.org/TR/html5/forms.html#form-submission-algorithm
[3] - https://metacpan.org/source/MRAMBERG/Catalyst-Runtime-5.7008/Changes
[4] - https://www.w3.org/TR/html401/interact/forms.html#h-17.8
[5] - https://developer.mozilla.org/en-US/docs/Web/HTML/Element/isindex

@asparkman
Copy link
Author

asparkman commented May 27, 2016

HTTP::Body which appears to be responsible for parsing the request body does not appear to support isindex. A name-value pair must have a name and a value for it to be saved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants