Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Message affected by 'use utf8', breaks binary POSTs [rt.cpan.org #77403] #70

Open
oalders opened this issue Mar 31, 2017 · 1 comment
Open

Comments

@oalders
Copy link
Member

oalders commented Mar 31, 2017

Migrated from rt.cpan.org#77403 (status was 'open')

Requestors:

Attachments:

From henrik.pauli@gmail.com on 2012-05-24 13:30:09:

It appeared to us that POSTing binary data with LWP corrupted the data
when (and only when) we had �use utf8� enabled in the script using LWP.

This bug was present in LWP 5.833 as well as the newest HTTP::Message 6.03.

�use utf8� doesn't do anything but turn the strings in the source code
into string of characters, rather than octets -- it seems that
HTTP::Request::Common is completely encoding (and u-string) agnostic,
which is VERY dangerous in a place where you manipulate octet streams.

The source of the problem is that you have strings in the source code
(eg. where you add the Content-Disposition header[1]), and *also* read
bytes from the file into the same buffer later on[2].  One is easily a
character string, the other is definitely an octet stream.

Not sure what the right solution is, but the module should safeguard
itself against these kinds of things.

[1]
https://metacpan.org/source/GAAS/HTTP-Message-6.03/lib/HTTP/Request/Common.pm#L135
[2]
https://metacpan.org/source/GAAS/HTTP-Message-6.03/lib/HTTP/Request/Common.pm#L243

P.S. Might be a similar issue, we also recently noticed that https and
use utf8 breaks a HTTP request, either or both of them missing doesn't.

PPS. Perl 5.10.1, Linux 3.1 x86.

From gaas@cpan.org on 2012-05-27 11:48:49:

It would be helpful if you can provide a small test script that demonstrates
the problem.

From gortan@cpan.org on 2015-05-13 15:51:32:

On Sun May 27 07:48:49 2012, GAAS wrote:
> It would be helpful if you can provide a small test script that demonstrates the problem.

I think I just ran into the same issue, and tried to come up with two minimal scripts: Both have a constant value 'öööö' in their source code, which they both pass on to HTTP::Request::Common::POST to print them as application/x-www-form-urlencoded. One of the scripts is saved as latin-1, the other is saved as utf-8 and has "use utf8" set.
I would assume that the output of both scripts is identical. However, while the latin1 script produces the expected:
text=%F6%F6%F6%F6%F6%F6%F6%F6%F6%F6%F6
the utf8 script (imho incorrectly) produces:
text=%C3%B6%C3%B6%C3%B6%C3%B6%C3%B6%C3%B6%C3%B6%C3%B6%C3%B6%C3%B6%C3%B6

$HTTP::Request::Common::VERSION is 6.04, perl v5.20.2 built for x86_64-linux.
@aero
Copy link

aero commented Jun 3, 2017

Hi,
I guess this issue is similar to http://matrix.cpantesters.org/?dist=WebService-KoreanSpeller+0.014
Only at or below perl 5.10.1 get test fail.

I tested it myself.
I changed url at https://metacpan.org/source/AERO/WebService-KoreanSpeller-0.014/lib/WebService/KoreanSpeller.pm#L25 to localhost.
and got the raw request through nc.

Why does LWP POST send different request with the same version LWP related modules ?

Case A: Perl 5.10.1 , LWP 6.26, HTTP::Request 6.11
Case B: Perl 5.20.3 , LWP 6.26, HTTP::Request 6.11

Case A

POST / HTTP/1.1
TE: deflate,gzip;q=0.3
Connection: TE, close
Host: localhost:88888
User-Agent: libwww-perl/6.26
Content-Length: 96
Content-Type: application/x-www-form-urlencoded

text1=%C3%AC%C2%95%C2%88%C3%AB%C2%87%C2%BD%C3%AD%C2%95%C2%98%C3%AC%C2%84%C2%B8%C3%AC%C2%9A%C2%94

Case B

POST / HTTP/1.1
TE: deflate,gzip;q=0.3
Connection: TE, close
Host: localhost:88888
User-Agent: libwww-perl/6.26
Content-Length: 51
Content-Type: application/x-www-form-urlencoded

text1=%EC%95%88%EB%87%BD%ED%95%98%EC%84%B8%EC%9A%94

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants