Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Http server message.url does not follow the whatwg url standard and does not evaluate ../ #51311

Open
stefanbeigel opened this issue Dec 29, 2023 · 7 comments

Comments

@stefanbeigel
Copy link

stefanbeigel commented Dec 29, 2023

Version

18.17.1

Platform

All

Subsystem

No response

What steps will reproduce the bug?

I would like to share with you a very common scenario:

  1. A request is recevied via NodeJs Express or Fastify server using http module.
  2. Request is forwarded to another service using an http client that uses the whatwg URL class to build the target URL using the service hostname + the incoming message url

Express and fastify route matching is done on the incoming message url, which is not whatwg compliant url, see also discussion here This scenario can lead to path traversal vulnerabilities as the whatwg URL class will evaluate ../ and ./ but the server has matched on another path.

This situation is not good at all, because the developer need to know about the different parsing logic. I created this issue at Nodejs as the root cause of this the url property of the incoming message. As you can see in the linked property the fastify maintainer says that

Fastify never aimed to be compatible with the WHATWG Url parsing standard.

Having different url representations in one framework/language is not good at all.

Example application

I have prepared a sample application using fastify:
https://github.com/stefanbeigel/whatwg-fastify-path-traversal/blob/main/index.mjs
Call the app with curl --path-as-is localhost:3000/abc/../foobar

How often does it reproduce? Is there a required condition?

What is the expected behavior? Why is that the expected behavior?

  1. Parse the incoming message url with WHATWG url class

OR

  1. Drop the whatwg standard for the URL class, as it mainly used for browsers.

What do you see instead?

Additional information

Fastify: Different URL parsing between fastify and URL whatwg standard
WhatWG: URL path shortening for ../ creates problem with other URL parsers that do not follow the whatwg standard
Discussion: http(s) server why is request.url actually not an url, but a pathname. And why is parsing it now so hard?

@aduh95
Copy link
Contributor

aduh95 commented Dec 29, 2023

Can you provide a repro that does not involve downloading code from the internet, specifying what is the expected behavior and what you see instead?

Enter details about your bug, preferably a simple code snippet that can
be run using `node` directly without installing third-party dependencies
or downloading code from the internet (i.e. no ZIP archive, no GitHub
repository, etc.).

@stefanbeigel
Copy link
Author

stefanbeigel commented Jan 5, 2024

Hi @aduh95
created a minimal http server example:

import http from 'http';

http.createServer((request, response) => {
    let targetUrl = new URL(`https://abc.com${request.url}`);
    console.log(`incoming url: ${request.url}`);
    console.log(`target url: ${targetUrl}`);

    response.writeHead(200, {'Content-Type': 'text/plain'});
    response.end('Hello World\n');
}).listen(3000);

When calling the serer with:

curl --path-as-is "localhost:3000/abc/../foobar"

The output is

incoming url: /abc/../foobar
target url: https://abc.com/foobar

So the incoming url does not follow the whatwg standard as dot-segments are not removed.
When using the Nodejs whatwg URL class this dot-segments are removed.
This leads to the problem that most http frameworks like express and fastify don't match the right route, as they use message.url

As a solution I could image to add an option to http.createServer that the message.url is parsed with the URL class.
For example:

    if(parseIncomingUrlwithWhatWg) {
      const newUrl = (new URL(`htp://localhost${request.url}`));
      request.url = newUrl.pathname + newUrl.search;
    }

Please also check out the WhatWg issue discussion

@marco-ippolito
Copy link
Member

cc @anonrig

@marco-ippolito
Copy link
Member

I think that changing the behavior would break the ecosystem so the only possibility would be in passing an option to http.createServer. PR welcomed

@wesleytodd
Copy link
Member

We discussed this a bit today on the web server frameworks team. Here is a recap:

  1. The breaking nature of a change to req.url would make it really difficult to land
  2. The idea to make it an option to http.createServer would mean that libraries in your framework stack would need to know which mode they were operating in which might be difficult to achieve and is less than idea.
  3. Alternatives like a new property (bad example: req.newURL) would make it easier to land but less useful to the ecosystem
  4. Ideally we get to a place with the new http apis where we are only exposing a URL instance, that work is being tracked here: URL's on incoming requests web-server-frameworks#71

So for me, I would say that maybe the current options on the table for this are worse than the unexpected behavior and that we should work hard toward the new http api's being proposed instead.

@stefanbeigel
Copy link
Author

Thanks @wesleytodd for the recap and linking of the web-server-frameworks issue, good to know that it was even discussed there before.
As you might have seen I also tried to add it to the http-server module, where I received different options.
The original problem would only be solved if everyone (all web frameworks) would use the parsed url for routing and not the raw message header (path + query).
So I guess the biggest concern is the performance drawback when using whatWG for parsing.
Maybe it would be possible to parse the url (only path and query) more efficient.
At the moment I have the feeling I can't really help there as you need to discuss this further in the web-server-frameworks round.

@wesleytodd
Copy link
Member

I also tried to add it to the http-server module

I did see that and read through the comments, I agree they gave different input but also I agree mostly with what was said over there. Honestly thinking about it maybe this comment should have been on that PR to keep the conversation together, this was just the link shared in the meeting.

The original problem would only be solved if everyone (all web frameworks) would use the parsed url for routing and not the raw message header (path + query).

Yeah this is, IMO, a valid goal. But I think different frameworks would have different targets, for example Fastify highly values performance so might decide not to use the parsed URL. On the other hand, Express might decide to use the URL instance as performance is a secondary concern compared to user expectations around web platform support. I say this because ideally I think Node core should offer both in easily consumed ways.

So I guess the biggest concern is the performance drawback when using whatWG for parsing.

I mean that is some folks biggest concern, it is not mine (as I said above).

At the moment I have the feeling I can't really help there as you need to discuss this further in the web-server-frameworks round.

That is an open group, we would love to have you in those issues and helping us drive the decisison making on this topic. Feel free to hop in there!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants
@wesleytodd @aduh95 @marco-ippolito @stefanbeigel and others