Path is decoded in createLocation #505

ArthurKnaus · 2017-07-26T06:13:39Z

This results in an invalid URL being pushed to the HTML5 history element when there are encoded percent signs in the pathname.

E.g.:

history.push({
  pathname: `randomlocation/${ encodeURIComponent("AB%C") }/furthermore`
});

results in "randomlocation/AB%C/futhermore" being pushed to the HTML5 history.

The text was updated successfully, but these errors were encountered:

pshrmn · 2017-07-26T17:08:13Z

Alright, so just to be clear, the reasoning behind decoding in createLocation is that it allows us to have location object's whose pathname is decoded. That means that React Router users can use decoded strings in their route paths.

// with decoding
<Route path='/역사' />
// without decoding
<Route path='/%EC%97%AD%EC%82%AC' />

Another effect of this is that params will be automatically decoded.

<Route path='/greeting/:word' />
// user visits /greeting/안녕하세요
match === {
  ...,
  params: { word: '안녕하세요' }
  // without decoding this would be %EC%95%88%EB%85%95%ED%95%98%EC%84%B8%EC%9A%94
}

History then relies on the browser to automatically encode the pathname when we push to the browser. This appears to work for all characters except for the percent sign.

There are a few thought I have on possible ways to deal with this:

Stop all decoding

I don't think that this is desirable, but I'll at least throw it out there. We could leave the decoding to the program that is using history's locations. location.pathname would just be what the user provides. Of course, this would lead to to inconsistencies between the locations created from window.location and locations created from values passed by the user.

// URI = '/역사'
// window.location.pathname = '/%EC%97%AD%EC%82%AC'
{ pathname: '/%EC%97%AD%EC%82%AC', ...}

history.push('/역사')
{ pathname: '/역사', ...}

Add a `raw` property

Before we decode the pathname, we can store the original value as location.raw (or maybe rawPathname, name isn't important here). Then, when we create a URI, we would concatenate using location.raw instead of location.pathname.

Prevent decoding of `%25`

Instead of just calling location.pathname = decodeURI(location.pathname), we could modify that to skip decoding the character string %25, which is % encoded.

const decodeButSkipPercent = pathname => (
  pathname.split('%25')
    .map(decodeURI)
    .join('%25')
)

That would work, but we would also end up with location's whose pathname is not necessarily fully decoded. The exception should only apply to the encoded percent sign, but it is still an exception.

I'm not sure which of these is the best choice (or maybe there is a better alternative that I did not think of?) 🤷‍♀️

dominykas · 2017-08-31T08:12:25Z

Of course, this would lead to to inconsistencies between the locations created from window.location and locations created from values passed by the user.

I'm sorry, I'm not familiar with the code, but is it not possible to decodeURI only and only when creating a location from window.location? I'd assume the apps/devs would never read the location directly from there, because they access it via this lib? This is the place where history does know the context and can behave accordingly?

kellyrmilligan · 2017-10-10T21:37:41Z

what is the recommended workaround for this? in components, the route params are only partially decoded, see referenced issue. we have file names that are coming from users, and sometimes they have ? in the name.

janhoeck · 2017-10-24T06:01:38Z

I have the same problem.
I try to push this to the history:
history.push('/data/Amazing%20Example')

and history converts this to the url: http://localhost:3000/data/Amazing Example

History removes my empty space! But why :)

Kapaacius · 2017-10-24T10:53:41Z

I guess I have a similar issue as @janhoeck
I want to encode part of my link because it contains stringified object
to={ /123/${encodeURIComponent(JSON.stringify({ encode: "me" }))} }
But when I now click on the link, it ends up being
#/123/{"encode"%3A"me"}
this on the url...it kinda works but what dies for me is when someone pastes such URL to slack or some other location. Clicking on that does another set of encoding magic and then the url does not work anymore.

How can I work around it? ( 2x encodeURIComponent is not really a nice fix...but would kinda help right now :D )

OliverJAsh · 2018-01-18T17:35:06Z

@pshrmn

History then relies on the browser to automatically encode the pathname when we push to the browser. This appears to work for all characters except for the percent sign.

Do you know why browsers don't encode the % character, but encode everything else? Is it a bug or intentional for some reason? Is it applicable in all browsers?

pshrmn · 2018-01-18T17:54:08Z

@OliverJAsh Browsers use percent-encoding, so if they encoded the %, that could break pre-encoded paths. From https://tools.ietf.org/html/rfc3986#section-2.4 (emphasis mine).

Because the percent ("%") character serves as the indicator for
percent-encoded octets, it must be percent-encoded as "%25" for that
octet to be used as data within a URI. Implementations must not
percent-encode or decode the same string more than once, as decoding
an already decoded string might lead to misinterpreting a percent
data octet as the beginning of a percent-encoding, or vice versa in
the case of percent-encoding an already percent-encoded string.

OliverJAsh · 2018-01-18T18:05:16Z

Ah. I've noticed browsers also do this when you enter a URL in the address bar, e.g. https://google.com/?search=sale 50% off redirects to https://google.com/?search=sale%2050%%20off instead of the correct https://google.com/?search=sale%2050%25%20off.

IIUC: as a user, when entering the URL in the address bar, we know the query params are not encoded, but the computer does not!

OliverJAsh · 2018-01-18T18:10:05Z

Perhaps one argument in favour of stopping all decoding in this module is for consistency with HTML5's native history. Currently we have this inconsistency:

history.push(`/foo/${encodeURIComponent('foo % bar')}`)
// => URL is "/foo/foo%20%%20bar"
// => `history.location.pathname` is `/foo/foo % bar`

window.history.pushState(null, '', `/foo/${encodeURIComponent('foo % bar')}`)
// => URL is "/foo/foo%20%25%20bar"
// => `window.location.pathname` is `/foo/foo%20%25%20bar`

Both the generated URL and pathname are different in these two cases.

mjackson · 2018-01-19T02:23:43Z

I think my vote is to make our push as consistent with pushState as we can. That just makes sense.

@pshrmn Isn't there some other way we can provide decoded parameters to RR users? Instead of decoding the pathnames in history, can we do it later in the router (i.e. just before we need to match with path-to-regexp)?

pshrmn · 2018-01-19T05:01:28Z

RR could decode in matchPath (or maybe higher up the chain/memoized so it isn't called as often). Off the top of my head I can't recall if there was a reason that was avoided before, but I want to say that the idea was that doing it in history would make it available to all history users, not just RR users.

If decoding is removed, there will be an inconsistency in location objects unless history expects to be given encoded pathnames. Whether or not there are real consequences to this, I am not sure (pushing the same location and whether that pushes/replaces is the one that comes to mind).

pshrmn · 2018-01-19T05:08:35Z

I haven't thought through the implications, but if you wanted to enforce that pathnames are encoded for consistency, that could be done using an anchor. That would leave encoded ones alone, but ensure anything that should be encoded is.

function encode(pathname) {
  const a = document.createElement('a');
  a.setAttribute('href', pathname);
  return a.pathname;
}

OliverJAsh · 2018-01-22T17:32:29Z

@pshrmn

This appears to work for all characters except for the percent sign.

It also appears this doesn't work for & characters. To test this:

<a href="https://google.com/foo & bar"/>go</a>

When clicked, the browser navigates to https://google.com/foo%20&%20bar (& is not URI encoded), instead of the https://google.com/foo%20%26%20bar (& is URI encoded).

I don't know whether this needs to be considered as part of this or not—just pointing it out.

pshrmn · 2018-01-22T17:40:18Z

@OliverJAsh that is the search segment of a URI, which the history package doesn't touch.

OliverJAsh · 2018-01-22T17:42:04Z

@pshrmn I've updated my example to show the same problem for pathnames.

OliverJAsh · 2018-01-22T17:44:59Z

@pshrmn In fact it would seem the browser only URI encodes spaces, unlike encodeURIComponent. Example: http://output.jsbin.com/ceqerer/1.

I presume it's using encodeURI under the hood, which unlike encodeURIComponent doesn't encode reserved characters (&, …).

(My example uses query parameters for easy testing, but the same browser mechanism is used for pathnames.)

pshrmn · 2018-01-22T19:31:24Z

I'm trying to remember exactly what I was talking about in that comment (it has been six months since I wrote it). I believe that what I meant by "this appears to work for all characters except for the percent sign" is that the percent sign is the only character that we cannot safely decode and expect the browser to properly encode.

From MDN, decodeURI (which is what history is using to decode pathnames) does the following:

Replaces each escape sequence in the encoded URI with the character that it represents, but does not decode escape sequences that could not have been introduced by encodeURI. The character “#” is not decoded from escape sequences.

Also from MDN, the following characters (which does not include the percent sign) are not encoded by encodeURI (so the encoded sequences for these characters will not be decoded by decodeURI):

A-Z a-z 0-9 ; , / ? : @ & = + $ - _ . ! ~ * ' ( ) #

history.push('/test&ing');
const decoded = decodeURI('/test&ing'); // /test&ing
history.pushState(null, '', decoded)
// window.location.pathname = '/test&ing'

decodeURI won't change the escape sequence %26 because it could not have been introduced by encodeURI

history.push('/test%26ing');
const decoded = decodeURI('/test%26ing'); // /test%26ing
// window.location.pathname = '/test%26ing'

encodeURI can encode a percent sign (encodeURI('%') === '%25'), so decodeURI will decode %25. However, the browser will not encode the percent sign (because it would not want to encode %26 into %2526), which is the source of this issue.

history.push('/test%25ing'); // %25 = encodeURI('%')
const decoded = decodeURI('/test%25ing'); // /test%ing
// window.location.pathname = '/test%ing'

OliverJAsh · 2018-01-22T20:13:38Z

@pshrmn That makes sense!

IIUC, this issue only concerns the percent symbol and not other characters that will be decoded by decodeURI because the percent symbol is particularly problematic as the browser will not re-encode it, unlike other characters.

However, it is quite surprising to find that some characters in location.pathname (and location params in React Router) are decoded whilst others are not. As you have explained, this is because decodeURI is only permitted to decode some characters. However, I would expect consistency with window.location.pathname:

const myLocation = history.createLocation({ pathname: `/search/${encodeURIComponent('foo & bar')}` });

historyInstance.push(myLocation);

myLocation.pathname
// => "/search/foo %26 bar"

window.location.pathname
// => "/search/foo%20%26%20bar"

Even if most of the decoded characters are correctly re-encoded by the browser, this is still an issue when working with location.pathname (or location params in React Router) directly.

I think this is also called out in remix-run/react-router#5607.

OliverJAsh · 2018-01-26T20:35:30Z

For anyone who runs into this issue and is looking for workarounds…

On unsplash.com we are working around this by double URI encoding:

history.push({
  pathname: `randomlocation/${ encodeURIComponent(encodeURIComponent("AB%C")) }/furthermore`
});

This results in randomlocation/AB%25C/futhermore being pushed to the HTML5 history, which is as expected.

However, this has unintended side effects.

If a user navigates directly to randomlocation/AB%25C/futhermore, location.pathname will be randomlocation/AB%C/futhermore (decoded).

If we push randomlocation/AB%25C/futhermore to the user's history using the above workaround (as randomlocation/AB%2525C/furthermore), location.pathname will be randomlocation/AB%25C/futhermore (encoded).

(The same applies to location params.)

To workaround this side effect, we run a "safe decode":

const safeDecode = (pathname) => {
  try {
    return decodeURIComponent(pathname);
  } catch (_error) {
    return pathname;
  }
}

I'm sure this workaround has further implications…

I'd be very happy to help getting this issue fixed. Is there enough consensus for us to go with one of the proposed solutions?

kevinliang · 2018-01-30T00:44:49Z

@OliverJAsh Double encoding doesn't work because if the user hits the back button then the forward button, the error will throw again.

OliverJAsh · 2018-01-30T10:32:20Z

@kevinliang What exactly would throw again? I tried what you said and I didn't get any exception.

cephirothdy2j · 2018-10-19T18:16:13Z

Here's how we're getting around the issue in our project:

Switch your project to use Yarn if necessary.
In package.json, add "history": "4.5.1" to your project's "dependencies".
Also add "history": "4.5.1" to your "resolutions".
You likely will have to encodeURIComponent and decodeURIComponent as you push entries to history and read params off of the URL.

Per the library's changelog, the encoding / decoding stuff was first added in v4.6.0, and v4.5.1 is the most recent version published without these changes.

By making history a direct dependency of our project and having any libraries that rely on it resolve to our selective dependency, we get around the encoding / decoding mess. Yes, we do miss out on the benefits this functionality provides - as well as any other upgrades and fixes published to history since 4.5.1 - but we were not using any of those other features anyways.

Andrekra · 2018-11-21T09:30:35Z

For me it seems the decodeURI is the culprit.

I tried to visit an url with reserved characters (? # % / ) so I use encodeURIComponent to build the url. /the/path%23%25%3F%2F. But the decodeURI in createLocation, changes it to /the/path/%23%%3F%2F Which cant be decoded with decodeURIComponent or used re-encode with encodeURI in the hope to get the original string back.

// User
let pathname = encodeURIComponent('%?#') // %25%3F%23

// History
pathname = decodeURI(patname) // %%3F%23

// User Trying to get the ```%?#``` back
decodeURIComponent(pathname) // Uncaught URIError: URI malformed
// Another attempt, encode it back to get the original pathname
ecodeURI(pathname) // %25%253F%2523 Huh? Thats not my original string. Double encode the %

From random googling, people said the decodeURI/encodeURI should be used on full url's, while decodeURIComponent/encodeURIComponent on parts (component) of the url. The pathname is a component. More info here for the difference between them https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURI#Description

I've wrote a few tests. If you change decodeURI to decodeURIComponent these will pass:

  describe('with a path that can be decoded', () => {
    it('handles encoded path', () => {
      expect(
        createLocation({pathname: '/the/path%23%25%3F%2F'})
      ).toMatchObject({
        pathname: '/the/path#%?/',
        search: '',
        hash: ''
      });
    });

    it('handles not encoded path', () => {
      expect(
        createLocation('/the/path?the=query#the-hash')
      ).toMatchObject({
        pathname: '/the/path',
        search: '?the=query',
        hash: '#the-hash'
      });
    });
  });

There are 3 (well 1 for 3 histories) tests failing, but I think that would be a false negative:

        Expected value to match object:
          {"pathname": "/view/%23abc"}
        Received:
          {"hash": "", "key": "rw3pt0", "pathname": "/view/#abc", "search": "", "state": undefined}

A problem would arise, when you decode %23 as an hash, and later on would re-interpertate the hash as a valid fragment (if this would be possible). Safest way would be to store the original path somewhere.

Edit:

I also discovered a second issue, related to this.
If you refresh on an URL with a safe character encoded, it's decoded (PUSH). If you use POP, the path stays as is.

const history = createBrowserHistory()
console.log('INIT:', history.location.pathname)
history.listen(location => {
  console.log(history.location.pathname)
})

// Refresh on page /the/path%23%25%3F%2F
=> INIT /the/path/%23%%3F%2F (POP, decoded, notice the double %)
// Go back and forth in in browser history
=>  /the/path/%23%25%3F%2F (PUSH, no decode)

This means I sometime have to decode twice, and sometimes once, if I use the double encode method described.

My current workaround:

const decode = str => {
  try {
    return decodeURIComponent(decodeURIComponent(str))
  } catch (e) {
    try {
      return decodeURIComponent(str)
    } catch {
      return str
    }
  }
}

const encode = str => encodeURIComponent(encodeURIComponent(str))

Dalzhim · 2018-12-21T01:17:07Z

Invoking decodeURI(location.pathname) on URLs before handling them to the browser's history is a major issue. First of all, it forces the browser to store malformed URLs (i.e. the original url https://mydomain.com/test%25 is modified to be https://mydomain.com/test%). Any browser vendor could disregard such blatant misuse of the API and completely break every user of this library when modifying their API to throw when pushing/replacing invalid URLs. They could also very well silently discard malformed URLs and never allow popping them back.

In this thread, workarounds were suggested as to decode everything except % signs. I would argue that it is a bad idea because you're going to surprise some of your indirect users who are going to have a hard time when their RR matcher doesn't work the same way once the % character is part of the URL (i.e. URLs are partially decoded except %25, which means you need to match against %25 rather than %).

Even though this represents a breaking change for people who came to rely on this implicit decoding, it is also a mandatory bug fix as applications relying directly or indirectly on this library are broken in regards to the % character and there is no reasonable workaround without fixing the root cause.

In conclusion, better figure out a path to make the breaking change at the time of your choosing rather than wait for a browser vendor to force you into it. In the mean time, the main Readme should list this as a known issue.

Prophet32j · 2018-12-21T02:58:18Z

We are ditching RR for @reach router because of this massive issue.

Fixes remix-run#505

OliverJAsh · 2019-01-10T17:11:10Z

Trying to figure out where this issue now stands and how we can move forward.

I think you're right, @OliverJAsh. We should have left the pathname encoded and then only decoded it when we needed to make the match in React Router. This is the way previous versions of the router worked.

Now, the question is how to get there from here w/out breaking everyone's apps. This would be a breaking change, so we would need to push a new major version.

#505 (comment)

Plan:

Raise PR to remove decoding from history
Release a new version of history with this change
Raise PR to upgrade history in react-router and handle decoding internally
Release a new version of react-router with this change

I've started 1 in #656.

kevinkyyro · 2019-01-22T18:36:32Z

As a matter of API design, if you want to allow users to provide decoded paths, you could accept a list instead of a string:

{
  pathname: '/search/a%2Fb%20testing/books'
}
// should be equivalent to something like
{ 
  pathname: ['search', 'a/b testing', 'books'] 
}

The relationship between each form should be approximately:

listPathname = stringPathname.replace(/^\/|\/$/g, '').split('/').map(decodeURIComponent)
stringPathname = '/' + listPathname.map(encodeURIComponent).join('/')

OliverJAsh · 2019-02-11T13:32:16Z

Update on #505 (comment). We've been waiting for a few weeks now for feedback from a maintainer of this repo. No signs yet. :-(

In the meantime, I thought I'd share how we're working around this at Unsplash. We use patch-package, which lets us modify the contents of node_modules. Using this, we can remove the decoding. This has fixed all of our problems.

patch-package
--- a/node_modules/history/LocationUtils.js
+++ b/node_modules/history/LocationUtils.js
@@ -44,15 +44,9 @@ var createLocation = exports.createLocation = function createLocation(path, stat
     if (state !== undefined && location.state === undefined) location.state = state;
   }
 
-  try {
-    location.pathname = decodeURI(location.pathname);
-  } catch (e) {
-    if (e instanceof URIError) {
-      throw new URIError('Pathname "' + location.pathname + '" could not be decoded. ' + 'This is likely caused by an invalid percent-encoding.');
-    } else {
-      throw e;
-    }
-  }
+  // Waiting for upstream PR to be merged
+  // https://github.com/ReactTraining/history/issues/505#issuecomment-453175833
+  // https://github.com/ReactTraining/history/pull/656
 
   if (key) location.key = key;
 
--- a/node_modules/history/es/LocationUtils.js
+++ b/node_modules/history/es/LocationUtils.js
@@ -31,16 +31,6 @@ export var createLocation = function createLocation(path, state, key, currentLoc
     if (state !== undefined && location.state === undefined) location.state = state;
   }
 
-  try {
-    location.pathname = decodeURI(location.pathname);
-  } catch (e) {
-    if (e instanceof URIError) {
-      throw new URIError('Pathname "' + location.pathname + '" could not be decoded. ' + 'This is likely caused by an invalid percent-encoding.');
-    } else {
-      throw e;
-    }
-  }
-
   if (key) location.key = key;
 
   if (currentLocation) {
--- a/node_modules/history/umd/history.js
+++ b/node_modules/history/umd/history.js
@@ -157,16 +157,6 @@ return /******/ (function(modules) { // webpackBootstrap
 	    if (state !== undefined && location.state === undefined) location.state = state;
 	  }
 
-	  try {
-	    location.pathname = decodeURI(location.pathname);
-	  } catch (e) {
-	    if (e instanceof URIError) {
-	      throw new URIError('Pathname "' + location.pathname + '" could not be decoded. ' + 'This is likely caused by an invalid percent-encoding.');
-	    } else {
-	      throw e;
-	    }
-	  }
-
 	  if (key) location.key = key;
 
 	  if (currentLocation) {

@XavierM

This PR covers the Routing + Page Skeleton of the Network IP Details page, including: * New route `app/secops#/network/ip/:ipv4_or_ipv6` * Empty page `ip_details` following convention from `host_details` * Links to IP fields for following components: * Authentications Table * Events Table * Host Summary * Network Top N Flow Table * Timeline Table * HoC `HostDetailsLink` and `IPDetailsLink` for easily linking to the `HostDetails` or `IP Details` pages * Refactored `Breadcrumb` component and navigation logic using react-router's `withRouter` * Added `encodeIpv6` & `decodeIpv6` helper functions (see note below) * Added tests for: * HeaderBreadcrumbsComponent * Custom links `HostDetailsLink` and `IPDetailsLink` * ipv6 helper functions Includes fixes for: * https://github.com/elastic/ingest-dev/issues/310 * https://github.com/elastic/ingest-dev/issues/234 Note: Due to remix-run/history#505 and the way SecOp's react-router is nested within Kibana's native routing (via `link-to`), URL's that contain characters that require URL Encoding (like colons in ipv6) can break the back button. This occurs as react-router adds a history entry for the decoded URL, so when pressing the back button the user returns to the decoded `link-to` url, which in turns re-routes them back to the page they were on. Speaking with @XavierM, we decided an initial solution for this would be to replace the `:`'s in ipv6's with `-`'s. As referenced in the issue above, it looks as if there is potential for a more robust solution we could implement at the `link-to` layer of our application. In the meantime, our current solution maintains the functionality of the back button, and also allows the user to manually input ipv6's with `:`'s into the address bar without breaking navigation. ![host_ip](https://user-images.githubusercontent.com/2946766/53674424-18ee6080-3c4b-11e9-8149-cb4dd048ca70.gif)

OliverJAsh · 2019-03-25T14:31:01Z

There's some discussion in #656 about what the correct behaviour is (as opposed to decoding pathname). Our options are:

Just use whatever pathname is provided (i.e. leave it to the browser/user to encode).
Try to normalize pathnames so that they are encoded. If the provided pathname is fully encoded, use as is, otherwise (if the pathname is not encoded at all, or some parts are not) encode the parts that are not yet encoded.

What do people here think? Can anyone foresee 2 causing issues?

With 2, I'm worried about bugs in the normalization process. This would be off loaded to encodeurl. Is it a completely safe operation? How likely is it to go wrong?

/cc @pshrmn

bripkens · 2019-03-25T14:43:53Z

Try to normalize pathnames so that they are encoded. If the provided pathname is fully encoded, use as is, otherwise (if the pathname is not encoded at all, or some parts are not) encode the parts that are not yet encoded.

IMHO this is impossible to implement. Let's say I have a path template like /article/:name with the name parameter having the value the best 10%. Since we are all fine developers we encode the value and build the full path. This results in:

$ '/article/' + encodeURIComponent('the best 10%')
'/article/the%20best%2010%25'

A generic piece of code cannot identify that this is already properly encoded. This is the whole reason why this issue here came up. encodeURI is already attempting such an implementation. For completeness:

$ encodeURI('/article/the%20best%2010%25')
/article/the%2520best%252010%2525

Just use whatever pathname is provided (i.e. leave it to the browser/user to encode).

The value is called pathname which translates in my head to path and thus to the way it is defined in the specification. IMHO it is fair to assume / require that developers properly escape when fiddling around with dynamic paths.

P.S. We are making heavy use of matrix parameters at Instana. Due to this bug we are operating a fork of history for the last 1.5 years which just removes the double encoding. It is working perfectly fine for us.

pshrmn · 2019-03-25T14:51:35Z

@bripkens encodeurl is designed to work with already/partially encoded strings.

With this sandbox you can see that it handles fully encoded, non-encoded, and partially encoded strings.

Dalzhim · 2019-03-25T15:43:02Z

@bripkens is right when he says it is impossible to implement. In order to correctly encode an URL, you must be able to prove the reverse transformation is always possible without any loss of information. Otherwise you might be causing prejudice to the application that expects to see the same URL that was given to the system originally.

Your sandbox demonstrates this impossibility. Three different source URLs are encoded to the same resulting URL. When you decode the final URL, you cannot possibly retrieve the correct source URL all of the time because you cannot guess the intent of the user. This will inevitably lead to subtle bugs that often go unnoticed.

One final example to demonstrate how encodeurl will screw up some URLs is shown in this sandbox I derived from yours. Assuming I create an article with the following title : %20 is a space %-encoded, the urlencode library sees it as being partially encoded when really, it is not encoded at all. The resulting encoded string is the following : %20%20is%20a%20space%20%25-encoded which then decodes to ** is a space %-encoded**, corrupting the title of my article by decoding the litteral %20 I originally used.

pshrmn · 2019-03-25T17:04:41Z

@Dalzhim I'll start out with your final example; please let me know if I am misinterpreting your point.

Let's start out by completely ignoring the encodeurl package.

If we use history.pushState to navigate to this /%20 is a space %-encoded, we end up with the following pathname:

history.pushState(null, "", "/%20 is a space %-encoded")
window.location.pathname === "/%20%20is%20a%20space%20%-encoded"

The browser does not encode percentage signs, so the initial %20 octet is used as provided. We also have an invalid URI because %-e is an invalid octet.

decodeURI(window.location.pathname) // this throws an URIError

To make the URI valid, we can pass it to an encodeURI call. This will encode %-en to %25-en as well as convert %20 to %2520%20. This will result in a pathname that can be decoded to the desired result.

history.pushState(null, "", encodeURI("/%20 is a space %-encoded"))
window.location.pathname === "/%2520%20is%20a%20space%20%25-encoded"
decodeURI(window.location.pathname) // "/%20 is a space %-encoded" CORRECT

Now, getting back to encodeurl, what happens if we wrap the pathnames in encodeurl calls?

history.pushState(null, "", encodeurl("/%20 is a space %-encoded"))
window.location.pathname === "/%20%20is%20a%20space%20%25-encoded"
decodeURI(window.location.pathname) // "/ is a space %-encoded" INCORRECT

history.pushState(null, "", encodeurl(encodeURI("/%20 is a space %-encoded")))
window.location.pathname === "/%2520%20is%20a%20space%20%25-encoded"
decodeURI(window.location.pathname) // "/%20 is a space %-encoded" CORRECT

In the first case, we still have the incorrect pathname, but it is no longer an invalid URI because encodeurl correctly encodes %-en to %25-en. The issue is that encodeurl interpets the initial %20 as a valid octet.

In the second case, we get the same (correct) result as the plain encodeURI result; wrapping an encodeURI call in an encodeurl call does not have any issues with double encoding.

encodeurl offers partial protection, but is not capable of determining the users intent. The conclusion that I draw from this is that the user should always encode percentage signs themselves.

Dalzhim · 2019-03-25T18:21:59Z

@pshrmn : Your previous message is 100% correct. The user is the only one who has enough information to disambiguate any malformed URL. The first case ended up with a valid, but meaningless URL whereas the second example displays proper construction of the URL by the user, which makes it possible for the library to handle in a generic way. In other words, decodeURI(encodeURI(url)) === url must always be true, otherwise the error is the user's responsibility.

OliverJAsh · 2019-03-26T08:41:28Z

Someone who knows better than me should raise a PR to encodeurl to mention these caveats…

kelly-tock · 2019-03-26T11:07:45Z

Can I ask before this issue, maybe rr 3 and below, what were the isssues with how it treated path params and what not? I don’t remember tbh having an issue until 4.

Fixes #505

The `history` package automatically tries to decode the URI and will break if the URI has an unencoded `%`. This problem usually props up when users manually manipulate the URL since in-app navigation will properly handle encoding. This change will redirect the user to the current path and strip all query strings. See remix-run/history#505 - They merged a change that would make the `history` package not always attempt to decode the URI, but requires a major version bump (and corresponding updates to react-router). Fixes JAVASCRIPT-7W0

…#13306) The `history` package automatically tries to decode the URI and will break if the URI has an unencoded `%`. This problem usually props up when users manually manipulate the URL since in-app navigation will properly handle encoding. Currently, when this happens, the user is greeted with a big empty page. Instead, this change will redirect the user to the current path and strip all query strings so that we're able to at least load the Sentry app. See remix-run/history#505 - They merged a change that would make the `history` package not always attempt to decode the URI, but requires a major version bump (and corresponding updates to react-router). Fixes JAVASCRIPT-7W0 Fixes JAVASCRIPT-5XJ

Fixes remix-run#505

This was referenced Aug 28, 2017

Partially encoded parameter remix-run/react-router#5472

Closed

matchPath failing at unicode paths remix-run/react-router#5474

Closed

This was referenced Aug 28, 2017

Was automatic decoding a mistake? #516

Closed

Add rawPathname to location #517

Closed

timdorr mentioned this issue Oct 10, 2017

Route parameter value with an encoded '?' as %3F not decoded in params remix-run/react-router#5607

Closed

mjackson mentioned this issue Jan 19, 2018

decodeUri the path breaks encoding #551

Closed

timdorr mentioned this issue Oct 10, 2018

Using Unicode routes remix-run/react-router#6386

Closed

timdorr mentioned this issue Dec 11, 2018

When I try to use replace, will the location search value be automatically transcoded? remix-run/react-router#6506

Closed

OliverJAsh added a commit to OliverJAsh/history that referenced this issue Jan 10, 2019

Remove location pathname decoding

11bde7a

Fixes remix-run#505

OliverJAsh mentioned this issue Jan 10, 2019

Remove location pathname decoding #656

Merged

1 task

spong mentioned this issue Mar 7, 2019

Add Network IP Details Page elastic/kibana#32365

Merged

Dalzhim mentioned this issue Mar 12, 2019

Inform users about known issues #675

Closed

jacobwgillespie mentioned this issue Mar 21, 2019

Don't decode paths in locations (keep the browser-native value) frontarm/navi#69

Closed

mjackson closed this as completed in #656 Mar 26, 2019

mjackson pushed a commit that referenced this issue Mar 26, 2019

Remove location pathname decoding

78f016f

Fixes #505

billyvg mentioned this issue May 21, 2019

fix(ui): Sentry completely breaks if there is an unencoded % in URL getsentry/sentry#13306

Merged

lock bot locked as resolved and limited conversation to collaborators May 25, 2019

forl pushed a commit to forl/history that referenced this issue Aug 12, 2019

Remove location pathname decoding

930f0f7

Fixes remix-run#505

Path is decoded in createLocation #505

Path is decoded in createLocation #505

Comments

ArthurKnaus commented Jul 26, 2017

pshrmn commented Jul 26, 2017

Stop all decoding

Add a raw property

Prevent decoding of %25

dominykas commented Aug 31, 2017

kellyrmilligan commented Oct 10, 2017

janhoeck commented Oct 24, 2017

Kapaacius commented Oct 24, 2017

OliverJAsh commented Jan 18, 2018

pshrmn commented Jan 18, 2018

OliverJAsh commented Jan 18, 2018 • edited

OliverJAsh commented Jan 18, 2018 • edited

mjackson commented Jan 19, 2018

pshrmn commented Jan 19, 2018

pshrmn commented Jan 19, 2018

OliverJAsh commented Jan 22, 2018 • edited

pshrmn commented Jan 22, 2018

OliverJAsh commented Jan 22, 2018

OliverJAsh commented Jan 22, 2018 • edited

pshrmn commented Jan 22, 2018

OliverJAsh commented Jan 22, 2018 • edited

OliverJAsh commented Jan 26, 2018 • edited

kevinliang commented Jan 30, 2018

OliverJAsh commented Jan 30, 2018

cephirothdy2j commented Oct 19, 2018 • edited

Andrekra commented Nov 21, 2018 • edited

Dalzhim commented Dec 21, 2018 • edited

Prophet32j commented Dec 21, 2018

OliverJAsh commented Jan 10, 2019 • edited

kevinkyyro commented Jan 22, 2019 • edited

OliverJAsh commented Feb 11, 2019

OliverJAsh commented Mar 25, 2019

bripkens commented Mar 25, 2019 • edited

pshrmn commented Mar 25, 2019

Dalzhim commented Mar 25, 2019

pshrmn commented Mar 25, 2019

Dalzhim commented Mar 25, 2019 • edited

OliverJAsh commented Mar 26, 2019

kelly-tock commented Mar 26, 2019

Add a `raw` property

Prevent decoding of `%25`

OliverJAsh commented Jan 18, 2018 •

edited

OliverJAsh commented Jan 18, 2018 •

edited

OliverJAsh commented Jan 22, 2018 •

edited

OliverJAsh commented Jan 22, 2018 •

edited

OliverJAsh commented Jan 22, 2018 •

edited

OliverJAsh commented Jan 26, 2018 •

edited

cephirothdy2j commented Oct 19, 2018 •

edited

Andrekra commented Nov 21, 2018 •

edited

Dalzhim commented Dec 21, 2018 •

edited

OliverJAsh commented Jan 10, 2019 •

edited

kevinkyyro commented Jan 22, 2019 •

edited

bripkens commented Mar 25, 2019 •

edited

Dalzhim commented Mar 25, 2019 •

edited