Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the raw url path when proxying upstream requests #997

Merged
merged 7 commits into from
Sep 17, 2021

Conversation

FStelzer
Copy link
Contributor

@FStelzer FStelzer commented Jan 18, 2021

This way escaped char's like /%2F/ will still work instead of returning
301. The proxy should ideally touch the request as little as possible /
needed.

Description

Adds a new configuration flag to enable passing of the raw path to upstreams

Motivation and Context

#844

How Has This Been Tested?

a couple of new tests and running in our production setup for ~6month now.

Checklist:

  • My change requires a change to the documentation or CHANGELOG.
  • I have updated the documentation/CHANGELOG accordingly.
  • I have created a feature (non-master) branch for my PR.

@JoelSpeed
Copy link
Member

I think this is actually a reasonable fix, did you ever look further into possible requests that it would break?

@FStelzer
Copy link
Contributor Author

FStelzer commented Feb 8, 2021

I've deployed this change into our production which proxies various admin tools / frontends and had no issue with any of them.
If something breaks by this change then it was probably broken before putting a proxy in front of it. Generally i'm in favor of having the proxy touch the user request as little as possible.

@JoelSpeed
Copy link
Member

Ok well that seems reasonable, I've had a bit of a think about this and I think it's probably the right way to go, would like to get @NickMeves opinion as well though, Nick can you think of any negative affects of this change?

@NickMeves
Copy link
Member

I've definitely run into this in the past (with Apache & python webservers if I recall). This auto %2F to / conversion is a widespread problem.

My only concern is the breaking aspect of this change -- us flipping this over would break upstream systems that potentially accounted for this issue in their logic?

Is this minor enough of a breaking change to squeeze into a minor release?

@FStelzer
Copy link
Contributor Author

To rely on the current behaviour you'd have to explicitly link to sth like /%2F/ and expect the backend to just get /
That doesen't really make sense since you could link / in the first place.
If you were working around this behaviour you would probably change the encoding for the / altogether (or put it into a parameter) and that should continue to work.
I think the behaviour change regarding %2F and / is rather safe. I'm not sure though which other encoded char's might now be passed that were removed/decoded/rewritten by the go url parser before.

@NickMeves
Copy link
Member

To rely on the current behaviour you'd have to explicitly link to sth like /%2F/ and expect the backend to just get /
That doesen't really make sense since you could link / in the first place.
If you were working around this behaviour you would probably change the encoding for the / altogether (or put it into a parameter) and that should continue to work.
I think the behaviour change regarding %2F and / is rather safe. I'm not sure though which other encoded char's might now be passed that were removed/decoded/rewritten by the go url parser before.

I ran into this a lot build security threat indicator systems. The threat indicator (IP, email, domain, URL) would be the last dynamic part of a path in a REST API on the graph database structure (e.g. a route in my backend like this - /vertex/indicator/:threat)

Where :threat being the last part, I recall I could handle any nested / in there I received from my frontend proxy layers. Not saying that was the best (or even a good design 😅 ) -- but that's a sample scenario I've had where you can expect dynamic %2F to / conversions where I worked around it on the backend because I didn't have access to tweak the corporate hosting layer I was stuck with.

If this change happened under my nose without being marked as breaking - I would've needed to add URL decoding logic on the :threat variable in the backend.

Now the question - is this situation minor enough of a breaking change to warrant this sneaking into a minor release instead of waiting for the next major?

@NickMeves
Copy link
Member

This got me curious for why all these webserver implementations do this by default and don't allow encoded slashes. I came across this blurb:

Both Apache and Tomcat intentionally reject URIs with an encoded slash (%2F for / and %5C for ) to prevent possible security vulnerabilities such as CVE-2007-0450 and CVE-2007-0450 related attacks.

I'll look into those CVEs and think about the threat vectors

@FStelzer
Copy link
Contributor Author

FStelzer commented Feb 20, 2021 via email

@NickMeves
Copy link
Member

Hm. Let me check my patch again. My intention was not to decode the %2F in the url but pass it through as is. I believe the proxy should not touch the incoming request at all.

Yeah, I think that's the attack vector various web servers have in mind when they refuse to proxy %2F and auto-encode it to / -- attackers stuffing in %2F to try to do directory traversal to a backend layer that turns it into a / potentially where a frontend security filtering layer would miss it.

You can see that attack behavior a ton in our open redirect attack test cases: https://github.com/oauth2-proxy/oauth2-proxy/blob/master/testdata/openredirects.txt

@NickMeves
Copy link
Member

I think we need to look at the underlying Go HTTP server and see how it turns raw HTTP protocol bytes into its req.URL object we receive to better understand what is in the URL.Path and URL.RawPath.

Then if we went down this route - I think we would want a flag (e.g. something like allowEncodedSlashes that other webserver use for users to opt out of the security protections).

And finally, we would likely need our own function that only operates on encoded slashes -- I think URL.EscapedPath() will be mutating a lot of other characters looking at the source code.

I'm thinking since Go opted to turning %2F into / automatically in http.Request in their implementation, they must've been mindful of the same CVEs & security implecations that caused all other proxy servers to behave like this 🤔

@NickMeves
Copy link
Member

NickMeves commented Feb 20, 2021

I think we need to look at the underlying Go HTTP server and see how it turns raw HTTP protocol bytes into its req.URL object we receive to better understand what is in the URL.Path and URL.RawPath.

It looks like ReadRequest in net/http/request.go is the place to look. And that's usage of ParseRequestURI for getting our req.URL here that we want to consume.

@NickMeves
Copy link
Member

This seems to be Go's implementation that we want to be mindful of - its comment has the description of our scenario: https://github.com/golang/go/blob/d4b26382342c98a95b85140b2863bc30c48edd68/src/net/url/url.go#L676

@@ -51,6 +51,7 @@ type multiUpstreamProxy struct {

// ServerHTTP handles HTTP requests.
func (m *multiUpstreamProxy) ServeHTTP(rw http.ResponseWriter, req *http.Request) {
req.URL.Path = req.URL.EscapedPath()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the implementation here: https://github.com/golang/go/blob/d4b26382342c98a95b85140b2863bc30c48edd68/src/net/url/url.go#L700

If we only want to be aware of force encoded slashes resulting in a difference during the building of URL.Path and URL.RawPath we should do this I think:

Suggested change
req.URL.Path = req.URL.EscapedPath()
if req.URL.RawPath != "" {
req.URL.Path = req.URL.EscapedPath()
}

Otherwise EscapedPath() goes down another codepath where it will do escaping on req.URL.Path even though the original request builder didn't make a RawPath because it wasn't needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about a more generic flag switching between the sanitized go Path and the escaped raw path instead of just allowEncodedSlashes?

And finally, we would likely need our own function that only operates on encoded slashes -- I think URL.EscapedPath() will be mutating a lot of other characters looking at the source code.
that's what i was trying to avoid. EscapedPath already considers a few edge cases and validates it. i don't wanna copy/reimplement this (and add new/other bugs).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about a more generic flag switching between the sanitized go Path and the escaped raw path instead of just allowEncodedSlashes?

That seems good to me - since looking at the Go implementation, we are getting forced into more than just %2F auto-decoding.

@FStelzer
Copy link
Contributor Author

interestingly i looked into adding this config flag per upstream and found the following comment:
https://github.com/oauth2-proxy/oauth2-proxy/blob/master/pkg/upstream/http.go#L147

@JoelSpeed do you remember what this referred to? this looks to me like it tried to fix the same thing?
I'm not sure about the go http/proxy stuff but i guess the client side http server part already did the rewriting/redirect of /%2F/ to / at this point?
If thats the case it would be difficult to add the setting per upstream :/

@JoelSpeed
Copy link
Member

I didn't actually add that comment, just moved it from one package to another, it was originally introduced in bitly/oauth2_proxy#17 which does seem to be trying to resolve the same issue, but maybe is now broken?

@FStelzer
Copy link
Contributor Author

maybe back then the proxy did not allow for several upstreams / static file handlers and was not using a mux at all but proxying straight through?
anyway, i can add a flag to add the desired behavior to the mux (but not individual upstreams). Is there a clever way of setting specific options in the test framework? i'm not familiar with ginkgo - but maybe there are other tests already that test flags in both directions?

@JoelSpeed
Copy link
Member

anyway, i can add a flag to add the desired behavior to the mux (but not individual upstreams). Is there a clever way of setting specific options in the test framework? i'm not familiar with ginkgo - but maybe there are other tests already that test flags in both directions?

Rather than relying on the flag, implement a new option on the Upstreams struct (you might need to change this from a list to a struct with a list inside it), then in the tests you can just set that value as you construct the new upstream proxy

@FStelzer
Copy link
Contributor Author

Sorry for the late reply. I was on vacation and offline for the last 2 weeks :)

Rather than relying on the flag, implement a new option on the Upstreams struct (you might need to change this from a list to a struct with a list inside it), then in the tests you can just set that value as you construct the new upstream proxy

Hmm. changing it to a struct would actually change the existing config file format. There might be more mux related options in the future so maybe a seperate config struct UpstreamOptions for general settings might make migration easier?

i can set these easily via the test (and already do with the simple flag i added for now). My problem was more that the test framework sets everything up first (including all the options) and then runs all the tests on the same options. What i would like would be to run a different set of tests with different options and i'm not sure how to do this ginkgo. Do i put another Describe("Proxy Suite - flagX") in the same test file?

@JoelSpeed
Copy link
Member

changing it to a struct would actually change the existing config file format.

The structured config is currently in alpha stage, so we can make breaking changes and that's ok! (I put a big red warning at the top of the docs about it! :D)

We could do something like

upstreams:
  option1: foo
  configs:
    - <existing upstreams>

Not sure about how to name this tbh, @NickMeves any thoughts on this change? Another option is upstreamConfig and then upstreams within that

Do i put another Describe("Proxy Suite - flagX") in the same test file?

Yeah i think in this case, if it is going to affect all of the upstreams, then you would want it to.

I guess the alternative to all of this, is whether this should be an option per upstream, that would then make it not a breaking change and would be easier to set up a new backend for the test suite. Does this actually effect file or static paths? Or is it really just that we need to set something in the proxy director of the http proxy?

I guess we need to work out exactly where the 301 is coming from 🤔

@JoelSpeed
Copy link
Member

Thinking about this, I was recently investigating replacing the mux with gorilla/mux, I did some work on this here #1060, I wonder if it might be easier to resolve this issue there. I notice they have a StrictSlash option, they have a SkipClean option for various routes, which we may be able to use to prevent the cleaning behaviour that is breaking these requests

@JoelSpeed
Copy link
Member

Looked into this a bit today and while it would work with gorilla mux, we can only set it for every upstream, we can't set it for individual upstreams at the moment. Though, there is a stale PR in place that would allow this to work gorilla/mux#463. I'll see if there's anything I can do to help get that PR merged

@FStelzer
Copy link
Contributor Author

changed the upstream config to a struct with the upstream list inside. the current change has no effect yet and all tests succeed (not sure why the github ci fails). If this is ok i can readd the rawpath change and tests at least as a setting for all upstreams.
when gorilla/mux supports the change per upstream the config also allows setting this then.

let me know if i can proceed. just wanted to check before adjusting the rest.

regarding the tests: there really should be a gomega wildcard/regex matcher used instead of generic errors that need to fully match. otherwise quite a few tests will keep breaking all the time (e.g. the main_test check for a broken yaml that references a specific test config line number)
I'm not familiar with ginkgo/gomega but when adding tests for the new flag i can look into it.

@FStelzer
Copy link
Contributor Author

FStelzer commented Apr 12, 2021

hm... weird... i've readded the escapedpath line and the unit tests work fine but i still get the 301 redirect from the oauth proxy....
has something about the request path changed in the meantime and we hit the same problem somewhere else as well?

EDIT: Forget what i wrote... something about my test was broken (or my browser had the redirect cached). The flag works as expected

@FStelzer FStelzer marked this pull request as ready for review April 14, 2021 15:18
@FStelzer FStelzer requested a review from a team as a code owner April 14, 2021 15:18
@JoelSpeed
Copy link
Member

Before we proceed with this, I want to spend a little time working out how this will interact with #1128 and #1060. We are basically ripping out and replacing the mux's in the regular and upstream cases, so I don't want to introduce this change now unless we can keep the behaviour when we merge these. Apologies for the delays in this, will hopefully get to fixing those up in the next couple of weeks

@FStelzer
Copy link
Contributor Author

the actual change to the upstream proxy is just a single line so i don't think there will be much of an impact but we can wait for the others to merge and i can rebase this one

@JoelSpeed
Copy link
Member

One of my concerns/thoughts here is that gorilla mux behaves very differently to the standard mux. I think it may even "just work" and not return the 301 redirects when it finds encoded characters, so we may not need this. But as I say, will keep this in mind and double check what we are doing

@github-actions
Copy link
Contributor

This pull request has been inactive for 60 days. If the pull request is still relevant please comment to re-activate the pull request. If no action is taken within 7 days, the pull request will be marked closed.

@FStelzer
Copy link
Contributor Author

FStelzer commented Aug 9, 2021

ok. i have pushed a new version against current master.
this diff changes Upstreams to a struct to enable config options for the mux and adds the first one for enabling UseEncodedPath()
The proxy_test rewrite might look a bit confusing at first. But by generating the upstreams for every test we can test different mux settings here as well.

@FStelzer FStelzer force-pushed the issue844 branch 2 times, most recently from 9c24bc2 to 13d9709 Compare August 18, 2021 07:24
@FStelzer
Copy link
Contributor Author

ok, this should fix the checked in docs. i had no idea these were generated automatically...

@github-actions github-actions bot closed this Aug 26, 2021
@FStelzer
Copy link
Contributor Author

Hm :/
Why was this closed @JoelSpeed?

@JoelSpeed JoelSpeed removed the Stale label Aug 26, 2021
@JoelSpeed
Copy link
Member

Absolutely no idea, it shouldn't have been

@JoelSpeed JoelSpeed reopened this Aug 26, 2021
@FStelzer
Copy link
Contributor Author

anything i can do to help this along?
without this patch the proxy is unusable for something like the rabbitmq admin panel.

Copy link
Member

@JoelSpeed JoelSpeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the delay in reviewing this, I've been unable to get much time for this project lately due to other commitments and holidays

@@ -230,7 +236,7 @@ redirect_url="http://localhost:4180/oauth2/callback"
configContent: testCoreConfig,
alphaConfigContent: testAlphaConfig + ":",
expectedOptions: func() *options.Options { return nil },
expectedErr: errors.New("failed to load alpha options: error unmarshalling config: error converting YAML to JSON: yaml: line 49: did not find expected key"),
expectedErr: fmt.Errorf("failed to load alpha options: error unmarshalling config: error converting YAML to JSON: yaml: line %d: did not find expected key", strings.Count(testAlphaConfig, "\n")),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice fix, thanks! :D

oauthproxy.go Show resolved Hide resolved
Comment on lines 16 to 18
// Upstream represents the configuration for an upstream server.
// Requests will be proxied to this upstream if the path matches the request path.
Configs []Upstream `json:"configs,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment should start with the name of the field, so in this case Configs instead of upstreams.

One quick question, what's a better name here in terms of UX, the three possibilities I can think of are configs, backends or servers, did you have any thoughts about why configs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we call the surrounding struct UpstreamConfig and this one Upstreams?
Easy enough to change.

@@ -8,7 +8,15 @@ const (
)

// Upstreams is a collection of definitions for upstream servers.
type Upstreams []Upstream
type Upstreams struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels a bit weird to be called Upstreams now right? I keep reading upstreams.configs and I think it would be better if we had it as upstream.configs.

I wonder if we should rename this to Upstream and the current Upstream struct to be something like UpstreamConfig, this could be a follow up but should be done before we release I think 🤔

Comment on lines +41 to +42
// Allows for specifying settings and even individual upstreams for specific tests and uses the default upstreams/configs otherwise
upstreams := in.upstreams
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good approach here!

Comment on lines 104 to 105
//upstreamServer.ServeHTTP(rw, req)
var err error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these are needed right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right. i'll remove these

Copy link
Contributor Author

@FStelzer FStelzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for your review :)
I will push an updated version tomorrow i think

Comment on lines 104 to 105
//upstreamServer.ServeHTTP(rw, req)
var err error
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right. i'll remove these

@FStelzer
Copy link
Contributor Author

i just pushed a new version leaving the fixup commits in place and adding a rename commit.
it's probably easier to leave the rename on its own instead of squashing it into the one changing the struct.
if we want a really clean history i could rename it first and then change it into a struct afterwards with the new name for the member. doesn't really make a big difference though and i think this is easier to review. the fixups can of course be squashed though

Copy link
Member

@JoelSpeed JoelSpeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think we are nearly there, please see my comment about the user facing naming as well. Once you've fixed that, if you can make sure we have a changelog entry and the commits are squashed into logical commits then I'm happy to get this merged, thanks for working with me on this one :)

@@ -12,7 +12,7 @@ type AlphaOptions struct {
// Upstreams is used to configure upstream servers.
// Once a user is authenticated, requests to the server will be proxied to
// these upstream servers based on the path mappings defined in this list.
Upstreams Upstreams `json:"upstreams,omitempty"`
Upstreams UpstreamConfig `json:"upstreams,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the naming would be good to keep consistent. My last comment was primarily a UX based thing, so I'm thinking that in the yaml it would be good to be clear

usptreamConfig:
  proxyRawPath: true
  upstreams:
  - type: ...

Does that make sense?

Suggested change
Upstreams UpstreamConfig `json:"upstreams,omitempty"`
UpstreamConfig UpstreamConfig `json:"upstreamConfig,omitempty"`


// Upstreams represents the configuration for the upstream servers.
// Requests will be proxied to this upstream if the path matches the request path.
Upstreams []Upstream `json:"configs,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we go with my suggestion above this would stay as upstreams in the json tag

Suggested change
Upstreams []Upstream `json:"configs,omitempty"`
Upstreams []Upstream `json:"upstreams,omitempty"`

This allows urls with encoded characters (e.g.: /%2F/) to pass to the
upstream mux instead of triggering a HTTP 301 from the frontend.
Otherwise a /%2F/test/ will result in a HTTP 301 -> /test/
This commit changes Upstreams from []Upstream to a struct{}
moving the previous []Upstream into .Configs and adjusts all uses of it.
Adding a new option to the yaml alpha config will result in failed tests
unless you manually increment the line count. This commit computes this
dynamically.
Setting this flag will configure the upstream proxy to pass encoded urls
as-is.
Refactor proxy_test to set mux/upstream options for each test
individually and add tests for encoded urls with ProxyRawPath set and
unset.
@FStelzer
Copy link
Contributor Author

ok, you are right. i missed these...
i've adjusted these as well now and also regenerated the docs, squashed my fixups and added the changelog entry.

Copy link
Member

@JoelSpeed JoelSpeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @FStelzer, looks great!

@JoelSpeed JoelSpeed merged commit ee7c405 into oauth2-proxy:master Sep 17, 2021
@FStelzer
Copy link
Contributor Author

Thanks @FStelzer, looks great!

Thanks a lot for your support on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants