Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for index.php in subdirectories #259

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

maciekpaprocki
Copy link

#237

  • Added support for index.php in subdirectories ( without index.php being in the URL )
  • Added a bunch of tests

It's my first go PR and I wasn't able to find any contributor rules or anything like that, so it's straightforward PR. I know I am going to have to change a few things, so don't worry about bashing it :)

Copy link
Contributor

@tucksaun tucksaun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code might be necessary (but I'm not sure) but is not enough to fix the linked issue because the HTTP handler (https://github.com/symfony-cli/symfony-cli/blob/75f91e31404e/local/http/http.go#L201) will not forward the call to PHP at all.
You first need to make sure the handler detects such PHP scripts and then forward the request to PHP

@maciekpaprocki
Copy link
Author

maciekpaprocki commented Feb 3, 2023

Hey. Thanks for quick review. Just to make sure we are talking about this same. This script is now returning the correct values:

mkdir -p web/subdirectory
echo "<?php echo 'root'; " > web/index.php
echo "<?php echo 'subdirectory'; " > web/subdirectory/index.php
go run main.go server:start -d
curl localhost:8000 # returns root
curl localhost:8000/subdirectory # returns subdirectory

I think the issue you might be talking about though is that command will refuse to enable PHP if the root PHP file is not available. Is that it?

@maciekpaprocki
Copy link
Author

Actually, i think we are talking about different things.

Currently the behaviour of a server is that it will ignore PHP all together if there's no root index.php, so i didn't change anything there.

Going back to your issue. I am crap at GO so forgive me if I say something stupid, but seems that Handler is used for static files and then as a fallback it calls s.callback. The original part of the routing for other .php files is included in envs file, so i guess this one should be too.

local/php/envs.go Outdated Show resolved Hide resolved
local/php/envs.go Outdated Show resolved Hide resolved
@tucksaun
Copy link
Contributor

tucksaun commented Feb 8, 2023

Hey. Thanks for quick review. Just to make sure we are talking about this same. This script is now returning the correct values:

mkdir -p web/subdirectory
echo "<?php echo 'root'; " > web/index.php
echo "<?php echo 'subdirectory'; " > web/subdirectory/index.php
go run main.go server:start -d
curl localhost:8000 # returns root
curl localhost:8000/subdirectory # returns subdirectory

I think the issue you might be talking about though is that command will refuse to enable PHP if the root PHP file is not available. Is that it?

Hum, sorry I was wrong: I misread the Handler part. Everything is fine there.

@maciekpaprocki
Copy link
Author

Great. Thanks for reviewing. I am happy for that to go. It would be good to know it works on windows too, but it shouldn't have any adverse effects if it doesn't, the feature just wouldn't be available.

local/php/envs.go Outdated Show resolved Hide resolved
local/php/envs.go Outdated Show resolved Hide resolved
local/php/envs.go Outdated Show resolved Hide resolved
local/php/envs.go Outdated Show resolved Hide resolved
@maciekpaprocki maciekpaprocki force-pushed the feature/add-index-in-subdirectories branch 2 times, most recently from c777a4c to 7b158c8 Compare February 8, 2023 19:00
@maciekpaprocki
Copy link
Author

Few fixes

  • removed issues with // in URLs
  • fixed few code style issues mentioned
  • added more tests for edge cases

local/php/envs.go Outdated Show resolved Hide resolved
local/php/envs.go Outdated Show resolved Hide resolved
local/php/envs.go Outdated Show resolved Hide resolved
local/php/envs.go Outdated Show resolved Hide resolved
local/php/envs.go Show resolved Hide resolved
local/php/envs.go Show resolved Hide resolved
local/php/envs.go Show resolved Hide resolved
local/php/envs.go Outdated Show resolved Hide resolved
@maciekpaprocki maciekpaprocki force-pushed the feature/add-index-in-subdirectories branch from c8aa21f to 2a2401d Compare February 13, 2023 14:09
@maciekpaprocki
Copy link
Author

Thanks for the fix, I think you removed the empty lines too, so I just squashed it and it's ready to go.

Thank you

local/php/envs.go Outdated Show resolved Hide resolved
Copy link
Contributor

@tucksaun tucksaun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two last minor comments.
Tomorrow I will play a bit with this to check the performance to be sure this patch does not brings a big hit

local/php/envs.go Outdated Show resolved Hide resolved
local/php/envs.go Outdated Show resolved Hide resolved
@maciekpaprocki
Copy link
Author

Added fixes requested.

@maciekpaprocki
Copy link
Author

maciekpaprocki commented Feb 15, 2023

Had a quick look with ab:

mkdir -p web
echo "<?php echo 'root'; " > web/index.php
go run main.go server:start -d
ab -n 7000 -c 20 http://localhost:8000/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo/foo

Both benchmarks had pretty much similar footprint, even that this is pretty hardcore case.

Examples:

Before

Percentage of the requests served within a certain time (ms)
50% 5
66% 5
75% 5
80% 6
90% 6
95% 6
98% 7
99% 9
100% 656 (longest request)

After

Percentage of the requests served within a certain time (ms)
50% 5
66% 6
75% 6
80% 6
90% 6
95% 7
98% 8
99% 9
100% 599 (longest request)

I run it a bunch of times, and the results were not really showing any major differences. Over 7k requests in both branches had problems, but I guess it would be probably logging overflowing or some mac process kicking in.

Obviously, it's not really a reliable test, but better than nothing.

macbook pro
2.4 GHz 8-Core Intel Core i9
64 GB 2667 MHz DDR4

local/php/envs_test.go Outdated Show resolved Hide resolved
@maciekpaprocki maciekpaprocki force-pushed the feature/add-index-in-subdirectories branch from b86b610 to 18fc8ce Compare February 16, 2023 07:56
local/php/envs.go Outdated Show resolved Hide resolved
local/php/envs.go Outdated Show resolved Hide resolved
@maciekpaprocki
Copy link
Author

maciekpaprocki commented Feb 25, 2023

OK. I sent over a solution that addresses both of the issues as well as a previous issue with PHP files outside of the document root being executed.

I left comments in the code, but I think that's one of the rare situations they might actually be useful going forward.

Pretty important thing:

  • I changed some tests I added before, as they were not really correct. It would be definitely good to get the second pair of eyes on it.
  • DOCUMENT_URI is currently almost definitely wrong. If it's supposed to mimic nginx one ( it's not a standard PHP variable ) and by nginx documentation, I found it's cleared version of the URI. Should I put it in another ticket or solve it here?

Gonna fix the commits once reviews goes through, as merging them for now is actually causing problems.

EDIT:

Docs for fastcgi and document_uri in nginx:

https://www.nginx.com/resources/wiki/start/topics/examples/phpfcgi/
http://nginx.org/en/docs/http/ngx_http_core_module.html

$document_uri is same as $uri and $uri is "normalised":

"The matching is performed against a normalized URI, after decoding the text encoded in the “%XX” form, resolving references to relative path components “.” and “..”, and possible compression of two or more adjacent slashes into a single slash."

@maciekpaprocki
Copy link
Author

maciekpaprocki commented Feb 25, 2023

One more thing @tucksaun. Seen your commit:
7abd242

And it solves the execution outside of the index, but it will probably accept things like "/foo/../update.php" which is not what we want. My code looks crap though with nested if statements :)

I added it to tests.

@tucksaun
Copy link
Contributor

tucksaun commented Mar 3, 2023

Hey sorry for the delay.
Regarding DOCUMENT_URI, so far we never got any feedback about routing issues with any major PHP framework so I would rather stick with the current logic. That would be another issue anyway IMO.

Going to review the code now

@tucksaun
Copy link
Contributor

tucksaun commented Mar 3, 2023

code looks quite complicated indeed :/

what I drafted last week was something similar to this:

func (p *Server) resolveScriptName(pathInfo string) (string, string) {
	if pos := strings.Index(strings.ToLower(pathInfo), ".php"); pos != -1 {
		file := path.Clean(pathInfo[:pos+4])
		if _, err := os.Stat(filepath.Join(p.documentRoot, file)); err == nil {
			return file, pathInfo[pos+4:]
		}
	}

	for pos := len(pathInfo); pos > 1; pos-- {
		// virtually trim trailing slashes from pathinfo
		if pathInfo[pos-1] == '/' {
			continue
		}

		file := filepath.Join(pathInfo[:pos], p.passthru)
		if _, err := os.Stat(filepath.Join(p.documentRoot, file)); err == nil {
			return file, pathInfo[pos:]
		}
	}

	return p.passthru, pathInfo
}

EDIT: cleaner version

The idea is to explore the path bits by bits (parts separated by / to be more precise) starting from the right end and going upwards so that as soon as we reach a directory where an index.php file exists we stop here.
As we do the look for the / ourselves we can easily "trim' or normalize them on the go.
I can submit a PR to your fork if you wish.

Regarding the ".." we don't really care if they allow relative navigation in the path, this will not end up with a path one could have send directly.
The only important bit is that it should not let one go outside of the document root. here, because we do a filepath.Join (calling Clean) before doing the filepath.Join with document root, the path joined with document root is already safe and will not include relative bits.

The same is probably we true for repeated slashes.
As far as I know, because those are valids when dealing with file paths, webservers will accept them. But probably most web frameworks won't. So if we end up compacting the ones that are part of the script filenames I don't really care much.

@maciekpaprocki
Copy link
Author

maciekpaprocki commented Mar 5, 2023

I mean, if we don't care about the path support for ".." and "//" then we can just do a quick break.

if( pathInfo != path.Clean(pathInfo) ){
	return p.passthru, pathInfo
}

And that will clean a lot of code. Said that, not sure if we shouldn't support it, so people can test their frameworks with it to actually make sure it's resolved.

When it comes to the loop, I think having a split might be a bit easier for people to figure it out and when we actually add all the edge cases ( sub/index.php/ vs sub/index.php, // and ../ support ) I think it will end up with the similar amount of code. Current code currently also starts from the back and moves to the start, just does that from the array.

I can probably try to optimise a few things, but it would be a line or two tops if we want to keep all the features and I am not sure if it will actually simplify code.

local/php/envs_test.go Outdated Show resolved Hide resolved
local/php/envs_test.go Show resolved Hide resolved
local/php/envs_test.go Show resolved Hide resolved
local/php/envs_test.go Show resolved Hide resolved
local/php/envs_test.go Show resolved Hide resolved
@tucksaun
Copy link
Contributor

tucksaun commented Mar 7, 2023

Said that, not sure if we shouldn't support it, so people can test their frameworks with it to actually make sure it's resolved.

I'm not saying we should clean the pathinfo and not support having .. or repeated / in it, but the part of the URL leading to the script name could happen to be cleaned and we should not care IMO (and the framework behind the script as well): a typical request should not even arrive with some /../ because most HTTP clients will compact them before sending the request. Some web servers even just plainly reject requests when /../ is included in the URI.
This also means we are consistent with what will happen if one adds the script filename.
My last review goes with this logic in mind

When it comes to the loop, I think having a split might be a bit easier for people to figure it out and when we actually add all the edge cases

That's the reason why the code becomes complicated: there are a lot of conditions to deal with edge cases
But IMHO by not using spit we remove most of the edge cases and needs for string manipulations

@maciekpaprocki
Copy link
Author

OK. I had time to set up a basic wp nginx example and actually learned that all the paths will be cleaned up by default, which is completely against what I got from docs :)

I am not sure how it is with other servers, but I found some mentions of path normalisation in apache too. I guess that would mean, that if we want to make the server similar to that of two main ones we should probably always normalise path. That will remove a lot of if statements, but at this same time, it will make it not back-compatible for those not sanitised URLs currently supported. I am not really sure if that's a problem as I can see a small chance someone on purpose used ../ or // in their routing. in fact i am sure symfont does it too.

I am gonna send an update to the code tomorrow with normalising path for all requests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants