Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chunked file uploads #6421

Open
wants to merge 7 commits into
base: develop-minor
Choose a base branch
from
Open

Conversation

distantnative
Copy link
Member

@distantnative distantnative commented Apr 28, 2024

This PR …

  • New Api\Upload class
    • Takes over all the nested code from Api\Api::upload() to make it more easily to read and test
    • Adds support for uploads in chunks

Features

Enhancements

  • New uploadAsChunks JS helper function

Refactored

  • New Kirby\\Api\Upload class to handle file uploads via the REST API

Ready?

  • In-code documentation (wherever needed)
  • Unit tests for fixed bug/feature
  • Tests and checks all pass

For review team

  • Squash commits
  • Add changes & docs to release notes draft in Notion

@distantnative distantnative added the type: enhancement ✨ Suggests an enhancement; improves Kirby label Apr 28, 2024
@distantnative distantnative self-assigned this Apr 28, 2024
@afbora
Copy link
Member

afbora commented Apr 30, 2024

I've tried to upload a EXE file. Kirby 4.2.0 throws a error The extension "exe" is not allowed but nothing happens for this PR. Seems something broken.

@afbora
Copy link
Member

afbora commented Apr 30, 2024

Also I realized that file is still remains when I cancel the upload. Should be deleted. But should be careful when file replacing.

@distantnative
Copy link
Member Author

@afbora We cannot delete the tmp file on cancel. But we clean up the files with a garbage collection.

@distantnative distantnative marked this pull request as ready for review May 2, 2024 09:57
src/Panel/Panel.php Outdated Show resolved Hide resolved
src/Api/Upload.php Outdated Show resolved Hide resolved
config/api/routes/files.php Outdated Show resolved Hide resolved
panel/src/helpers/upload.js Outdated Show resolved Hide resolved
src/Api/Upload.php Outdated Show resolved Hide resolved
src/Api/Upload.php Outdated Show resolved Hide resolved
src/Api/Upload.php Outdated Show resolved Hide resolved
src/Api/Upload.php Outdated Show resolved Hide resolved
config/api/routes/files.php Outdated Show resolved Hide resolved
@lukasbestle lukasbestle added the needs: two-person review 🧑‍🤝‍🧑 PR must only be merged with two approvals label May 2, 2024
@lukasbestle
Copy link
Member

Adding the "two-person review" label as I realized we need to be super careful about the security implications here. There's a lot that could go horribly wrong if we miss anything. I think we are heading in a great direction though 👍

@distantnative
Copy link
Member Author

@lukasbestle makes sense. And as written in the initial post, I think we should have a few review rounds as well. Now getting the concept right, later adding unit tests etc. before final review.

@afbora
Copy link
Member

afbora commented May 3, 2024

We cannot delete the tmp file on cancel. But we clean up the files with a garbage collection.

@distantnative What if we upload to the temp folder, does it make sense? If upload successful, we can move the file to the model root, if not garbage collector take care.

@distantnative
Copy link
Member Author

@afbora That's what we are doing here. Just not the system temp folder as that doesn't persist across requests. So the first parts aren't there anymore when the last chunk is received.

@tobimori
Copy link
Contributor

tobimori commented May 3, 2024

Is it theoretically possible to add a config option that will limit the file size?

@afbora
Copy link
Member

afbora commented May 3, 2024

I've found few issues, could you confirm that?

  • File not uploaded without error or log (dialog closed and no file uploaded,) with big files. For me, ~100 MB success, ~300 MB fails.
  • I'm getting A file with the name "filename.zip" already exists error when big files upload from textarea field

@distantnative
Copy link
Member Author

@afbora the PR currently is not functional as it's in the middle of implementing changes based on Lukas' feedback and some troubles I encountered (see our comments discussion). Sorry that you tried, but I think any upload fails currently that needs more than one chunk.

@afbora
Copy link
Member

afbora commented May 3, 2024

@distantnative No problem. I'll test again after PR is ready.

@lukasbestle
Copy link
Member

Is it theoretically possible to add a config option that will limit the file size?

Good point. I think it would make sense to make this an option for the file blueprint.

@distantnative
Copy link
Member Author

@lukasbestle @tobimori are you thinking of something different than the existing accept maxsize option?

@tobimori
Copy link
Contributor

tobimori commented May 5, 2024

I didn't know there was an option already. Consider my request as solved 😀

@lukasbestle
Copy link
Member

Also forgot about maxsize. 😆

@distantnative
Copy link
Member Author

@lukasbestle added some more validations – skipping MIME types (will still checked for the full file), but at least basic ones as allowed extensions.

src/Api/Upload.php Outdated Show resolved Hide resolved
src/Api/Upload.php Outdated Show resolved Hide resolved
src/Api/Upload.php Outdated Show resolved Hide resolved
panel/src/helpers/upload.js Outdated Show resolved Hide resolved
panel/src/helpers/upload.js Outdated Show resolved Hide resolved
@lukasbestle
Copy link
Member

lukasbestle commented May 15, 2024

About maxsize: We need to ensure that we check for that in three ways:

  • The frontend should check for it before even uploading the first chunk (avoids user frustration). Maybe this already happens.
  • The backend should also ensure it by checking the Upload-Length against it.
  • Also there should be a validation for the actual uploaded chunk size. So the size of the current tmp file + the size of the temporary chunk file provided by PHP should be compared to maxsize before streaming over the chunk to our tmp file. Otherwise one could provide an Upload-Length with an allowed size (e.g. 5 MB) but send a chunk with 100 MB. The file would then already take up double that size on the server before Kirby finally validates the size in $page->createFile().

@distantnative
Copy link
Member Author

@lukasbestle I am not sure we can follow your suggestions for maxsize. This would require us parsing the blueprints on each of those requests (Panel view, chunk upload). And given that our blueprints currently are always fully resolved (sadly still), this could create real performance issues. Because of that, I think until we resolve blueprints more lazily, we have to stick to the (not ideal) behavior that the accept rules only get evaluated at the end of the full upload.

@lukasbestle
Copy link
Member

About maxsize: You are right that it won't work out to do the first check in the frontend as we would need to provide all possible maximum sizes for all allowed file blueprints before we even know if a file will be uploaded at all. However during an active upload, this is different: There I feel like the performance impact of checking the maxsize is negligible compared to the duration of the upload (unless the uploader is somewhere in Switzerland with a symmetric 1 Gbit/s connection 😆).

However with these checks we could prevent attackers from stuffing our custom temp dir with insanely large partially uploaded files that are so far prevented by the blueprint. At the moment, PHP will accept the file first, store it to the system temp folder and then Kirby decides not to copy it elsewhere. So the damage is limited to the upload_max_filesize for a brief moment in time (PHP auto-cleans uploaded files that were not copied during the request). With chunked uploads, an attacker could however initiate an upload of a 1 GB file and upload 999 MB of that. Those 999 MB will reside on the server until cleanup. This could lead to denial of service if done until the disk on the server or quota is full.

@distantnative
Copy link
Member Author

@lukasbestle I added a check on the backend for maxsize: both, checking the total length from the header as well as existing tmp plus chunk size.

Copy link
Member

@lukasbestle lukasbestle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, I feel like the logic is really solid now. My remaining comments are just minor fixes and refactorings.

@afbora I think now is the time when you can test it in detail as behavior changes are not too be expected anymore.

src/Api/Upload.php Outdated Show resolved Hide resolved
config/api/routes/files.php Outdated Show resolved Hide resolved
src/Api/Upload.php Outdated Show resolved Hide resolved
tests/Api/UploadTest.php Outdated Show resolved Hide resolved
lukasbestle
lukasbestle previously approved these changes May 25, 2024
Copy link
Member

@lukasbestle lukasbestle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes, that's much better. 👍

Code-wise I think this is great now. To be on the safe side, it would be good to get another code review by someone uninvolved (@bastianallgeier if you have the time) and also @afbora's testing skills will help us a lot.

@distantnative
Copy link
Member Author

Would be great indeed. I've rebased and squashed the commits already.

@distantnative distantnative requested a review from afbora May 26, 2024 09:30
@afbora
Copy link
Member

afbora commented May 27, 2024

@distantnative I'm getting A file with the name "filename.zip" already exists error when big files upload from textarea field. Same file uploads without error from files section.

@distantnative
Copy link
Member Author

@afbora good catch. Can reproduce myself. Somehow already after the first chunk they get moved to the content folder. But need a fresher mind to investigate why.

@lukasbestle
Copy link
Member

Maybe a different route processes these so the Upload-Length header is ignored and the new chunk logic on the backend doesn't get executed?

@distantnative
Copy link
Member Author

Indeed it is. So I am thinking to actually move the Upload::chunk() call here: https://github.com/getkirby/kirby/blob/main/src/Api/Api.php#L697

So any place using $api->upload() would work with it.

Furthermore, I think I would like to move most of the Code from Api::upload() to the new Upload class. I think that could als help us to simplify that code into several single-purpose methods, e.g. Upload::response(), Upload::error(), Upload::filename() ...

@lukasbestle
Copy link
Member

Makes sense to me. The Api::upload() method is indeed a spaghetti monster at the moment.

Behavior would be that Api::upload() would check for chunks in the place you linked to and only call the callback on the last chunk when the upload is complete, right?

@distantnative
Copy link
Member Author

Yes, exactly. Have a version working locally. It really helps breaking this up but also having it all in the one class then.

Now only need to add unit tests to all new parts of the Upload class. Bit of work but also is nice cause we'll cover most of this with proper unit tests now.

@distantnative distantnative added type: feature 🎉 Adds a feature (requests → feedback.getkirby.com) and removed type: enhancement ✨ Suggests an enhancement; improves Kirby labels Jun 1, 2024
@distantnative distantnative added this to the 4.4.0 milestone Jun 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs: two-person review 🧑‍🤝‍🧑 PR must only be merged with two approvals type: feature 🎉 Adds a feature (requests → feedback.getkirby.com)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants