Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CI test for workspace file initialization #9855

Merged
merged 20 commits into from
May 28, 2024

Conversation

jonatanschroeder
Copy link
Member

@jonatanschroeder jonatanschroeder commented May 15, 2024

Resolves #9846.

  • Static files
  • Template files
  • Dynamic files - content in dictionary
  • Dynamic files - link to questionFile
  • Files in more than one location
  • Check if no issues are created in this scenario.
  • Question with invalid workspace_files dictionary:
    • Missing file name
    • Missing contents or questionFile
    • Invalid encoding
    • File name outside home directory
    • questionFile path outside question directory

Copy link
Contributor

github-actions bot commented May 15, 2024

All images

Image Platform Old Size New Size Change
prairielearn/executor:27572ce077e74347e1be2608a08f5ada9d9802e9 null 1187.91 MB 1187.77 MB -0.01%
prairielearn/prairielearn:27572ce077e74347e1be2608a08f5ada9d9802e9 linux/amd64 1187.74 MB 1187.77 MB 0.00%

@@ -444,7 +448,7 @@ async function initialize(workspace_id: string): Promise<InitializeResult> {
}
try {
// Discard names with directory traversal outside the home directory
if (!contains(remotePath, path.join(remotePath, file.name), false)) {
if (!contains(sourcePath, path.join(sourcePath, file.name), false)) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bugfix: contains requires the parent directory to be an absolute path to work as expected.

@@ -553,7 +553,7 @@ async function initialize(workspace_id: string): Promise<InitializeResult> {

if (fileGenerationErrors.length > 0) {
const output = fileGenerationErrors.map((error) => `${error.file}: ${error.msg}`).join('\n');
issues.insertIssue({
await issues.insertIssue({
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bugfix: floating promise.

@jonatanschroeder jonatanschroeder marked this pull request as ready for review May 15, 2024 21:47
@jonatanschroeder
Copy link
Member Author

@nwalters512 as mentioned in the meeting, I added (in 9f87ae0) the ability for instructors to set a dynamic file name as absolute based on the workspace home directory. Can you please review that portion to make sure it looks fine?

@nwalters512
Copy link
Contributor

@jonatanschroeder for posterity, can you comment here stating why that change is being made? What's the instructor use case?

I thought about this a bit more and I have vague worries about how that would behave if an instructor changed the home directory for a question, and then a student reset their workspace. That's probably unlikely to happen in practice, but that's the kind of thing that would have worked if absolute paths to the home directory weren't used.

@jonatanschroeder
Copy link
Member Author

@jonatanschroeder for posterity, can you comment here stating why that change is being made? What's the instructor use case?

I thought about this a bit more and I have vague worries about how that would behave if an instructor changed the home directory for a question, and then a student reset their workspace. That's probably unlikely to happen in practice, but that's the kind of thing that would have worked if absolute paths to the home directory weren't used.

That is a good point. In essence, depending on the question's functionality, some interesting patterns may arise in terms of file name structure. So these files should be accepted:

  • file_name.txt (i.e., no path);
  • subdir/file_name.txt (with a path, but all within the home directory);
  • subdirA/../subdirB/file_name.txt (while unconventional, depending on the complexity of the generate process, I see no reason to deny it).

While these files should be denied, mostly because they are not mapped in the volume and should thus be explicitly listed to instructors as an error, but also to avoid problems trying to save these files:

  • ../file_name.txt;
  • /etc/file_name.txt (absolute path that has nothing to do with the home directory);
  • subdirA/../../file_name.txt (resolution goes outside the home directory).

The handling of the files above was supposed to cause an issue, but the current implementation in master causes a weird behaviour that allows it to proceed, mostly because the contains function expects the parent directory to be an absolute path, while the function was sending a relative path. This was fixed in the current implementation.

Now this leaves some fuzzy options. Assuming that the workspace home directory is /home/prairie, what should be done with directories like these?

  • ../prairie/file_name.txt (navigates outside the home directory, but resolves inside);
  • /home/prairie/file_name.txt (absolute path that includes the home directory itself).

The current implementation allows both to proceed, but I'm perfectly happy to also change it so they are both denied. I don't see a strong reason in either direction. Allowing them would not cause a significant add of functionality and the corresponding functionality is easily replaced with a similar alternative, while the rationale for denying it would only cause a problem if there is a change in home directory that does not align with the code in generate, but changing the home directory would typically also rely on corresponding changes in the workspace itself, so it's unlikely to happen often.

One alternative implementation to deny the fuzzy options above would be to replace path.resolve with path.normalize, and then deny if path.isAbsolute or if the normalized path starts with ... This could be implemented as a special case of contains if the parent directory is null.

@nwalters512
Copy link
Contributor

I would lean towards being as restrictive as possible by default, which will also push people towards doing what is probably the most correct thing. That is: their paths should always be relative to whatever the home directory might be. Your suggestion to reject absolute paths entirely sounds like a good one.

Copy link

codecov bot commented May 18, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 67.30%. Comparing base (c5ecd8a) to head (27572ce).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #9855      +/-   ##
==========================================
+ Coverage   67.00%   67.30%   +0.30%     
==========================================
  Files         456      456              
  Lines       71318    71363      +45     
  Branches     5714     5751      +37     
==========================================
+ Hits        47787    48033     +246     
+ Misses      23102    22903     -199     
+ Partials      429      427       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jonatanschroeder
Copy link
Member Author

I would lean towards being as restrictive as possible by default, which will also push people towards doing what is probably the most correct thing. That is: their paths should always be relative to whatever the home directory might be. Your suggestion to reject absolute paths entirely sounds like a good one.

Done in 1ab3637. I've extended the functionality of the path-utils package. However I tried to add a changeset to it but it didn't list path-utils as an option. Is this expected?

@nwalters512
Copy link
Contributor

path-utils is a private package, meaning it won't ever be published to npm:

Since it's not published, it doesn't enforce a changelog.

Thanks for making the changes, I'll review them next week!

@jonatanschroeder
Copy link
Member Author

@nwalters512 just making sure this doesn't fall through the cracks, I'm looking for a review of the changes done after the approval above.

@@ -3,13 +3,20 @@ import path from 'path';
/**
* Returns true if the parent path contains the child path. Used to allow code to make checks that
* prevent directory traversal attacks.
* @param parentPath The path of the parent directory. Assumed to be absolute.
* @param childPath The path of the child directory. Usually absolute, but if relative, resolved in relation to the parent directory.
* @param parentPath The path of the parent directory. If provided, assumed to be absolute. If set to null, the child path is normalized and checked against an arbitrary parent directory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't actually know what it means for a path to be "checked against an arbitrary parent directory". It feels like we're trying to make contains do something it's not meant to do; I also worry that allowing null values here makes it easier for someone to use this function incorrectly.

Do we need another function that performs a different kind of check, ideally one that only takes a single string argument instead of two?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created a separate isContainedRelativePath function in f39a440.

apps/prairielearn/src/lib/workspace.ts Outdated Show resolved Hide resolved
@jonatanschroeder jonatanschroeder added this pull request to the merge queue May 28, 2024
Merged via the queue into master with commit c5d4bbd May 28, 2024
9 checks passed
@jonatanschroeder jonatanschroeder deleted the ci-workspace-dynamic-files branch May 28, 2024 22:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add CI tests for dynamic workspace files
2 participants