Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ETag should be a hash #2948

Closed
Joannis opened this issue Jan 30, 2023 · 11 comments · Fixed by #3015
Closed

ETag should be a hash #2948

Joannis opened this issue Jan 30, 2023 · 11 comments · Fixed by #3015
Labels
bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed

Comments

@Joannis
Copy link
Member

Joannis commented Jan 30, 2023

Describe the bug

Currently, ETag for file serving is generated from the modified date and file size. A hash is a stronger alternative, and can be precomputed if available. See #2943.

@Joannis Joannis added the bug Something isn't working label Jan 30, 2023
@0xTim 0xTim added help wanted Extra attention is needed good first issue Good for newcomers labels Jan 31, 2023
@VaporBot VaporBot added this to To do in Beginner Issues Jan 31, 2023
@VaporBot VaporBot added this to To do in Help Wanted Issues Jan 31, 2023
@jhoughjr
Copy link

"ETag should be a hash of the contents (not just (lastModificationTime)-(fileLength))" re precomputing, precomputing when?

@jhoughjr
Copy link

if the hash is to be based on the file contents, what algorithm should be used?

@jhoughjr
Copy link

currently im testing an md5 hash on the contents.

@jhoughjr
Copy link

I dont think this is teastable.

@Joannis
Copy link
Member Author

Joannis commented Feb 1, 2023

I think that's mostly up for debate / deferred to the end user implementation. Many people prefer not using MD5, but rather SHA1, or a modern standard like SHA2 (whichever variant). And since hashes vary depending on the user and their infrastructure, I'd say cached hashes should not be mandated to use a specific algorithm.

I think it would be nice to support non-cached hashing out of the box, however that might be a two-step process. I'd love us to have an algorithm option in the case of non-cached hashes such as etag: .md5 or etag: .sha1

@jhoughjr
Copy link

jhoughjr commented Feb 1, 2023

can do that tonight I think.

@linus-hologram
Copy link
Contributor

I am going to give this a try as my first vapor contribution :)

@linus-hologram
Copy link
Contributor

As discussed on discord, I'll try to compute a hash whenever data is written to the file system and store it in a file attribute. As for the hashing technique, SHA-1, SHA-256 or similar will be used.

@linus-hologram
Copy link
Contributor

We're using a different approach now. Creating the hash solely when a file is written to the file system using Vapor turned out to be unviable since that would not create any file hashes for manually added files.

Instead, the hash is now generated whenever a file is streamed back to the client, provided the hash is not already cached.

@linus-hologram
Copy link
Contributor

@0xTim will the PR #3015 be merged some time soon?

@linus-hologram
Copy link
Contributor

@0xTim any update on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed
Projects
Development

Successfully merging a pull request may close this issue.

4 participants