Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider implementing server-side md5 #45

Open
bkanuka opened this issue Jun 22, 2021 · 0 comments
Open

Consider implementing server-side md5 #45

bkanuka opened this issue Jun 22, 2021 · 0 comments

Comments

@bkanuka
Copy link

bkanuka commented Jun 22, 2021

GCP can compute the md5 hash server-side: https://googleapis.dev/python/storage/latest/blobs.html#google.cloud.storage.blob.Blob.md5_hash

hash() is not typically implemented in pyfilesystem according to: https://docs.pyfilesystem.org/en/latest/implementers.html#helper-methods however, in this case it could be argued that there is a performance benefit to getting the hash from the server rather than downloading+computing it client-side.

Note that this only works for md5 and crc32c. Therefore the implementation could just fallback to the parent class otherwise. An implementation might look like this (complelely untested):

def hash(self, path, name):
    if name.lower() == 'md5':
        _path = self.validatepath(path)
        _key = self._path_to_key(_path)
        blob = self._get_blob(_key)
        if not blob:
            raise errors.ResourceNotFound(path)
        return blob.md5_hash
    else:
        return super().hash(self, path, name)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant