Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage emulator fixes and added rewriteTo support #3647

Merged
merged 41 commits into from Mar 15, 2022

Conversation

rhodgkins
Copy link
Contributor

@rhodgkins rhodgkins commented Aug 6, 2021

Description

Downloading (using File#download())

When using the emulator against @google-cloud/storage, I noticed a few of the headers were missing, this mainly caused any validation on download to fail (as the emulated endpoint was missing the x-goog-hash header).

I've included some values for the other x-goog headers.

Support for rewriteTo / copyTo

This is so File#copy() is supported.

The same endpoint/logic is used for both copyTo and rewriteTo.

For copyTo it works as is - just creates a new file based on the source files data/metadata.

The same thing is done for rewriteTo, which to the end user (calling #copy()) is fine, it doesn't handle the complexity of the rewriteToken. If the rewriteTo endpoint is called with a rewriteToken a 501 status code is returned (I can't see (at least for the moment!) any issues with that as the copying is done in one go).

Flattened object storage for buckets

GCS doesn't really have directories so in the emulator if you store a file at filename and then try to store another file at filename/child.txt it would fail. I've update the Persistence to store files for a bucket in a single directory, but URL encoded the filenames.

Fixed Objects: list endpoint

This didn't work properly at all (and it didn't return prefixes), so added some test cases and updated the logic to match a real life objects: list on a bucket.
Also added basic support for pageToken and maxResults.

Scenarios Tested

See updated / added unit tests.

Sample Commands

See comments - I've given allUsers the objects.list and objects.get permissions for the bookcreator-dev-firetool-tests bucket given in the examples.

Linked issues

Fixes #3469
Fixes #3751
Fixes #3778
Fixes #3695
Fixes #3764

@google-cla google-cla bot added the cla: yes Manual indication that this has passed CLA. label Aug 6, 2021
@rhodgkins rhodgkins changed the title Emulator fixes Storage emulator fixes Aug 6, 2021
Comment on lines +308 to +393
res.setHeader("x-goog-generation", `${md.generation}`);
res.setHeader("x-goog-metadatageneration", `${md.metageneration}`);
res.setHeader("x-goog-storage-class", md.storageClass);
res.setHeader("x-goog-hash", `crc32c=${crc32cToString(md.crc32c)},md5=${md.md5Hash}`);
Copy link
Contributor Author

@rhodgkins rhodgkins Aug 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've matched these extra headers to a real file:

curl -I 'https://storage.googleapis.com/storage/v1/b/bookcreator-dev-firetool-tests/o/testing%2Fshoveler.svg?alt=media'

HTTP/2 200 
x-guploader-uploadid: ADPycdtb0wlKL1b531e7pcvB5BwHSRnLmGtHmc9rd3A2lpjNiGegBft5t_ClHASXwONq7hhh2f6n7WHs8fp4q97YweY
etag: CKf07Ir4o/ICEAE=
x-goog-hash: crc32c=t0Xs5w==,md5=Z0RBlgyhui3gitTlDJ/emA==
content-type: application/json
x-goog-generation: 1628512034961959
x-goog-metageneration: 1
x-goog-storage-class: STANDARD
content-disposition: attachment
date: Mon, 09 Aug 2021 15:26:16 GMT
vary: Origin
vary: X-Origin
cache-control: no-cache, no-store, max-age=0, must-revalidate
expires: Mon, 01 Jan 1990 00:00:00 GMT
pragma: no-cache
content-length: 5
server: UploadServer
alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000,h3-T051=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"

Comment on lines 483 to 493

function generateETag(bytes: Buffer): string {
const hash = crypto.createHash("sha1");
hash.update(bytes);
// Trim padding
return hash.digest("base64").slice(0, -1);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really sure how GCS calculates these, but I guess it doesn't matter as long as its consistent for the emulator?

For now I've based it off the etag package.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GCS ensures etags are unique across resources and versions of those resources. If you simply hash the bytes, you guarantee collisions between objects with the same file content.

The generation metadata field is guaranteed to be unique across objects in GCS. It seems like that mostly holds true in the emulator today. I would recommend hashing the generation and metageneration fields here instead of the file contents.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can use a combination of those 2 things, but as the generation is based on the current timestamp, it might not be always be unique in the emulator context - it's a small possibility but could happen.

What about a combination of name, generation, metageneration and file contents?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry I think I misread unique across it objects in GCS as unique across GCS 😅
Combo of generation and metageneration sounds good 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done - etag now based on generation/metadatageneration

@rhodgkins rhodgkins changed the title Storage emulator fixes Storage emulator fixes and added rewriteTo support Aug 6, 2021
Comment on lines 382 to 431
function sendObjectNotFound(req: Request, res: Response): void {
res.status(404);
const message = `No such object: ${req.params.bucketId}/${req.params.objectId}`;
if (req.method === "GET" && req.query.alt == "media") {
res.send(message);
} else {
res.json({
error: {
code: 404,
message,
errors: [
{
message,
domain: "global",
reason: "notFound",
},
],
},
});
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested with (GET with alt=media):

curl -i 'https://storage.googleapis.com/storage/v1/b/bookcreator-dev-firetool-tests/o/blah?alt=media'

HTTP/2 404 
x-guploader-uploadid: ADPycdst7ZSZzN7H2Yw38zD4PTx9Zu6wMutI4cgUbUWHUS366qbxzy8tklhFeSs9HeMXJj9-JWuFLhxj8cQ9X8GT88LqNALIfA
content-type: text/html; charset=UTF-8
date: Mon, 09 Aug 2021 12:22:42 GMT
vary: Origin
vary: X-Origin
expires: Mon, 09 Aug 2021 12:22:42 GMT
cache-control: private, max-age=0
content-length: 51
server: UploadServer
alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000,h3-T051=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"

No such object: bookcreator-dev-firetool-tests/blah

Tested with (GET without alt=media):

curl -i 'https://storage.googleapis.com/storage/v1/b/bookcreator-dev-firetool-tests/o/blah'

HTTP/2 404 
x-guploader-uploadid: ADPycdvIQMxtmw4ACLIf5rrG3GAMTQY7ST_w_BAZtGCGGke00EPB1bca_SAuLCeTYG9UeDx_625W-N78NwaiUtKPIjtWPU_mdw
content-type: application/json; charset=UTF-8
date: Mon, 09 Aug 2021 12:23:29 GMT
vary: Origin
vary: X-Origin
cache-control: no-cache, no-store, max-age=0, must-revalidate
expires: Mon, 01 Jan 1990 00:00:00 GMT
pragma: no-cache
content-length: 277
server: UploadServer
alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000,h3-T051=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"

{
  "error": {
    "code": 404,
    "message": "No such object: bookcreator-dev-firetool-tests/blah",
    "errors": [
      {
        "message": "No such object: bookcreator-dev-firetool-tests/blah",
        "domain": "global",
        "reason": "notFound"
      }
    ]
  }
}

And (non-GET):

curl -i 'https://storage.googleapis.com/storage/v1/b/bookcreator-dev-firetool-tests/o/blah' -X DELETE

HTTP/2 404 
x-guploader-uploadid: ADPycdt_7t20l4J1rLO0hosF99ICr446vfIK8TBzbt0cf9nQ_eO-0Rlsc_pegIV8VzcCJd3KcYgco2gQjKnOjrzk9P1vN0PN8A
content-type: application/json; charset=UTF-8
date: Mon, 09 Aug 2021 12:22:57 GMT
vary: Origin
vary: X-Origin
cache-control: no-cache, no-store, max-age=0, must-revalidate
expires: Mon, 01 Jan 1990 00:00:00 GMT
pragma: no-cache
content-length: 277
server: UploadServer
alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000,h3-T051=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"

{
  "error": {
    "code": 404,
    "message": "No such object: bookcreator-dev-firetool-tests/blah",
    "errors": [
      {
        "message": "No such object: bookcreator-dev-firetool-tests/blah",
        "domain": "global",
        "reason": "notFound"
      }
    ]
  }
}

@rhodgkins
Copy link
Contributor Author

rhodgkins commented Aug 10, 2021

Test cases for @google-cloud/storages Bucket#getFiles based on querying against real life bucket with same files:

curl -i 'https://storage.googleapis.com/storage/v1/b/bookcreator-dev-firetool-tests/o'

@rhodgkins
Copy link
Contributor Author

The integration tests are only being run for pushes - is there a reason why?

if: github.event_name == 'push'

Comment on lines 726 to 820
it("should set null custom metadata values to empty strings", async () => {
const [, source] = await testBucket.upload(smallFilePath);

const file = testBucket.file(copyDestinationFile);
const metadata = { foo: "bar", nullMetadata: null };
const cacheControl = "private,max-age=10,immutable";
// Types for CopyOptions are wrong (@google-cloud/storage sub-dependency needs
// update to include https://github.com/googleapis/nodejs-storage/pull/1406
// and https://github.com/googleapis/nodejs-storage/pull/1426)
const copyOpts: CopyOptions & { [key: string]: unknown } = {
metadata,
cacheControl,
};
const [, { resource: metadata1 }] = await testBucket
.file(smallFilePath.split("/").slice(-1)[0])
.copy(file, copyOpts);

expect(metadata1).to.deep.include({
bucket: source.bucket,
contentType: source.contentType,
etag: source.etag,
crc32c: source.crc32c,
metadata: {
foo: "bar",
// Sets null metadata values to empty strings
nullMetadata: "",
},
cacheControl,
});

// Also double check with a new metadata fetch
const [metadata2] = await file.getMetadata();
expect(metadata2).to.deep.equal(metadata1);
});
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems weird behaviour but can be confirmed with:

curl -i 'https://storage.googleapis.com/storage/v1/b/bookcreator-dev-firetool-tests/o/prefix/rewriteTo/b/bookcreator-dev-firetool-tests/o/copied' -H "authorization: Bearer $(gcloud auth print-access-token)" -H 'content-type: application/json' --data-raw '{ "metadata": { "hello": "world", "foo": null  } }' 

HTTP/2 200 
x-guploader-uploadid: ADPycdtAGoz55T_cY81t5UnGT_MZlTnOeXiBSkCyL6ydj5-1yIPxz4zkg8dPzdIkjZww3q5eKzJN7n_nnVpii6K-iw
content-type: application/json; charset=UTF-8
date: Wed, 11 Aug 2021 08:13:45 GMT
vary: Origin
vary: X-Origin
cache-control: no-cache, no-store, max-age=0, must-revalidate
expires: Mon, 01 Jan 1990 00:00:00 GMT
pragma: no-cache
content-length: 974
server: UploadServer
alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000,h3-T051=":443"; ma=2592000,h3-Q050=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,quic=":443"; ma=2592000; v="46,43"

{
  "kind": "storage#rewriteResponse",
  "totalBytesRewritten": "5",
  "objectSize": "5",
  "done": true,
  "resource": {
    "kind": "storage#object",
    "id": "bookcreator-dev-firetool-tests/copied/1628669625765894",
    "selfLink": "https://www.googleapis.com/storage/v1/b/bookcreator-dev-firetool-tests/o/copied",
    "mediaLink": "https://storage.googleapis.com/download/storage/v1/b/bookcreator-dev-firetool-tests/o/copied?generation=1628669625765894&alt=media",
    "name": "copied",
    "bucket": "bookcreator-dev-firetool-tests",
    "generation": "1628669625765894",
    "metageneration": "1",
    "storageClass": "STANDARD",
    "size": "5",
    "md5Hash": "Z0RBlgyhui3gitTlDJ/emA==",
    "crc32c": "t0Xs5w==",
    "etag": "CIaggJTDqPICEAE=",
    "timeCreated": "2021-08-11T08:13:45.767Z",
    "updated": "2021-08-11T08:13:45.767Z",
    "timeStorageClassUpdated": "2021-08-11T08:13:45.767Z",
    "metadata": {
      "hello": "world",
      "foo": ""
    }
  }
}

(Confirmed same behaviour with copyTo.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is correct, yes.

@rhodgkins rhodgkins force-pushed the emulator-fixes branch 4 times, most recently from 8f00187 to 7165bdc Compare August 19, 2021 07:38
@rhodgkins rhodgkins force-pushed the emulator-fixes branch 2 times, most recently from fb04288 to 2d030f7 Compare August 31, 2021 15:13
@rhodgkins
Copy link
Contributor Author

Sorry to pester @inlined / @taeold / @mbleigh but is there anyone who can have a look over this please?

@inlined
Copy link
Member

inlined commented Sep 2, 2021

Thanks for tagging me. I'll poke around internally to see if I can find a good reviewer for you.

@abeisgoat abeisgoat self-requested a review September 2, 2021 18:50
@abeisgoat
Copy link
Contributor

Wow awesome work! Sorry for having this slip under my radar. It's a lot to digest, but I'll dig through and send thoughts!

@rhodgkins
Copy link
Contributor Author

Thanks @abeisgoat, let me know if you want me to rebase the latest changes from master

@rhodgkins
Copy link
Contributor Author

Bump @abeisgoat & @inlined

@tonyjhuang
Copy link
Contributor

I'm checking in with @rachelmyers to see if she needs to give a pass. @sam-gc's approval is optional but please hold off on merging until Rachel chimes in, thanks.

@rachelmyers rachelmyers removed their request for review March 14, 2022 15:42
@tonyjhuang
Copy link
Contributor

tonyjhuang commented Mar 14, 2022

@rhodgkins Looks like you're good to go on approvals, thanks again for the high quality PR! Apologies for the very, very long turnaround time. We appreciate you volunteering your time to improve the Firebase Storage emulator!

@rhodgkins
Copy link
Contributor Author

Thanks! Look forward to the release containing this 👍 (I don't have permission to merge this in btw - thought I'd mention as it sounds like, from your comments, that I can click that button!)

@sam-gc
Copy link
Contributor

sam-gc commented Mar 14, 2022

Looks like it can't merge at the moment because this branch is outdated. May need a quick merge/rebase

@rhodgkins
Copy link
Contributor Author

@sam-gc / @tonyjhuang done

@tonyjhuang tonyjhuang enabled auto-merge (squash) March 14, 2022 17:37
@tonyjhuang
Copy link
Contributor

@rhodgkins Looks like an integration test failed. Please fix.

auto-merge was automatically disabled March 15, 2022 08:53

Head branch was pushed to by a user without write access

@rhodgkins
Copy link
Contributor Author

Fixed - tested locally with npm run clean && npm run test:triggers-end-to-end

@rhodgkins
Copy link
Contributor Author

I've resolved the conflicts too but ping me if there's more by the time you come back to this. (Some what frustrating having to resolve conflicts in a simple CHANGELOG each time 😬)

@tonyjhuang
Copy link
Contributor

Sounds good, running integration tests now.

@tonyjhuang tonyjhuang merged commit 9795181 into firebase:master Mar 15, 2022
@tonyjhuang
Copy link
Contributor

And it's in, thanks again for the high quality contribution, @rhodgkins 🥳

@rhodgkins
Copy link
Contributor Author

rhodgkins commented Mar 15, 2022

Thanks @tonyjhuang - any ideas on when they'll be a next a release? ignore me! Thanks again!

@tonyjhuang
Copy link
Contributor

Yes I believe this should be included in https://github.com/firebase/firebase-tools/releases/tag/v10.3.0 which we just made the cut off for :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes Manual indication that this has passed CLA.
Projects
None yet
9 participants