Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws-s3-multipart: make headers part indexed too in prepareUploadParts #3895

Merged
merged 11 commits into from Jul 26, 2022
2 changes: 1 addition & 1 deletion package.json
Expand Up @@ -163,7 +163,7 @@
"test:locale-packs:warnings": "yarn workspace @uppy-dev/locale-pack test warnings",
"test:type": "yarn workspaces foreach -piv --include '@uppy/*' --exclude '@uppy/{angular,react-native,locales,companion,provider-views,robodog,svelte}' exec tsd",
"test:unit": "yarn run build:lib && NODE_OPTIONS=--experimental-vm-modules jest --env jsdom",
"test:watch": "jest --env jsdom --watch",
"test:watch": "NODE_OPTIONS=--experimental-vm-modules jest --env jsdom --watch --coverage false",
Murderlon marked this conversation as resolved.
Show resolved Hide resolved
"test:size": "yarn build:lib && size-limit --why",
"test": "npm-run-all lint test:locale-packs:unused test:locale-packs:warnings test:unit test:type test:companion",
"uploadcdn": "yarn node ./bin/upload-to-cdn.js",
Expand Down
77 changes: 42 additions & 35 deletions packages/@uppy/aws-s3-multipart/src/MultipartUploader.js
Expand Up @@ -162,39 +162,48 @@ class MultipartUploader {
return
}

// For a 100MB file, with the default min chunk size of 5MB and a limit of 10:
//
// Total 20 parts
// ---------
// Need 1 is 10
// Need 2 is 5
// Need 3 is 5
const need = this.options.limit - this.partsInProgress
const completeChunks = this.chunkState.filter((state) => state.done).length
const remainingChunks = this.chunks.length - completeChunks
let minNeeded = Math.ceil(this.options.limit / 2)
if (minNeeded > remainingChunks) {
minNeeded = remainingChunks
}
if (need < minNeeded) return
const getChunkIndexes = () => {
Murderlon marked this conversation as resolved.
Show resolved Hide resolved
Murderlon marked this conversation as resolved.
Show resolved Hide resolved
// For a 100MB file, with the default min chunk size of 5MB and a limit of 10:
//
// Total 20 parts
// ---------
// Need 1 is 10
// Need 2 is 5
// Need 3 is 5
const need = this.options.limit - this.partsInProgress
const completeChunks = this.chunkState.filter((state) => state.done).length
const remainingChunks = this.chunks.length - completeChunks
let minNeeded = Math.ceil(this.options.limit / 2)
if (minNeeded > remainingChunks) {
minNeeded = remainingChunks
}
if (need < minNeeded) return []

const candidates = []
for (let i = 0; i < this.chunkState.length; i++) {
const state = this.chunkState[i]
// eslint-disable-next-line no-continue
if (state.done || state.busy) continue
const chunkIndexes = []
for (let i = 0; i < this.chunkState.length; i++) {
const state = this.chunkState[i]
// eslint-disable-next-line no-continue
if (state.done || state.busy) continue

candidates.push(i)
if (candidates.length >= need) {
break
chunkIndexes.push(i)
if (chunkIndexes.length >= need) {
break
}
}

return chunkIndexes
}
if (candidates.length === 0) return

this.#prepareUploadParts(candidates).then((result) => {
candidates.forEach((index) => {
const chunkIndexes = getChunkIndexes()

if (chunkIndexes.length === 0) return

return this.#prepareUploadParts(chunkIndexes).then((result) => {
const { presignedUrls, headers } = result

chunkIndexes.forEach((index) => {
const partNumber = index + 1
const prePreparedPart = { url: result.presignedUrls[partNumber], headers: result.headers }
const prePreparedPart = { url: presignedUrls[partNumber], headers: headers?.[partNumber] }
this.#uploadPartRetryable(index, prePreparedPart).then(() => {
this.#uploadParts()
}, (err) => {
Expand Down Expand Up @@ -238,21 +247,19 @@ class MultipartUploader {
})
}

async #prepareUploadParts (candidates) {
candidates.forEach((i) => {
async #prepareUploadParts (chunkIndexes) {
chunkIndexes.forEach((i) => {
this.chunkState[i].busy = true
})

const result = await this.#retryable({
attempt: () => this.options.prepareUploadParts({
key: this.key,
uploadId: this.uploadId,
partNumbers: candidates.map((index) => index + 1),
chunks: candidates.reduce((chunks, candidate) => ({
...chunks,
// Use the part number as the index
[candidate + 1]: this.chunks[candidate],
}), {}),
parts: chunkIndexes.map((index) => ({
number: index + 1, // Use the part number as the index
chunk: this.chunks[index],
})),
}),
})

Expand Down
35 changes: 22 additions & 13 deletions packages/@uppy/aws-s3-multipart/src/index.test.js
Expand Up @@ -62,7 +62,7 @@ describe('AwsS3Multipart', () => {
partNumber
] = `https://bucket.s3.us-east-2.amazonaws.com/test/upload/multitest.dat?partNumber=${partNumber}&uploadId=6aeb1980f3fc7ce0b5454d25b71992&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIATEST%2F20210729%2Fus-east-2%2Fs3%2Faws4_request&X-Amz-Date=20210729T014044Z&X-Amz-Expires=600&X-Amz-SignedHeaders=host&X-Amz-Signature=test`
})
return { presignedUrls }
return { presignedUrls, headers: { 1: { 'Content-MD5': 'foo' } } }
}),
})
awsS3Multipart = core.getPlugin('AwsS3Multipart')
Expand All @@ -72,25 +72,32 @@ describe('AwsS3Multipart', () => {
const scope = nock(
'https://bucket.s3.us-east-2.amazonaws.com',
).defaultReplyHeaders({
'access-control-allow-headers': '*',
'access-control-allow-method': 'PUT',
'access-control-allow-origin': '*',
'access-control-expose-headers': 'ETag',
'access-control-expose-headers': 'ETag, Content-MD5',
})
// 6MB file will give us 2 chunks, so there will be 2 PUT and 2 OPTIONS
// calls to the presigned URL from 1 prepareUploadParts calls
const fileSize = 5 * MB + 1 * MB

scope
.options((uri) => uri.includes('test/upload/multitest.dat'))
.reply(200, '')
.options((uri) => uri.includes('test/upload/multitest.dat?partNumber=1'))
.reply(function replyFn () {
expect(this.req.headers['access-control-request-headers']).toEqual('Content-MD5')
return [200, '']
})
scope
.options((uri) => uri.includes('test/upload/multitest.dat'))
.reply(200, '')
.options((uri) => uri.includes('test/upload/multitest.dat?partNumber=2'))
.reply(function replyFn () {
expect(this.req.headers['access-control-request-headers']).toBeUndefined()
return [200, '']
})
scope
.put((uri) => uri.includes('test/upload/multitest.dat'))
.put((uri) => uri.includes('test/upload/multitest.dat?partNumber=1'))
.reply(200, '', { ETag: 'test1' })
scope
.put((uri) => uri.includes('test/upload/multitest.dat'))
.put((uri) => uri.includes('test/upload/multitest.dat?partNumber=2'))
.reply(200, '', { ETag: 'test2' })

core.addFile({
Expand All @@ -115,6 +122,7 @@ describe('AwsS3Multipart', () => {
const scope = nock(
'https://bucket.s3.us-east-2.amazonaws.com',
).defaultReplyHeaders({
'access-control-allow-headers': '*',
'access-control-allow-method': 'PUT',
'access-control-allow-origin': '*',
'access-control-expose-headers': 'ETag',
Expand Down Expand Up @@ -145,11 +153,12 @@ describe('AwsS3Multipart', () => {

await core.upload()

function validatePartData ({ partNumbers, chunks }, expected) {
expect(partNumbers).toEqual(expected)
partNumbers.forEach(partNumber => {
expect(chunks[partNumber]).toBeDefined()
})
function validatePartData ({ parts }, expected) {
expect(parts.map((part) => part.number)).toEqual(expected)

for (const part of parts) {
expect(part.chunk).toBeDefined()
}
}

expect(awsS3Multipart.opts.prepareUploadParts.mock.calls.length).toEqual(3)
Expand Down
2 changes: 1 addition & 1 deletion packages/@uppy/aws-s3-multipart/types/index.d.ts
Expand Up @@ -22,7 +22,7 @@ export interface AwsS3MultipartOptions extends PluginOptions {
) => MaybePromise<AwsS3Part[]>
prepareUploadParts?: (
file: UppyFile,
partData: { uploadId: string; key: string; partNumbers: Array<number>; chunks: { [k: number]: Blob } }
partData: { uploadId: string; key: string; parts: Record<number, Blob> }
Murderlon marked this conversation as resolved.
Show resolved Hide resolved
) => MaybePromise<{ presignedUrls: { [k: number]: string }, headers?: { [k: string]: string } }>
abortMultipartUpload?: (
file: UppyFile,
Expand Down
3 changes: 1 addition & 2 deletions packages/@uppy/aws-s3-multipart/types/index.test-d.ts
Expand Up @@ -21,8 +21,7 @@ import type { AwsS3Part } from '..'
expectType<UppyFile>(file)
expectType<string>(partData.uploadId)
expectType<string>(partData.key)
expectType<Array<number>>(partData.partNumbers)
expectType<{ [k: number]: Blob }>(partData.chunks)
expectType<Record<number, Blob>>(partData.parts)
return { presignedUrls: {} }
},
abortMultipartUpload (file, opts) {
Expand Down
1 change: 0 additions & 1 deletion website/inject.js
Expand Up @@ -185,7 +185,6 @@ async function injectGhStars () {

async function injectMarkdown () {
const sources = {
'.github/ISSUE_TEMPLATE/integration_help.md': 'src/_template/integration_help.md',
Murderlon marked this conversation as resolved.
Show resolved Hide resolved
'.github/CONTRIBUTING.md': 'src/_template/contributing.md',
}

Expand Down
11 changes: 7 additions & 4 deletions website/src/docs/aws-s3-multipart.md
Expand Up @@ -109,13 +109,12 @@ A function that generates a batch of signed URLs for the specified part numbers.

* `uploadId` - The UploadID of this Multipart upload.
* `key` - The object key in the S3 bucket.
* `partNumbers` - An array of indices of this part in the file (`PartNumber` in S3 terminology). Note that part numbers are _not_ zero-based.
* `chunks` - A Javascript object with the part numbers as keys and the Blob data for each part as the value.
* `parts` - An array of objects with the part number and chunk (`Array<{ number: number, chunk: blob }>`). `number` can’t be zero.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has there been consideration for using partNumber in place of number? So it would read as Array<{ partNumber: number, chunk: blob }> instead.

It might just be me, but reading { number: number } kind of trips one up a little bit at first glance.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I prefer making keys intention revealing with the context of the parent object. part.number already implies it's the part's number, wheres part.partNumber sounds like a pleonasm to me.


`prepareUploadParts` should return a `Promise` with an `Object` with keys:

* `presignedUrls` - A JavaScript object with the part numbers as keys and the presigned URL for each part as the value.
* `headers` - **(Optional)** Custom headers that should be sent to the S3 presigned URL.
* `headers` - **(Optional)** Custom headers to send along with every request per part (`{ 1: { 'Content-MD5': 'hash' }}`). These are indexed (1-based) by part number too so you can for instance send the MD5 hash validation for each part to S3.
Murderlon marked this conversation as resolved.
Show resolved Hide resolved

An example of what the return value should look like:

Expand All @@ -126,7 +125,11 @@ An example of what the return value should look like:
"2": "https://bucket.region.amazonaws.com/path/to/file.jpg?partNumber=2&...",
"3": "https://bucket.region.amazonaws.com/path/to/file.jpg?partNumber=3&..."
},
"headers": { "some-header": "value" }
"headers": {
"1": { "Content-MD5": "foo" },
"2": { "Content-MD5": "bar" },
"3": { "Content-MD5": "baz" }
}
}
```

Expand Down