Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

@uppy/aws-s3-multipart: fix the chunk size calculation #4508

Merged
merged 8 commits into from
Jun 19, 2023

Conversation

aduh95
Copy link
Member

@aduh95 aduh95 commented Jun 16, 2023

We were using Math.ceil instead of Math.floor which ended up creating always one part too many.

Copy link
Member

@Murderlon Murderlon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm reading this correctly, a 200GB file with explicitly passing 5MB as chunk size results in 40K chunks, but then you Math.min it to 10K, only uploading 1/4 of the file?

In tus-node-server I solve it like this:

  private calcOptimalPartSize(size: number): number {
    let optimalPartSize: number

    // When upload is smaller or equal to PreferredPartSize, we upload in just one part.
    if (size <= this.preferredPartSize) {
      optimalPartSize = size
    }
    // Does the upload fit in MaxMultipartParts parts or less with PreferredPartSize.
    else if (size <= this.preferredPartSize * this.maxMultipartParts) {
      optimalPartSize = this.preferredPartSize
      // The upload is too big for the preferred size.
      // We devide the size with the max amount of parts and round it up.
    } else {
      optimalPartSize = Math.ceil(size / this.maxMultipartParts)
    }

    return optimalPartSize
  }

Murderlon

This comment was marked as resolved.

@aduh95

This comment was marked as resolved.

Copy link
Member

@Murderlon Murderlon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have a 3MB file, we still set the part size to 5MB (or more). In tus servers we always set it to the file size if it's under the minimum part size.

Furthermore, arraySize would be 0.6 which with Math.floor is 0.

const chunkSize = Math.max(desiredChunkSize, minChunkSize)

const arraySize = Math.ceil(fileSize / chunkSize)
if (shouldUseMultipart && fileSize > 5 * MB) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the file size is 3MB, it will use a single part because of this condition.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay makes sense. Can we make 10_000 and 5 * MB global variables (or class properties). Something like maxMultipartParts and minPartSize?

Can we also add some docs above #initChunks that explains why we need to do these things?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we pull this login out into a function and make a unit test for the different file sizes and expected chunksizes?

Co-authored-by: Antoine du Hamel <duhamelantoine1995@gmail.com>
@aduh95 aduh95 added the safe to test Add this label on trustworthy PRs to spawn the e2e test suite label Jun 19, 2023
@github-actions github-actions bot removed pending end-to-end tests safe to test Add this label on trustworthy PRs to spawn the e2e test suite labels Jun 19, 2023
@aduh95 aduh95 merged commit 13da7ca into transloadit:main Jun 19, 2023
17 checks passed
@aduh95 aduh95 deleted the multipart-chunk-size branch June 19, 2023 15:13
@github-actions github-actions bot mentioned this pull request Jun 19, 2023
github-actions bot added a commit that referenced this pull request Jun 19, 2023
| Package                | Version | Package                | Version |
| ---------------------- | ------- | ---------------------- | ------- |
| @uppy/aws-s3           |   3.2.0 | @uppy/status-bar       |   3.2.0 |
| @uppy/aws-s3-multipart |   3.4.0 | @uppy/transloadit      |   3.1.6 |
| @uppy/companion        |   4.5.1 | @uppy/tus              |   3.1.1 |
| @uppy/core             |   3.2.1 | @uppy/url              |   3.3.2 |
| @uppy/dashboard        |   3.4.1 | @uppy/utils            |   5.4.0 |
| @uppy/golden-retriever |   3.0.4 | @uppy/xhr-upload       |   3.3.0 |
| @uppy/locales          |   3.2.2 | uppy                   |  3.10.0 |
| @uppy/provider-views   |   3.3.1 |                        |         |

- @uppy/aws-s3-multipart: fix the chunk size calculation (Antoine du Hamel / #4508)
- @uppy/aws-s3: add `shouldUseMultipart` option (Antoine du Hamel / #4299)
- @uppy/companion: switch from aws-sdk v2 to @aws-sdk/* (v3) (Scott Bessler / #4285)
- @uppy/companion,@uppy/core,@uppy/dashboard,@uppy/golden-retriever,@uppy/status-bar,@uppy/utils: Migrate all lodash' per-method-packages usage to lodash. (LinusMain / #4274)
- @uppy/core: Don't set late (throttled) progress event on a file that is 100% complete (Artur Paikin / #4507)
- @uppy/companion: revert randomness from file names (Mikael Finstad / #4509)
- @uppy/companion: Custom provider fixes (Mikael Finstad / #4498)
- @uppy/transloadit: ensure `fields` is not nullish when there no uploaded files (Antoine du Hamel / #4487)
- @uppy/aws-s3-multipart,@uppy/aws-s3,@uppy/tus,@uppy/utils,@uppy/xhr-upload: When file is removed (or all are canceled), controller.abort queued requests (Artur Paikin / #4504)
- @uppy/provider-views: Fix range selection not resetting and computing correctly (Terence C / #4415)
- meta: disallow use of `.only` in tests (Antoine du Hamel / #4494)
- @uppy/companion: fix 500 when file name contains non-ASCII chars (Antoine du Hamel / #4493)
- @uppy/locales: update `fr_FR.js` (Samuel De Backer / #4499)
- @uppy/aws-s3-multipart,@uppy/tus,@uppy/xhr-upload: Don't close socket while upload is still in progress (Artur Paikin / #4479)
- meta: bump `luxon` from 1.28.0 to 1.28.1 (dependabot[bot] / #4497)
- @uppy/utils: rename `EventTracker` -> `EventManager` (Stephen Wooten / #4481)
- meta: bump cookiejar from 2.1.3 to 2.1.4 (dependabot[bot] / #4496)
- meta: make `pre-commit` use `corepack yarn` instead of `npm run` (Antoine du Hamel / #4495)
- meta: bump ua-parser-js from 0.7.31 to 0.7.35 (dependabot[bot] / #4474)
- meta: bump @sideway/formula from 3.0.0 to 3.0.1 (dependabot[bot] / #4473)
- meta: bump http-cache-semantics from 4.1.0 to 4.1.1 (dependabot[bot] / #4472)
- @uppy/companion: Use filename from content-disposition instead of relying on url, with fallback (Artur Paikin / #4489)
- meta: bump `babel`, `esbuild`, and `vite` (dependabot[bot] / #4485)
- @uppy/dashboard: include the old state when setting new (Artur Paikin / #4490)
- @uppy/companion: fix companion implicitpath (Mikael Finstad / #4484)
- @uppy/companion: fix undefined protocol and example page (Mikael Finstad / #4483)
- meta: upgrade Cypress 12.9.0 -> 12.14.0 (Antoine du Hamel / #4491)
- @uppy/core: remove `state` getter from types (Antoine du Hamel / #4477)
- examples/php-xhr: Added filename sanitation and file size check before saving (neuronet77 / #4432)
- examples/php-xhr: update PHP dependencies (dependabot[bot])
- @uppy/xhr-upload: add support for arrays in metadata (Vasiliy Matyushin / #4431)
- @uppy/status-bar: Filtered ETA (stduhpf / #4458)
- @uppy/aws-s3-multipart: fix `getUploadParameters` option (Antoine du Hamel / #4465)
Murderlon added a commit that referenced this pull request Jun 20, 2023
* main: (61 commits)
  Release: uppy@3.10.0 (#4511)
  @uppy/aws-s3-multipart: fix the chunk size calculation (#4508)
  @uppy/aws-s3: add `shouldUseMultipart` option (#4299)
  @uppy/companion: switch from aws-sdk v2 to @aws-sdk/* (v3) (#4285)
  Migrate all lodash' per-method-packages usage to lodash. (#4274)
  @uppy/core: Don't set late (throttled) progress event on a file that is 100% complete (#4507)
  @uppy/companion: revert randomness from file names (#4509)
  Custom provider fixes (#4498)
  @uppy/transloadit: ensure `fields` is not nullish when there no uploaded files (#4487)
  When file is removed (or all are canceled), controller.abort queued requests (#4504)
  @uppy/provider-views: Fix range selection not resetting and computing correctly (#4415)
  meta: disallow use of `.only` in tests (#4494)
  @uppy/companion: fix 500 when file name contains non-ASCII chars (#4493)
  @uppy/locales: update `fr_FR.js` (#4499)
  Don't close socket while upload is still in progress (#4479)
  meta: bump `luxon` from 1.28.0 to 1.28.1 (#4497)
  @uppy/utils: rename `EventTracker` -> `EventManager` (#4481)
  meta: bump cookiejar from 2.1.3 to 2.1.4 (#4496)
  meta: make `pre-commit` use `corepack yarn` instead of `npm run` (#4495)
  meta: bump ua-parser-js from 0.7.31 to 0.7.35 (#4474)
  ...
Murderlon added a commit that referenced this pull request Jun 20, 2023
* main:
  Release: uppy@3.10.0 (#4511)
  @uppy/aws-s3-multipart: fix the chunk size calculation (#4508)
  @uppy/aws-s3: add `shouldUseMultipart` option (#4299)
  @uppy/companion: switch from aws-sdk v2 to @aws-sdk/* (v3) (#4285)
  Migrate all lodash' per-method-packages usage to lodash. (#4274)
  @uppy/core: Don't set late (throttled) progress event on a file that is 100% complete (#4507)
  @uppy/companion: revert randomness from file names (#4509)
  Custom provider fixes (#4498)
  @uppy/transloadit: ensure `fields` is not nullish when there no uploaded files (#4487)
  When file is removed (or all are canceled), controller.abort queued requests (#4504)
  @uppy/provider-views: Fix range selection not resetting and computing correctly (#4415)
  meta: disallow use of `.only` in tests (#4494)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants