Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Completing Multipart Uploads with Checksum That Contains Number of Parts Fails #19670

Closed
jakub300 opened this issue May 4, 2024 · 2 comments · Fixed by #19680
Closed

Comments

@jakub300
Copy link

jakub300 commented May 4, 2024

Expected Behavior

When completing multipart upload checksum in format {checksum}-{numberOfParts} should be supported.

Current Behavior

When completing multipart upload checksum in format {checksum}-{numberOfParts}, InvalidArgument: Invalid arguments provided for checksum-bug/minio-checksum-bug/2024-05-04T17:52:34.793Z.mp4: (invalid/unknown checksum sent: invalid checksum) error is returned.

Various scenarios:

  1. both s3 and minio allow completing without checksum
  2. both s3 and minio fail on invalid checkum
  3. both s3 and minio allow completing with checksum that does not contain number of parts, eg. ejt070PMd2pF/mFNTgP6LDtAo2px3L+i/l91VTgvRSc=.
  4. only s3 allows completing when checkusm with number of parts as a suffix is provided, eg. ejt070PMd2pF/mFNTgP6LDtAo2px3L+i/l91VTgvRSc=-2.

Possible Solution

Fix somewhere in erasure-multipart.go.

Steps to Reproduce (for bugs)

  1. Create multipart upload with ChecksumAlgorithm: "SHA256" (likely all other algorithms are also impacted)
  2. Upload parts with checksum
  3. Complete multipart upload with checksum combined with number of parts
Sample code (JS)
// tested with node@20.12.2, minio version RELEASE.2024-05-01T01-11-10Z
// save as .mjs file
// set flowing env variables before starting or here:
// process.env.S3_BUCKET_NAME = "";
// process.env.AWS_REGION = ""; // only for s3
// process.env.AWS_ACCESS_KEY_ID = "";
// process.env.AWS_SECRET_ACCESS_KEY = "";
// process.env.AWS_ENDPOINT_URL = ""; // only for minio

import {
  CompleteMultipartUploadCommand,
  CreateMultipartUploadCommand,
  ListPartsCommand,
  S3Client,
  UploadPartCommand,
} from "@aws-sdk/client-s3"; // tested with version 3.569.0
import crypto from "node:crypto";

function getChecksum(buffer) {
  const hash = crypto.createHash("sha256");
  hash.update(new Uint8Array(buffer));
  return hash.digest();
}

const video = await fetch(
  "https://github.com/minio/minio/assets/610941/2653d680-2c87-42ea-98d0-8feab199d3ef"
).then((response) => response.arrayBuffer());

const s3 = new S3Client({
  forcePathStyle: !!process.env.AWS_ENDPOINT_URL,
});

const MB_5 = 5 * 1024 * 1024;
const KEY = `minio-checksum-bug/${new Date().toISOString()}.mp4`;
const PARTS = Math.ceil(video.byteLength / MB_5);

console.log({ KEY, PARTS });

const multipartUpload = await s3.send(
  new CreateMultipartUploadCommand({
    Bucket: process.env.S3_BUCKET_NAME,
    Key: KEY,
    ChecksumAlgorithm: "SHA256",
  })
);

const uploadId = multipartUpload.UploadId;

const checksums = [];

for (let i = 0; i < PARTS; i++) {
  const start = i * MB_5;
  const end = Math.min((i + 1) * MB_5, video.byteLength);
  const body = video.slice(start, end);
  const checksum = getChecksum(body);
  const checksumBase64 = checksum.toString("base64");
  checksums.push(checksum);
  console.log({ i, checksumBase64 });

  await s3.send(
    new UploadPartCommand({
      Bucket: process.env.S3_BUCKET_NAME,
      Key: KEY,
      UploadId: uploadId,
      PartNumber: i + 1,
      Body: body,
      ChecksumSHA256: checksum.toString("base64"),
    })
  );
}

const partsList = await s3.send(
  new ListPartsCommand({
    Bucket: process.env.S3_BUCKET_NAME,
    Key: KEY,
    UploadId: uploadId,
  })
);

const checksumTotal = getChecksum(Buffer.concat(checksums));
const checksumTotalBase64 = checksumTotal.toString("base64");
console.log({ checksumTotalBase64 });

await s3.send(
  new CompleteMultipartUploadCommand({
    Bucket: process.env.S3_BUCKET_NAME,
    Key: KEY,
    UploadId: uploadId,
    MultipartUpload: {
      Parts: partsList.Parts.map((part) => ({
        PartNumber: part.PartNumber,
        ETag: part.ETag,
        ChecksumSHA256: part.ChecksumSHA256,
      })),
    },
    ChecksumSHA256: "a" + checksumTotalBase64, // GOOD does not work in both s3 and minio
    ChecksumSHA256: checksumTotalBase64, // GOOD works in both s3 and minio
    ChecksumSHA256: `${checksumTotalBase64}-${PARTS}`, // BAD works in s3, does not work in minio
  })
);
Test file storage
bbb_sunflower_1080p_30fps_normal_cut_small.mp4

Context

Not a big deal for us at this point, we can remove part numbers from the checksum, but reporting this as a it is compatibility issue.

Regression

Probably not.

Your Environment

Testing locally on Windows WSL, not really relevant for this issue.

minio version RELEASE.2024-05-01T01-11-10Z (commit-id=7926401cbd5cceaacd9509f2e50e1f7d636c2eb8)
Runtime: go1.21.9 linux/amd64
License: GNU AGPLv3 <https://www.gnu.org/licenses/agpl-3.0.html>
Copyright: 2015-2024 MinIO, Inc.
Linux KUBA-PC 5.15.146.1-microsoft-standard-WSL2 #1 SMP Thu Jan 11 04:09:03 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
@harshavardhana harshavardhana changed the title Completing Multipart Uploads with Checksum That Contains Number of Parts Fails feat: Completing Multipart Uploads with Checksum That Contains Number of Parts Fails May 4, 2024
@klauspost
Copy link
Contributor

Yeah. Seems like S3 added the part count later.

@klauspost
Copy link
Contributor

#19680 added.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants