Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialization seems to contain wrong data #46

Open
dalu93 opened this issue May 12, 2020 · 5 comments
Open

Serialization seems to contain wrong data #46

dalu93 opened this issue May 12, 2020 · 5 comments

Comments

@dalu93
Copy link

dalu93 commented May 12, 2020

Hi,

I've been trying using this library (version 4.0.0-beta.2) together with AsyncHTTPClient to execute a multipart POST request, where I need to upload a file.

The expected request should look like the following

> POST /rest/api/content/595596971/child/attachment/609652887/data HTTP/1.1
> Host: myHost.com
> Authorization: Basic myBasic
> User-Agent: insomnia/7.1.1
> X-Atlassian-Token: nocheck
> Content-Type: multipart/form-data; boundary=X-INSOMNIA-BOUNDARY
> Accept: */*
> Content-Length: 656151

| --X-INSOMNIA-BOUNDARY
| Content-Disposition: form-data; name="file"; filename="picture.jpg"
| Content-Type: image/jpeg
It continues. I cut it because irrelevant.

This request works well through Insomnia and I tried to convert it to Swift using MultipartKit as below

import MultipartKit
import Foundation

let boundary = "MY-BOUNDARY"
let fileUrl = URL(fileURLWithPath: "/Users/myUser/Desktop/picture.jpg")
let fileData = try! Data(contentsOf: fileUrl)
var multipart = MultipartPart(body: fileData)
multipart.headers.replaceOrAdd(
    name: "Content-Disposition",
    value: "form-data; name=\"file\"; filename=\"picture.jpg\""
)
multipart.headers.replaceOrAdd(
    name: "Content-Type",
    value: "image/jpeg"
)

let body = try MultipartSerializer().serialize(
    parts: [multipart],
    boundary: boundary
)

print(Data(body.utf8).count) // 1130517

Unfortunately the data count (1130517) is quite bigger than expected (see Content-Length from Insomnia output).

I've tried my own implementation of this simple case using Foundation and I actually am able to create the correct payload

let multipartData = try MultipartHelper().multipart(
    from: fileUrl.path,
    partName: "file",
    boundary: boundary
)

print(multipartData.count) // 656135

And MultipartHelper.swift

import Foundation

struct MultipartHelper {
    enum Error: Swift.Error {
        case fileDoesNotExist
        case unsupportedFileType(String)
    }

    func multipart(from filePath: String, partName: String, boundary: String) throws -> Data {
        guard FileManager.default.fileExists(atPath: filePath) else {
            throw Error.fileDoesNotExist
        }

        let fileUrl = URL(fileURLWithPath: filePath)
        let fileName = fileUrl.lastPathComponent
        let fileExtension = fileUrl.pathExtension
        guard !fileName.isEmpty, !fileExtension.isEmpty else {
            throw Error.fileDoesNotExist
        }

        guard let contentType = ContentType(fileExt: fileExtension) else {
            throw Error.unsupportedFileType(fileName)
        }

        let fileData: Data
        do {
            fileData = try Data(contentsOf: fileUrl)
        } catch {
            throw Error.fileDoesNotExist
        }

        let headersPart = Data("""
        --\(boundary)\r\n\
        Content-Disposition: form-data; name="\(partName)"; filename="\(fileName)"\r\n\
        Content-Type: \(contentType.header)\r\n\
        \r\n
        """.utf8)
        let footerPart = Data("""
        \r\n--\(boundary)--\r\n
        """.utf8)

        var data = Data()
        data.append(headersPart)
        data.append(fileData)
        data.append(footerPart)

        return data
    }
}

enum ContentType {
    case png, jpeg, gif

    init?(fileExt: String) {
        switch fileExt.lowercased() {
        case "png":
            self = .png

        case "jpeg", "jpg":
            self = .jpeg

        case "gif":
            self = .gif

        default:
            return nil
        }
    }

    var header: String {
        switch self {
        case .gif:
            return "image/gif"

        case .jpeg:
            return "image/jpeg"

        case .png:
            return "image/png"
        }
    }
}

Am I somehow using the library incorrectly?

@tanner0101
Copy link
Member

tanner0101 commented May 18, 2020

Hi @dalu93, unfortunately this library's 4.0.0 version is no longer being maintained. It was merged into vapor/vapor a while ago: https://github.com/vapor/vapor/tree/master/Sources/Vapor/Multipart.

Eventually my goal is to split the package back out again and formally pitch this to the swift server work group's package index. Until then, you can access the MultipartParser and MultipartSerializer as part of the Vapor module.

That said, I'm not sure why the length is about 2x longer via MultipartKit here. Seeing the full output (attached as a .txt or something) could help. If this bug exists in the vapor/vapor version of this code I'm happy to fix it.

@dalu93
Copy link
Author

dalu93 commented May 20, 2020

Hi @tanner0101,

Vapor is not an option for us since we are planning to create a very small CLI tool and importing the whole Vapor product is going to be heavy and useless.

I created, however, an example repository where you can see the error: https://github.com/dalu93/multipartkit-issue
When running through Xcode, set the environment variable FILE_PATH to your own file path (the file is included in the project as well)

@dalu93
Copy link
Author

dalu93 commented Jun 4, 2020

Hi @tanner0101 have you had chance to have a look at it? Unfortunately I cannot run Vapor locally because I'm on MacOS Mojave (10.14) and Vapor only supports 10.15+.

I was anyway able to verify few things:

There is a difference in terms of bytes between the two methods, using MultipartKit the data is almost 2x the data I get from using my small helper.
So I tried to replicate as much as possible what MultipartKit is doing

  1. I created my multipartData using my small helper
let multipartData = try MultipartHelper().multipart(
    from: fileUrl.path,
    partName: "file",
    boundary: boundary
)
  1. I created a new ByteBuffer and I wrote the data there
var buffer = ByteBufferAllocator().buffer(capacity: 0)
buffer.writeBytes(multipartData)
  1. I converted the buffer to a String
let bufferString = String(decoding: buffer.readableBytesView, as: UTF8.self)
  1. I expected to have multipartData.count equal to Data(bufferString.utf8).count
print(Data(bufferString.utf8).count == multipartData.count) // got `false`

When I compare the string result of this operation (bufferString) to the string I get directly using MultipartSerializer, they are identical

let body = try MultipartSerializer().serialize(
    parts: [multipart],
    boundary: boundary
)

print(body == bufferString) // got `true`

@t-ae
Copy link
Contributor

t-ae commented Apr 9, 2021

There are two serialization methods:

public func serialize(parts: [MultipartPart], boundary: String) throws -> String {
var buffer = ByteBufferAllocator().buffer(capacity: 0)
try self.serialize(parts: parts, boundary: boundary, into: &buffer)
return String(decoding: buffer.readableBytesView, as: UTF8.self)
}

public func serialize(parts: [MultipartPart], boundary: String, into buffer: inout ByteBuffer) throws {
for part in parts {
buffer.writeString("--")
buffer.writeString(boundary)
buffer.writeString("\r\n")
for (key, val) in part.headers {
buffer.writeString(key)
buffer.writeString(": ")
buffer.writeString(val)
buffer.writeString("\r\n")
}
buffer.writeString("\r\n")
var body = part.body
buffer.writeBuffer(&body)
buffer.writeString("\r\n")
}
buffer.writeString("--")
buffer.writeString(boundary)
buffer.writeString("--\r\n")
}

The former method you are using uses String internally so it can corrupt input data.
You have to use the latter.

@0xTim
Copy link
Member

0xTim commented May 7, 2021

@dalu93 are you seeing this in the latest release? We've rewritten the parser in Swift so hopefully this issue has been resolved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants