Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add upload functionality #214

Merged
merged 3 commits into from Apr 1, 2020
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
@@ -0,0 +1,212 @@
/*
* Copyright 2020 Google LLC
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package com.google.cloud.storage;

import com.google.cloud.WriteChannel;
import java.io.IOException;
import java.io.InputStream;
import java.nio.ByteBuffer;
import java.nio.channels.Channels;
import java.nio.channels.ReadableByteChannel;
import java.nio.file.Files;
import java.nio.file.Path;

/**
* Utility methods to perform various operations with the Storage such as upload.
*
* <p>Example of uploading files from a folder:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"directory" might or might not be clearer here. Depending on whether the reader is coming from Mac, Windows, or Linux. However Java does call this directory, so we should probably stick to that.

*
* <pre>{@code
* File folder = new File("pictures/");
* StorageUtils utils = StorageUtils.create(storage);
* for (File file: folder.listFiles()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why you have this example at all. It doesn't handle subdirectories.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example demonstrates that:

  • StorageUtils instance can be reused
  • Upload of folders is not supported

To upload a folder with subfolders one should case about nuances like correct handling of symbolic links. Getting the blob name for files in subfolders requires an extra code. So the example will be too complicated.
I can remove this one.

* if (!file.isDirectory()) {
* BlobInfo blobInfo = BlobInfo.newBuilder(BUCKET, file.getName()).build();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two space indents in example

* try {
* utils.upload(blobInfo, file.toPath());
* } catch (IOException e) {
* System.err.println("Unable to upload " + file + ": " + e.getMessage());
* }
* }
* }
* }</pre>
*/
public final class StorageUtils {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Utils" classes usually don't have object state or non-static methods. Perhaps there's a better name here? Uploader or StorageUploader perhaps?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class could be used not only of uploading, it could contain downloading or another functionality as well. Would it be okay to rename it to StorageHelper?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works. Another possibility: StorageTransmitter.

However, also consider whether the Storage field works better as a method argument.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a very rare case when someone has more than one instance of the storage. So, passing this instance to every method doesn't seem to be convenient. Methods with 4 arguments doesn't look as easy to understand.
Having a class above the storage will certainly give some freedom in extending API without breaking compatibility. To me 'Transmitter' sounds like something closed to the physical layer. 'Helper' is very generic. May be StorageOperations or StorageFunctions fits better?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StorageMover? StorageTransporter?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like StorageOperations, actually.


/** The instance of the Storage the utilities are associated with. */
public final Storage storage;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

important: private

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StorageMover? StorageTransporter?

This class can be used not only for upload/download operation, but for any operation with the storage. That will allow to add new functionality without extending the Storage interface and breaking API.

May be StorageOperations or StorageFunctions feet better?


private static final int DEFAULT_BUFFER_SIZE = 15 * 1024 * 1024;
private static final int MIN_BUFFER_SIZE = 256 * 1024;

private StorageUtils(Storage storage) {
this.storage = storage;
}

/**
* Creates a new utility object associated with the given storage.
*
* @param storage the Storage
* @return an instance which refers to {@code storage}
*/
public static StorageUtils create(Storage storage) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no better than a constructor. Remove it.

return new StorageUtils(storage);
}

/**
* Uploads the given {@code path} to the blob using {@link Storage#writer}. By default any MD5 and
* CRC32C values in the given {@code blobInfo} are ignored unless requested via the {@link
* Storage.BlobWriteOption#md5Match()} and {@link Storage.BlobWriteOption#crc32cMatch()} options.
* Folder upload is not supported.
*
* <p>Example of uploading a file:
*
* <pre>{@code
* String bucketName = "my-unique-bucket";
* String fileName = "readme.txt";
* BlobId blobId = BlobId.of(bucketName, fileName);
* BlobInfo blobInfo = BlobInfo.newBuilder(blobId).setContentType("text/plain").build();
* StorageUtils.create(storage).upload(blobInfo, Paths.get(fileName));
* }</pre>
*
* @param blobInfo blob to create
* @param path file to upload
* @param options blob write options
* @throws IOException on I/O error
* @throws StorageException on failure
* @see #upload(BlobInfo, Path, int, Storage.BlobWriteOption...)
*/
public void upload(BlobInfo blobInfo, Path path, Storage.BlobWriteOption... options)
throws IOException {
upload(blobInfo, path, DEFAULT_BUFFER_SIZE, options);
}

/**
* Uploads the given {@code path} to the blob using {@link Storage#writer} and the given {@code
* bufferSize}. By default any MD5 and CRC32C values in the given {@code blobInfo} are ignored
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete "the given"

* unless requested via the {@link Storage.BlobWriteOption#md5Match()} and {@link
* Storage.BlobWriteOption#crc32cMatch()} options. Folder upload is not supported.
*
* <p>{@link #upload(BlobInfo, Path, Storage.BlobWriteOption...)} invokes this one with a buffer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is "this one"?

* size of 15 MiB. Users can pass alternative values. Larger buffer sizes might improve the upload
* performance but require more memory. This can cause an OutOfMemoryError or add significant
* garbage collection overhead. Smaller buffer sizes reduce memory consumption, that is noticeable
* when uploading many objects in parallel. Buffer sizes less than 256 KiB are treated as 256 KiB.
*
* <p>Example of uploading a humongous file:
*
* <pre>{@code
* BlobId blobId = BlobId.of(bucketName, blobName);
* BlobInfo blobInfo = BlobInfo.newBuilder(blobId).setContentType("video/webm").build();
*
* int largeBufferSize = 150 * 1024 * 1024;
* Path file = Paths.get("humongous.file");
* StorageUtils.create(storage).upload(blobInfo, file, largeBufferSize);
* }</pre>
*
* @param blobInfo blob to create
* @param path file to upload
* @param bufferSize size of the buffer I/O operations
* @param options blob write options
* @throws IOException on I/O error
* @throws StorageException on failure
*/
public void upload(
BlobInfo blobInfo, Path path, int bufferSize, Storage.BlobWriteOption... options)
throws IOException {
if (Files.isDirectory(path)) {
throw new StorageException(0, path + " is a directory");
}
try (InputStream input = Files.newInputStream(path)) {
upload(blobInfo, input, bufferSize, options);
}
}

/**
* Uploads the given {@code content} to the blob using {@link Storage#writer}. By default any MD5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Uploads the given {@code content}" --> Reads bytes from an input stream and uploads those bytes to the blob

* and CRC32C values in the given {@code blobInfo} are ignored unless requested via the {@link
* Storage.BlobWriteOption#md5Match()} and {@link Storage.BlobWriteOption#crc32cMatch()} options.
*
* <p>Example of uploading data with CRC32C checksum:
*
* <pre>{@code
* BlobId blobId = BlobId.of(bucketName, blobName);
* byte[] content = "Hello, world".getBytes(UTF_8);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StandardCharsets.UTF_8

* Hasher hasher = Hashing.crc32c().newHasher().putBytes(content);
* String crc32c = BaseEncoding.base64().encode(Ints.toByteArray(hasher.hash().asInt()));
* BlobInfo blobInfo = BlobInfo.newBuilder(blobId).setCrc32c(crc32c).build();
* StorageUtils.create(storage).upload(blobInfo, new ByteArrayInputStream(content),
* Storage.BlobWriteOption.crc32cMatch());
* }</pre>
*
* @param blobInfo blob to create
* @param content content to upload
* @param options blob write options
* @throws IOException on I/O error
* @throws StorageException on failure
* @see #upload(BlobInfo, InputStream, int, Storage.BlobWriteOption...)
*/
public void upload(BlobInfo blobInfo, InputStream content, Storage.BlobWriteOption... options)
throws IOException {
upload(blobInfo, content, DEFAULT_BUFFER_SIZE, options);
}

/**
* Uploads the given {@code content} to the blob using {@link Storage#writer} and the given {@code
* bufferSize}. By default any MD5 and CRC32C values in the given {@code blobInfo} are ignored
* unless requested via the {@link Storage.BlobWriteOption#md5Match()} and {@link
* Storage.BlobWriteOption#crc32cMatch()} options.
*
* <p>{@link #upload(BlobInfo, InputStream, Storage.BlobWriteOption...)} )} invokes this method
* with a buffer size of 15 MiB. Users can pass alternative values. Larger buffer sizes might
* improve the upload performance but require more memory. This can cause an OutOfMemoryError or
* add significant garbage collection overhead. Smaller buffer sizes reduce memory consumption,
* that is noticeable when uploading many objects in parallel. Buffer sizes less than 256 KiB are
* treated as 256 KiB.
*
* @param blobInfo blob to create
* @param content content to upload
* @param bufferSize size of the buffer I/O operations
* @param options blob write options
* @throws IOException on I/O error
* @throws StorageException on failure
*/
public void upload(
BlobInfo blobInfo, InputStream content, int bufferSize, Storage.BlobWriteOption... options)
throws IOException {
try (WriteChannel writer = storage.writer(blobInfo, options)) {
upload(Channels.newChannel(content), writer, bufferSize);
}
}

/*
* Uploads the given content to the storage using specified write channel and the given buffer
* size. This method does not close any channels.
*/
private static void upload(ReadableByteChannel reader, WriteChannel writer, int bufferSize)
throws IOException {
bufferSize = Math.max(bufferSize, MIN_BUFFER_SIZE);
ByteBuffer buffer = ByteBuffer.allocate(bufferSize);
writer.setChunkSize(bufferSize);

while (reader.read(buffer) >= 0) {
buffer.flip();
writer.write(buffer);
buffer.clear();
}
}
}
@@ -0,0 +1,179 @@
/*
* Copyright 2020 Google LLC
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package com.google.cloud.storage;

import static org.easymock.EasyMock.anyObject;
import static org.easymock.EasyMock.createStrictMock;
import static org.easymock.EasyMock.eq;
import static org.easymock.EasyMock.expect;
import static org.easymock.EasyMock.replay;
import static org.easymock.EasyMock.verify;
import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertSame;
import static org.junit.Assert.fail;

import com.google.cloud.WriteChannel;
import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.nio.ByteBuffer;
import java.nio.file.Files;
import java.nio.file.NoSuchFileException;
import java.nio.file.Path;
import java.nio.file.Paths;
import org.junit.After;
import org.junit.Before;
import org.junit.Test;

public class StorageUtilsTest {
private Storage storage;
private StorageUtils storageUtils;

private static final BlobInfo BLOB_INFO = BlobInfo.newBuilder("b", "n").build();
private static final int DEFAULT_BUFFER_SIZE = 15 * 1024 * 1024;
private static final int MIN_BUFFER_SIZE = 256 * 1024;

@Before
public void setUp() {
storage = createStrictMock(Storage.class);
storageUtils = StorageUtils.create(storage);
}

@After
public void tearDown() throws Exception {
verify(storage);
}

@Test
public void testCreate() {
replay(storage);
assertSame(storage, storageUtils.storage);
}

@Test
public void testUploadFromNonExistentFile() {
replay(storage);
String fileName = "non_existing_file.txt";
try {
storageUtils.upload(BLOB_INFO, Paths.get(fileName));
storageUtils.upload(BLOB_INFO, Paths.get(fileName), -1);
fail();
} catch (IOException e) {
assertEquals(NoSuchFileException.class, e.getClass());
assertEquals(fileName, e.getMessage());
}
}

@Test
public void testUploadFromDirectory() throws IOException {
replay(storage);
Path dir = Files.createTempDirectory("unit_");
try {
storageUtils.upload(BLOB_INFO, dir);
storageUtils.upload(BLOB_INFO, dir, -2);
fail();
} catch (StorageException e) {
assertEquals(dir + " is a directory", e.getMessage());
}
}

private void prepareForUpload(BlobInfo blobInfo, byte[] bytes, Storage.BlobWriteOption... options)
throws Exception {
prepareForUpload(blobInfo, bytes, DEFAULT_BUFFER_SIZE, options);
}

private void prepareForUpload(
BlobInfo blobInfo, byte[] bytes, int bufferSize, Storage.BlobWriteOption... options)
throws Exception {
WriteChannel channel = createStrictMock(WriteChannel.class);
ByteBuffer expectedByteBuffer = ByteBuffer.wrap(bytes, 0, bytes.length);
channel.setChunkSize(bufferSize);
expect(channel.write(expectedByteBuffer)).andReturn(bytes.length);
channel.close();
replay(channel);
expect(storage.writer(blobInfo, options)).andReturn(channel);
replay(storage);
}

@Test
public void testUploadFromFile() throws Exception {
byte[] dataToSend = {1, 2, 3};
prepareForUpload(BLOB_INFO, dataToSend);
Path tempFile = Files.createTempFile("testUpload", ".tmp");
Files.write(tempFile, dataToSend);
storageUtils.upload(BLOB_INFO, tempFile);
}

@Test
public void testUploadFromStream() throws Exception {
byte[] dataToSend = {1, 2, 3, 4, 5};
Storage.BlobWriteOption[] options =
new Storage.BlobWriteOption[] {Storage.BlobWriteOption.crc32cMatch()};
prepareForUpload(BLOB_INFO, dataToSend, options);
InputStream input = new ByteArrayInputStream(dataToSend);
storageUtils.upload(BLOB_INFO, input, options);
}

@Test
public void testUploadSmallBufferSize() throws Exception {
byte[] dataToSend = new byte[100_000];
prepareForUpload(BLOB_INFO, dataToSend, MIN_BUFFER_SIZE);
InputStream input = new ByteArrayInputStream(dataToSend);
int smallBufferSize = 100;
storageUtils.upload(BLOB_INFO, input, smallBufferSize);
}

@Test
public void testUploadFromIOException() throws Exception {
IOException ioException = new IOException("message");
WriteChannel channel = createStrictMock(WriteChannel.class);
channel.setChunkSize(DEFAULT_BUFFER_SIZE);
expect(channel.write((ByteBuffer) anyObject())).andThrow(ioException);
replay(channel);
expect(storage.writer(eq(BLOB_INFO))).andReturn(channel);
replay(storage);
InputStream input = new ByteArrayInputStream(new byte[10]);
try {
storageUtils.upload(BLOB_INFO, input);
fail();
} catch (IOException e) {
assertSame(e, ioException);
}
}

@Test
public void testUploadMultiplePortions() throws Exception {
int totalSize = 400_000;
int bufferSize = 300_000;
byte[] dataToSend = new byte[totalSize];
dataToSend[0] = 42;
dataToSend[bufferSize] = 43;

WriteChannel channel = createStrictMock(WriteChannel.class);
channel.setChunkSize(bufferSize);
expect(channel.write(ByteBuffer.wrap(dataToSend, 0, bufferSize))).andReturn(1);
expect(channel.write(ByteBuffer.wrap(dataToSend, bufferSize, totalSize - bufferSize)))
.andReturn(2);
channel.close();
replay(channel);
expect(storage.writer(BLOB_INFO)).andReturn(channel);
replay(storage);

InputStream input = new ByteArrayInputStream(dataToSend);
storageUtils.upload(BLOB_INFO, input, bufferSize);
}
}