New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add upload functionality #214
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,212 @@ | ||
/* | ||
* Copyright 2020 Google LLC | ||
* | ||
* Licensed under the Apache License, Version 2.0 (the "License"); | ||
* you may not use this file except in compliance with the License. | ||
* You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
|
||
package com.google.cloud.storage; | ||
|
||
import com.google.cloud.WriteChannel; | ||
import java.io.IOException; | ||
import java.io.InputStream; | ||
import java.nio.ByteBuffer; | ||
import java.nio.channels.Channels; | ||
import java.nio.channels.ReadableByteChannel; | ||
import java.nio.file.Files; | ||
import java.nio.file.Path; | ||
|
||
/** | ||
* Utility methods to perform various operations with the Storage such as upload. | ||
* | ||
* <p>Example of uploading files from a folder: | ||
* | ||
* <pre>{@code | ||
* File folder = new File("pictures/"); | ||
* StorageUtils utils = StorageUtils.create(storage); | ||
* for (File file: folder.listFiles()) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure why you have this example at all. It doesn't handle subdirectories. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This example demonstrates that:
To upload a folder with subfolders one should case about nuances like correct handling of symbolic links. Getting the blob name for files in subfolders requires an extra code. So the example will be too complicated. |
||
* if (!file.isDirectory()) { | ||
* BlobInfo blobInfo = BlobInfo.newBuilder(BUCKET, file.getName()).build(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. two space indents in example |
||
* try { | ||
* utils.upload(blobInfo, file.toPath()); | ||
* } catch (IOException e) { | ||
* System.err.println("Unable to upload " + file + ": " + e.getMessage()); | ||
* } | ||
* } | ||
* } | ||
* }</pre> | ||
*/ | ||
public final class StorageUtils { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "Utils" classes usually don't have object state or non-static methods. Perhaps there's a better name here? Uploader or StorageUploader perhaps? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This class could be used not only of uploading, it could contain downloading or another functionality as well. Would it be okay to rename it to StorageHelper? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That works. Another possibility: StorageTransmitter. However, also consider whether the Storage field works better as a method argument. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it's a very rare case when someone has more than one instance of the storage. So, passing this instance to every method doesn't seem to be convenient. Methods with 4 arguments doesn't look as easy to understand. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. StorageMover? StorageTransporter? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I like StorageOperations, actually. |
||
|
||
/** The instance of the Storage the utilities are associated with. */ | ||
public final Storage storage; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. important: private There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This class can be used not only for upload/download operation, but for any operation with the storage. That will allow to add new functionality without extending the Storage interface and breaking API. May be StorageOperations or StorageFunctions feet better? |
||
|
||
private static final int DEFAULT_BUFFER_SIZE = 15 * 1024 * 1024; | ||
private static final int MIN_BUFFER_SIZE = 256 * 1024; | ||
|
||
private StorageUtils(Storage storage) { | ||
this.storage = storage; | ||
} | ||
|
||
/** | ||
* Creates a new utility object associated with the given storage. | ||
* | ||
* @param storage the Storage | ||
* @return an instance which refers to {@code storage} | ||
*/ | ||
public static StorageUtils create(Storage storage) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is no better than a constructor. Remove it. |
||
return new StorageUtils(storage); | ||
} | ||
|
||
/** | ||
* Uploads the given {@code path} to the blob using {@link Storage#writer}. By default any MD5 and | ||
* CRC32C values in the given {@code blobInfo} are ignored unless requested via the {@link | ||
* Storage.BlobWriteOption#md5Match()} and {@link Storage.BlobWriteOption#crc32cMatch()} options. | ||
* Folder upload is not supported. | ||
* | ||
* <p>Example of uploading a file: | ||
* | ||
* <pre>{@code | ||
* String bucketName = "my-unique-bucket"; | ||
* String fileName = "readme.txt"; | ||
* BlobId blobId = BlobId.of(bucketName, fileName); | ||
* BlobInfo blobInfo = BlobInfo.newBuilder(blobId).setContentType("text/plain").build(); | ||
* StorageUtils.create(storage).upload(blobInfo, Paths.get(fileName)); | ||
* }</pre> | ||
* | ||
* @param blobInfo blob to create | ||
* @param path file to upload | ||
* @param options blob write options | ||
* @throws IOException on I/O error | ||
* @throws StorageException on failure | ||
* @see #upload(BlobInfo, Path, int, Storage.BlobWriteOption...) | ||
*/ | ||
public void upload(BlobInfo blobInfo, Path path, Storage.BlobWriteOption... options) | ||
throws IOException { | ||
upload(blobInfo, path, DEFAULT_BUFFER_SIZE, options); | ||
} | ||
|
||
/** | ||
* Uploads the given {@code path} to the blob using {@link Storage#writer} and the given {@code | ||
* bufferSize}. By default any MD5 and CRC32C values in the given {@code blobInfo} are ignored | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. delete "the given" |
||
* unless requested via the {@link Storage.BlobWriteOption#md5Match()} and {@link | ||
* Storage.BlobWriteOption#crc32cMatch()} options. Folder upload is not supported. | ||
* | ||
* <p>{@link #upload(BlobInfo, Path, Storage.BlobWriteOption...)} invokes this one with a buffer | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what is "this one"? |
||
* size of 15 MiB. Users can pass alternative values. Larger buffer sizes might improve the upload | ||
* performance but require more memory. This can cause an OutOfMemoryError or add significant | ||
* garbage collection overhead. Smaller buffer sizes reduce memory consumption, that is noticeable | ||
* when uploading many objects in parallel. Buffer sizes less than 256 KiB are treated as 256 KiB. | ||
* | ||
* <p>Example of uploading a humongous file: | ||
* | ||
* <pre>{@code | ||
* BlobId blobId = BlobId.of(bucketName, blobName); | ||
* BlobInfo blobInfo = BlobInfo.newBuilder(blobId).setContentType("video/webm").build(); | ||
* | ||
* int largeBufferSize = 150 * 1024 * 1024; | ||
* Path file = Paths.get("humongous.file"); | ||
* StorageUtils.create(storage).upload(blobInfo, file, largeBufferSize); | ||
* }</pre> | ||
* | ||
* @param blobInfo blob to create | ||
* @param path file to upload | ||
* @param bufferSize size of the buffer I/O operations | ||
* @param options blob write options | ||
* @throws IOException on I/O error | ||
* @throws StorageException on failure | ||
*/ | ||
public void upload( | ||
BlobInfo blobInfo, Path path, int bufferSize, Storage.BlobWriteOption... options) | ||
throws IOException { | ||
if (Files.isDirectory(path)) { | ||
throw new StorageException(0, path + " is a directory"); | ||
} | ||
try (InputStream input = Files.newInputStream(path)) { | ||
upload(blobInfo, input, bufferSize, options); | ||
} | ||
} | ||
|
||
/** | ||
* Uploads the given {@code content} to the blob using {@link Storage#writer}. By default any MD5 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "Uploads the given {@code content}" --> Reads bytes from an input stream and uploads those bytes to the blob |
||
* and CRC32C values in the given {@code blobInfo} are ignored unless requested via the {@link | ||
* Storage.BlobWriteOption#md5Match()} and {@link Storage.BlobWriteOption#crc32cMatch()} options. | ||
* | ||
* <p>Example of uploading data with CRC32C checksum: | ||
* | ||
* <pre>{@code | ||
* BlobId blobId = BlobId.of(bucketName, blobName); | ||
* byte[] content = "Hello, world".getBytes(UTF_8); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. StandardCharsets.UTF_8 |
||
* Hasher hasher = Hashing.crc32c().newHasher().putBytes(content); | ||
* String crc32c = BaseEncoding.base64().encode(Ints.toByteArray(hasher.hash().asInt())); | ||
* BlobInfo blobInfo = BlobInfo.newBuilder(blobId).setCrc32c(crc32c).build(); | ||
* StorageUtils.create(storage).upload(blobInfo, new ByteArrayInputStream(content), | ||
* Storage.BlobWriteOption.crc32cMatch()); | ||
* }</pre> | ||
* | ||
* @param blobInfo blob to create | ||
* @param content content to upload | ||
* @param options blob write options | ||
* @throws IOException on I/O error | ||
* @throws StorageException on failure | ||
* @see #upload(BlobInfo, InputStream, int, Storage.BlobWriteOption...) | ||
*/ | ||
public void upload(BlobInfo blobInfo, InputStream content, Storage.BlobWriteOption... options) | ||
throws IOException { | ||
upload(blobInfo, content, DEFAULT_BUFFER_SIZE, options); | ||
} | ||
|
||
/** | ||
* Uploads the given {@code content} to the blob using {@link Storage#writer} and the given {@code | ||
* bufferSize}. By default any MD5 and CRC32C values in the given {@code blobInfo} are ignored | ||
* unless requested via the {@link Storage.BlobWriteOption#md5Match()} and {@link | ||
* Storage.BlobWriteOption#crc32cMatch()} options. | ||
* | ||
* <p>{@link #upload(BlobInfo, InputStream, Storage.BlobWriteOption...)} )} invokes this method | ||
* with a buffer size of 15 MiB. Users can pass alternative values. Larger buffer sizes might | ||
* improve the upload performance but require more memory. This can cause an OutOfMemoryError or | ||
* add significant garbage collection overhead. Smaller buffer sizes reduce memory consumption, | ||
* that is noticeable when uploading many objects in parallel. Buffer sizes less than 256 KiB are | ||
* treated as 256 KiB. | ||
* | ||
* @param blobInfo blob to create | ||
* @param content content to upload | ||
* @param bufferSize size of the buffer I/O operations | ||
* @param options blob write options | ||
* @throws IOException on I/O error | ||
* @throws StorageException on failure | ||
*/ | ||
public void upload( | ||
BlobInfo blobInfo, InputStream content, int bufferSize, Storage.BlobWriteOption... options) | ||
throws IOException { | ||
try (WriteChannel writer = storage.writer(blobInfo, options)) { | ||
upload(Channels.newChannel(content), writer, bufferSize); | ||
} | ||
} | ||
|
||
/* | ||
* Uploads the given content to the storage using specified write channel and the given buffer | ||
* size. This method does not close any channels. | ||
*/ | ||
private static void upload(ReadableByteChannel reader, WriteChannel writer, int bufferSize) | ||
throws IOException { | ||
bufferSize = Math.max(bufferSize, MIN_BUFFER_SIZE); | ||
ByteBuffer buffer = ByteBuffer.allocate(bufferSize); | ||
writer.setChunkSize(bufferSize); | ||
|
||
while (reader.read(buffer) >= 0) { | ||
buffer.flip(); | ||
writer.write(buffer); | ||
buffer.clear(); | ||
} | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,179 @@ | ||
/* | ||
* Copyright 2020 Google LLC | ||
* | ||
* Licensed under the Apache License, Version 2.0 (the "License"); | ||
* you may not use this file except in compliance with the License. | ||
* You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
|
||
package com.google.cloud.storage; | ||
|
||
import static org.easymock.EasyMock.anyObject; | ||
import static org.easymock.EasyMock.createStrictMock; | ||
import static org.easymock.EasyMock.eq; | ||
import static org.easymock.EasyMock.expect; | ||
import static org.easymock.EasyMock.replay; | ||
import static org.easymock.EasyMock.verify; | ||
import static org.junit.Assert.assertEquals; | ||
import static org.junit.Assert.assertSame; | ||
import static org.junit.Assert.fail; | ||
|
||
import com.google.cloud.WriteChannel; | ||
import java.io.ByteArrayInputStream; | ||
import java.io.IOException; | ||
import java.io.InputStream; | ||
import java.nio.ByteBuffer; | ||
import java.nio.file.Files; | ||
import java.nio.file.NoSuchFileException; | ||
import java.nio.file.Path; | ||
import java.nio.file.Paths; | ||
import org.junit.After; | ||
import org.junit.Before; | ||
import org.junit.Test; | ||
|
||
public class StorageUtilsTest { | ||
private Storage storage; | ||
private StorageUtils storageUtils; | ||
|
||
private static final BlobInfo BLOB_INFO = BlobInfo.newBuilder("b", "n").build(); | ||
private static final int DEFAULT_BUFFER_SIZE = 15 * 1024 * 1024; | ||
private static final int MIN_BUFFER_SIZE = 256 * 1024; | ||
|
||
@Before | ||
public void setUp() { | ||
storage = createStrictMock(Storage.class); | ||
storageUtils = StorageUtils.create(storage); | ||
} | ||
|
||
@After | ||
public void tearDown() throws Exception { | ||
verify(storage); | ||
} | ||
|
||
@Test | ||
public void testCreate() { | ||
replay(storage); | ||
assertSame(storage, storageUtils.storage); | ||
} | ||
|
||
@Test | ||
public void testUploadFromNonExistentFile() { | ||
replay(storage); | ||
String fileName = "non_existing_file.txt"; | ||
try { | ||
storageUtils.upload(BLOB_INFO, Paths.get(fileName)); | ||
storageUtils.upload(BLOB_INFO, Paths.get(fileName), -1); | ||
fail(); | ||
} catch (IOException e) { | ||
assertEquals(NoSuchFileException.class, e.getClass()); | ||
assertEquals(fileName, e.getMessage()); | ||
} | ||
} | ||
|
||
@Test | ||
public void testUploadFromDirectory() throws IOException { | ||
replay(storage); | ||
Path dir = Files.createTempDirectory("unit_"); | ||
try { | ||
storageUtils.upload(BLOB_INFO, dir); | ||
storageUtils.upload(BLOB_INFO, dir, -2); | ||
fail(); | ||
} catch (StorageException e) { | ||
assertEquals(dir + " is a directory", e.getMessage()); | ||
} | ||
} | ||
|
||
private void prepareForUpload(BlobInfo blobInfo, byte[] bytes, Storage.BlobWriteOption... options) | ||
throws Exception { | ||
prepareForUpload(blobInfo, bytes, DEFAULT_BUFFER_SIZE, options); | ||
} | ||
|
||
private void prepareForUpload( | ||
BlobInfo blobInfo, byte[] bytes, int bufferSize, Storage.BlobWriteOption... options) | ||
throws Exception { | ||
WriteChannel channel = createStrictMock(WriteChannel.class); | ||
ByteBuffer expectedByteBuffer = ByteBuffer.wrap(bytes, 0, bytes.length); | ||
channel.setChunkSize(bufferSize); | ||
expect(channel.write(expectedByteBuffer)).andReturn(bytes.length); | ||
channel.close(); | ||
replay(channel); | ||
expect(storage.writer(blobInfo, options)).andReturn(channel); | ||
replay(storage); | ||
} | ||
|
||
@Test | ||
public void testUploadFromFile() throws Exception { | ||
byte[] dataToSend = {1, 2, 3}; | ||
prepareForUpload(BLOB_INFO, dataToSend); | ||
Path tempFile = Files.createTempFile("testUpload", ".tmp"); | ||
Files.write(tempFile, dataToSend); | ||
storageUtils.upload(BLOB_INFO, tempFile); | ||
} | ||
|
||
@Test | ||
public void testUploadFromStream() throws Exception { | ||
byte[] dataToSend = {1, 2, 3, 4, 5}; | ||
Storage.BlobWriteOption[] options = | ||
new Storage.BlobWriteOption[] {Storage.BlobWriteOption.crc32cMatch()}; | ||
prepareForUpload(BLOB_INFO, dataToSend, options); | ||
InputStream input = new ByteArrayInputStream(dataToSend); | ||
storageUtils.upload(BLOB_INFO, input, options); | ||
} | ||
|
||
@Test | ||
public void testUploadSmallBufferSize() throws Exception { | ||
byte[] dataToSend = new byte[100_000]; | ||
prepareForUpload(BLOB_INFO, dataToSend, MIN_BUFFER_SIZE); | ||
InputStream input = new ByteArrayInputStream(dataToSend); | ||
int smallBufferSize = 100; | ||
storageUtils.upload(BLOB_INFO, input, smallBufferSize); | ||
} | ||
|
||
@Test | ||
public void testUploadFromIOException() throws Exception { | ||
IOException ioException = new IOException("message"); | ||
WriteChannel channel = createStrictMock(WriteChannel.class); | ||
channel.setChunkSize(DEFAULT_BUFFER_SIZE); | ||
expect(channel.write((ByteBuffer) anyObject())).andThrow(ioException); | ||
replay(channel); | ||
expect(storage.writer(eq(BLOB_INFO))).andReturn(channel); | ||
replay(storage); | ||
InputStream input = new ByteArrayInputStream(new byte[10]); | ||
try { | ||
storageUtils.upload(BLOB_INFO, input); | ||
fail(); | ||
} catch (IOException e) { | ||
assertSame(e, ioException); | ||
} | ||
} | ||
|
||
@Test | ||
public void testUploadMultiplePortions() throws Exception { | ||
int totalSize = 400_000; | ||
int bufferSize = 300_000; | ||
byte[] dataToSend = new byte[totalSize]; | ||
dataToSend[0] = 42; | ||
dataToSend[bufferSize] = 43; | ||
|
||
WriteChannel channel = createStrictMock(WriteChannel.class); | ||
channel.setChunkSize(bufferSize); | ||
expect(channel.write(ByteBuffer.wrap(dataToSend, 0, bufferSize))).andReturn(1); | ||
expect(channel.write(ByteBuffer.wrap(dataToSend, bufferSize, totalSize - bufferSize))) | ||
.andReturn(2); | ||
channel.close(); | ||
replay(channel); | ||
expect(storage.writer(BLOB_INFO)).andReturn(channel); | ||
replay(storage); | ||
|
||
InputStream input = new ByteArrayInputStream(dataToSend); | ||
storageUtils.upload(BLOB_INFO, input, bufferSize); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"directory" might or might not be clearer here. Depending on whether the reader is coming from Mac, Windows, or Linux. However Java does call this directory, so we should probably stick to that.