Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java Storage NIO: FileSystemProvider::checkAccess fails on '/' with StorageException #1062

Closed
nielsbasjes opened this issue Nov 22, 2022 · 15 comments · Fixed by #1065, #1066 or #1067
Closed
Assignees
Labels
api: storage Issues related to the googleapis/java-storage-nio API. status: investigating The issue is under investigation, which is determined to be non-trivial.

Comments

@nielsbasjes
Copy link
Contributor

Using com.google.cloud:google-cloud-nio:0.124.20 the CloudStorageFileSystemProvider implementation of FileSystemProvider.checkAccess fails when asking about the root directory.

Take this code fragment and fill in a cloud storage bucket name you have access to:

public class App {

    private static final Logger LOG = LoggerFactory.getLogger(App.class);

    public static void main(String[] args) {

        String bucketName = "xxxxxxxxxxxx";

        FileSystem fileSystem = FileSystems.getFileSystem(URI.create("gs://" + bucketName));

        Path path = fileSystem.getPath("/");

        FileSystemProvider provider = fileSystem.provider();
        try {
            provider.checkAccess(path, AccessMode.READ, AccessMode.WRITE);
        }
        catch (Exception e) {
            LOG.error("{}", e);
        }
        LOG.warn("{}", path);
    }
}

Stack trace

NOTE: I have hidden the real bucket name.

com.google.cloud.storage.StorageException: Invalid field selection bucket
	at com.google.cloud.storage.StorageException.translate(StorageException.java:163)
	at com.google.cloud.storage.spi.v1.HttpStorageRpc.translate(HttpStorageRpc.java:291)
	at com.google.cloud.storage.spi.v1.HttpStorageRpc.get(HttpStorageRpc.java:517)
	at com.google.cloud.storage.StorageImpl.lambda$get$6(StorageImpl.java:298)
	at com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:103)
	at com.google.cloud.RetryHelper.run(RetryHelper.java:76)
	at com.google.cloud.RetryHelper.runWithRetries(RetryHelper.java:50)
	at com.google.cloud.storage.Retrying.run(Retrying.java:60)
	at com.google.cloud.storage.StorageImpl.run(StorageImpl.java:1476)
	at com.google.cloud.storage.StorageImpl.get(StorageImpl.java:296)
	at com.google.cloud.storage.contrib.nio.CloudStorageFileSystemProvider.checkAccess(CloudStorageFileSystemProvider.java:732)
	at nl.basjes.bugreports.App.main(App.java:31)
Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad Request
GET https://storage.googleapis.com/storage/v1/b/xxxxxxxxxxxx/o/?fields=bucket,name,id&projection=full
{
  "code" : 400,
  "errors" : [ {
    "domain" : "global",
    "location" : "fields",
    "locationType" : "parameter",
    "message" : "Invalid field selection bucket",
    "reason" : "invalidParameter"
  } ],
  "message" : "Invalid field selection bucket"
}
	at com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:146)
	at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:118)
	at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:37)
	at com.google.api.client.googleapis.services.AbstractGoogleClientRequest$1.interceptResponse(AbstractGoogleClientRequest.java:439)
	at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1111)
	at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:525)
	at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:466)
	at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:576)
	at com.google.cloud.storage.spi.v1.HttpStorageRpc.get(HttpStorageRpc.java:514)
	... 9 more

External references such as API reference guides

Any additional information below

I ran into this problem when "simply" trying to attach Apache Mina SFTP to a Google Cloud Bucket.
Note that this all does work as expected on

  • existing files/directories
  • non-existing files/directories

Only on '/' it fails.

@product-auto-label product-auto-label bot added the api: storage Issues related to the googleapis/java-storage-nio API. label Nov 22, 2022
@nielsbasjes
Copy link
Contributor Author

Note:
Connecting to the bucket via this yields the exact same result:

FileSystem fileSystem = CloudStorageFileSystem.forBucket(bucketName);

@tritone tritone added the status: investigating The issue is under investigation, which is determined to be non-trivial. label Nov 23, 2022
@BenWhitehead
Copy link
Collaborator

Hi @nielsbasjes,

Thanks for filing the issue. I've been able to replicate your findings and add a new test to the test suite. Unfortunately, I don't yet have a fix I can post. I do have a workaround which might be able to tide you over until I can get a fix.

Initial Investigation Findings

The error appears to be partially happening in the path parser logic and how it gets translated to the actual call to GCS. In this case having the trailing slash -- or solo slash -- the parser thinks there is an object part to the URI, but there isn't when it comes time to query GCS which fails to receive an object id.

Workaround

In this case if you remove the trailing slash, the method is able to complete without exception.

    Path path = Paths.get(URI.create("gs://bucket-name"));
    FileSystem fs = FileSystems.getFileSystem(uri);
    FileSystemProvider provider = fs.provider();
    provider.checkAccess(path, AccessMode.READ, AccessMode.WRITE);
Background

GCS doesn't actually have directories, so all directory specific operations in this library are either emulated, or optimistically lie. So in the case of asking the files system if write access is enabled for a directory gs://bucket/object/ the code will optimistically not error since it's impossible to query if write permissions are granted without actually writing an object.

@nielsbasjes
Copy link
Contributor Author

Hi,
Please check the changes I wrote as a workaround
https://github.com/nielsbasjes/java-storage-nio/tree/FSPCheckAccess

This fixes the root issue and also has a quick fix for being able to get the attributes. This also includes a hack to provide a faked set of posix attributes because apparently many libraries assume the entire world is a posix world

Let me know what you think.

@nielsbasjes
Copy link
Contributor Author

Note that with the fixes I wrote I can now simply tell the Apache Mina SFTP server to use a gs://something as the filesystem and I'm now able to do common things via sftp like listing, changing directory and getting a file.
So this is a works on my machine and you are free to copy this into this project I you like it.

@nielsbasjes
Copy link
Contributor Author

Some additional info to help you verify what I created.
Make a basic Java project with dependency org.apache.sshd:sshd-sftp:2.9.1 and the google-cloud-nio with my changes.
Running this class below creates an SFTP server on the provided bucket name on localhost:2222 using public key authentication and your local authorized_keys file.

With an ssh agent running with my key and my authorized_keys also having my public key;
I can do this: sftp -v -P2222 nbasjes@localhost and list and get the files in the bucket I have provided.

public class CloudStorageSftp {

    private static final Logger LOG = LoggerFactory.getLogger(CloudStorageSftp.class);

    public static void main(String[] args) throws IOException, InterruptedException {

        if (args.length == 0) {
            LOG.error("Must specify the bucket to connect to.");
            return;
        }

        String bucketName = args[0];

        try(SshServer server = SshServer.setUpDefaultServer()) {
            server.setPort(2222);
            server.setKeyPairProvider(new SimpleGeneratorHostKeyProvider(new File("my.pem").toPath()));

            SftpSubsystemFactory sftpSubsystemFactory = new SftpSubsystemFactory();
            sftpSubsystemFactory.setFileSystemAccessor(SftpFileSystemAccessor.DEFAULT);
            server.setSubsystemFactories(Collections.singletonList(sftpSubsystemFactory));
            server.setFileSystemFactory(new FileSystemFactory() {
                @Override
                public Path getUserHomeDir(SessionContext session) {
                    return Path.of("/");
                }

                @Override
                public FileSystem createFileSystem(SessionContext session) {
                    return FileSystems.getFileSystem(URI.create("gs://" + bucketName));
                }
            });
            server.start();

            // Only for testing, 10000 seconds is fine.
            sleep(10000000L);
        }
    }
}

@nielsbasjes
Copy link
Contributor Author

@BenWhitehead is there something I can do to get to a fixed implementation?

@BenWhitehead
Copy link
Collaborator

Apologies for the delay.

I think the changes you've made in your branch in general make sense. Would you be willing to open a PR to this repo for inclusion?

If you are willing, I think the smoothest path to merging things would be to break into 3 PRs each with commensurate tests:

  1. The fix for the exception being thrown this issue is for
  2. Adding the implementation for CloudStorageFileSystemProvider#readAttributes limited to basic and gcs
  3. Adding support for posix presentation
    • Merging 1 and 2 above should be quite easy as they are primarily plumbing, adding posix support I think will be a bit more hairy as there are many subtle things that will need to be thought through for the permissions and user/group masquerading.

I believe 1 and 2 above could be opened and reviewed at the same time, while 3 would depend on 2.

@nielsbasjes
Copy link
Contributor Author

Yes, I'll get on it.
Two questions:

  1. Do I need to sign a contributor agreement somewhere?
  2. I have not been able to trigger the exception in the mocked environment. Mainly because the error is given by the actual API behind it. How should I create a test for this?

@BenWhitehead
Copy link
Collaborator

Great, thanks ahead of time for the contribution!

Once you open your first PR our CLA bot will reply with instructions.

If you want to write a test that hits the actual service, feel free to add to ITGcsNio. It will be picked up and run in our integration tests environment with access to the production service. To run the integration suite locally, you'll need some form of credentials (either Application Default Credentials via gcloud or a service account JSON file with it's full path exported as GOOGLE_APPLICATION_CREDENTIALS env var), then you can run it through IntelliJ directly or by running mvn -Penable-integration-tests verify.

@nielsbasjes
Copy link
Contributor Author

@BenWhitehead can you please approve me running the workflows? Thanks.

@nielsbasjes
Copy link
Contributor Author

@BenWhitehead

With the 3 patches I have submitted combined and used in the minimal SFTP server I posted earlier I now see this when I list the files in my test cloud storage instance.

sftp> ls -la
drwxrwx---   1 fakeowner fakegroup        1 Jan  1  1970 .
-rw-rw----   1 fakeowner fakegroup        0 Nov 18 14:30 dummy
-rw-rw----   1 fakeowner fakegroup     2587 Nov 18 14:14 niels_pom.xml
-rw-rw----   1 fakeowner fakegroup      924 Nov 24 20:30 sftp-server-config.yaml
drwxrwx---   1 fakeowner fakegroup        1 Jan  1  1970 test

This is for me a totally acceptable answer.

@nielsbasjes
Copy link
Contributor Author

For easier testing: https://github.com/nielsbasjes/MinimalSftpCloudStorage

gcf-merge-on-green bot pushed a commit that referenced this issue Dec 7, 2022
…1065)

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
- [x] Make sure to open an issue: https://togithub.com/googleapis/java-storage-nio/issues/1062
- [x] Ensure the tests and linter pass
- [x] Code coverage does not decrease (if any source code was changed)
- [x] Appropriate docs were updated (if necessary)

Fixes #1062 ☕️
@BenWhitehead
Copy link
Collaborator

A v0.125.0 was cut a couple days ago with the first two PRs. I'll cut another release early next week to include the final one.

@BenWhitehead
Copy link
Collaborator

v0.126.0 has been cut with POSIX emulation

@nielsbasjes
Copy link
Contributor Author

v0.126.0 has been cut with POSIX emulation

Awesome. Thanks for all your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the googleapis/java-storage-nio API. status: investigating The issue is under investigation, which is determined to be non-trivial.
Projects
None yet
3 participants