Provide support for runtime creation of Docker-managed volumes #270

wphicks · 2018-01-31T15:53:13Z

It should be possible to create a Docker-managed volume (not a bind-mount) and use it in the context of a girder_worker instance running inside Docker that launches Docker tasks.

kotfic · 2018-01-31T15:54:55Z

This would resolve issues where the worker docker container needs to mount the host /"tmp" inside the container

zachmullen · 2018-01-31T15:56:05Z

@kotfic that can be accomplished with bind mounts. @wphicks could you describe your use case in more detail so we know what sort of API we'll need to expose?

cjh1 · 2018-01-31T16:07:58Z

@kotfic Not sure I understand the issues with the worker docker containers needing to mount the host "/tmp" ? The user doesn't have to do this? It just provides them with the ability to have a temporary directory shared between the host and container, this is a nice way to get results out of a container.

kotfic · 2018-01-31T16:08:37Z

So this is applicable in situations where docker is running alongside docker (e.g girder worker running in a container and a task running in a container). Currently the TemporaryNamedVolume container tries to mount the worker container's tmp directory into the task container's tmp directory. Because the worker is talking to the docker engine via a mounted socket file, it is actually talking to the docker engine running on the host machine. That means when the worker container makes the request to mount /tmp/whatever/foo.jpg inside the task container, it is passed to the docker engine running on the host, and the docker engine mounts the host's /tmp/whatever/foo.jpg directory inside the task container. This requires that the host /tmp directory be mounted in the worker container's /tmp directory so that the host, the worker and the task all share the same /tmp directory.

mounting the host /tmp directory into the worker /tmp directory must be performed at deploy time (for instance here.

Alternately, if we used dockerpy to create a docker manged volume, that volume would exist on the host at /var/run/docker/.... and could be mounted at run time inside the worker container and the task container.

zachmullen · 2018-01-31T16:09:57Z

Ah, understood, thanks for clarifying.

zachmullen · 2018-01-31T16:10:41Z

Is this something we may need to break the current volume-related API for?

cjh1 · 2018-01-31T16:19:19Z

@kotfic that makes sense now.

kotfic · 2018-01-31T16:23:59Z

@zachmullen No, I think this can probably be addressed without breaking any API surface, I think/hope it is just a different set of transforms and maybe some small internal things that need to change (e.g. how we make sure Volume Transforms are identifeid in container_args and girder_result_hooks and added to dockerpy's volume dictionary)

cjh1 · 2018-01-31T16:27:51Z

@kotfic Agreed, I think this can be encapsulated in a new transform or two.

zachmullen · 2018-01-31T16:38:33Z

If it's a different set of transforms, doesn't that mean that either

The client needs to know internal details about how the workers are deployed and choose to use the existing transforms or the new ones
The client should always choose the new transforms, essentially deprecating the old ones

?

cjh1 · 2018-01-31T16:43:04Z

@zachmullen The existing ones would still be used in the case of bind mounting an existing host volume, a different use case.

zachmullen · 2018-01-31T16:45:56Z

@cjh1 not sure I understand, to motivate this with an example I'm currently using:

    outdir = VolumePath('__thumbnails_output__')
    return docker_run.delay(
        'zachmullen/3d_thumbnails:latest', container_args=[
            '--phi-samples', str(_PHI_SAMPLES),
            '--theta-samples', str(_THETA_SAMPLES),
            '--width', str(_SIZE),
            '--height', str(_SIZE),
            '--preset', preset,
            GirderFileIdToVolume(files[0]['_id']),
            outdir
        ], girder_job_title='Interactive thumbnail generation: %s' % item['name'],
        girder_result_hooks=[
            GirderUploadVolumePathToItem(outdir, item['_id'], upload_kwargs={
                'reference': json.dumps({'interactive_thumbnail': True})
            })
        ]).job

As a client, I don't want to have to know whether the worker is running in the host or as a container, I just want an ephemeral directory for IO. In my case I don't declare any specific volume, so my understanding is that it defaults to the temporary volume. Would this still work in both deployment architectures once this is fixed?

cjh1 · 2018-01-31T17:40:04Z

@zachmullen What I was trying to say is that the Volume transform would still be preserved for bind mounting, still a valid use case. In your use case ( where the default temporary volume is being using ) we could move to managed volumes, without breaking things.

kotfic · 2018-01-31T19:19:56Z

Probably in the long run we should default to managed volumes as they are docker's recommended way of managing container external data. Managed volumes can work in either docker alongside docker or a regular process and shouldn't effect the API of your example.

We may need to provide some kind of flag or configuration where by girder worker can inform it's transforms that it is running inside a docker container and docker specific transforms can handle behavior differently. That would be some kind of a run-time configuration (i.e., when running girder-worker). At the end of the day though this is all speculative. There is no urgent need to implement or change anything, I just wanted to capture the somewhat complex behavior related to mounting from one container to another container through the host and identify that there may be better approaches to investigate.

kotfic assigned kotfic and cjh1 Jan 31, 2018

zachmullen added the enhancement label Jan 31, 2018

cjh1 mentioned this issue Feb 1, 2018

Rename Volume => BindMountVolume #273

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide support for runtime creation of Docker-managed volumes #270

Provide support for runtime creation of Docker-managed volumes #270

wphicks commented Jan 31, 2018

kotfic commented Jan 31, 2018

zachmullen commented Jan 31, 2018

cjh1 commented Jan 31, 2018

kotfic commented Jan 31, 2018

zachmullen commented Jan 31, 2018

zachmullen commented Jan 31, 2018

cjh1 commented Jan 31, 2018

kotfic commented Jan 31, 2018

cjh1 commented Jan 31, 2018

zachmullen commented Jan 31, 2018

cjh1 commented Jan 31, 2018

zachmullen commented Jan 31, 2018 •

edited

cjh1 commented Jan 31, 2018

kotfic commented Jan 31, 2018

Provide support for runtime creation of Docker-managed volumes #270

Provide support for runtime creation of Docker-managed volumes #270

Comments

wphicks commented Jan 31, 2018

kotfic commented Jan 31, 2018

zachmullen commented Jan 31, 2018

cjh1 commented Jan 31, 2018

kotfic commented Jan 31, 2018

zachmullen commented Jan 31, 2018

zachmullen commented Jan 31, 2018

cjh1 commented Jan 31, 2018

kotfic commented Jan 31, 2018

cjh1 commented Jan 31, 2018

zachmullen commented Jan 31, 2018

cjh1 commented Jan 31, 2018

zachmullen commented Jan 31, 2018 • edited

cjh1 commented Jan 31, 2018

kotfic commented Jan 31, 2018

zachmullen commented Jan 31, 2018 •

edited