Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resource requirements #179

Open
manthey opened this issue Jul 7, 2022 · 10 comments
Open

Resource requirements #179

manthey opened this issue Jul 7, 2022 · 10 comments

Comments

@manthey
Copy link
Member

manthey commented Jul 7, 2022

Currently, we have a somewhat hacky method to indicate a job needs GPU. It would be better if there was an optional command for docker modules like the --list_cli command that would emit a json with requirements that the docker image needs to run.

@andreped
Copy link

andreped commented Feb 22, 2023

@manthey This is likely related to the issue I mentioned in the HistomicsTK gitter. Gitter name andreped.

Basically, I would need access to a GPU AND X11, as I want to use a python module that performs OpenCL/OpenGL compute and can run inference on the GPU. Note that I only need X11 for OpenCL/OpenGL compute and not rendering stuff within the docker image itself. For DSA, I figured that you could add X11 by adding this - /tmp/.X11-unix:/tmp/.X11-unix to the volumes attribute, e.g., like this.

But I think I'm running into an issue as all Tasks that are performed using the HistomicsTK plugin (I'm making my own custom one), uses slicer_cli_web which does not seem to provide the arguments I want to docker run. An example of what I could want to do is:

docker run --rm -it --gpus all -v /tmp/.X11-unix:/tmp/.X11-unix fastImage python3 -c "import fast"

As there is no X11, OpenCL is not available, hence, no OpenCL platforms, and importing pyFAST results in RuntimeError: clGetPlatformIDs.

But you mentioned a hacky solution to get GPU working? What is that hack specifically? Could a similar idea be used to add X11 support as well? As I believe this is somewhat related to what you are saying above, it felt natural to make a comment here. I hope that is OK.

@andreped
Copy link

Down the rabbit hole I came across this. Is that the hacky way? Could give it a shot.

Anyways, he seems to have exactly my issue:
I want to set --gpus all when create container ,but found docker-py not support this param.

Which then in DSA when Im trying to import CLI results in this error.

@andreped
Copy link

Or maybe it was this hack. Anyways, to get this working with DSA, wouldn't I need to add this fix to the slicer_cli_web or similar (like here), and then get that into my own DSA fork, build my own DSA image, and then maybe it works? Has to be a better solution. I will take a small break from this. Hopefully you have an idea.

@manthey
Copy link
Member Author

manthey commented Feb 22, 2023

You can specify some additional parameters that get passed to the docker when it is started via json specification of the CLIs. This (badly) documented on the readme of this repo:

 "Example3": {
    "type": "python",
    // docker-params is a dictionary of parameters passed to the docker API
    // when the docker container is created and run.  Not all possible tags
    // are passed through.  See the docker python module for options:
    // https://docker-py.readthedocs.io/en/stable/containers.html
    "docker-params": {
      "ipc_mode": "host"
    }
  }

I think this method would fail for passing volumes because (without trying it) it would either result in two volumes parameters to the run command or would get replaced by the volumes we normally mount. It also requires the dockerized task to know that it needs those parameters.

I think a possible solution is to merge array parameters from that docker-params dictionary (which probably has to be done in girder-worker). This wouldn't be hard -- my approach would be to add such a entry into a test CLI image and then see where girder_worker needs to be modified to merge that together and then make a PR in girder_worker.

Without rebuilding containers or modifying an existing library, the truly hacky way to do this is to monkey patch girder_worker to inject the mount, which would be overriding the command for the girder_worker container in the docker-compose (or changing it if running directly) to run your module, which would have its own main that would import girder_worker, patch the requisite run command to add your volume, then call app.worker_main as normal.

@andreped
Copy link

I arrived to the same conclusion. I'm trying to set the GPU with docker-params but I don't see what I need to set to enable GPU.

I don't believe that I need to pass volumes actually. It seems to work just by specifying the GPU.

And oh yeah, that is a hacky way indeed. What I just find strange is that HistoCloud which is a DSA plugin seem to have GPU support. However, I cannot really see how they did it. I assume you assisted them on the manner, or similar, as they would need GPU support to train models (at least thats the sane way of training them).

@andreped
Copy link

This from the girder_worker you mention, looks awfully similar to the hack described here.

Hence, I would believe there is "out-of-the-box" GPU support as long as I set something particular, no? Not sure why though.

@manthey
Copy link
Member Author

manthey commented Feb 22, 2023

girder_worker looks for a LABEL in the Dockerfile: e.g., https://github.com/DigitalSlideArchive/superpixel-classification/blob/main/Dockerfile#L5 to decide if it should pass the gpus flag.

@andreped
Copy link

girder_worker looks for a LABEL in the Dockerfile: e.g., https://github.com/DigitalSlideArchive/superpixel-classification/blob/main/Dockerfile#L5 to decide if it should pass the gpus flag.

I added the LABEL and it works! THANKS!!! I spent way too long on this...

I noticed this LABEL in the HistoCloud Dockerfile as well but didn't think much of it. Oh silly me...

But cheers! Then I believe I have a working solution. Will continue exploring FAST with DSA :]

@manthey
Copy link
Member Author

manthey commented Feb 22, 2023

This clearly means we need to document it MUCH better.

@andreped
Copy link

This clearly means we need to document it MUCH better.

Yes, sure, but I believe it would have been relevant to also document this in the HistomicsTK and/or DSA repos.

Anyways, I'm just happy I finally got it working. Thanks for the help, @manthey!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants