DeepView.Predict

A Runtime-Based Computational Performance Predictor for Deep Neural Network Training

Installation
Building from source
Usage example
Development Environment Setup
Release process
Release history
License
Research paper
Contributing

DeepView.Predict is a tool that predicts a deep neural network's training iteration execution time on a given GPU. It currently supports PyTorch. To learn more about how DeepView.Predict works, please see our research paper.

Installation

To run DeepView.Predict, you need:

Python 3.8+
Pytorch 1.13.1+
A system equiped with an Nvidia GPU with properly configured CUDA

Currently, we have predictors for the following Nvidia GPUs:

GPU	Generation	Memory	Mem. Type	SMs
P4000	Pascal	8 GB	GDDR5	14
P100	Pascal	16 GB	HBM2	56
V100	Volta	16 GB	HBM2	80
RTX 2070	Turing	8 GB	GDDR6	36
RTX 2080Ti	Turing	11 GB	GDDR6	68
T4	Turing	16 GB	GDDR6	40
RTX 3090	Ampere	24 GB	GDDR6X	82
A100	Ampere	40 GB	HBM2	108
A40	Ampere	48 GB	GDDR6	84
RTX A4000	Ampere	16 GB	GDDR6	48
RTX 4000	Turing	8 GB	GDDR6	36

Building locally

Installing from pip

Install via pip with the following command

pip install deepview-predict

Installing from source

Install CUPTI

CUPTI is a profiling interface required by DeepView.Predict. Select your version of CUDA here and follow the instructions to add NVIDIA's repository. Then, install CUPTI with:

sudo apt-get install cuda-cupti-xx-x

where xx-x represents the version of CUDA you have installed.

Alternatively, if you do not have root access on your machine, you can use conda to install CUPTI. Select your version of CUDA here and follow the instructions. For example if you have CUDA 11.6.0, you can install CUPTI with:

conda install -c "nvidia/label/cuda-11.6.0" cuda-cupti

After installing CUPTI, add $CONDA_HOME/extras/CUPTI/lib64/ to LD_LIBRARY_PATH to ensure the library is linked.

Install CMake 3.17+.
- Note that CMake 3.24.0 and 3.24.1 has a bug that breaks DeepView.Predict as it is not able to find the CUPTI directory and you should not use those versions
  - https://gitlab.kitware.com/cmake/cmake/-/merge_requests/7608/diffs
- Run the following commands to download and install a precompiled version of CMake 3.24.2
```
wget https://github.com/Kitware/CMake/releases/download/v3.24.2/cmake-3.24.2-linux-x86_64.sh
chmod +x cmake-3.24.2-linux-x86_64.sh
mkdir /opt/cmake
sh cmake-3.24.2-linux-x86_64.sh --prefix=/opt/cmake --skip-license
ln -s /opt/cmake/bin/cmake /usr/local/bin/cmake
```
- You can verify the version of CMake you installed with the following command
```
cmake --version
```
Install Git Large File Storage

Clone the DeepView.Predict package

git clone https://github.com/CentML/DeepView.Predict
cd DeepView.Predict

Get the pre-trained models used by DeepView.Predict

git submodule init && git submodule update
git lfs pull

Finally build DeepView.Predict with the following command
```
./analyzer/install-dev.sh
```

Building with Docker

DeepView.Predict has been tested to work on the latest version of NVIDIA NGC PyTorch containers.

To build DeepView.Predict with Docker, first run the NGC container where

docker run --gpus all -it --rm nvcr.io/nvidia/pytorch:XX.XX-py3

Inside the container, clone the repository then build and install DeepView.Predict Python package:

git clone --recursive https://github.com/CentML/DeepView.Predict
./habitat/analyzer/install-dev.sh

Note: DeepView.Predict needs access to your GPU's performance counters, which requires special permissions if you are running with a recent driver (418.43 or later). If you encounter a CUPTI_ERROR_INSUFFICIENT_PRIVILEGES error when running DeepView.Predict, please follow the instructions here and in issue #5.

Usage example

You can verify your DeepView.Predict installation by running the simple usage example:

# example.py
import habitat
import torch
import torchvision.models as models

# Define model and sample inputs
model = models.resnet50().cuda()
image = torch.rand(8, 3, 224, 224).cuda()

# Measure a single inference
tracker = habitat.OperationTracker(device=habitat.Device.RTX2080Ti)
with tracker.track():
    out = model(image)

trace = tracker.get_tracked_trace()
print("Run time on source:", trace.run_time_ms)

# Perform prediction to a single target device
pred = trace.to_device(habitat.Device.V100)
print("Predicted time on V100:", pred.run_time_ms)

python3 example.py

See experiments/run_experiment.py for other examples of DeepView.Predict usage.

Release History

See Releases

License

The code in this repository is licensed under the Apache 2.0 license (see LICENSE and NOTICE), with the exception of the files mentioned below.

This software contains source code provided by NVIDIA Corporation. These files are:

The code under cpp/external/cupti_profilerhost_util/ (CUPTI sample code)
cpp/src/cuda/cuda_occupancy.h

The code mentioned above is licensed under the NVIDIA Software Development Kit End User License Agreement.

We include the implementations of several deep neural networks under experiments/ for our evaluation. These implementations are copyrighted by their original authors and carry their original licenses. Please see the corresponding README files and license files inside the subdirectories for more information.

Research Paper

DeepView.Predict began as a research project in the EcoSystem Group at the University of Toronto. The accompanying research paper appeared in the proceedings of USENIX ATC'21. If you are interested, you can read a preprint of the paper here.

If you use DeepView.Predict in your research, please consider citing our paper:

@inproceedings{habitat-yu21,
  author = {Yu, Geoffrey X. and Gao, Yubo and Golikov, Pavel and Pekhimenko,
    Gennady},
  title = {{Habitat: A Runtime-Based Computational Performance Predictor for
    Deep Neural Network Training}},
  booktitle = {{Proceedings of the 2021 USENIX Annual Technical Conference
    (USENIX ATC'21)}},
  year = {2021},
}

Contributing

Check out CONTRIBUTING.md for more information on how to help with Habitat.

Name		Name	Last commit message	Last commit date
Latest commit History 153 Commits
.github/workflows		.github/workflows
analyzer		analyzer
build_scripts		build_scripts
cpp		cpp
docker		docker
experiments		experiments
tools		tools
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
RELEASE.md		RELEASE.md

License

CentML/DeepView.Predict

Folders and files

Latest commit

History

Repository files navigation

DeepView.Predict

Installation

Building locally

Installing from pip

Installing from source

Building with Docker

Usage example

Release History

License

Research Paper

Contributing

About

Resources

License

Stars

Watchers

Forks

Languages