Skip to content

CS/ECE/ME/EP 759 (High Performance Computing for Engineering Applications) Course Project: Cautiously Aggressive GPU Space Sharing to Improve Resource Utilization and Job Efficiency

Notifications You must be signed in to change notification settings

ruipeterpan/cs759-sp21

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CS/ECE/ME/EP 759 Spring 2021 Final Project

This README contains the code base for Rui Pan's final project report: Cautiously Aggressive GPU Space Sharing to Improve Resource Utilization and Job Efficiency.

Some of the prerequisites for replicating the results include:

  • An NVIDIA GPU with Volta architecture
  • Python 3.8 nightly build
  • CUDA-compatible PyTorch & TorchVision

This repo contains:

  • /data: Source data for running the workloads. It should be set up as follows:
  • /latex: LaTex files for editing the report on Overleaf
  • /output: Core-specific utilizations of workloads produced using an earlier version of the profiler
  • /tables: Shell scripts for replicating the profiling results in various tables
  • /workloads: Common DL/HPC workloads used in the evaluations. A lot of these are copied from Gavel.
  • plotting.ipynb: Jupyter Notebook that produces all figures in the report
  • profiler.py: Profiler parser wrapped around nvprof
  • pymps.py: Provides Python access to NVIDIA CUDA Multi-Process Service (MPS)
  • README.md: Well, of course I know him. He's me.
  • report.pdf: PDF version of the final report

About

CS/ECE/ME/EP 759 (High Performance Computing for Engineering Applications) Course Project: Cautiously Aggressive GPU Space Sharing to Improve Resource Utilization and Job Efficiency

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published