Evaluation of Datacenter Scheduler Programming Abstractions

This repository contains the evaluation code and experiments written in Kotlin for the research work that investigates the performance impact of various datacenter scheduler programming abstractions.

Prerequisites

An x86 machine with at least 16GB RAM.
Any desktop Linux distribution with build tools installed. For example, build-essential package on Ubuntu.
Java 17 or greater.
Python 3.11 (We recommend conda, virtualenv, or a similar environment).

Reproducibility instructions

Run ./download-traces.sh. This will download all required traces from Zenodo and copy them to the relevant directories for experiments.
Setup a python 3.11 virtual environment. Install dependencies using pip install -r plot/script/requirements.txt.
Use ./run-experiments.sh to run the simulations and plot the results.
Figure 7 in the paper is plot/output/migrations-results-packing-azure.pdf. Figure 8 is plot/output/migrations-results-totaltime-azure.pdf. Figure 10 is plot/output/metadata-results-ibm.pdf.

Getting Started

Detailed instructions to build individual experiments and modify them start here. The code for all the specific experiments can be found in the opendc-experiments/studying-apis directory of the repository. To run the experiments, make sure to place the relevant traces in the corresponding experiment folder within this directory. For example, an example trace bitbrains-small is provided in the traces directory for your reference.

Setup Traces

Clone the repository to your local machine.
Navigate to the opendc-experiments/studying-apis folder.
Place the relevant traces for the experiment you want to run in the corresponding experiment folder.

Here are the steps for proper trace placing:

Download the necessary traces from the following link: https://zenodo.org/record/7996316.
Store the downloaded traces in the directory structure: opendc-experiments/studying-apis/<experiment>/src/main/resources/trace.
Create a folder within the trace directory with a name corresponding to the source of the trace (e.g., google, azure, bitbrains).
Place the trace Parquet file inside the respective source folder.
Make sure to rename the trace file as meta.parquet when copying it.

These steps ensure that the traces are properly organized and available for the evaluation experiments using the OpenDC platform.

Compile and Run

Open the terminal and navigate to the opendc project directory.
Run the command ./gradlew :opendc-experiments:studying-apis:<experiment>:experiment to compile and run the desired experiment.

Plotting Experiment Results

To plot the experiment results, please follow these steps:

Install the required Python packages by running the following command:
```
pip install -r plot/script/requirements.txt
```
Ensure that all the necessary files for plotting are located under the ./plot folder in the repository.
Set up the trace data by copying the corresponding trace files to the ./plot/trace/<dataset> directory. Rename the trace file to meta.parquet. For example, if you are working with the Azure dataset, the path would be ./plot/trace/azure/meta.parquet.
Prepare the input folder for the experiments you have run. Copy the experiment output to the path ./plot/input/<experiment>/<dataset>. To locate the output of the experiments, you can find them under the following directory structure: opendc-experiments/studying-apis//output/. Each experiment and dataset combination will have its own corresponding folder under the output directory. For instance, if you ran the migrations experiment using the Azure dataset, the path would be ./plot/input/migrations/azure.
Choose one of the available scripts located in the ./plot/script directory based on the experiment type. The available scripts are:
- migrations_plot_results.py
- reservations_plot_results.py
- metadata_plot_results.py
- metadata_plot_preview.py
Run the selected script with the desired dataset parameter. For example, to plot the results for the Azure dataset using the reservations experiment, run the following command:
```
python3 ./plot/script/reservations_plot_results.py azure
```
The generated plots and visualizations will be available in the ./plot/output/<experiment>/<dataset> directory.

Please note that the dataset parameter is mandatory when running the plotting script, as it determines which dataset the script will process.

Feel free to explore the generated plots and visualizations to analyze and interpret the experiment results.

That's it! You should now be able to plot your experiment results using the provided scripts and instructions.

OpenDC

The evaluation experiments are performed using OpenDC, an open-source data center simulation platform. OpenDC allows for accurate and realistic simulations of data center environments, enabling thorough evaluation and analysis of various scheduling APIs and their impact on performance.

OpenDC is a free and open-source platform for datacenter simulation aimed at both research and education.

Users can construct datacenters (see above) and define portfolios of scenarios (experiments) to see how these datacenters perform under different workloads and schedulers (see below).

The simulator is accessible both as a ready-to-use website hosted by us at opendc.org, and as source code that users can run locally on their own machine, through Docker.

To learn more about OpenDC, have a look through our paper OpenDC 2.0 or on our vision.

🛠 OpenDC is a project by the @Large Research Group.

🐟 OpenDC comes bundled with Capelin , the capacity planning tool for cloud datacenters based on portfolios of what-if scenarios. More information on how to use and extend Capelin coming soon!

Documentation

The documentation is located in the docs/ directory and is divided as follows:

Contributing

Questions, suggestions and contributions are welcome and appreciated! Please refer to the contributing guidelines for more details.

License

This work is distributed under the MIT license. See LICENSE.txt.

Name		Name	Last commit message	Last commit date
Latest commit History 1,902 Commits
.github		.github
buildSrc		buildSrc
docs		docs
gradle		gradle
opendc-common		opendc-common
opendc-compute		opendc-compute
opendc-experiments		opendc-experiments
opendc-faas		opendc-faas
opendc-harness		opendc-harness
opendc-simulator		opendc-simulator
opendc-telemetry		opendc-telemetry
opendc-trace		opendc-trace
opendc-web		opendc-web
opendc-workflow		opendc-workflow
plot		plot
traces/bitbrains-small		traces/bitbrains-small
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
CITATION.cff		CITATION.cff
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.txt		LICENSE.txt
README.md		README.md
build.gradle.kts		build.gradle.kts
codecov.yml		codecov.yml
docker-compose.override.yml		docker-compose.override.yml
docker-compose.prod.yml		docker-compose.prod.yml
docker-compose.yml		docker-compose.yml
download-traces.sh		download-traces.sh
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
run-experiments.sh		run-experiments.sh
settings.gradle.kts		settings.gradle.kts

License

atlarge-research/quantifying-api-design

Folders and files

Latest commit

History

Repository files navigation

Evaluation of Datacenter Scheduler Programming Abstractions

Prerequisites

Reproducibility instructions

Getting Started

Setup Traces

Compile and Run

Plotting Experiment Results

OpenDC

Documentation

Contributing

License

About

Resources

License

Stars

Watchers

Forks

Languages