GitHub - DirkToewe/SelX: Selection Algorithms for Scala

SelX is an exploratory project that compares the performance and efficiency of different Selection Algorithms. As a baseline the algorithms are also compared to java.util.Arrays.sort. The benchmark results for random uniform boxed Double inputs can be found here. SelX implements multiple evolutions of the following Selection Algorithms:

Bubble Select: Inspired by Bubble Sort. Moves the maxima from the left side to the right until the Selection property is instated.
Heap Select: Builds a Binary Heap on the larger of the two sides and swaps with the other side until the Selection property is instated.
Median-of-Medians (MoM): Splits the input into small groups of 3 (MoM3) or 5 (MoM5) entries, computes the median of each groups. Of these medians, the median is computed by calling MoM recursively. This median is then used to split/pivotize the original input into two parts. The entire procedure is repeated on the correct one of the two parts recursively. MoM5 is guaranteed O(n).
Median-of-Medians-of-Medians (MoMoM3): Also known as repeated step algorithm. Splits the input into groups of 3 and computes their median. Those medians are again split into groups of 3, their median is computed. Of those medians, the median is computed by recursively calling MoMoM3. That one resulting median is then used to split/pivotize the original input into two parts. The entire procedure is repeated on the correct one of the two parts recursively. MoMoM3 is guaranteed O(n) and slightly faster than MoM5.
Quick Select: Chooses a random entry from the input and uses it to split/pivotize the input. This is done recursively until the Selection property is instated.
Mean Select: Like Quick Select but uses the mean value of th inputs to split/pivotize. (Requires numeric keys).

Most of the algorithms have been incrementally tweaked and optimized. The most effectful optimizations were taken from this paper:

Use the median of three random values as pivot for Quick Select
Avoid splitting/pivotizing the medians of the groups a second time
Instead of just selecting the median of the medians, select another index of the medians if it guarantees a better reduction of the problem size while splitting/pivotizing.

Running the Benchmarks

To run a quick benchmark that will automatically generate HTML plots of the results, use sbt to call:

test:runMain test:runMain selx.Select_comparison

To run the proper JMH benchmarks, which may take roughly 24 hours, run:

jmh:run

To run only the unboxed or boxed JMH benchmark, run:

jmh:run selx.Select_benchmarks_double

or:

jmh:run selx.Select_benchmarks_boxed

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
docs		docs
project		project
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build.sbt		build.sbt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs

docs

project

project

src

src

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

build.sbt

build.sbt

Repository files navigation

Running the Benchmarks

About

Releases

Packages

Languages

License

DirkToewe/SelX

Folders and files

Latest commit

History

Repository files navigation

Running the Benchmarks

About

Topics

Resources

License

Stars

Watchers

Forks

Languages