GitHub - sg0/nido: Batched Graph Clustering using Louvain Method on multiple GPUs.

sg0 / nido Public

Notifications You must be signed in to change notification settings
Fork 1
Star 2

Batched Graph Clustering using Louvain Method on multiple GPUs.

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
cugraph-scripts		cugraph-scripts
Makefile		Makefile
MakefileBase		MakefileBase
MakefileBaseNVHPC		MakefileBaseNVHPC
README		README
clustering.cpp		clustering.cpp
clustering.hpp		clustering.hpp
cuda_wrapper.hpp		cuda_wrapper.hpp
graph.cpp		graph.cpp
graph.hpp		graph.hpp
graph_cpu.cpp		graph_cpu.cpp
graph_cpu.hpp		graph_cpu.hpp
graph_cuda.cu		graph_cuda.cu
graph_cuda.hpp		graph_cuda.hpp
graph_gpu.cpp		graph_gpu.cpp
graph_gpu.hpp		graph_gpu.hpp
heap.cpp		heap.cpp
heap.hpp		heap.hpp
louvain_gpu.cpp		louvain_gpu.cpp
louvain_gpu.hpp		louvain_gpu.hpp
main.cpp		main.cpp
types.hpp		types.hpp
utils.hpp		utils.hpp

Repository files navigation

************************
nido (/knee/ˈdough/)
************************

*******
-------
 ABOUT
-------
*******
nido is a multi-GPU (C++, CUDA) implementation of Louvain 
method for graph community detection/clustering. 

This code requires NVIDIA CUDA (preferably 11.x, > 10.x) 
and C++14 compliant compiler (e.g., GNU GCC 9.x) for building. 

Please contact the following for any queries or support:
Sayan Ghosh, PNNL (sg0 at pnnl dot gov)

Paper: H. Chou and S. Ghosh. 2022. "Batched Graph Community 
Detection on GPUs". In 31st International Conference on Parallel 
Architectures and Compilation Techniques (PACT).

*************
-------------
 COMPILATION
-------------
*************
Please make minimal changes to the Makefile with the compiler flags 
and use a C++14 compliant compiler of your choice. 

Invoke `make clean; make` should build the binary (e.g., run_1_70). 
Execute the code with specific arguments mentioned in the next 
section. The Makefile has `NGPU` and `SM` shell variables to select 
the #GPUs and GPU architecture.

Pass a suitable value (equal to the #sockets or NUMA nodes on the 
system) to the GRAPH_FT_LOAD macro at compile-time, like 
-DGRAPH_FT_LOAD=4. This is important for `first touch' purposes.

The default values for certain variables are specified in types.hpp.

***********************
-----------------------
 EXECUTING THE PROGRAM
-----------------------
***********************

We allow users to pass any real world graph as input (or optionally 
use a random graph, which is not recommended). However, we expect 
an input graph to be in a certain binary format, which we have 
observed to be more efficient than reading ASCII format files. 
The code for binary conversion (from a variety of common graph 
formats) is packaged separately with Vite, which is an 
implementation of Louvain method in distributed memory.

Follow these three steps to convert a matrix-market file to binary:

1. Download and build Vite: <https://github.com/ECP-ExaGraph/vite> 
   (requires a C++11 compiler and MPI)
   
2. Download matrix-market format file (with .mtx extension) from the 
   SuiteSparse collection: <https://sparse.tamu.edu/>

3. Use fileConvert utility in Vite as follows:
   bin/./fileConvert -m -f com-orkut.mtx -o com-orkut.bin

Step #3 above is serial, so the time to convert will depend on the 
size of the input graph. The memory requirements are proportional
to the size of the input graph as well.

More discussions on various native format to binary file conversion: 
<https://github.com/ECP-ExaGraph/vite/blob/master/README#L130>

Once you have a binary graph, these are a few ways to run the code: 
./run_1_70 -f karate.bin
./run_2_70 -f com-orkut.bin -b 32
./run_2_70 -f com-orkut.bin -b 32 -o communities-com-orkut.txt
./run_2_70 -f com-orkut.bin -b 8 -i 100 -t 1.0E-03

Possible options (can be combined):

1. -f <bin-file>       : Specify input binary file after this argument. 
2. -p <?gpu> <?batch>  : Influence the way partitions are derived, do not modify
                         without consulting the code. 
3. -r <|V|> <EF>       : #Vertices and edge-factor (EF*|V|==#edges) for randomly 
                         generated graph.
4. -c                  : Uses Luby's algorithm for coloring.
5. -b <#batches>       : Specify #batches, default is 2 and it can affect the 
                         quality significantly, so try increasing it to 8--32.
6. -t <threshold>      : Specify threshold quantity (default: 1.0E-06) used to 
                         determine the exit criteria in an iteration.
7. -i                  : Specify maximum #iterations per phase (default: 500).                    
8. -o <file-name>      : Specify output file name for storing the communities/clusters.
9. -h                  : Prints sample execution options. 

We recommend just passing -f & -b options for most cases.