Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a GPU accelerated examples/extra-large-graphs #1252

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

thomcom
Copy link

@thomcom thomcom commented Jun 3, 2022

Hi @jacomyal ! I've been working on an NVIDIA backed sigma.js extension for a few weeks and I'm sharing it with you today. We're writing GPU accelerated, open-source libraries for data visualization, among other things, and wanted to try to integrate with sigma.js. It's lead me to have plenty of ideas for optimizing the sigma.js process/render pipeline, if you're interested. Hopefully you like my contribution! Below is the README.md for this contribution.

Extra Large Graphs Demo with @rapidsai/node GPU acceleration

RAPIDS.ai is an open source GPU-acceleration project. We're building new
tools with familiar APIs and extending existing tools to that more
scientists and users can take advantage of GPU performance.

This project creates a RapidsGraphologyGraph subclass of Graph,
plus a modified index.ts to make GPU-stored graph vertices and edges
available to your browser session. It talks to the @rapidai/demo-api-server
that has been built using node-rapids.

Performance

Using a GPU for graph rendering offers substantial performance increases.
In this demo, I've included screenshots of performance comparison's between
using our GPU-backed RapidsGraphologyGraph and the original renderer
included in the large-graphs demo.

When the complexity of the graph matches the original demo, performance
is comparable:

original size of graph

When the complexity increases substantially, the GPU performance improvement
is marked:

size increased by 100x

It is important to note in the larger case that the "I/O" benchmark for
the original large-graphs demo includes the graph generation time, which
is substantial. However, were the file stored in a 500MB .json (as in the
GPU case) and parsed in-browser, execution time is similar or longer.

What Is Happening

In order to run this demo, you need the @rapidsai/demo npm package,
a system with an NVIDIA GPU from around 2018 forward (Turing architecture
and up), and have previously installed the [CUDA Toolkit] (https://developer.nvidia.com/cuda-toolkit).

The node-rapids workspace demo-api-server
is available as a backend to any HTTP client. At this time only limited
functionality is available to parse JSON files in the graphology
graph dataset format, plus API requests to request Dataframes and
their Columns via apache-arrow.

Two endpoints, graphology/nodes and graphology/edges specifically
return pre-formatted arrays that can be used directly with the
sigma.js renderer.

Additional Dependencies

  • @rapidsai/demo-api-server
  • apache-arrow

To run the demo

Due to native dependency distribution complexity, pre-packaged builds of
the node-rapids modules are presently only available via our public docker images.
See USAGE.md for more details.

Run @rapidsai/demo-api-server via docker:

REPO=ghcr.io/rapidsai/node
VERSIONS="22.02.00-runtime-node18.2.0-cuda11.6.2-ubuntu20.04"

# Be sure to pass either the `--runtime=nvidia` or `--gpus` flag!
docker run --rm \
    --runtime=nvidia \
    -e "DISPLAY=$DISPLAY" \
    -v "/etc/fonts:/etc/fonts:ro" \
    -v "/tmp/.X11-unix:/tmp/.X11-unix:rw" \
    -v "/usr/share/fonts:/usr/share/fonts:ro" \
    -v "/usr/share/icons:/usr/share/icons:ro" \
    $REPO:$VERSIONS-demo \
    npx @rapidsai/demo-api-server

We expect to have full npm support soon.

Next generate a graph of your liking, or provide another:

cd $SIGMAJSHOME/examples/extra-large-graphs
node generate-graph.js 10000 20000 3 graphology.json
cp graphology.json $DOCKERROOT/node/modules/demo/api-server/public

Finally run the extra-large-graphs demo in the normal fashion:

cd $SIGMAJSROOT/examples
npm start --example=extra-large-graphs

Dataset

Run the graph generator at https://github.com/thomcom/sigma.js/blob/add-gpu-graph-to-example/examples/extra-large-graphs/generate-graph.js
to create a very large graph using the command:

node graph-generator.js 1000000 2000000 3 graphology.json

Any dataset that matches the schema of graphology.json is supported.

This closes #1258

demo/package.json Outdated Show resolved Hide resolved
@thomcom
Copy link
Author

thomcom commented Jun 14, 2022

Spending some time cleaning up "my end" of this at github.com/rapidsai/node before I clean this one up. I hope to be done here by end of Friday.

@thomcom thomcom marked this pull request as ready for review June 28, 2022 17:02
@stale
Copy link

stale bot commented Aug 8, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Aug 8, 2022
@thomcom
Copy link
Author

thomcom commented Aug 9, 2022

Pinging this one, any interest?

@stale stale bot removed the wontfix label Aug 9, 2022
@sim51 sim51 requested review from Yomguithereal and jacomyal and removed request for Yomguithereal August 17, 2022 15:30
@thomcom
Copy link
Author

thomcom commented Aug 30, 2022

Hey @tonyz0x0 or @lf- what do you think?

@Yomguithereal
Copy link
Collaborator

Hello @thomcom, this looks interesting but I don't have any NVIDIA gpu to test your PR on unfortunately.

What I understand from this is there is basically two things to make this work? 1. A gpu-backed implementation of graphology's interface (at least the subset necessary to sigma, by cleverly repurposing forEach methods etc.) and 2. programs that are the same as ours but that don't need byte array processing since the backing buffers are already encapsulated into the make-believe graphology instance.

This is quite clever and seems to work nicely indeed. The fact is it can very well work from userland also so the question on our side is probably to decide whether we want to merge your example (which cannot work without proper hardware, as I understand?) or if we can just point to your code somewhere in our docs, no?

What do you think @jacomyal ?

@thomcom
Copy link
Author

thomcom commented Oct 21, 2022

Hey @Yomguithereal, thanks for your consideration! Your analysis is essentially correct. The GPU is able to read graphology data files and pack them into the serialized backing buffers that sigma.js displays.

It is true that you can't run the demo without the proper hardware - but that raises another interesting question. The GPU in this situation is accessed through a web server. I'll talk to some people internally and see if we can put the server that this demo depends on somewhere public for you to play with or at least see the demo work!

We love sigma.js here and would love to get this connected with your work one way or another. :)

@Yomguithereal
Copy link
Collaborator

Thanks for the answer @thomcom, I think we'll link to your resources then when we find time to release the next version of the lib (which should be sooner than later now :) ).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Productize sigma.js/node-rapids extra-large-graphs to sigma.js quality and satisfaction.
3 participants