Skip to content

A tool for visualising set membership and patterns of missingness in data

License

Notifications You must be signed in to change notification settings

alan-turing-institute/setvis

Repository files navigation

setvis

Python Package Documentation Status

Setvis is a python library for visualising set membership and patterns of missingness in data.

It can be used both programmatically and interactively in a Jupyter notebook (powered by Bokeh widgets). It operates on data using a memory efficient architecture, and supports loading data from flat files, Pandas dataframes, and directly from a Postgres database.

Documentation

The setvis documentation is hosted on Read the Docs.

Installation (quick start)

For the complete installation instructions, consult the installation page of the documentation, which includes information on some extra installation options and setting up a suitable environment on several platforms.

We recommend installing setvis in a python virtual environment or Conda environment.

To install setvis, most users should run:

pip install 'setvis[notebook]'

This will include everything to run setvis in a notebook, and to run the tutorial examples that do not need a database connection.

The Bokeh plots produced by setvis require the package notebook >= 6.4 to display properly. This will be included when installing setvis using the command above.

Tutorials

For basic examples, please see the two example notebooks:

Additionally, there is a series of Tutorials notebooks, starting with Tutorial 1.

After installing setvis, to follow theses tutorials interactively you will need to clone or download this repository. Then start jupyter from within it:

python -m jupyter notebook notebooks

Notice

The setvis software is released under the Apache Licence, version 2.0. See LICENCE for details.

The data files ./examples/datasets/simpsons - Format 1.csv and ./examples/datasets/simpsons - Format 2.csv, are based on a data file included in UpSet, copyright Visual Computing Group, Harvard, and distributed here under the terms of the MIT Licence.

The other data files in ./examples/datasets/ are released under the Creative Commons Attribution 4.0 International Licence (CC-BY-4.0).

Acknowledgements

The development of the setvis software was supported by funding from the Engineering and Physical Sciences Research Council (EP/N013980/1; EP/R511717/1) and the Alan Turing Institute.