Skip to content
Nezar Abdennur edited this page Aug 5, 2019 · 1 revision

Running locally

HiGlass can also be run locally as a docker container. The higlass-docker repository contains detailed information about how to set it up and run it.

The simple example below stops any running higlass containers, removes them, pulls the latest version and runs it.

docker stop higlass-container; 
docker rm higlass-container;

docker pull higlass/higlass-docker:v0.6.1 # higher versions are experimental and may or may not work


docker run --detach \
           --publish 8989:80 \
           --volume ~/hg-data:/data \
           --volume ~/tmp:/tmp \
           --name higlass-container \
           higlass/higlass-docker:v0.6.1

The higlass website should now be visible at http://localhost:8989. Take a look at the documentation for adding a new track to see how to display data.

For security reasons, an instance created this way will not be accessible from hosts other than "localhost". To make it accessible to other hosts, please specify a hostname using the SITE_URL environment variable:

docker run --detach \
           --publish 8989:80 \
           --volume ~/hg-data:/data \
           --volume ~/tmp:/tmp \
           --name higlass-container \
           -e SITE_URL=my.higlass.org \
           higlass/higlass-docker:v0.6.1

To use the admin interface for managing the available datasets, a superuser needs to created:

docker exec -it higlass-container higlass-server/manage.py createsuperuser

Once a username and password are created, the admin interface can be accessed at http://localhost:8989/admin.

Processing and importing data

Large datasets need to be converted to multiple resolutions so that they can be tiled and displayed using higlass. Unfortunately, due to the variety of data types available there are different procedures for different starting file types.

Cooler files

Cooler files store genome contact matrices as HDF files. Typical cooler files store data at one resolution. To support zooming, they need to be converted to multi-resolution cooler files. Starting with the highest resolution you would like to visualize in a file called matrix.cool:

pip install cooler
cooler zoomify --balance matrix.cool

This command will aggregate the contact matrix in matrix.cool to produce multiple normalized zoom levels, storing the resulting contact matrices in matrix.multi.cool. This can then be loaded into higlass:

docker exec higlass-container python higlass-server/manage.py \
  ingest_tileset \
  --filename /tmp/matrix.multi.cool \
  --datatype matrix \
  --filetype cooler 

Creating cooler files from contacts

If a cooler file doesn't already exist, it can be created from a list of contacts (positions of pairs of genomic loci) and a set of chromosome sizes. Here's an example of a tab-delimited contact list or "pairs file":

chr1       124478180       -       chr1       121966441       +
chr1       124478180       -       chr1       121760032       +
...

It can be aggregated into a multi-resolution cooler using the following commands:

CHROMSIZES_FILE=hg19.chrom.sizes
BINSIZE=1000
CONTACTS_FILE=contacts.tsv

cooler cload pairs -c1 1 -p1 2 -c2 4 -p2 5 \
     $CHROMSIZES_FILE:$BINSIZE \
     $CONTACTS_FILE.sorted \
     out.cool

cooler zoomify out.cool

Note that the order of the chromosomes in the chromosome sizes file should match the coordinate system used in HiGlass.

BigWig Files

BigWig files need to be processed using the clodius package before they can be displayed in higlass:

pip install clodius
clodius aggregate bigwig file.bigwig

The default bigwig aggregation will assume that the chromosome sizes are from hg19. To aggregate for a different assembly use the --assembly option. E.g. --assembly mm9. It is also possible to pass in a set of chromosome size with the --chromsizes-filename option. Even though chromosome sizes are stored in the bigWig file, the conversion script requires an ordering as provided by the chromsizes-filename to produce the hitile file.

This will convert file.bigwig into a higlass-legible file. If no filename is specified using the --output-file option, the original extension is replaced with .hitile. This hitile file can then be loaded into higlass:

docker exec higlass-container python higlass-server/manage.py \
  ingest_tileset \
  --filename /tmp/file.hitile \
  --filetype hitile \
  --datatype vector \
  --name "Some 1D genomic data"

bedGraph files

Data can be imported from text files which have a bedGraph-like format:

chrom   start   end     eigU    eigT    eigN    GC
chr1    3000000 3020000 -0.30001076078261446    -0.28139497528740076    -0.4257141574669923     0.39005
chr1    3020000 3040000 -0.6506417814728713     -0.04220806911621135    -0.7562304803612467     0.3995
chr1    3040000 3060000 -0.5962263338769729     -0.58579839698137       -0.5406451925771123     0.38845

These files need to be aggregated and converted to hitile files using clodius:

pip install clodius
clodius aggregate bedgraph file.tsv --output-file file.hitile --assembly hg19

The columns containing the chromosome name (--chromosome-col), the starting position (--from-pos-col), the ending position (--to-pos-col) and the values (--value-col) can be specified as 1-based parameters. They default to 1,2,3 and 4, respectively. The genome assembly defaults to hg19 but can be changed using the --assembly parameter.

Note: The entries in the bedlike file must be sorted so that the order of the chromosomes matches the order defined in the negspy package (e.g. hg19/chromOrder.txt). For assemblies such as hg19 and mm9 this defaults to a semantic ordering (e.g. chr1, chr2, chr3... chrX, chrY, chrM).

Bedpe-like files

2D annotations often have a two start and end points:

chr10   74160000        74720000	chr10	74165000	74725000
chr12   120920000       121640000	chr12	120925000	121645000
chr15   86360000        88840000	chr15	86365000	88845000

These can be aggregated using clodius:

clodius aggregate bedpe \
	--assembly hg19 \
	--chr1-col 1 --from1-col 2 --to1-col 3 \
	--chr2-col 4 --from2-col 5 --to2-col 6 \
	--output-file domains.txt.multires \
	domains.txt

Once created, they can be entered into higlass using docker:

docker exec higlass-container python higlass-server/manage.py \
  ingest_tileset \
  --filename /tmp/domains.txt.multires.db \
  --filetype bed2ddb \
  --datatype 2d-rectangle-domains

Gene annotation files

Gene annotation files store information about exons, introns and gene names. They are sqlite3 db files with a schema that is compatible with higlass-server. Creating these files first requires a bed-like list of gene annotations:

chr5	176022802	176037131	GPRIN1	7	-	union_114787	114787	protein-coding	G protein regulated inducer of neurite outgrowth 1	176023808	176026835	176022802,176036999	176026878,176037131
chr8	56015016	56438710	XKR4	8	+	union_114786	114786	protein-coding	XK, Kell blood group complex subunit-related family, member 4	56015048	56436786	56015016,56270237,56435839	56015854,56270437,56438710

These can be generated from publicly available data as described in the clodius wiki. This bed-like file then needs to be aggregated for multiple resolutions and converted to an sqlite3 db file using clodius:

pip install clodius
clodius aggregate bedfile \
    --max-per-tile 20 --importance-column 5 \
    --assembly hg19 \
    --output-file gene-annotations.beddb
    gene-annotations.bed

Once created, the gene annotations file can be loaded into higlass:

docker exec higlass-container python higlass-server/manage.py \
  ingest_tileset \
  --filename /tmp/gene-annotations.beddb \
  --filetype beddb \
  --datatype gene-annotation \
  --coordSystem hg19 \
  --name "Gene Annotations (hg19)"