Skip to content
/ wppi Public

Gene prioritisation tool using a weighted network methodology based on Gene Ontology and Human Phenotype Ontology annotations to infer closely related genes to given genes of interest.

License

Notifications You must be signed in to change notification settings

aGalhoz/wppi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

wppi

wppi is a functional genome-phenotype gene prioritisation tool according to their potential relevance in a disease, based on Gene Ontology (GO) and Human Phenotype Ontology (HPO) ontology databases.

Description

This package constructs a Protein-Protein Interaction (PPI) network in the neighborhood of user-defined genes of interest from the Omnipath webservice (https://omnipathdb.org/) through the OmnipathR R package. The built PPI is functionally weighted based on topological information (shared neighbors) and the functional similarities from GO and HPO ontology databases. The new candidate genes are scored by a Random Walk with Restart algorithm on the predefined weighted PPI.

In this repository it is provided the relevant functions to run and customize the workflow.

Installation

wppi is an available package published on Bioconductor and GitHub.

The package can be installed using the following commands:

From Bioconductor (last release)

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
    
BiocManager::install("wppi", version = '3.16')

## Development version with the lastest updates
BiocManager::install('wppi', version = 'devel')

From Github

require(devtools)
install_github('AnaGalhoz37/wppi')

Dependencies

The wppi package depends on the OmnipathR package. Since it relies on features more recent than the latest Bioconductor version, until the release of Bioconductor 3.16, it is recommended to install OmnipathR from git.

require(devtools)
install_github('saezlab/OmnipathR')

Getting started and user cases

We recomment to read our vignette and manual with examples of how to query functions using the wppi package.

The wppi pipeline can be leveraged for the following cases:

  1. The user has a list of known disease related genes, and wants to discover new genes in a PPI proximity based on all available GO and HPO ontology terms.
# example gene set
genes_interest <- c(
    'ERCC8', 'AKT3', 'NOL3', 'TTK',
    'GFI1B', 'CDC25A', 'TPX2', 'SHE'
)
scores <- score_candidate_genes_from_PPI(genes_interest)
scores
# # A tibble: 295 x 3
#    score gene_symbol uniprot
#    <dbl> <chr>       <chr>
#  1 0.247 KNL1        Q8NG31
#  2 0.247 HTRA2       O43464
#  3 0.247 KAT6A       Q92794
#  4 0.247 BABAM1      Q9NWV8
#  5 0.247 SKI         P12755
#  6 0.247 FOXA2       Q9Y261
#  7 0.247 CLK2        P49760
#  8 0.247 HNRNPA1     P09651
#  9 0.247 HK1         P19367
# 10 0.180 SH3RF1      Q7Z6J0
# # . with 285 more rows
  1. In addition to a set of genes, the user wants to customize the whole pipeline for a specific phenotype. For example, if interested in diabetes related annotations:
HPO_data <- wppi_hpo_data()
HPO_interest <- unique(dplyr::filter(HPO_data, grepl('Diabetes', Name))$Name)
scores_diabetes <- score_candidate_genes_from_PPI(genes_interest,HPO_interest)
scores
# # A tibble: 295 x 3
#    score gene_symbol uniprot
#    <dbl> <chr>       <chr>
#  1 0.247 KNL1        Q8NG31
#  2 0.247 HTRA2       O43464  
#  3 0.247 KAT6A       Q92794 
#  4 0.247 BABAM1      Q9NWV8 
#  5 0.247 SKI         P12755 
#  6 0.247 FOXA2       Q9Y261 
#  7 0.247 CLK2        P49760 
#  8 0.247 HNRNPA1     P09651 
#  9 0.247 HK1         P19367 
# 10 0.180 SH3RF1      Q7Z6J0 
# # … with 285 more rows

Untitled 7

All previous use cases can be altered to consider different subsets and aspects of GO (e.g., only Biological Processes), or different parameters used for the creation of the weighted PPI and the Random Walk with Restart algorithm.

Package feedback

To report bugs, ask questions or any feedback, please use the Github issue page.

About

Gene prioritisation tool using a weighted network methodology based on Gene Ontology and Human Phenotype Ontology annotations to infer closely related genes to given genes of interest.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages