RTX-KG2 Gateway

Enabling RTX-KG2 data access through various means.

Overview

RTX-KG2 provides a knowledge graph composed of many different data sources. The output data from the RTX-KG2 project can benefit from the use of additional specialized graph database tools for analysis purposes. Please find a brief overview of these technologies below for a better understanding of how they're used in context with the RTX-KG2 data.

Graph Database Technologies

Kuzu: Kuzu is an embeddable property graph database system which provides querying capabilities through Cypher. Kuzu includes a Python package and related API which enables local queries.
- See rtx-kg2-gateway-kuzu-database-details.md for more information on the database schema and data.

Installation

Python

Usage of the contents found within this repository depend on Python being available on the system. One suggested way to use and manage Python is through pyenv (there are many other ways too!). Please reference the pyproject.toml file for more information on Python versions which are compatible with this project.

Poetry environment

Please use Python poetry to run and install a Python environment related to this project. The Poetry environment for this project includes dependencies which help run IDE environments, manage the data, and run workflows. See here for more information about installing Poetry within your environment.

# context: within the root of the repository
# after installing poetry, create the environment
poetry install

Development

Running and updating Jupyter notebooks

Please follow installation steps above and then use a related Jupyter environment to open and explore the notebooks under the notebooks directory. These notebooks leverage Jupyter Lab extensions (such as jupytext) through the related Poetry environment for this repository. Usage of the notebooks outside of Jupyter Lab as an IDE may have varied experiences.

# context: within the root of the repository
# after creating poetry environment, run jupyter
poetry run jupyter lab

Executing sequences of Python modules as tasks

We use Poe the Poet to define and run tasks defined within pyproject.toml under the section [tool.poe.tasks*]. This allows for the definition and use of a task workflow when implementing multiple procedures in sequence.

For example, use the following to run the notebook_sample_data_generation task:

# context: within the root of the repository
# run data_prep task using poethepoet defined within `pyproject.toml`
poetry run poe notebook_sample_data_generation

Existing tasks:

notebook_sample_data_generation: generates a sample parquet dataset and adds to a kuzu database.
notebook_full_data_generation: generates full dataset and adds to a kuzu database.
notebook_full_data_generation_with_metanames: generates full dataset with metanames specificity and adds to a kuzu database in similar fashion.

Citation and Acknowledgements

Data used by this repo includes RTX-KG2 which was published at the NCATS Biomedical Data Translator repository. Special thanks goes to those mentioned in the RTX-KG2 credits. Further data acknowledgments may be found within the data sources documentation.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github/workflows		.github/workflows
src/notebooks		src/notebooks
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github/workflows

.github/workflows

src/notebooks

src/notebooks

.gitignore

.gitignore

.pre-commit-config.yaml

.pre-commit-config.yaml

CITATION.cff

CITATION.cff

LICENSE

LICENSE

README.md

README.md

poetry.lock

poetry.lock

pyproject.toml

pyproject.toml

Repository files navigation

RTX-KG2 Gateway

Overview

Graph Database Technologies

Installation

Python

Poetry environment

Development

Running and updating Jupyter notebooks

Executing sequences of Python modules as tasks

Citation and Acknowledgements

About

Releases

Packages

Contributors 2

Languages

License

CU-DBMI/rtx-kg2-gateway

Folders and files

Latest commit

History

Repository files navigation

RTX-KG2 Gateway

Overview

Graph Database Technologies

Installation

Python

Poetry environment

Development

Running and updating Jupyter notebooks

Executing sequences of Python modules as tasks

Citation and Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Languages