Skip to content

CA2M: chromatin accessibility and mutations in cancer genomes

Notifications You must be signed in to change notification settings

reimandlab/CA2M_v2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 

Repository files navigation

Predicting regional mutation burden in cancer genomes using chromatin accessibility (CA) and replication timing (RT)

This repository includes source code, tutorials, and processed datasets for the study:

Chromatin accessibility of primary human cancers ties regional mutational processes and signatures with tissues of origin .

Oliver Ocsenas and Jüri Reimand (2022) in revision.

Tutorials - Jupyter notebooks

  • 1_BigWigtoWindow.ipynb - mapping chromatin signals to megabase-scale windows
  • 2_MAFtoWindow.ipynb - mapping cancer mutations to megabase-scale windows
  • 3_CA2M_RF.ipynb - random forest models of megabase-scale mutation burden, chromatin accessibility and replication timing
  • 4_CA2M_RF_FeatureSelection_Tutorial.ipynb - selecting significant features predicting mutation rates
  • 5_CA2M_RF_SHAPscores.ipynb - computing feature importance scores (SHAP)
  • 6_CA2M_RF_EnrichedMutations_Tutorial.ipynb - detecting genomic regions with enriched mutations that are not explained by chromatin and replication timing alone

Tutorials/data - files needed for tutorials

  • All_CA_RT_100KB_scale.csv.gz - CA and RT tracks for cancer and normal samples, 100-kbps resolution
  • All_CA_RT_1MB_scale.csv.gz - CA and RT tracks for cancer and normal samples, 1-Mbps resolution
  • NormalCA_RT_MBscale.csv.gz - CA and RT tracks for normal tissues and cell lines, 1-Mbps resolution
  • PCAWG_SNVbinned_100KB_scale.csv.gz
  • PCAWG_SNVbinned_MBscale.csv.gz - mutation burden in whole cancer genomes, 1-Mbps resolution
  • PCAWG_breastcancer_SNV.MAF.gz - example file of somatic mutations in breast cancer for creating files above
  • SHAP_plot.pdf - example plot of feature importance scores (SHAP)
  • TCGA_BRCA_ATACSeq_chr1_2.bw - example file of chromatin accessibility in breast cancer for creating files above (chrs 1-2 only)
  • TumorCA_RT_MBscale.csv.gz - CA and RT tracks for cancer samples, 1-Mbps resolution

All_code - entire code repository for the project; use on your own responsibility

Contact: oocsenas [@] oicr.on.ca ; juri.reimand [@] utoronto.ca

About

CA2M: chromatin accessibility and mutations in cancer genomes

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published