.

zhenzuo2 · Jun 16, 2024 · f9811b1 · f9811b1
1 parent 3477290
commit f9811b1
Show file tree

Hide file tree

Showing 2 changed files with 95 additions and 4 deletions.
diff --git a/analysis/ContextSpecificNetworks.Rmd b/analysis/ContextSpecificNetworks.Rmd
@@ -0,0 +1,91 @@
+---
+title: "Context-specific networks"
+output:
+  workflowr::wflow_html:
+    toc: true
+    toc_float: true
+    theme: united
+    highlight: textmate
+editor_options:
+  chunk_output_type: console
+---
+
+# Load packages
+```{r message=FALSE, warning=FALSE}
+library(viper)
+library(aracne.networks)
+library(dplyr)
+library(plyr)
+library(stringr)
+library(Biobase)
+library(EnsDb.Hsapiens.v86)
+library(foreach)
+library(doParallel)
+dir.create("output/regulon/",showWarnings = F)
+dir.create("output/regulon/raw/",showWarnings = F)
+```
+
+# Extract context-specific networks
+ARACNe-AP was run on RNA-Seq datasets normalized using Variance-Stabilizing Transformation. The raw data was downloaded on April 15th, 2015 from the TCGA official website.
+
+## Extract ARACNe-inferred gene networks from TCGA tumor datasets.
+```{r}
+items <- data(package="aracne.networks")$results[, "Item"]
+print(items)
+```
+
+```{r}
+df <- read.csv("data/omics_regulon_pairs.csv")
+```
+
+## Process network and save as adj files
+Export network to adj files
+```{r message=FALSE, warning=FALSE}
+# Set up parallel backend
+registerDoParallel(10)
+
+# Loop through items in parallel
+foreach(item = items) %dopar% {
+    if (!file.exists(paste("output/regulon/", item, ".adj", sep = ""))) {
+        data <- get(item)
+        write.regulon(data, file = paste("output/regulon/raw/", item, ".adj", sep = ""))
+    }
+}
+
+```
+
+Convert Entrez Gene ids to SYMBOL.
+```{r}
+for (item in items) {
+    if (!file.exists(paste("output/regulon/", item, ".adj", sep = ""))) {
+        df <- read.csv(paste("output/regulon/raw/", item, ".adj", sep = ""),
+            sep = "\t")
+
+        geneID <- ensembldb::select(EnsDb.Hsapiens.v86, keys = as.character(df$Regulator),
+            keytype = "ENTREZID", columns = c("SYMBOL", "ENTREZID", "GENEID"))
+        df$Regulator <- plyr::mapvalues(df$Regulator, from = geneID$ENTREZID,
+            to = geneID$SYMBOL, warn_missing = FALSE)
+
+        geneID <- ensembldb::select(EnsDb.Hsapiens.v86, keys = as.character(df$Target),
+            keytype = "ENTREZID", columns = c("SYMBOL", "ENTREZID", "GENEID"))
+        df$Target <- plyr::mapvalues(df$Target, from = geneID$ENTREZID,
+            to = geneID$SYMBOL, warn_missing = FALSE)
+
+        can_be_integer <- function(x) {
+            suppressWarnings(!is.na(as.integer(x)))
+        }
+        f1 <- !sapply(df$Regulator, can_be_integer)
+        f2 <- !sapply(df$Target, can_be_integer)
+        df <- df[f1 & f2, ]
+
+        # Group by Regulator and concatenate elements of each group into strings
+       df$temp <- paste(df$Target,df$MoA, sep="\t")
+      result <- aggregate(temp ~ Regulator, data = df, FUN = function(x) paste(x, collapse = "\t"))
+        # Print the result
+        file <- file(paste("output/regulon/", item, ".adj",
+            sep = ""))
+        writeLines(paste(result$Regulator,result$temp,sep = "\t"),file)
+        close(file)
+    }
+}
+```
diff --git a/analysis/index.Rmd b/analysis/index.Rmd
@@ -61,17 +61,17 @@ grid::grid.raster(img,width = 0.4, height = 1)
 ## Differential Abundance Analysis in Proteomics
 - Get proteomics data from <a href="https://proteomic.datacommons.cancer.gov/pdc/cptac-pancancer"> CPTAC </a>.  
   + Remove features with more 20% zero/missing values. 
-- Define Differentially Proteins by two-side Wilcoxon Rank Sum and Signed Rank Tests with Benjamini & Hochberg correction (adjusted p values < 0.05)[See more](Differentially_Protein.html). 
+- Define Differentially Proteins by two-side Wilcoxon Rank Sum and Signed Rank Tests with Benjamini & Hochberg correction (adjusted p values < 0.05) [See more](Differentially_Protein.html). 
 
 ## Master Regulator Inference Algorithm (MARINa)
 *MARINa, a method to infer the activity of a given protein based on the differential expression/phosphorylation of the targets it regulates.*
 
 ### Input
 - Known kinases from KESA differentially phosphorylated peptide [See more](KSEA.html). 
 - Gene Expression data for normal and tumor [See more](Differentially_Gene.html).  
-- Gene level pathway network (Object of class regulon with XXX regulators, XXX targets and XXX interactions). The paper used Genome-wide cross-species interrogation of disease-specific regulatory networks from 10.1016/j.ccr.2014.03.017.
-- Phosphorylation data for normal and tumor.
-- Phosphorylation level pathway network (Object of class regulon with XXX regulators, XXX targets and XXX interactions).
+- Gene level pathway network: ARACNe-inferred gene networks from TCGA tumor datasets [See more](ContextSpecificNetworks.html). 
+- Phosphorylation data for normal and tumor [See more](Differentially_Phosphorylated_Site.html). 
+- Phosphorylation level pathway network (In house shared by Faye).
 
 ### Output
 - Transcription factors with differential activity (repression/activation).