Update README.md

pasmopy · Oct 3, 2021 · c3288ac · c3288ac
1 parent 7e095ed
commit c3288ac
Show file tree

Hide file tree

Showing 3 changed files with 98 additions and 54 deletions.
diff --git a/README.md b/README.md
@@ -35,14 +35,42 @@ R:
 
 - [Integration of TCGA and CCLE data](#integration-of-tcga-and-ccle-data)
 
-- [Construction of a comprehensive model of the ErbB signaling network](#construction-of-a-comprehensive-model-of-the-ErbB-signaling-network)
+  - [Download TCGA clinical/subtype information](#download-tcga-clinicalsubtype-information)
+
+  - [Select samples in reference to clinical or subtype data](#select-samples-in-reference-to-clinical-or-subtype-data)
+
+  - [Download TCGA gene expression data (HTSeq-Counts)](#download-tcga-gene-expression-data-htseq-counts)
+
+  - [Download CCLE transcriptomic data](#download-ccle-transcriptomic-data)
+
+  - [Merge TCGA and CCLE data](#merge-tcga-and-ccle-data)
+
+  - [Normalize RNA-seq counts data](#normalize-rna-seq-counts-data)
+
+- [Construction of a comprehensive model of the ErbB signaling network](#construction-of-a-comprehensive-model-of-the-erbb-signaling-network)
+
+  - [From text into executable models](#from-text-into-executable-models)
+
+  - [Other tasks for incorporating gene expression levels](#other-tasks-for-incorporating-gene-expression-levels)
 
 - [Individualization of the mechanistic model](#individualization-of-the-mechanistic-model)
 
+  - [Parameter estimation](#parameter-estimation)
+
+  - [Patient-specific simulations](#patient-specific-simulations)
+
 - [Subtype classification based on the ErbB signaling dynamics](#subtype-classification-based-on-the-ErbB-signaling-dynamics)
 
+  - [Extraction of response characteristics from patient-specific simulations](#extraction-of-response-characteristics-from-patient-specific-simulations)
+
+  - [Model-based patient stratification](#model-based-patient-stratification)
+
 - [Investigation of patient-specific pathway activities](#investigation-of-patient-specific-pathway-activities)
 
+  - [Sensitivity analysis](#sensitivity-analysis)
+
+  - [Drug response data analysis](#drug-response-data-analysis)
+
 ## Integration of TCGA and CCLE data
 
 ### Download TCGA clinical/subtype information
@@ -115,7 +143,7 @@ R:
 
     Output : `totalreadcounts.csv`
 
-### Normalization of RNA-seq counts data
+### Normalize RNA-seq counts data
 
 - Conduct noramlization of RNA-seq.
 - You can specify min and max value for truncation of total read counts.
@@ -129,6 +157,8 @@ R:
 
 ## Construction of a comprehensive model of the ErbB signaling network
 
+### From text into executable models
+
 1. Use `pasmopy.Text2Model` to build a mechanistic model
 
    ```python
@@ -140,6 +170,19 @@ R:
    Text2Model(os.path.join("models", "erbb_network.txt")).convert()
    ```
 
+1. Rename `erbb_network/` to CCLE_name or TCGA_ID, e.g., `MCF7_BREAST` or `TCGA_3C_AALK_01A`
+
+   ```python
+   import shutil
+
+   shutil.move(
+       os.path.join("models", "erbb_network"),
+       os.path.join("models", "breast", "TCGA_3C_AALK_01A")
+   )
+   ```
+
+### Other tasks for incorporating gene expression levels
+
 1. Add weighting factors for each gene (prefix: `"w_"`) to [`name2idx/parameters.py`](models/breast/TCGA_3C_AALK_01A/name2idx/parameters.py)
 
    ```python
@@ -179,17 +222,6 @@ R:
    weighting_factors.set_search_bounds()
    ```
 
-1. Rename `erbb_network/` to CCLE_name or TCGA_ID, e.g., `MCF7_BREAST` or `TCGA_3C_AALK_01A`
-
-   ```python
-   import shutil
-
-   shutil.move(
-       os.path.join("models", "erbb_network"),
-       os.path.join("models", "breast", "TCGA_3C_AALK_01A")
-   )
-   ```
-
 1. Edit [`set_search_param.py`](models/breast/TCGA_3C_AALK_01A/set_search_param.py)
 
    ```python
@@ -261,7 +293,9 @@ R:
 
 ## Individualization of the mechanistic model
 
-### Use time-course datasets to train kinetic constants and weighting factors
+### Parameter estimation
+
+Here, we use phospho-protein time-course datasets to train kinetic constants and weighting factors.
 
 1. Build a mechanistic model to identify model parameters
 
@@ -324,7 +358,7 @@ R:
        )
    ```
 
-### Execute patient-specific models
+### Patient-specific simulations
 
 - Use `pasmopy.PatientModelSimulations`
 
@@ -353,26 +387,30 @@ R:
 
 ## Subtype classification based on the ErbB signaling dynamics
 
-1. Extract response characteristics from patient-specific simulations
+### Extraction of response characteristics from patient-specific simulations
 
-   ```python
-   simulations.subtyping(
-       fname=None,
-       dynamical_features={
-           "Phosphorylated_Akt": {"EGF": ["max"], "HRG": ["max"]},
-           "Phosphorylated_ERK": {"EGF": ["max"], "HRG": ["max"]},
-           "Phosphorylated_c-Myc": {"EGF": ["max"], "HRG": ["max"]},
-       }
-   )
-   ```
+- Execute `subtyping()`
+
+  ```python
+  simulations.subtyping(
+      fname=None,
+      dynamical_features={
+          "Phosphorylated_Akt": {"EGF": ["max"], "HRG": ["max"]},
+          "Phosphorylated_ERK": {"EGF": ["max"], "HRG": ["max"]},
+          "Phosphorylated_c-Myc": {"EGF": ["max"], "HRG": ["max"]},
+      }
+  )
+  ```
 
-1. Visualize patient classification by executing [`brca_heatmap.R`](classification/brca_heatmap.R)
+### Model-based patient stratification
 
-   ```bash
-   $ cd classification
-   # $ Rscript brca_heatmap.R [n_cluster: int] [figsize: tuple]
-   $ Rscript brca_heatmap.R 6 8,5
-   ```
+- Run [`brca_heatmap.R`](classification/brca_heatmap.R)
+
+  ```bash
+  $ cd classification
+  # $ Rscript brca_heatmap.R [n_cluster: int] [figsize: tuple]
+  $ Rscript brca_heatmap.R 6 8,5
+  ```
 
 ## Investigation of patient-specific pathway activities
 

diff --git a/drug_response/README.md b/drug_response/README.md
@@ -1,6 +1,6 @@
-# CCLE drug response data analysis and visualization
+## CCLE drug response data analysis and visualization
 
-## Drug response data
+### Drug response data
 
 #### Description:
 
@@ -14,43 +14,49 @@
 
 - Barretina, J., Caponigro, G., Stransky, N. _et al_. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. _Nature_ **483**, 603–607 (2012). https://doi.org/10.1038/nature11003
 
-## Usage
-1. Prepare sample.txt and gene.txt, which contain the names of the samples and genes, respectively.
+### Usage
+
+1. Prepare [`sample.txt`](data/sample.txt) and [`gene.txt`](data/gene.txt), which contain the names of the samples and genes, respectively.
+
 1. Open R
 
    ```bash
    $ R
-   ```  
+   ```
 
-1. Load CCLE_normalization.R
+1. Load `CCLE_normalization.R`
 
    ```R
    source("calc_erbb_ratio.R")
    ```
 
-1. Read sample.txt and gene.txt  
-If you want data for a specific samples or genes, you will need to create a list of those samples as `sample.txt` or `gene.txt` and read it here. 
+1. Read [`sample.txt`](data/sample.txt) and [`gene.txt`](data/gene.txt)
+
+   If you want data for a specific samples or genes, you will need to create a list of those samples as [`sample.txt`](data/sample.txt) or [`gene.txt`](data/gene.txt) and read it here.
+
    ```R
    gene <- scan("gene.txt", what="character")
    #sample <- scan("sample.txt", what="character")
    ```
 
 1. Create TPM/RLE normalized RNA-seq data matrix of selected samples and genes
+
    ```R
    CCLEnormalization(gene, sample = NULL)
    ```
-   Output: CCLE_normalized.csv (TPM/RLE normalized RNA-seq data for specific samples or genes)
 
-1. Calculate receotor ratio  
-Calculate EGFR/(ERBB2+ERBB3+ERBB4) using CCLE_normalized.csv.  
-If you want to run this coad, you need to prepare gene.txt containing the names of these 4 genes.
+   Output: [`CCLE_normalized.csv`](data/CCLE_normalized.csv) (TPM/RLE normalized RNA-seq data for specific samples or genes)
+
+1. Calculate EGFR/(ERBB2+ERBB3+ERBB4) using [`CCLE_normalized.csv`](data/CCLE_normalized.csv)
+
+   If you want to run this code, you need to prepare `gene.txt` containing the names of these 4 genes.
+
    ```R
    CCLE_normalized <- read.csv("CCLE_normalized.csv", row.names = 1)
    receptor_ratio(data = CCLE_normalized, num = 30)
    ```
-   Output: "ErbB_expression_ratio.csv"
 
-
+   Output: [`ErbB_expression_ratio.csv`](data/ErbB_expression_ratio.csv)
 
 1. Drug response analysis and visualization
 

diff --git a/transcriptomic_data/README.md b/transcriptomic_data/README.md
@@ -1,15 +1,13 @@
-## Transcriptomic data processing
+## Transcriptomic data integration
+
+Integrating TCGA and CCLE data for parameterization and individualization of the mechanistic model.
 
 ### Requirements
 
 | Language | Dependent packages                                                     |
 | -------- | ---------------------------------------------------------------------- |
 | R        | dplyr, edgeR, sva, tibble, data.table, stringr, TCGAbiolinks , biomaRt |
 
-## Transcriptomic data integration
-
-Integrating TCGA and CCLE data for parameterization and individualization of the mechanistic model.
-
 ### Download TCGA clinical/subtype information
 
 - Read `integration.R`
@@ -47,13 +45,15 @@ Integrating TCGA and CCLE data for parameterization and individualization of the
                    age_at_initial_pathologic_diagnosis < 80)
   ```
 
-  **type** :  
+  **Parameters** :
+
+  `type` :  
    You can choose `clinical` or `subtype`. If you specify `clinical`, refer to `<TCGA Study Abbreviation>_clinical.csv`, and if you specify `subtype`, refer to `<TCGA Study Abbreviation>_subtype.csv` to select the patient. In order to select each one, you need to run `outputClinical()` or `outputSubtype()` before running this code.
 
-  **ID** :  
+  `ID` :  
    Column name that contains the patient's ID (ex. TCGA-E2-A14U, TCGA-E9-A1RC, ...) in the .csv file referenced by "type".
 
-  **After line 3** :  
+  `*args` :  
    You can set multiple conditions for selecting samples.
 
   | Expression                | Meaning                                                |