pasmopy · himoto · Jun 25, 2021 · May 31, 2021 · Jun 2, 2021 · Jun 3, 2021
diff --git a/README.md b/README.md
@@ -1,7 +1,4 @@
-# breast_cancer
-
-[![Actions Status](https://github.com/pasmopy/breast_cancer/workflows/Tests/badge.svg)](https://github.com/pasmopy/breast_cancer/actions)
-[![License](https://img.shields.io/badge/License-Apache%202.0-green.svg)](https://opensource.org/licenses/Apache-2.0)
+# Breast cancer [![Actions Status](https://github.com/pasmopy/breast_cancer/workflows/Tests/badge.svg)](https://github.com/pasmopy/breast_cancer/actions)
 
 Workflow for classifying breast cancer subtypes based on intracellular signaling dynamics.
 
@@ -10,18 +7,20 @@ Workflow for classifying breast cancer subtypes based on intracellular signaling
 | Language      | Dependent packages                                             |
 | ------------- | -------------------------------------------------------------- |
 | Python >= 3.7 | See [`requirements.txt`](requirements.txt)                     |
-| Julia >= 1.5  | [BioMASS.jl](https://github.com/himoto/BioMASS.jl)             |
+| Julia >= 1.5  | [BioMASS.jl](https://github.com/biomass-dev/BioMASS.jl)        |
 | R >= 4.0      | TCGAbiolinks, sva, biomaRt, edgeR, ComplexHeatmap, viridisLite |
 
 ## Table of contents
 
-- [Transcriptomic data integration](#integration-of-tcga-and-ccle-data)
+- [Integration of TCGA and CCLE data](#integration-of-tcga-and-ccle-data)
 
-- [Model construction](#construction-of-a-comprehensive-model-of-the-ErbB-signaling-network)
+- [Construction of a comprehensive model of the ErbB signaling network](#construction-of-a-comprehensive-model-of-the-ErbB-signaling-network)
 
 - [Individualization of the mechanistic model](#individualization-of-the-mechanistic-model)
 
-- [Subtype classification](#subtype-classification-based-on-the-ErbB-signaling-dynamics)
+- [Subtype classification based on the ErbB signaling dynamics](#subtype-classification-based-on-the-ErbB-signaling-dynamics)
+
+- [Investigation of patient-specific pathway activities](#investigation-of-patient-specific-pathway-activities)
 
 ## Integration of TCGA and CCLE data
 
@@ -39,9 +38,12 @@ Workflow for classifying breast cancer subtypes based on intracellular signaling
 1. Use `pasmopy.Text2Model` to build a mechanistic model
 
    ```python
+   import os
+
    from pasmopy import Text2Model
 
-   Text2Model("models/erbb_network.txt").convert()
+
+   Text2Model(os.path.join("models", "erbb_network.txt")).convert()
    ```
 
 1. Rename `erbb_network/` to CCLE_name or TCGA_ID, e.g., `MCF7_BREAST` or `TCGA_3C_AALK_01A`
@@ -60,11 +62,12 @@ Workflow for classifying breast cancer subtypes based on intracellular signaling
    from . import __path__
    from .name2idx import C, V
    from .set_model import initial_values, param_values
+
 
    incorporating_gene_expression_levels = Individualization(
        parameters=C.NAMES,
        species=V.NAMES,
-       transcriptomic_data="transcriptomic_data/TPM_RLE_postComBat.csv",
+       transcriptomic_data=os.path.join("transcriptomic_data", "TPM_RLE_postComBat.csv"),
        gene_expression={
            "ErbB1": ["EGFR"],
            "ErbB2": ["ERBB2"],
@@ -123,9 +126,12 @@ Workflow for classifying breast cancer subtypes based on intracellular signaling
 1. Build a mechanistic model to identify model parameters
 
    ```python
+   import os
+
    from pasmopy import Text2Model
 
-   Text2Model("models/erbb_network.txt", lang="julia").convert()
+
+   Text2Model(os.path.join("models", "erbb_network.txt"), lang="julia").convert()
    ```
 
 1. Add time-series data to [`experimental_data.jl`](training/erbb_network_jl/experimental_data.jl)
@@ -144,21 +150,25 @@ Workflow for classifying breast cancer subtypes based on intracellular signaling
 
    When finished, run:
 
-   ```julia
+   ```bash
    $ julia
+   ```
+
+   ```julia
+   using BioMASS
 
-   julia> using BioMASS
-   julia> param2biomass("training")
+   param2biomass("training")
    ```
 
    And you will get [`dat2npy/out/`](https://github.com/pasmopy/breast_cancer/tree/master/training/erbb_network_jl/dat2npy/out).
-   This is the optimized parameter sets that [`biomass`](https://github.com/okadalabipr/biomass) can recognize and read.
+   This is the estimated parameter sets that [`biomass`](https://github.com/biomass-dev/biomass) can recognize and read.
    Copy `out/` to each patient-specific model folder via:
 
    ```python
    import os
    import shutil
 
+
    breast_cancer_models = []
    path_to_models = os.path.join("models", "breast")
    for f in os.listdir(path_to_models):
@@ -169,7 +179,7 @@ Workflow for classifying breast cancer subtypes based on intracellular signaling
    # Set optimized parameters
    for model in breast_cancer_models:
        shutil.copytree(
-           os.path.join("training", "dat2npy", "out"),
+           os.path.join("training", "erbb_network_jl", "dat2npy", "out"),
            os.path.join(path_to_models, f"{model}", "out"),
        )
    ```
@@ -224,6 +234,62 @@ Workflow for classifying breast cancer subtypes based on intracellular signaling
    $ Rscript brca_heatmap.R 6 8,5
    ```
 
+## Investigation of patient-specific pathway activities
+
+### Sensitivity analysis
+
+- Calculate sensitivity coefficients by varying the amount of each nonzero species
+
+  ```python
+  from pasmopy import PatientModelAnalyses
+
+  import models.breast
+
+  with open ("selected_tnbc.txt" mode="r") as f:
+      TNBC_ID = f.read().splitlines()
+  analyses = PatientModelAnalyses(
+      models.breast.__package__,
+      TNBC_ID,
+      biomass_kws={
+          "metric": "maximum", "style": "heatmap", "options": {"excluded_initials": ["PIP2"]}
+      },
+  )
+  ```
+
+### Drug response data analysis
+
+1. Calculation of the ErbB receptor expression ratio
+
+   ```bash
+   $ cd drug_response
+   $ Rscript data/calc_erbb_ratio.R
+   ```
+
+1. Drug response analysis and visualization
+
+   ```python
+   import os
+
+   import pandas as pd
+   from drug.database import CancerCellLineEncyclopedia
+
+
+   ccle = CancerCellLineEncyclopedia()
+
+   erbb_expression_ratio = pd.read_csv(
+       os.path.join("data", "ErbB_expression_ratio.csv"),
+       index_col=0
+    )
+   compounds = ["Erlotinib", "Lapatinib", "AZD6244", "PD-0325901"]
+   for compound in compounds:
+       ccle.save_all(erbb_expression_ratio, compound)
+   ```
+
+## Author
+
+- Hiroaki Imoto
+- Sawa Yamashiro
+
 ## License
 
 [Apache License 2.0](LICENSE)
diff --git a/drug_response/README.md b/drug_response/README.md
@@ -25,15 +25,17 @@
 1. Drug response analysis and visualization
 
    ```python
+   import os
    import pandas as pd
    from drug.database import CancerCellLineEncyclopedia
 
    ccle = CancerCellLineEncyclopedia()
 
-   erbb_expression_ratio = pd.read_csv("./data/ErbB_expression_ratio.csv", index_col=0)
-
+   erbb_expression_ratio = pd.read_csv(
+      os.path.join("data", "ErbB_expression_ratio.csv"),
+      index_col=0
+   )
    compounds = ["Erlotinib", "Lapatinib", "AZD6244", "PD-0325901"]
-
    for compound in compounds:
        ccle.save_all(erbb_expression_ratio, compound)
    ```
diff --git a/drug_response/drug/database.py b/drug_response/drug/database.py
@@ -129,6 +129,7 @@ def save_dose_response_curve(
         erbb_expression_ratio: pd.DataFrame,
         drug: str,
         *,
+        config: Optional[dict] = None,
         show_individual: bool = False,
         error: str = "std",
         save_format: str = "pdf",
@@ -162,14 +163,17 @@ def save_dose_response_curve(
             [drug] * len(bottom30),
         )
 
-        plt.figure(figsize=(7, 5))
-        plt.rcParams['font.size'] = 24
-        plt.rcParams['font.family'] = 'Arial'
-        plt.rcParams['axes.linewidth'] = 2.4
-        plt.rcParams['xtick.major.width'] = 2.4
-        plt.rcParams['ytick.major.width'] = 2.4
-        plt.rcParams['lines.linewidth'] = 3
-        plt.rcParams["lines.markersize"] = 1
+        if config is None:
+            config = {}
+        config.setdefault('figure.figsize', (7, 5))
+        config.setdefault('font.size', 24)
+        config.setdefault('font.family', 'Arial')
+        config.setdefault('axes.linewidth', 2.4)
+        config.setdefault('xtick.major.width', 2.4)
+        config.setdefault('ytick.major.width', 2.4)
+        config.setdefault('lines.linewidth', 3)
+        config.setdefault("lines.markersize", 1)
+        plt.rcParams.update(config)
         plt.gca().spines["right"].set_visible(False)
         plt.gca().spines["top"].set_visible(False)
 
@@ -279,6 +283,7 @@ def save_activity_area(
         erbb_expression_ratio: pd.DataFrame,
         drug: str,
         *,
+        config: Optional[dict] = None,
         save_format: str = "pdf",
     ) -> None:
         self._check_args(drug, save_format)
@@ -315,15 +320,20 @@ def save_activity_area(
             ],
         )
 
-        plt.figure(figsize=(3, 5))
-        plt.rcParams['font.family'] = 'Arial'
-        plt.rcParams['mathtext.it'] = 'Arial:italic'
-        plt.rcParams['font.size'] = 24
-        plt.rcParams['axes.linewidth'] = 2.4
-        plt.rcParams['xtick.major.width'] = 2.4
-        plt.rcParams['ytick.major.width'] = 2.4
-        plt.rcParams['lines.linewidth'] = 1.8
-        plt.rcParams['lines.markersize'] = 11
+        if config is None:
+            config = {}
+        config.setdefault('figure.figsize', (3, 5))
+        config.setdefault('font.family', 'Arial')
+        config.setdefault('mathtext.it', 'Arial:italic')
+        config.setdefault('font.size', 24)
+        config.setdefault('axes.linewidth', 2.4)
+        config.setdefault('xtick.major.width', 2.4)
+        config.setdefault('ytick.major.width', 2.4)
+        config.setdefault('lines.linewidth', 1.8)
+        config.setdefault('lines.markersize', 11)
+
+        plt.rcParams.update(config)
+
         plt.gca().spines["right"].set_visible(False)
         plt.gca().spines["top"].set_visible(False)
 
@@ -333,10 +343,9 @@ def save_activity_area(
             data=(activity_area[0], activity_area[1]),
             palette=sns.color_palette(['darkmagenta', 'goldenrod']),
         )
-        #sns.stripplot(data=(activity_area[0], activity_area[1]),color=".26")
-        sns.swarmplot(data=(activity_area[0], activity_area[1]),color=".48")
-        #sns.boxenplot(data=(activity_area[0], activity_area[1]), palette=sns.color_palette(['#76106cff', '#f7a426ff']))
-        #sns.violinplot(data=(activity_area[0], activity_area[1]), palette=sns.color_palette(['mediumslateblue', 'tomato']))
+        sns.swarmplot(
+            data=(activity_area[0], activity_area[1]),color=".48"
+        )
         plt.xticks([0, 1], ["EGFR\nhigh", "EGFR\nlow"])
         plt.ylabel("Activity area", fontsize=28)
         plt.suptitle(
@@ -360,7 +369,19 @@ def save_all(
         erbb_expression_ratio: pd.DataFrame,
         drug: str,
         *,
+        config_dose_response_curve: Optional[dict] = None,
+        config_activity_area: Optional[dict] = None,
         save_format: str = "pdf",
     ) -> None:
-        self.save_dose_response_curve(erbb_expression_ratio, drug, save_format=save_format)
-        self.save_activity_area(erbb_expression_ratio, drug, save_format=save_format)
+        self.save_dose_response_curve(
+            erbb_expression_ratio,
+            drug,
+            config=config_dose_response_curve,
+            save_format=save_format,
+        )
+        self.save_activity_area(
+            erbb_expression_ratio,
+            drug,
+            config=config_activity_area,
+            save_format=save_format,
+        )
diff --git a/models/breast/TCGA_3C_AALK_01A/__init__.py b/models/breast/TCGA_3C_AALK_01A/__init__.py
@@ -1,66 +1,5 @@
-import os
-from typing import List
-
-from .fitness import objective
+from .fitness import OptimizationProblem
 from .name2idx import C, V
-from .observable import ExperimentalData, NumericalSimulation, observables
 from .reaction_network import ReactionNetwork
 from .set_model import initial_values, param_values
-from .set_search_param import SearchParam
 from .viz import Visualization
-
-
-def _check_duplicate(names: List[str], object: str) -> List[str]:
-    duplicate = [name for name in set(names) if names.count(name) > 1]
-    if not duplicate:
-        return names
-    else:
-        raise NameError(f"Duplicate {object}: {', '.join(duplicate)}")
-
-
-class BioMassModel(object):
-    def __init__(self) -> None:
-        self.path = __path__[0]
-        self.parameters = _check_duplicate(C.NAMES, "parameters")
-        self.species = _check_duplicate(V.NAMES, "species")
-        self.obs = _check_duplicate(observables, "observables")
-        self.pval = param_values
-        self.ival = initial_values
-        self.obj_func = objective
-        self.sim = NumericalSimulation()
-        self.exp = ExperimentalData()
-        self.viz = Visualization()
-        self.sp = SearchParam()
-        self.rxn = ReactionNetwork()
-
-
-def show_info() -> None:
-    model_name = os.path.basename(os.path.dirname(__file__))
-    print(
-        f"{model_name} information\n" + ("-" * len(model_name)) + "------------\n"
-        f"{len(BioMassModel().species):d} species\n"
-        f"{len(BioMassModel().parameters):d} parameters, "
-        f"of which {len(BioMassModel().sp.idx_params):d} to be estimated"
-    )
-
-
-def create() -> BioMassModel:
-    model = BioMassModel()
-    if model.sim.normalization:
-        for obs_name in model.obs:
-            if (
-                isinstance(model.sim.normalization[obs_name]["timepoint"], int)
-                and not model.sim.t[0]
-                <= model.sim.normalization[obs_name]["timepoint"]
-                <= model.sim.t[-1]
-            ):
-                raise ValueError("Normalization timepoint must lie within sim.t.")
-            if not model.sim.normalization[obs_name]["condition"]:
-                model.sim.normalization[obs_name]["condition"] = model.sim.conditions
-            else:
-                for c in model.sim.normalization[obs_name]["condition"]:
-                    if c not in model.sim.conditions:
-                        raise ValueError(
-                            f"Normalization condition '{c}' is not defined in sim.conditions."
-                        )
-    return model