Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

formatting #210

Merged
merged 1 commit into from
Aug 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
96 changes: 66 additions & 30 deletions ExampleWorkflow.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -197,7 +197,7 @@ imp_conf <- c("InRatioCor.6", "InRatioCor.15", "InRatioCor.24", "InRatioCor.35",
"InRatioCor.58", "PmEd2") # empirical example

prebalance_stats <- assessBalance(obj = obj,
data = data, # required
data = data,
balance_thresh = balance_thresh,
imp_conf = imp_conf,
save.out = TRUE)
Expand All @@ -207,11 +207,20 @@ prebalance_stats <- assessBalance(obj = obj,
Inspect
```{r}

summary(prebalance_stats, i = 1, t = 1, save.out = TRUE)
summary(prebalance_stats,
i = 1,
t = 1,
save.out = TRUE)

plot(prebalance_stats, i = 1, t = 1, save.out = TRUE)
plot(prebalance_stats,
i = 1,
t = 1,
save.out = TRUE)

print(prebalance_stats, i = 2, t = 1, save.out = TRUE) # NOTE: "F_b" for one ti covar? covariate?
print(prebalance_stats,
i = 2,
t = 1,
sve.out = TRUE) # NOTE: "F_b" for one ti covar? covariate?
```


Expand Down Expand Up @@ -246,7 +255,7 @@ We recommend users use the short formulas to create IPTW weights using all the a

formulas <- short_formulas

method <- "cbps" # this takes some time to run
method <- "cbps"

weights.cbps <- createWeights(obj = obj,
data = data,
Expand All @@ -259,9 +268,12 @@ weights.cbps <- createWeights(obj = obj,
Inspect
```{r}

print(weights.cbps, i = 2)
print(weights.cbps,
i = 2)

plot(weights.cbps, i = 1, save.out = TRUE)
plot(weights.cbps,
i = 1, s
ave.out = TRUE)

summary(weights.cbps[[1]])

Expand All @@ -276,9 +288,11 @@ weights.glm <- createWeights(obj = obj,
formulas = formulas,
method = method,
save.out = TRUE)
print(weights.glm, i = 2)
print(weights.glm,
i = 2)

plot(weights.glm, i = 2)
plot(weights.glm,
i = 2)

method <- "gbm"

Expand All @@ -287,20 +301,24 @@ weights.gbm <- createWeights(obj = obj,
formulas = formulas,
method = method,
save.out = TRUE)
print(weights.gbm, i = 2)
print(weights.gbm,
i = 2)

plot(weights.gbm, i = 2)
plot(weights.gbm,
i = 2)

method <- "bart"

weights.bart <- createWeights(obj = obj,
data = data,
formulas = formulas,
method = method,
save.out - TRUE)
print(weights.bart, i = 2)
save.out = TRUE)
print(weights.bart,
i = 2)

plot(weights.bart, i = 2)
plot(weights.bart,
i = 2)

method <- "super"

Expand All @@ -309,9 +327,11 @@ weights.super <- createWeights(obj = obj,
formulas = formulas,
method = method,
save.out = TRUE)
print(weights.super, i = 2)
print(weights.super,
i = 2)

plot(weights.super, i = 2)
plot(weights.super,
i = 2)

```

Expand Down Expand Up @@ -418,7 +438,8 @@ First, assess how well each of the IPTW achieve balance for all measured confoun

summary(balance_stats.cbps)

plot(balance_stats.cbps, t = 1)
plot(balance_stats.cbps,
t = 1)

print(balance_stats.cbps)

Expand Down Expand Up @@ -468,11 +489,13 @@ final_weights <- createWeights(data = data,

Inspect
```{r}
print(final_weights, i = 1)
print(final_weights,
i = 1)

summary(final_weights[[1]])

plot(final_weights, i = 1)
plot(final_weights,
i = 1)
```


Expand All @@ -497,9 +520,12 @@ trim_weights <- trimWeights(obj = obj,

Inspect
```{r}
print(trim_weights, i = 1)
print(trim_weights,
i = 1)

plot(trim_weights, i = 1, save.out = TRUE)
plot(trim_weights,
i = 1,
save.out = TRUE)

```

Expand Down Expand Up @@ -551,11 +577,15 @@ final_balance_stats <- assessBalance(data = data,
Inspect
```{r}

summary(final_balance_stats, save.out = TRUE)
summary(final_balance_stats,
save.out = TRUE)

plot(final_balance_stats, t = 1, save.out = TRUE)
plot(final_balance_stats,
t = 1,
save.out = TRUE)

print(final_balance_stats, save.out = TRUE)
print(final_balance_stats,
save.out = TRUE)


# manually list remaining imbalanced covariates that are time-invariant or time-varying at t=1 for use in Step 5
Expand Down Expand Up @@ -625,7 +655,9 @@ models <- fitModel(data = data,
Inspect
```{r}

print(models, i = 1, save.out = TRUE)
print(models,
i = 1,
save.out = TRUE)

```

Expand Down Expand Up @@ -684,7 +716,7 @@ out_lab <- "Behavior Problems" # empirical example

colors <- c("Dark2") # empirical example

model <- models # output from fitModel
model <- models

results <- compareHistories(obj = obj,
fit = model,
Expand All @@ -698,11 +730,14 @@ results <- compareHistories(obj = obj,

Inspect
```{r}
print(results, save.out = TRUE)
print(results,
save.out = TRUE)

summary(results, save.out = TRUE)
summary(results,
save.out = TRUE)

plot(results, save.out = TRUE)
plot(results,
save.out = TRUE)
```


Expand Down Expand Up @@ -739,7 +774,8 @@ results.s2 <- compareHistories(obj = obj,
We are grateful to the authors of many existing packages that *devMSMs* draws from!

```{r}
grateful::cite_packages(out.dir = home_dir, omit = c("devMSMs", "devMSMsHelpers"),
grateful::cite_packages(out.dir = home_dir,
omit = c("devMSMs", "devMSMsHelpers"),
out.format = "docx")
```

4 changes: 2 additions & 2 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ library(tinytable)

<br>

Those who study and work with humans are fundamentally interested in questions of causation. More specifically, scientists, clinicians, educators, and policymakers alike are often interested in *causal processes* involving questions about when (timing) and to what extent (dose) different factors influence human functioning and development, in order to inform our scientific understanding and improve people's lives. However, for many, conceptual, methodological, and practical barriers have prevented the use of methods for causal inference developed in other fields.
Those who study and work with humans are fundamentally interested in questions of causation. More specifically, scientists, clinicians, educators, and policymakers alike are often interested in *causal processes* involving questions about when (timing) and at what levels (dose) different factors influence human functioning and development, in order to inform our scientific understanding and improve people's lives. However, for many, conceptual, methodological, and practical barriers have prevented the use of methods for causal inference developed in other fields.
<br>

The goal of this *devMSMs* package and accompanying tutorial paper, *Investigating Causal Questions in Human Development Using Marginal Structural Models: A Tutorial Introduction to the devMSMs Package in R* (*insert preprint link here*), is to provide a set of tools for implementing marginal structural models (**MSMs**; Robins et al., 2000).
Expand All @@ -49,7 +49,7 @@ Core features of *devMSMs* include:

- an accompanying suite of <a href="https://github.com/istallworthy/devMSMsHelpers">helper functions</a> to assist users in preparing and inspecting their data prior to the implementation of *devMSMs*

- executable, step-by-step user guidance for implementing the *deveMSMs* worflow and preliminary steps in the form of vignettes geared toward users of all levels of R programming experience, along with a <a href="https://github.com/istallworthy/devMSMs/blob/main/exampleWorkflow.Rmd">R markdown template file</a>
- executable, step-by-step user guidance for implementing the *devMSMs* worflow and preliminary steps in the form of vignettes geared toward users of all levels of R programming experience, along with a <a href="https://github.com/istallworthy/devMSMs/blob/main/ExampleWorkflow.Rmd">R markdown template file</a>

- a brief conceptual introduction, example empirical application, and additional resources in the accompanying tutorial paper

Expand Down
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
Those who study and work with humans are fundamentally interested in
questions of causation. More specifically, scientists, clinicians,
educators, and policymakers alike are often interested in *causal
processes* involving questions about when (timing) and to what extent
processes* involving questions about when (timing) and at what levels
(dose) different factors influence human functioning and development, in
order to inform our scientific understanding and improve people’s lives.
However, for many, conceptual, methodological, and practical barriers
Expand Down Expand Up @@ -76,10 +76,10 @@ Core features of *devMSMs* include:
functions</a> to assist users in preparing and inspecting their data
prior to the implementation of *devMSMs*

- executable, step-by-step user guidance for implementing the *deveMSMs*
- executable, step-by-step user guidance for implementing the *devMSMs*
worflow and preliminary steps in the form of vignettes geared toward
users of all levels of R programming experience, along with a
<a href="https://github.com/istallworthy/devMSMs/blob/main/exampleWorkflow.Rmd">R
<a href="https://github.com/istallworthy/devMSMs/blob/main/ExampleWorkflow.Rmd">R
markdown template file</a>

- a brief conceptual introduction, example empirical application, and
Expand Down
69 changes: 35 additions & 34 deletions vignettes/Data_Requirements.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ vignette: >
%\VignetteEngine{knitr::rmarkdown}
---

```{r, include = FALSE}
```{r, include = FALSE, warning = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
Expand All @@ -23,7 +23,7 @@ options(rmarkdown.html_vignette.check_title = FALSE)

<br>

This vignette is designed to aid users in preparing their data for use with *devMSMs*. Users should first view the <a href="https://istallworthy.github.io/devMSMs/articles/Terminology.html">Terminology</a> and href="https://istallworthy.github.io/devMSMs/articles/Specify_Core_Inputs.html"\>Specifying Core Inputs</a> vignettes.
This vignette is designed to aid users in preparing their data for use with *devMSMs*. Users should first view the <a href="https://istallworthy.github.io/devMSMs/articles/Terminology.html">Terminology</a> and <a href="https://istallworthy.github.io/devMSMs/articles/Specify_Core_Inputs.html">Specifying Core Inputs</a> vignettes.

The code contained in this vignette is also available, integrated code from the other vignettes, in the <a href="https://github.com/istallworthy/devMSMs/blob/main/ExampleWorkflow.Rmd">ExampleWorkflow.rmd file</a>.

Expand Down Expand Up @@ -98,6 +98,7 @@ library(devMSMsHelpers)
if (!require("devtools")) install.packages("devtools", quiet = TRUE)
if (!require("devMSMs")) devtools::install_github("istallworthy/devMSMs", quiet = TRUE)
```
<br>

## Exploring Data

Expand Down Expand Up @@ -290,21 +291,21 @@ factor_confounders <- c("state", "TcBlac2", "BioDadInHH2", "HomeOwnd", "PmBlac2"
integer_confounders <- c("KFASTScr", "PmEd2", "RMomAgeU", "SWghtLB", "peri_health",
"caregiv_health", "gov_assist", "B18Raw", "EARS_TJo", "MDI")

# data_long_f <- formatLongData(
# data = data_long,
# exposure = exposure,
# outcome = outcome,
# sep = "\\.",
# time_var = "WAVE",
# id_var = "ID",
# missing = NA,
# factor_confounders = factor_confounders,
# integer_confounders = integer_confounders,
# home_dir = home_dir,
# save.out = save.out
# )

# head(data_long_f, n = c(5, 10))
data_long_f <- formatLongData(
data = data_long,
exposure = exposure,
outcome = outcome,
sep = "\\.",
time_var = "WAVE",
id_var = "ID",
missing = NA,
factor_confounders = factor_confounders,
integer_confounders = integer_confounders,
home_dir = home_dir,
save.out = save.out
)

head(data_long_f, n = c(5, 10))

```

Expand All @@ -321,22 +322,22 @@ We then transform our newly formatted long data into wide format, specifying `id
```{r}
require("stats", quietly = TRUE)

# sep <- "\\."
# v <- sapply(strsplit(tv_conf[!grepl("\\:", tv_conf)], sep), head, 1)
# v <- c(v[!duplicated(v)], sapply(strsplit(exposure[1], sep), head, 1))
#
# data_wide_f <- stats::reshape(
# data = data_long_f,
# idvar = "ID",
# v.names = v,
# timevar = "WAVE",
# times = c(6, 15, 24, 35, 58),
# direction = "wide"
# )
#
# data_wide_f <- data_wide_f[, colSums(is.na(data_wide_f)) < nrow(data_wide_f)]
#
# head(data_wide_f, n = c(5, 10))
sep <- "\\."
v <- sapply(strsplit(tv_conf[!grepl("\\:", tv_conf)], sep), head, 1)
v <- c(v[!duplicated(v)], sapply(strsplit(exposure[1], sep), head, 1))

data_wide_f <- stats::reshape(
data = data_long_f,
idvar = "ID",
v.names = v,
timevar = "WAVE",
times = c(6, 15, 24, 35, 58),
direction = "wide"
)

data_wide_f <- data_wide_f[, colSums(is.na(data_wide_f)) < nrow(data_wide_f)]

head(data_wide_f, n = c(5, 10))
```

<br> <br>
Expand Down Expand Up @@ -418,7 +419,7 @@ As shown below, users can use a helper function to impute their wide data or imp

Users have the option of using the helper `imputeData()` function to impute their correctly formatted wide data. This step can take a while to run. The user can specify how many imputed datasets to create (default m = 5). `imputeData()` draws on the `mice()` function from the *mice* package (van Buuren & Oudshoorn, 2011) to conduct multiple imputation by chained equations (mice). All other variables present in the dataset are used to impute missing data in each column.

The user can specify the imputation method through the `method` field drawing from the following list: “pmm” (predictive mean matching), “midastouch” (weighted predictive mean matching), “sample” (random sample from observed values), “rf” (random forest) or “cart” (classification and regression trees). Random forest imputation is the default given evidence for its efficiency and superior performance (Shah et al., 2014). Please review the *mice* documentation for more details.\
The user can specify the imputation method through the `method` field drawing from the following list: “pmm” (predictive mean matching), “midastouch” (weighted predictive mean matching), “sample” (random sample from observed values), “rf” (random forest) or “cart” (classification and regression trees). Random forest imputation is the default given evidence for its efficiency and superior performance (Shah et al., 2014). Please review the *mice* documentation for more details.
Additionally, users can specify an integer value to `seed` in order to offset the random number generator in *mice()* and make reproducible imputations.

The parameter `read_imps_from_file` will allow you to read already imputed data in from local storage (TRUE) so as not to have to re-run this imputation code multiple times (FALSE; default). Users may use this parameter to supply their own mids object of imputed data from the *mice* package (with the title ‘all_imp.rds’). Be sure to inspect the console for any warnings as well as the resulting imputed datasets. Any variables that have missing data following imputation may need to be removed due to high collinearity and/or low variability.
Expand Down
Loading
Loading