Skip to content

Commit

Permalink
Merge pull request #190 from istallworthy/more-feedback
Browse files Browse the repository at this point in the history
custom formulas vignette
  • Loading branch information
istallworthy committed Dec 8, 2023
2 parents 87f7980 + e262622 commit a419218
Show file tree
Hide file tree
Showing 9 changed files with 291 additions and 140 deletions.
Binary file modified .DS_Store
Binary file not shown.
92 changes: 57 additions & 35 deletions R/data.R
Original file line number Diff line number Diff line change
Expand Up @@ -9,44 +9,62 @@
#' @format A wide data frame of 1,292 observations
#' There are 36 measured variables.
#' \itemize{
#' \item "ID" person identifier
#' \item "ID" subject id
#' \item "WAVE" age (in months) when data were collected
#' \item "ESETA1" is the continuous exposure of economic strain
#' \item "StrDif_Tot.58" is the continuous outcome of behavioral problems
#' \item "InRatioCor" is the income-to-needs ratio
#' \item "PmEd2" is the parent's education level
#' \item "state" is the family's state of residence
#' \item "TcBlac2" is the family's race (1 = Black?, 0 = White?)
#' \item "bioDadInHH2" is whether the biological father lives with the family (insert coding)
#' \item "HomeOwnd" indicator of whether family owns home (insert coding)
#' \item "KFASTScr"
#' \item "PmBlac2" primary careigver race (insert coding)
#' \item "PmAge2"
#' \item "PmMrSt2"
#' \item "RMomAgeU"
#' \item "RHealth"
#' \item "RHasSO"
#' \item "SmokTotl"
#' \item "caregiv_health"
#' \item "peri_health"
#' \item "SWghtLB"
#' \item "SurpPreg"
#' \item "DrnkFreq"
#' \item "gov_assist"
#' \item "ALI_LE"
#' \item "B18Raw"
#' \item "CORTB"
#' \item "ESETA1" continuous exposure of economic strain
#' \item "StrDif_Tot.58" continuous outcome of behavioral problems
#' \item "InRatioCor" continuous income-to-needs ratio
#' \item "PmEd2" parent's education level (0-11 = less than high school,
#' 12 = GED, 13 = GED and additional training, 14 = high school grad,
#' 15 = high school and additional training, 16 = some college,
#' 17 = associates degree, 18 = four year college degree, 19 = some post college,
#' 20 = masters degree, 21 = professional degree, 22 = PhD)
#' \item "state" family's state of residence (NC = North Carolina, PA = Pennslyvania)
#' \item "TcBlac2" child's race (1 = Black, 0 = White)
#' \item "bioDadInHH2" whether the biological father lives with the family (1 = yes, 0 = no)
#' \item "HomeOwnd" whether family owns home (1 = owned or being bought by family,
#' 2 = owned or being bought by someone else, 3 = rented for rent,
#' 4 = occupied without payment for rent)
#' \item "KFASTScr" continuous score of caregiver reading comprehension
#' \item "PmBlac2" primary caregiver's race (1 = Black, 0 = White)
#' \item "PmAge2" primary caregiver age in years
#' \item "PmMrSt2" caregiver marital status (1 = single, 2 = married and living with spouse,
#' 3 = married and not living with spouse, 4 = divorced, 5 = separated, 6 = widowed)
#' \item "RMomAgeU" continuous age in years of biological mother when caregiver was born
#' \item "RHealth" index of general caregiver health (1 = excellent, 2 = very good,
#' 3 = good, 4 = fair, 5 = poor)
#' \item "RHasSO" whether caregiver has significant other or not (1 = yes, 0 = no)
#' \item "SmokTotl" total cigarettes biological mother smoked while pregnant (1 = 21 cigarettes or less,
#' 2 = 2 - 99 cigarettes, 3 = 100 or more cigarettes)
#' \item "caregiv_health" sum score of caregiver health problems including emotional problems,
#' ADHD, asthma, cancer, high blood pressure, limited mobility, learning disability,
#' general subjective health, mental health, overwight, seizures, depression, diabetes
#' \item "peri_health" sum score of pregnancy/birth health including excessive vomitting,
#' fetal distress, colic, had alcohol, high blood pressure, heavy bleeding, infection, congenital issues,
#' stay in pediatric intensive care, labor induction, independent breathing at birth, had surgery, in NICU,
#' smoked while pregnant, breach, excessive weight loss, incubation, water retention, had c-section
#' \item "SWghtLB" child birth weight in pounds
#' \item "SurpPreg" whether caregiver had a surprise pregnancy (1 = yes, 0 = no)
#' \item "DrnkFreq" how frequently caregiver drank while pregnant (1 = never,
#' 2 = once or twice, 3 = once a month, 4 = twice a month, 5 = couple times/week, 6 = everyday)
#' \item "gov_assist" sum score of whether family received government benefits including
#' early headstart, early intervention, food stamps, subsidized childcare, heating assistance,
#' government housing, transportation, school free lunch, WIC, and AFDC
#' \item "ALI_LE" continuous child language expression
#' \item "B18Raw" continuous caregiver total depression problems
#' \item "CORTB" continuous child salivary cortisol at rest
#' \item "EARS_TJo"
#' \item "fscore"
#' \item "HOMEETA1"
#' \item "IBRAttn"
#' \item "LESMnNeg"
#' \item "LESMnPos"
#' \item "MDI"
#' \item "RHAsSO"
#' \item "SAAmylase"
#' \item "WndNbrhood"
#' \item "fscore" continuous executive function factor score
#' \item "HOMEETA1" continuous sociocognitive resources factor score
#' \item "IBRAttn" continuous child total joint attention
#' \item "LESMnNeg" continuous family negative life events
#' \item "LESMnPos" continuous family positive life events
#' \item "MDI" continuous child Bayely mental development index
#' \item "RHAsSO" whether caregiver has significant other at a given time (1 = yes, 0 = no)
#' \item "SAAmylase" continuous child salivary alpha amylase at rest
#' \item "WndNbrhood" continuous neighborhood safety
#' }
#'
#' @references
#' DeJoseph, M. L., Sifre, R. D., Raver, C. C., Blair, C. B., & Berry, D. (2021).
#' Capturing Environmental Dimensions of Adversity and Resources in the Context of Poverty Across Infancy Through Early Adolescence:
Expand All @@ -60,6 +78,10 @@
#' Garrett-Peiers, P., Conger, R. D., & Bauer, P. J. (2013). The Family Life Project: An Epidemiological and
#' Developmental Study of Young Children Living in Poor Rural Communities.
#' Monographs of the Society for Research in Child Development, 78(5), i–150.
#'
#' Willoughby, M. T., Blair, C. B., Wirth, R. J., & Greenberg, M. (2010). The measurement
#' of executive function at age 3 years: psychometric properties and criterion validity of a
#' new battery of tasks. Psychological assessment, 22(2), 306.
#'

#'@keywords datasets
Expand Down
70 changes: 19 additions & 51 deletions examplePipelineRevised.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,16 @@ output: html_document

Please see this manuscript for a full conceptual and practical introduction to MSMs in the context of developmental data. Please see the vignettes on the *devMSMs* website for step-by-step guidance on the use of this code: https://istallworthy.github.io/devMSMs/index.html.

Headings denote accompanying website sections and steps.

The code in each code chunk is set up identifying all possible inputs to each function (required and optional) to aid the user's use of the full range of package functionality. Example possible values for the optional input are shown for each function, including a NULL/NA option if the user does not wish to specify an optional input. The user should select one of each optional input values. Alternatively, the user could modify the call to the function and remove the optional input argument(s) entirely.

Please see the website vignettes and/or type `?functionName` into the console for more guidance on the arguments for each function. These two sources should match but let me know if you see discrepancies.


# *Installation*
https://istallworthy.github.io/devMSMs/index.html

## Getting started
Until *devMSMs* is available on CRAN, you will need to install it directly from Github (https://github.com/istallworthy/devMSMs), as shown below.

Expand All @@ -27,14 +32,12 @@ library(devMSMs)
install_github("istallworthy/devMSMsHelpers", quiet = TRUE)
library(devMSMsHelpers)
#note: if I update Github to fix something, you may need to first uninstall the package(s) by running the following code:
# remove.packages("devMSMs") or
# remove.packages("devMSMsHelpers")
#prior to re-installing using the code above. Sorry, this is annoying! There may also be a short lag between when I update something on Github and when it becomes available for install.
```


# *Specify Core Inputs Vignette*
https://istallworthy.github.io/devMSMs/articles/Specify_Core_Inputs.html

## Specifying Required Package Core Inputs
The user should change all fields in this code chunk to match their home directory and wide data.

Expand Down Expand Up @@ -84,6 +87,11 @@ ti_confounders <- c("state", "BioDadInHH2", "PmAge2", "PmBlac2", "TcBlac2", "PmM
```



# *Preliminary Steps Vignette*
https://istallworthy.github.io/devMSMs/articles/Preliminary_Steps.html


## STEP P: Preliminary Steps for Reading in, Formatting, & Inspecting Data
We advise users implement the appropriate preliminary steps, with the goal of assigning to 'data' one of the following wide data formats (see Figure 1) for use in the package:

Expand All @@ -93,11 +101,9 @@ We advise users implement the appropriate preliminary steps, with the goal of as

* a list of data imputed in wide format as data frames.

<br>
<br>

Users have several options for reading in data. They can begin this workflow with the following options:
<br>

* long data with missingness can can be formatted and converted to wide data (P1a) for imputation (P2)

* wide with missingness can be formatted (P1b) before imputing (P2)
Expand All @@ -106,8 +112,6 @@ Users have several options for reading in data. They can begin this workflow wit

* data with no missingness (P3)

<br>


### P1. Single Long Data Frame
Users beginning with a single data frame in long format (with or without missingness).
Expand Down Expand Up @@ -149,9 +153,6 @@ data_wide_f <- stats::reshape(data = data_long_f,
times = c(6, 15, 24, 35, 58), # list all time points in your dataset
direction = "wide")
data_wide_f <- data_wide_f[, colSums(is.na(data_wide_f)) < nrow(data_wide_f)]
# data_wide2=data_wide
# data_wide2=data_wide2[order(names(data_wide2))]
```


Expand Down Expand Up @@ -179,7 +180,6 @@ data_wide_f <- formatWideData(data = data_wide, exposure = exposure, exposure_ti
factor_confounders = factor_confounders,
integer_confounders = integer_confounders,
home_dir = home_dir, save.out = TRUE)
```


Expand Down Expand Up @@ -219,39 +219,6 @@ data <- imputed_data
summary(mice::complete(data, 1))
```

```{r}
# row.names(data_wide) <-NULL
# row.names(data_wide2) <-NULL
#
# data_wide <- data_wide[order(names(data_wide))]
# data_wide2 <- data_wide2[order(names(data_wide2))]
#
# identical(data_wide, data_wide2)
#
# sapply(data_wide2, class)
#
# integer_covars <- c("ID", "KFASTScr", "PmEd2", "RMomAgeU", "SWghtLB", "peri_health", "caregiv_health" , "gov_assist", "B18Raw.15", "B18Raw.24", "B18Raw.58", "B18Raw.6", "EARS_TJo.24", "EARS_TJo.35", "MDI.15", "MDI.6")
# data_wide2[, integer_covars] <- as.data.frame(lapply(data_wide2[, integer_covars], as.integer))
#
# library(dplyr)
# all_equal(data_wide, data_wide2)
#
# all.equal(data_wide, data_wide2)
#
# #testing
# seed = 1234
# data_to_impute <- tibble::tibble(data_wide)
# # data_to_impute <- tibble::tibble(data_wide2) #from long
# imputed_datasets <- mice::mice(data_to_impute,
# m = m,
# method = method,
# maxit = maxit,
# print = F,
# seed = seed)
# summary(mice::complete(imputed_datasets, 1))
```

Alternatively, users could read in a saved mids object for use with *devMSMS*.
```{r}
imputed_data <- readRDS("/Users/isabella/Library/CloudStorage/Box-Box/BSL General/MSMs/testing/testing data/continuous outcome/continuous exposure/FLP_wide_imputed_mids.rds") # final imputations for empirical example; place your .rds file in your home directory and change the name of file here
Expand Down Expand Up @@ -296,11 +263,8 @@ factor_covars <- c("state", "TcBlac2","BioDadInHH2","HomeOwnd", "PmBlac2",
data <- lapply(data, function(x) {
x[, factor_covars] <- as.data.frame(lapply(x[, factor_covars], as.factor))
x })
```



Having read in their data, users should now have have assigned to `data` wide, complete data as: a data frame, mice object, or list of imputed data frames for use with the *devMSMs* functions.

### P4. Optional: Identify Exposure Epochs
Expand Down Expand Up @@ -357,6 +321,11 @@ inspectData(data = data, exposure = exposure, exposure_time_pts = exposure_time_
```



# *Workflow: Continuous Exposure Vignette*
https://istallworthy.github.io/devMSMs/articles/Workflow_Continuous_Exposure.html


## PHASE 1: Confounder Adjustment
The first phase of the MSM process is focused on eliminating confounding of the relation between exposure and outcome.

Expand Down Expand Up @@ -421,7 +390,6 @@ The next step is to specify shorter, simplified balancing formula for the purpos
#### 2a. Create simplified balancing formulas
First, create shorter, simplified balancing formulas at each exposure time point.
```{r}
#optional list of concurrent confounder
concur_conf <- "B18Raw.15"
concur_conf <- NULL #empirical example
Expand Down
91 changes: 56 additions & 35 deletions man/sim_data_wide.rda.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit a419218

Please sign in to comment.