Merge pull request #190 from istallworthy/more-feedback

custom formulas vignette
istallworthy · Dec 8, 2023 · a419218 · a419218
2 parents 87f7980 + e262622
commit a419218
Show file tree

Hide file tree

Showing 9 changed files with 291 additions and 140 deletions.
diff --git a/.DS_Store b/.DS_Store
diff --git a/R/data.R b/R/data.R
@@ -9,44 +9,62 @@
 #' @format A wide data frame of 1,292 observations
 #' There are 36 measured variables. 
 #' \itemize{
-#' \item "ID" person identifier 
+#' \item "ID" subject id
 #' \item "WAVE" age (in months) when data were collected
-#' \item "ESETA1" is the continuous exposure of economic strain
-#' \item "StrDif_Tot.58" is the continuous outcome of behavioral problems
-#' \item "InRatioCor" is the income-to-needs ratio
-#' \item "PmEd2" is the parent's education level
-#' \item "state" is the family's state of residence
-#' \item "TcBlac2" is the family's race (1 = Black?, 0 = White?)
-#' \item "bioDadInHH2" is whether the biological father lives with the family (insert coding)
-#' \item "HomeOwnd" indicator of whether family owns home (insert coding)
-#' \item "KFASTScr"
-#' \item "PmBlac2" primary careigver race (insert coding)
-#' \item "PmAge2"
-#' \item "PmMrSt2"
-#' \item "RMomAgeU"
-#' \item "RHealth"
-#' \item "RHasSO"
-#' \item "SmokTotl"
-#' \item "caregiv_health"
-#' \item "peri_health"
-#' \item "SWghtLB"
-#' \item "SurpPreg"
-#' \item "DrnkFreq"
-#' \item "gov_assist"
-#' \item "ALI_LE"
-#' \item "B18Raw"
-#' \item "CORTB"
+#' \item "ESETA1" continuous exposure of economic strain
+#' \item "StrDif_Tot.58" continuous outcome of behavioral problems
+#' \item "InRatioCor" continuous income-to-needs ratio
+#' \item "PmEd2" parent's education level (0-11 = less than high school, 
+#' 12 = GED, 13 = GED and additional training, 14 = high school grad,
+#' 15 = high school and additional training, 16 = some college,
+#' 17 = associates degree, 18 = four year college degree, 19 = some post college,
+#' 20 = masters degree, 21 = professional degree, 22 = PhD)
+#' \item "state" family's state of residence (NC = North Carolina, PA = Pennslyvania)
+#' \item "TcBlac2" child's race (1 = Black, 0 = White)
+#' \item "bioDadInHH2" whether the biological father lives with the family (1 = yes, 0 = no)
+#' \item "HomeOwnd" whether family owns home (1 = owned or being bought by family, 
+#' 2 = owned or being bought by someone else, 3 = rented for rent, 
+#' 4 = occupied without payment for rent)
+#' \item "KFASTScr" continuous score of caregiver reading comprehension
+#' \item "PmBlac2" primary caregiver's race (1 = Black, 0 = White)
+#' \item "PmAge2" primary caregiver age in years
+#' \item "PmMrSt2" caregiver marital status (1 = single, 2 = married and living with spouse,
+#' 3 = married and not living with spouse, 4 = divorced, 5 = separated, 6 = widowed)
+#' \item "RMomAgeU" continuous age in years of biological mother when caregiver was born
+#' \item "RHealth" index of general caregiver health (1 = excellent, 2 = very good,
+#' 3 = good, 4 = fair, 5 = poor)
+#' \item "RHasSO" whether caregiver has significant other or not (1 = yes, 0 = no)
+#' \item "SmokTotl" total cigarettes biological mother smoked while pregnant (1 = 21 cigarettes or less, 
+#' 2 = 2 - 99 cigarettes, 3 = 100 or more cigarettes)
+#' \item "caregiv_health" sum score of caregiver health problems including emotional problems,
+#' ADHD, asthma, cancer, high blood pressure, limited mobility, learning disability, 
+#' general subjective health, mental health, overwight, seizures, depression, diabetes
+#' \item "peri_health" sum score of pregnancy/birth health including excessive vomitting, 
+#' fetal distress, colic, had alcohol, high blood pressure, heavy bleeding, infection, congenital issues, 
+#' stay in pediatric intensive care, labor induction, independent breathing at birth, had surgery, in NICU, 
+#' smoked while pregnant, breach, excessive weight loss, incubation, water retention, had c-section
+#' \item "SWghtLB" child birth weight in pounds
+#' \item "SurpPreg" whether caregiver had a surprise pregnancy (1 = yes, 0 = no)
+#' \item "DrnkFreq" how frequently caregiver drank while pregnant (1 = never, 
+#' 2 = once or twice, 3 = once a month, 4 = twice a month, 5 = couple times/week, 6 = everyday)
+#' \item "gov_assist" sum score of whether family received government benefits including
+#' early headstart, early intervention, food stamps, subsidized childcare, heating assistance, 
+#' government housing, transportation, school free lunch, WIC, and AFDC
+#' \item "ALI_LE" continuous child language expression
+#' \item "B18Raw" continuous caregiver total depression problems
+#' \item "CORTB" continuous child salivary cortisol at rest
 #' \item "EARS_TJo"
-#' \item "fscore"
-#' \item "HOMEETA1"
-#' \item "IBRAttn"
-#' \item "LESMnNeg"
-#' \item "LESMnPos"
-#' \item "MDI"
-#' \item "RHAsSO"
-#' \item "SAAmylase"
-#' \item "WndNbrhood"
+#' \item "fscore" continuous executive function factor score
+#' \item "HOMEETA1" continuous sociocognitive resources factor score
+#' \item "IBRAttn" continuous child total joint attention 
+#' \item "LESMnNeg" continuous family negative life events
+#' \item "LESMnPos" continuous family positive life events
+#' \item "MDI" continuous child Bayely mental development index
+#' \item "RHAsSO" whether caregiver has significant other at a given time (1 = yes, 0 = no)
+#' \item "SAAmylase" continuous child salivary alpha amylase at rest
+#' \item "WndNbrhood" continuous neighborhood safety 
 #' }
+#' 
 #' @references
 #' DeJoseph, M. L., Sifre, R. D., Raver, C. C., Blair, C. B., & Berry, D. (2021). 
 #' Capturing Environmental Dimensions of Adversity and Resources in the Context of Poverty Across Infancy Through Early Adolescence: 
@@ -60,6 +78,10 @@
 #' Garrett-Peiers, P., Conger, R. D., & Bauer, P. J. (2013). The Family Life Project: An Epidemiological and 
 #' Developmental Study of Young Children Living in Poor Rural Communities.
 #' Monographs of the Society for Research in Child Development, 78(5), i–150.
+#' 
+#' Willoughby, M. T., Blair, C. B., Wirth, R. J., & Greenberg, M. (2010). The measurement 
+#' of executive function at age 3 years: psychometric properties and criterion validity of a 
+#' new battery of tasks. Psychological assessment, 22(2), 306.
 #'  
 
 #'@keywords datasets

diff --git a/examplePipelineRevised.Rmd b/examplePipelineRevised.Rmd
@@ -7,11 +7,16 @@ output: html_document
 
 Please see this manuscript for a full conceptual and practical introduction to MSMs in the context of developmental data. Please see the vignettes on the *devMSMs* website for step-by-step guidance on the use of this code: https://istallworthy.github.io/devMSMs/index.html. 
 
+Headings denote accompanying website sections and steps.  
+
 The code in each code chunk is set up identifying all possible inputs to each function (required and optional) to aid the user's use of the full range of package functionality. Example possible values for the optional input are shown for each function, including a NULL/NA option if the user does not wish to specify an optional input. The user should select one of each optional input values. Alternatively, the user could modify the call to the function and remove the optional input argument(s) entirely. 
 
 Please see the website vignettes and/or type `?functionName` into the console for more guidance on the arguments for each function. These two sources should match but let me know if you see discrepancies. 
 
 
+# *Installation*
+https://istallworthy.github.io/devMSMs/index.html 
+
 ## Getting started
 Until *devMSMs* is available on CRAN, you will need to install it directly from Github (https://github.com/istallworthy/devMSMs), as shown below.  
 
@@ -27,14 +32,12 @@ library(devMSMs)
 install_github("istallworthy/devMSMsHelpers", quiet = TRUE)
 library(devMSMsHelpers)
 
-#note: if I update Github to fix something, you may need to first uninstall the package(s) by running the following code:
-# remove.packages("devMSMs") or 
-# remove.packages("devMSMsHelpers")
-#prior to re-installing using the code above. Sorry, this is annoying! There may also be a short lag between when I update something on Github and when it becomes available for install. 
-
 ```
 
 
+# *Specify Core Inputs Vignette*
+https://istallworthy.github.io/devMSMs/articles/Specify_Core_Inputs.html
+
 ## Specifying Required Package Core Inputs
 The user should change all fields in this code chunk to match their home directory and wide data.
 
@@ -84,6 +87,11 @@ ti_confounders <- c("state", "BioDadInHH2", "PmAge2", "PmBlac2", "TcBlac2", "PmM
 ```
 
 
+
+# *Preliminary Steps Vignette*
+https://istallworthy.github.io/devMSMs/articles/Preliminary_Steps.html
+
+
 ## STEP P: Preliminary Steps for Reading in, Formatting, & Inspecting Data
 We advise users implement the appropriate preliminary steps, with the goal of assigning to 'data' one of the following wide data formats (see Figure 1) for use in the package:  
 
@@ -93,11 +101,9 @@ We advise users implement the appropriate preliminary steps, with the goal of as
 
 * a list of data imputed in wide format as data frames.  
 
-<br>
-<br>
 
 Users have several options for reading in data. They can begin this workflow with the following options:  
-<br>
+
 * long data with missingness can can be formatted and converted to wide data (P1a) for imputation (P2)
 
 * wide with missingness can be formatted (P1b) before imputing (P2)
@@ -106,8 +112,6 @@ Users have several options for reading in data. They can begin this workflow wit
 
 * data with no missingness (P3)
 
-<br>
-
 
 ### P1. Single Long Data Frame
 Users beginning with a single data frame in long format (with or without missingness).
@@ -149,9 +153,6 @@ data_wide_f <- stats::reshape(data = data_long_f,
                             times = c(6, 15, 24, 35, 58), # list all time points in your dataset
                             direction = "wide")
 data_wide_f <- data_wide_f[, colSums(is.na(data_wide_f)) < nrow(data_wide_f)]
-
-# data_wide2=data_wide
-# data_wide2=data_wide2[order(names(data_wide2))]
 ```
 
 
@@ -179,7 +180,6 @@ data_wide_f <- formatWideData(data = data_wide, exposure = exposure, exposure_ti
                               factor_confounders = factor_confounders,
                               integer_confounders = integer_confounders,
                               home_dir = home_dir, save.out = TRUE) 
-
 ```
 
 
@@ -219,39 +219,6 @@ data <- imputed_data
 summary(mice::complete(data, 1))
 ```
 
-```{r}
-# row.names(data_wide) <-NULL
-# row.names(data_wide2) <-NULL
-# 
-# data_wide <- data_wide[order(names(data_wide))]
-# data_wide2 <- data_wide2[order(names(data_wide2))]
-# 
-# identical(data_wide, data_wide2)
-# 
-# sapply(data_wide2, class) 
-# 
-# integer_covars <- c("ID", "KFASTScr", "PmEd2", "RMomAgeU", "SWghtLB", "peri_health", "caregiv_health" , "gov_assist", "B18Raw.15", "B18Raw.24", "B18Raw.58",      "B18Raw.6", "EARS_TJo.24", "EARS_TJo.35", "MDI.15", "MDI.6")
-# data_wide2[, integer_covars] <- as.data.frame(lapply(data_wide2[, integer_covars], as.integer))
-# 
-# library(dplyr)
-# all_equal(data_wide, data_wide2)
-# 
-# all.equal(data_wide, data_wide2)
-# 
-# #testing
-# seed = 1234
-# data_to_impute <- tibble::tibble(data_wide)
-# # data_to_impute <- tibble::tibble(data_wide2) #from long
-# imputed_datasets <- mice::mice(data_to_impute, 
-#                                m = m, 
-#                                method = method, 
-#                                maxit = maxit,
-#                                print = F,
-#                                seed = seed)
-# summary(mice::complete(imputed_datasets, 1))
-
-```
-
 Alternatively, users could read in a saved mids object for use with *devMSMS*.    
 ```{r}
 imputed_data <- readRDS("/Users/isabella/Library/CloudStorage/Box-Box/BSL General/MSMs/testing/testing data/continuous outcome/continuous exposure/FLP_wide_imputed_mids.rds") # final imputations for empirical example; place your .rds file in your home directory and change the name of file here
@@ -296,11 +263,8 @@ factor_covars <- c("state", "TcBlac2","BioDadInHH2","HomeOwnd", "PmBlac2",
 data <- lapply(data, function(x) {
   x[, factor_covars] <- as.data.frame(lapply(x[, factor_covars], as.factor))
   x })
-
 ```
 
-
-
 Having read in their data, users should now have have assigned to `data` wide, complete data as: a data frame, mice object, or list of imputed data frames for use with the *devMSMs* functions.  
 
 ### P4. Optional: Identify Exposure Epochs
@@ -357,6 +321,11 @@ inspectData(data = data, exposure = exposure, exposure_time_pts = exposure_time_
 ```
 
 
+
+# *Workflow: Continuous Exposure Vignette*
+https://istallworthy.github.io/devMSMs/articles/Workflow_Continuous_Exposure.html
+
+
 ## PHASE 1: Confounder Adjustment
 The first phase of the MSM process is focused on eliminating confounding of the relation between exposure and outcome.  
 
@@ -421,7 +390,6 @@ The next step is to specify shorter, simplified balancing formula for the purpos
 #### 2a. Create simplified balancing formulas
 First, create shorter, simplified balancing formulas at each exposure time point.  
 ```{r}
-
 #optional list of concurrent confounder
 concur_conf <- "B18Raw.15"
 concur_conf <-  NULL #empirical example 

diff --git a/man/sim_data_wide.rda.Rd b/man/sim_data_wide.rda.Rd