Title: | Ordinary Least Squares Trajectory Analysis |
---|---|
Description: | The 'OLStrajr' package provides comprehensive functions for ordinary least squares (OLS) trajectory analysis and case-by-case OLS regression as outlined in Carrig, Wirth, and Curran (2004) <doi:10.1207/S15328007SEM1101_9> and Rogosa and Saner (1995) <doi:10.3102/10769986020002149>. It encompasses two primary functions, OLStraj() and cbc_lm(). The OLStraj() function simplifies the estimation of individual growth curves over time via OLS regression, with options for visualizing both group-level and individual-level growth trajectories and support for linear and quadratic models. The cbc_lm() function facilitates case-by-case OLS estimates and provides unbiased mean population intercept and slope estimators by averaging OLS intercepts and slopes across cases. It further offers standard error calculations across bootstrap replicates and computation of 95% confidence intervals based on empirical distributions from the resampling processes. |
Authors: | Mackson Ncube [aut, cre], mightymetrika, LLC [cph, fnd] |
Maintainer: | Mackson Ncube <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.0.9000 |
Built: | 2024-11-15 04:06:58 UTC |
Source: | https://github.com/mightymetrika/olstrajr |
Implements the case-by-case ordinary least squares (OLS) regression method, as detailed in Rogosa & Saner (1995). The cbc_lm function provides unbiased estimators of the mean population intercept and slope by calculating the mean values of the OLS intercepts and slopes for each case (Carrig et al, 2004). The standard errors reported are the standard deviations across bootstrap replicates. Additionally, 95% confidence intervals are calculated using the empirical distributions from the resampling.
cbc_lm( data, formula, .case, n_bootstrap = 4000, lm_options = list(), boot_options = list(), boot.ci_options = list(), na.rm = FALSE, stop_zeroSD = TRUE )
cbc_lm( data, formula, .case, n_bootstrap = 4000, lm_options = list(), boot_options = list(), boot.ci_options = list(), na.rm = FALSE, stop_zeroSD = TRUE )
data |
A data frame containing the variables in the model |
formula |
An object of class formula (or a string that can be converted to a formula object) detailing the model's specifications. |
.case |
A quoted variable name used to subset data into cases. |
n_bootstrap |
The number of bootstrap replicates for standard errors and confidence intervals of mean coefficients. Default is 4000, as in Rogosa & Saner (1995). |
lm_options |
Pass additional arguments to the lm function. |
boot_options |
Pass additional arguments to the boot function. |
boot.ci_options |
Pass additional arguments to the boot.ci function. |
na.rm |
Pass na.rm to: the mean function used to obtain mean_coef and bm_coef; the sd function used to obtain se_coef; the mean function used in the statistic parameter of boot. |
stop_zeroSD |
A logical. If TRUE, the function halts execution when encountering a case where an independent variable has zero standard deviation, issuing an error message. If FALSE, the function issues a warning when encountering zero standard deviation and skips fitting a model for that case, returning NULL in place of a model object for such cases in the output. Defaults to TRUE. |
An object of class cbc_lm, which contains the results of the case-by-case OLS regression, including the mean, standard error, and confidence intervals for each coefficient.
Carrig, M. M., Wirth, R. J., & Curran, P. J. (2004). A SAS Macro for Estimating and Visualizing Individual Growth Curves. Structural Equation Modeling: A Multidisciplinary Journal, 11(1), 132-149. doi:10.1207/S15328007SEM1101_9
Rogosa, D., & Saner, H. (1995). Longitudinal Data Analysis Examples with Random Coefficient Models. Journal of Educational and Behavioral Statistics, 20(2), 149-170. doi:10.3102/10769986020002149
df <- data.frame(ids = rep(1:5, 5), vals = stats::rnorm(25), outs = stats::rnorm(25, 10, 25)) cbc_lm(data = df, formula = outs ~ vals, .case = "ids")
df <- data.frame(ids = rep(1:5, 5), vals = stats::rnorm(25), outs = stats::rnorm(25, 10, 25)) cbc_lm(data = df, formula = outs ~ vals, .case = "ids")
Implements the OLS trajectory analysis method as detailed in Carrig et al (2004). The method uses case-by-case ordinary least squares (OLS) regression to estimate individual growth curves over time. The function provides options for group-level and individual-level plots and accommodates linear and quadratic models.
OLStraj( data, idvarname = "id", predvarname = "time", outvarname = "score", varlist = c("anti1", "anti2", "anti3", "anti4"), timepts = c(0, 1, 2, 3), inclmiss = FALSE, level = "both", regtype = "lin", numplot = NULL, hist = TRUE, int_bins = 30, lin_bins = 30, quad_bins = 30, box = TRUE, outds = TRUE, ... )
OLStraj( data, idvarname = "id", predvarname = "time", outvarname = "score", varlist = c("anti1", "anti2", "anti3", "anti4"), timepts = c(0, 1, 2, 3), inclmiss = FALSE, level = "both", regtype = "lin", numplot = NULL, hist = TRUE, int_bins = 30, lin_bins = 30, quad_bins = 30, box = TRUE, outds = TRUE, ... )
data |
A data frame |
idvarname |
A quoted variable name identifying the column in data which serves as the case identifier |
predvarname |
A quoted predictor variable label. |
outvarname |
A quoted outcome variable label. |
varlist |
A vector of quoted variable names found in data |
timepts |
A vector specifying how time points should be coded |
inclmiss |
A logical specifying whether or not to use complete cases. Set inclmiss to FALSE in order to filter data down to complete cases. |
level |
Control which OLS trajectory plots to show. If level is set to "grp" then only group level plots will be shown, if level is set to "ind" then only individual level plots will be shown, and if level is set to "both" then both group and individual level plots will be shown. |
regtype |
Set regtype to "quad" to include quadratic term in the cbc_lm call or set regtype to "lin" to exclude the quadratic term. Use regtype = "both" to include the quadratic term in the cbc_lm call and to include both linear and quadratic terms on the individual OLS-estimated trajectory plots. |
numplot |
Specify an integer to subset the number of cases used in OLStraj |
hist |
Set hist to TRUE to include histograms or FALSE to exclude |
int_bins |
Set the number of bins for the intercept term's histogram |
lin_bins |
Set the number of bins for the linear term's histogram |
quad_bins |
Set the number of bins for the quadratic term's histogram |
box |
Set box to TRUE to include boxplots or FALSE to exclude |
outds |
Set outds to TRUE to include the output as a data frame. Output will contain original data used in the OLStraj algorithm with the parameter estimates obtained from cbc_lm |
... |
Pass additional arguments to cbc_lm |
A list containing an output data frame (if outds is set to TRUE), the selected plots, and the case-by-case regression model object.
Carrig, M.M., Wirth, R.J., & Curran, P.J. (2004). A SAS Macro for Estimating and Visualizing Individual Growth Curves. Structural Equation Modeling: A Multidisciplinary Journal, 11(1), 132-149. doi:10.1207/S15328007SEM1101_9
df <- data.frame(id = c(1,2,3,4,5), var1 = c(3,7,4,5,8), var2 = c(7,3,9,4,7), var3 = c(8,5,3,9,7), var4 = c(1,5,3,9,30)) olstraj_out <- OLStraj(data = df, varlist = c("var1", "var2", "var3", "var4"), regtype = "quad", int_bins = 5, lin_bins = 5, quad_bins = 5)
df <- data.frame(id = c(1,2,3,4,5), var1 = c(3,7,4,5,8), var2 = c(7,3,9,4,7), var3 = c(8,5,3,9,7), var4 = c(1,5,3,9,30)) olstraj_out <- OLStraj(data = df, varlist = c("var1", "var2", "var3", "var4"), regtype = "quad", int_bins = 5, lin_bins = 5, quad_bins = 5)
This function generates diagnostic plots for each linear model included in a 'cbc_lm' object. By default, it plots all models but this can be controlled by specifying the 'n_models' parameter. If multiple plots are to be generated, the function can be set up to ask before displaying the next plot (if the session is interactive).
## S3 method for class 'cbc_lm' plot(x, n_models = length(x$models), ask = interactive() && n_models > 1, ...)
## S3 method for class 'cbc_lm' plot(x, n_models = length(x$models), ask = interactive() && n_models > 1, ...)
x |
A 'cbc_lm' object. |
n_models |
The number of models to plot. Defaults to the total number of models in 'x'. If 'n_models' is greater than the number of models available, a warning will be issued and all models will be plotted. |
ask |
Logical. If TRUE (and the session is interactive), the function will prompt the user before displaying the next plot. Defaults to TRUE when the session is interactive and there is more than one model to be plotted. |
... |
Additional graphical parameters to pass to the plot function. |
The function is used for its side effect of generating diagnostic plots. It invisibly returns the 'cbc_lm' object.
Print method for 'cbc_lm' objects. Shows the call used to create the model, the mean coefficients, (optionally) the bootstrap mean coefficients, and the coefficients for each model.
## S3 method for class 'cbc_lm' print(x, digits = max(3L, getOption("digits") - 3L), boot = FALSE, ...)
## S3 method for class 'cbc_lm' print(x, digits = max(3L, getOption("digits") - 3L), boot = FALSE, ...)
x |
A 'cbc_lm' object. |
digits |
The number of significant digits to use when printing. |
boot |
Logical indicating whether or not to print the bootstrap mean coefficients. |
... |
Further arguments passed to or from other methods. |
An invisible 'cbc_lm' object.
Print method for 'summary.cbc_lm' objects. Prints the call used to create the models, the mean coefficients, (optionally) the bootstrap mean coefficients, bootstrap standard errors, bootstrap confidence intervals, and the tidy and glance summaries for each model.
## S3 method for class 'summary.cbc_lm' print(x, digits = max(3L, getOption("digits") - 3L), ...)
## S3 method for class 'summary.cbc_lm' print(x, digits = max(3L, getOption("digits") - 3L), ...)
x |
A 'summary.cbc_lm' object. |
digits |
The number of significant digits to use when printing. |
... |
Further arguments passed to or from other methods. |
An invisible 'summary.cbc_lm' object.
Data obtained from Rogosa & Saner (1995) which describes the data as: "Example 1. The rat weight data are taken from the HLM manual (Bryk et al., 1989). The rat data consist of 10 individuals, with weight measurements (Y) at five occasions (Weeks 0, 1,2, 3, 4) and a background measure, the mother's weight (Z)."
rats
rats
rats
A data frame with 10 observations and 7 variables:
Rat identifier
Week of weight measure
Mother's weight
Bryk, A. S., Raudenbush, S. W., Seltzer, M., & Congdon, R. T. (1989). An introduction to HLM: Computer program and user's guide. Chicago: University of Chicago. doi:10.1201/9780429246593
Rogosa, D., & Saner, H. (1995). Longitudinal Data Analysis Examples with Random Coefficient Models. Journal of Educational and Behavioral Statistics, 20(2), 149-170. doi:10.3102/10769986020002149
Data from Table 1 of "Birds: incomplete counts—five-minute bird counts Version 1.0"
robins
robins
robins
A data frame with 2 observations and 6 variables:
Site name
ratio of male to female robins
Summary method for 'cbc_lm' objects. Returns the mean coefficients, bootstrap mean coefficients, standard errors, and confidence intervals, as well as a summary of the models.
## S3 method for class 'cbc_lm' summary( object, digits = max(3L, getOption("digits") - 3L), boot = FALSE, n_models = length(object$models), ... )
## S3 method for class 'cbc_lm' summary( object, digits = max(3L, getOption("digits") - 3L), boot = FALSE, n_models = length(object$models), ... )
object |
A 'cbc_lm' object. |
digits |
The number of significant digits to use when printing. |
boot |
Logical indicating whether or not to include the bootstrap mean coefficients in the summary. |
n_models |
The number of models to include in the summary. Defaults to all models. |
... |
Further arguments passed to or from other methods. |
An object of class 'summary.cbc_lm', which includes the call, the mean coefficients, (optionally) the bootstrap mean coefficients, standard errors, confidence intervals, and a summary of the models.