--- title: "cbc_lm" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{cbc_lm} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(OLStrajr) ``` ## Introduction The OLStrajr package provides the cbc_lm function for obtaining estimates for each case. This vignette will demonstrate its use through an example that utilizes the Robins dataset. This dataset presents the yearly male-to-female ratio for Robins, as documented in [Birds: incomplete counts—five-minute bird counts Version 1.0](https://www.doc.govt.nz/documents/science-and-technical/inventory-monitoring/im-toolbox-birds-incomplete-five-min-counts.pdf). ```{r} data(robins) ``` ### Preparing Data While the OLStraj R function works with wide-form data to maintain compatibility with the original OLStraj SAS macro, the cbc_lm function aligns with common R statistical modeling practices by utilizing long-form data. In this example, we employ the tidyr package to transform the Robins dataset into a long format. Moreover, we adjust the resulting Year variable into a numeric format with a range from 0 to 4 through the following steps: conversion to a factor, coercion to numeric, and subtraction of 1. ```{r} # Convert to long form library(tidyr) robinsL <- robins |> pivot_longer(cols = starts_with("aug"), names_to = "Year", values_to = "Ratio") robinsL$Year = as.numeric(as.factor(robinsL$Year)) - 1 ``` ### Running the case by case regression ```{r} robins_mod <- cbc_lm(robinsL, Ratio ~ Year, .case = "site") # Show class class(robins_mod) ``` The cbc_lm class includes print, summary, and plot methods. In the following section, we'll take a closer look at the summary method. ### Model Summary According to [Carrig, Wirth, and Curran (2004)](https://www.tandfonline.com/doi/abs/10.1207/S15328007SEM1101_9), the mean values of the case-by-case OLS intercepts and slopes can act as unbiased estimators of the mean population intercept and slope. In the implementation of case-by-case regression by [Rogosa & Saner (1995)](https://journals.sagepub.com/doi/10.3102/10769986020002149), it is noted that the standard errors are the standard deviations across 4,000 bootstrap replications, and the 90% confidence intervals' endpoints correspond to 5% and 95% values of the empirical distributions obtained from the resampling. They further suggested that more sophisticated and accurate confidence intervals can be developed using the methods in [Efron and Tibshirani (1993)](https://www.taylorfrancis.com/books/mono/10.1201/9780429246593/introduction-bootstrap-bradley-efron-tibshirani)." By default, the summary method for the cbc_lm class initially displays the mean coefficients, bootstrap standard errors (over 4,000 replications), and bootstrap 95% confidence intervals. In addition, it exhibits the broom::tidy and broom::glance results for each case. ```{r} summary(robins_mod) ```