--- title: "Comparing HC Estimators" vignette: > %\VignetteIndexEntry{Comparing HC Estimators} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) options(hcinfer.use_emoji = TRUE) ``` This vignette shows how to compare HC estimators with `hcinfer`. The focus is on the objects returned by the package: coefficient tables, confidence intervals, covariance matrices, robust weights, and leverage diagnostics. ## List the available estimators Use `hc_methods()` to see the estimator names accepted by `hcinfer()` and `vcov_hc()`. ```{r} library(hcinfer) hc_methods() ``` ## Fit one model Comparisons should start from one fitted model. Here we use the same OLS fit for all methods. ```{r} schools <- PublicSchools |> dplyr::mutate( income_scaled = income / 10000, income_scaled_sq = income_scaled^2 ) fit <- lm(expenditure ~ income_scaled + income_scaled_sq, data = schools) ``` ## Run several methods The `type` argument selects the HC estimator. Store each result in a named list so that later extraction is straightforward. ```{r} methods <- c("hc0", "hc3", "hc4", "hc4m", "hcbeta") results <- purrr::map(methods, \(method) { hcinfer(fit, type = method) }) names(results) <- methods ``` ## Compare one coefficient Most applied comparisons focus on one or a few coefficients. The helper below uses `tests()` and `confint()` to extract one coefficient and adds the method name. ```{r} extract_term <- function(result, method, term = "income_scaled_sq") { row <- tests(result, parm = term) ci <- confint(result, parm = term) tibble::tibble( method = method, estimate = row$estimate, std_error = row$std_error, p_value = row$p_value, conf_low = ci$conf_low, conf_high = ci$conf_high, reject = row$reject ) } comparison <- purrr::imap(results, extract_term) comparison <- dplyr::bind_rows(comparison) comparison ``` Add interval widths when the goal is to compare how conservative the estimators are for this coefficient. ```{r} comparison <- comparison |> dplyr::mutate(interval_width = conf_high - conf_low) comparison ``` ## Plot the robust standard errors A simple plot can help show how the estimators differ for the selected coefficient. ```{r, fig.width = 7, fig.height = 4, fig.alt = "Bar chart comparing robust standard errors for the quadratic income term across HC estimators."} ggplot2::ggplot(comparison, ggplot2::aes(x = method, y = std_error)) + ggplot2::geom_col(fill = "#305c8a") + ggplot2::labs( x = "Estimator", y = "Robust standard error for the quadratic income term" ) + ggplot2::theme_minimal(base_size = 12) ``` ## Compare confidence intervals directly `confint()` returns a tibble for one result. Use the stored comparison table when you want intervals from several methods side by side. ```{r} comparison |> dplyr::select(method, conf_low, conf_high, interval_width) ``` ## Compare covariance objects Use `vcov_hc()` when you only need the covariance matrix and diagnostics, not coefficient tests. ```{r} cov_hc3 <- vcov_hc(fit, type = "hc3") cov_hc3 vcov(cov_hc3) ``` Plot the adjustment factors against leverage values for a covariance object. ```{r, fig.width = 7, fig.height = 4, fig.alt = "Scatterplot of HC3 adjustment factors against leverage values for the public-schools model."} plot(cov_hc3) ``` The covariance object also has a summary method. ```{r} summary(cov_hc3) ``` ## Compare diagnostics across methods All covariance and inference objects store robust weights and leverage values. The leverage values depend only on the fitted model, while the weights depend on the HC estimator. ```{r} covariances <- purrr::map(methods, \(method) { vcov_hc(fit, type = method) }) names(covariances) <- methods diagnostic_comparison <- purrr::imap(covariances, \(cov, method) { tibble::tibble( method = method, max_leverage = max(cov$leverage), max_weight = max(cov$weights), median_weight = stats::median(cov$weights) ) }) diagnostic_comparison <- dplyr::bind_rows(diagnostic_comparison) diagnostic_comparison ``` The next figure mirrors the empirical display in the HCbeta paper: adjustment factors are plotted against leverages for HC3, HC4, HC4m, and HCbeta. ```{r, fig.width = 8, fig.height = 6, fig.alt = "Faceted scatterplot of HC adjustment factors against leverage values for HC3, HC4, HC4m, and HCbeta."} figure_methods <- c("hc3", "hc4", "hc4m", "hcbeta") figure_covariances <- purrr::map(figure_methods, \(method) { vcov_hc(fit, type = method) }) names(figure_covariances) <- figure_methods weight_comparison <- purrr::imap(figure_covariances, \(cov, method) { tibble::tibble( method = cov$label, leverage = cov$leverage, weight = cov$weights, high_leverage = cov$leverage > 3 * cov$p / cov$n ) }) |> dplyr::bind_rows() |> dplyr::mutate( method = factor(method, levels = c("HC3", "HC4", "HC4m", "HCbeta")) ) ggplot2::ggplot(weight_comparison, ggplot2::aes(x = leverage, y = weight)) + ggplot2::geom_point( ggplot2::aes(color = high_leverage), size = 1.8, alpha = 0.85 ) + ggplot2::facet_wrap(~method, scales = "free_y", ncol = 2) + ggplot2::scale_color_manual( values = c(`FALSE` = "#2c5f8a", `TRUE` = "#c0392b"), guide = "none" ) + ggplot2::labs( x = expression(h[t]), y = expression(g[t]) ) + ggplot2::theme_minimal(base_size = 12) + ggplot2::theme( strip.text = ggplot2::element_text(face = "bold"), panel.grid.minor = ggplot2::element_blank() ) ``` ## What to report A compact reporting table usually needs the estimator, estimate, robust standard error, p-value, and interval endpoints. ```{r} comparison |> dplyr::select(method, estimate, std_error, p_value, conf_low, conf_high) ``` Use this comparison to document how sensitive your conclusion is to the choice of HC estimator. Use `vignette("hcinfer-hcbeta")` for a closer look at the default HCbeta estimator.