Generating and Exporting Nested Regression Tables

This tutorial takes on two super important tasks. First, how can we generate tables that present side-by-side results from regression analyses? Nested tables help us to better understand social processes by examining change in coefficients across models. Second, how can we export those tables to a format that is useful for interpretation and presentation? Conducting analyses means running a plethora of models, so having a quick and easy way to export these results is indispensable. The R package “huxtable” helps us to accomplish both goals.

The first thing to do is to load the necessary packages.

library(tidyverse) # always load the tidy tools
library(haven) # to read in Stata .dta files
library(huxtable) # package for estimating margins

Let’s read in some educational data for our example.

mlmdata <- read_dta("https://stats.idre.ucla.edu/stat/examples/imm/imm10.dta")
glimpse(mlmdata)
## Rows: 260
## Columns: 19
## $ schid    <dbl> 7472, 7472, 7472, 7472, 7472, 7472, 7472, 7472, 7472, 7472, 7~
## $ stuid    <dbl> 3, 8, 13, 17, 27, 28, 30, 36, 37, 42, 52, 53, 61, 64, 72, 83,~
## $ ses      <dbl> -0.13, -0.39, -0.80, -0.72, -0.74, -0.58, -0.83, -0.51, -0.56~
## $ meanses  <dbl> -0.4826087, -0.4826087, -0.4826087, -0.4826087, -0.4826087, -~
## $ homework <dbl> 1, 0, 0, 1, 2, 1, 5, 1, 1, 2, 1, 1, 1, 2, 1, 4, 1, 2, 1, 1, 1~
## $ white    <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1~
## $ parented <dbl> 2, 2, 2, 2, 2, 2, 2, 3, 2, 3, 2, 3, 2, 1, 2, 3, 3, 1, 3, 3, 3~
## $ public   <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1~
## $ ratio    <dbl> 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 1~
## $ percmin  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0~
## $ math     <dbl> 48, 48, 53, 42, 43, 57, 33, 64, 36, 56, 48, 48, 44, 35, 50, 3~
## $ sex      <dbl> 2, 1, 1, 1, 2, 2, 2, 1, 2, 2, 2, 1, 1, 2, 2, 2, 2, 2, 2, 1, 2~
## $ race     <dbl> 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4~
## $ sctype   <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1~
## $ cstr     <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2~
## $ scsize   <dbl> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3~
## $ urban    <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2~
## $ region   <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2~
## $ schnum   <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1~

Generating a formatted regression table

Create a dummy variable for female. Estimate a model predicting mathematics scores.

mlmdata$female <- ifelse(mlmdata$sex == 2, 1, 0)
ols <- lm(math ~ homework + ses + female + white, mlmdata)
summary(ols)
## 
## Call:
## lm(formula = math ~ homework + ses + female + white, data = mlmdata)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -22.9356  -5.8326   0.5589   5.2318  23.8706 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  46.7198     1.4598  32.005  < 2e-16 ***
## homework      2.1654     0.3648   5.936 9.51e-09 ***
## ses           5.5351     0.6394   8.656 5.61e-16 ***
## female       -0.2930     1.0254  -0.286    0.775    
## white         1.0309     1.2741   0.809    0.419    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.244 on 255 degrees of freedom
## Multiple R-squared:  0.4604, Adjusted R-squared:  0.4519 
## F-statistic: 54.39 on 4 and 255 DF,  p-value: < 2.2e-16

Homework and socio-economic status are positively and significantly associated with math scores. Gender and race differences in math scores are not statistically significant.

Let’s use huxreg to generate a table display these findings.

huxreg(ols)
(1)
(Intercept)46.720 ***
(1.460)   
homework2.165 ***
(0.365)   
ses5.535 ***
(0.639)   
female-0.293    
(1.025)   
white1.031    
(1.274)   
N260        
R20.460    
logLik-914.862    
AIC1841.724    
*** p < 0.001; ** p < 0.01; * p < 0.05.

The table presents the information in a format that is common to see in social science journals. Before we get to exporting, let’s take a look at some of the options.

huxtable options

One can adjust various aspects of the table. First, it is possible to decide on the summary statistics. In this case, we are asking R for only the number of observations and the R-squared value, with the original quotations referring to the label for the statistic (e.g., ‘N’) and the name of the object (e.g., “nobs”) in the summary list.

Second, we can control the note at the bottom of the table. In this case, the script asks for a note about the stars for significance and adds text describing the data.

Third, we can adjust the coefficient labels. Again, the first quoted text describes what we want the label to be (e.g., “Homework (hours)”), while the second calls on the variable name (“homework”).

huxreg(ols,
       statistics = c('N' = "nobs", 'R-squared' = "r.squared"),
       note = "{stars}
        Source: UCLA example education data",
       coefs = c("Homework (hours)" = "homework",
                 "Socio-economic status (SES)" = "ses",
                 "Female" = "female",
                 "White" = "white"))
(1)
Homework (hours)2.165 ***
(0.365)   
Socio-economic status (SES)5.535 ***
(0.639)   
Female-0.293    
(1.025)   
White1.031    
(1.274)   
N260        
R-squared0.460    
*** p < 0.001; ** p < 0.01; * p < 0.05
Source: UCLA example education data

Looks nice, right!

Exporting huxtables

There are several options for exporting these tables, but probably the most useful option is exporting to a Word document.

The script below is identical to the previous script, with one important exception. After generating the table, I added a piping option (%>%) in order to export the table to a Word doc via the quick_docx command. Note that you will likely need to set a working directory in order to save this file to wherever you want it to go.

huxreg(ols,
       statistics = c('N' = "nobs", 'R-squared' = "r.squared"),
       note = "{stars}
        Source: UCLA example education data",
       coefs = c("Homework (hours)" = "homework",
                 "Socio-economic status (SES)" = "ses",
                 "Female" = "female",
                 "White" = "white")) %>%
  quick_docx(file = "example_huxreg.docx")

Exporting a nested table

The last thing I want to show is how to generate and export a second model to appear alongside the first. Below I save the results from a model that is the same as the initial ols model, except that it includes an interaction between SES and white.

Once these results are saved, we need to generate a list including the name of the two models. Then we can feed that list into the huxreg command to generate the nested table. Below I display what it looks like before adding the export option.

ols2 <- lm(math ~ homework + ses + female + white + ses:white, mlmdata)

models <- list("Model 1" = ols, "Model 2" = ols2)

huxreg(models,
       statistics = c('N' = "nobs", 'R-squared' = "r.squared"),
       note = "{stars}
        Source: UCLA example education data",
       coefs = c("Homework (hours)" = "homework",
                 "Socio-economic status (SES)" = "ses",
                 "Female" = "female",
                 "White" = "white",
                 "SES*white" = "ses:white"))
Model 1Model 2
Homework (hours)2.165 ***2.102 ***
(0.365)   (0.359)   
Socio-economic status (SES)5.535 ***2.411 *  
(0.639)   (1.171)   
Female-0.293    -0.699    
(1.025)   (1.016)   
White1.031    3.129 *  
(1.274)   (1.417)   
SES*white        4.217 ** 
        (1.333)   
N260        260        
R-squared0.460    0.481    
*** p < 0.001; ** p < 0.01; * p < 0.05
Source: UCLA example education data

And now let’s run the same command with the export option to create a new Word document.

huxreg(models,
       statistics = c('N' = "nobs", 'R-squared' = "r.squared"),
       note = "{stars}
        Source: UCLA example education data",
       coefs = c("Homework (hours)" = "homework",
                 "Socio-economic status (SES)" = "ses",
                 "Female" = "female",
                 "White" = "white",
                 "SES*white" = "ses:white")) %>%
  quick_docx(file = "example_huxreg2.docx")

With the output exported to Word, one can make any number formatting changes to the table.

For more on this package, I recommend the following bookdown chapter. Note that you can use huxtable to generate descriptive statistics tables as well: https://bookdown.org/sarahwerth2024/RegressionLabsBook/lab-5-r.html