Skip to content

Chapters 9 & 11 Fail to Knit #10

@alex-gable

Description

@alex-gable

Problem

Chapter 9 and 11 are failing to knit due to changes in dplyr(1.0) or broom(0.7)

I consulted this StackOverflow post for guidance (and provided my own solution) in order to solve the error which occurs in the below locations.

Details

Example Solution

Looking at the documentation for do(), it appears to have been superseded with a recommendation to use nest_by(). Conveniently, the documentation examples cover almost this exact use case (see details)

Details
# do() with named arguments becomes nest_by() + mutate() & list()
models <- by_cyl %>% do(mod = lm(mpg ~ disp, data = .))
# ->
models <- mtcars %>%
  nest_by(cyl) %>%
  mutate(mod = list(lm(mpg ~ disp, data = data)))
models %>% summarise(rsq = summary(mod)$r.squared)

# use broom to turn models into data
models %>% do(data.frame(
  var = names(coef(.$mod)),
  coef(summary(.$mod)))
)
# ->
if (requireNamespace("broom")) {
  models %>% summarise(broom::tidy(mod))
}
For the chunk containing errors on lines 400-414:

regressions <- smallchart.long %>% 
  nest_by(schoolid) %>% 
  mutate(fit = list(lm(MathAvgScore ~ year08, data=data)))

sd_filter <- smallchart.long %>%
  group_by(schoolid) %>%
  summarise(sds = sd(MathAvgScore)) 

regressions <- regressions %>%
  right_join(sd_filter, by="schoolid") %>%
  filter(!is.na(sds))

lm_info1 <- regressions %>%
  summarise(tidy(fit)) %>%
  ungroup() %>%
  select(schoolid, term, estimate) %>%
  spread(key = term, value = estimate) %>%
  rename(rate = year08, int = `(Intercept)`)

lm_info2 <- regressions %>%
  summarise(tidy(fit)) %>%
  ungroup() %>%
  select(schoolid, term, std.error) %>%
  spread(key = term, value = std.error) %>%
  rename(se_rate = year08, se_int = `(Intercept)`)

lm_info <- regressions %>%
  summarise(glance(fit)) %>%
  ungroup() %>%
  select(schoolid, r.squared, df.residual) %>%
  inner_join(lm_info1, by = "schoolid") %>%
  inner_join(lm_info2, by = "schoolid") %>%
  mutate(tstar = qt(.975, df.residual), 
         intlb = int - tstar * se_int, 
         intub = int + tstar * se_int,
         ratelb = rate - tstar * se_rate, 
         rateub = rate + tstar * se_rate)

This solution can nearly be line-for-lined copy for the errors occurring on lines 461-475.

Chapter 9 also has an issue here knitting due to failure to converge. Using 500 iterations seemed to do the trick:

hcs.lme=lme(MathAvgScore ~ year08 * charter, chart.long, 
  random =  ~ 1 | schoolid, na.action=na.exclude,
  correlation=corCompSymm(form = ~ 1 |schoolid), 
  weights=varIdent(form = ~1|year08), control = lmeControl(msMaxIter=500))

summary(hcs.lme)                                                                                                                                                                                   
# Linear mixed-effects model fit by REML
#   Data: chart.long 
#       AIC     BIC  logLik
#   10299.2 10348.3 -5140.6
# 
# Random effects:
#  Formula: ~1 | schoolid
#         (Intercept) Residual
# StdDev: 0.002264717 6.534915
# 
# Correlation Structure: Compound symmetry
#  Formula: ~1 | schoolid 
#  Parameter estimate(s):
#      Rho 
# 0.8209145 
# Variance function:
#  Structure: Different standard deviations per stratum
#  Formula: ~1 | year08 
#  Parameter estimates:
#        0        1        2 
# 1.000000 1.127902 1.079423 
# Fixed effects:  MathAvgScore ~ year08 * charter 
#                   Value Std.Error   DF   t-value p-value
# (Intercept)    652.3347 0.2828597 1113 2306.2126  0.0000
# year08           1.1831 0.0907869 1113   13.0320  0.0000
# charter         -5.9106 0.8611940  616   -6.8633  0.0000
# year08:charter   0.8316 0.3032040 1113    2.7426  0.0062
#  Correlation: 
#                (Intr) year08 chartr
# year08         -0.208              
# charter        -0.328  0.068       
# year08:charter  0.062 -0.299 -0.308
# 
# Standardized Within-Group Residuals:
#        Min         Q1        Med         Q3        Max 
# -4.9760770 -0.4490767  0.0865079  0.5669240  3.0970658 
# 
# Number of Observations: 1733
# Number of Groups: 618 

hcs.lme$modelStruct                                                                                                                                                                                
# reStruct  parameters:
#  schoolid 
# -7.967465 
# corStruct  parameters:
# [1] 1.998216
# varStruct  parameters:
# [1] 0.1203593 0.0764270

anova(hcs.lme,cs.lme)   # hcs not converging here                                                                                                                                                  
#         Model df      AIC      BIC    logLik   Test  L.Ratio p-value
# hcs.lme     1  9 10299.20 10348.30 -5140.600                        
# cs.lme      2  7 10315.94 10354.13 -5150.973 1 vs 2 20.74528  <.0001

Finally, in Chapter 11, there's a missing library(broom) and a handful of unscoped select() calls needing dplyr:: prefixed.

Details

Hope this unsolicited help is, well, helpful!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions