FEGLM Adding one of the fixed effects as dummy gives different results than TWFE. #523

YavuzMehmet2 · 2024-08-20T20:02:53Z

Hi all,

First of all, thank you for a fantastic package, I will make sure to explicitly cite it in my papers!

I'm working with a two-way fixed effects (TWFE) logit model in R using the fixest package. My model includes interaction terms between some fixed effects and a categorical variable called group. To report the coefficients of each fixed effect term, I've included one of the fixed effects as a regular control variable. However, I've noticed that this approach leads to differences in standard errors and p-values compared to the standard TWFE model.

When I include one of the fixed effects (e.g., wave) as a regular control variable instead of as a fixed effect, the standard errors and p-values differ, although the coefficients remain the same.

library(fixest)

# Simulate data
set.seed(12345)
n <- 1000  
data <- data.frame(
  outcome = rbinom(n, 1, 0.5),
  group = factor(sample(c("Group1", "Group2", "Group3"), n, replace = TRUE)),
  event1 = rbinom(n, 1, 0.3),
  event2 = rbinom(n, 1, 0.2),
  event3 = rbinom(n, 1, 0.4),
  event4 = rbinom(n, 1, 0.25),
  age_group = factor(sample(2:4, n, replace = TRUE)),
  gender = factor(sample(c("Male", "Female"), n, replace = TRUE)),
  education = factor(sample(c("Low", "Medium", "High"), n, replace = TRUE)),
  ethnicity = factor(sample(c("Ethnic1", "Ethnic2"), n, replace = TRUE)),
  income_level = factor(sample(1:4, n, replace = TRUE)),
  perception = factor(sample(1:3, n, replace = TRUE)),
  previous_vote = rbinom(n, 1, 0.4),
  struggle = factor(sample(1:3, n, replace = TRUE)),
  news_type = factor(sample(c("News1", "News2", "News3", "NoNews"), n, replace = TRUE)),
  location_type = factor(sample(c("Urban", "Rural"), n, replace = TRUE)),
  city = factor(sample(1:10, n, replace = TRUE)),
  wave = factor(sample(paste0("Wave", 1:5), n, replace = TRUE)),
  weight = runif(n, 0.5, 2)
)

# Display results in a table, running the models directly inside etable
etable(
  feglm(
    outcome ~ i(group, "Group1") + 
      i(group, event1, ref = "Group1", ref2 = "0") + 
      i(group, event2, ref = "Group1", ref2 = "0") + 
      i(group, event3, ref = "Group1", ref2 = "0") + 
      i(group, event4, ref = "Group1", ref2 = "0") + 
      i(age_group, ref = "2") + gender + education + 
      i(ethnicity, ref = "Ethnic1") + income_level + 
      i(perception, ref = "2") + as.factor(previous_vote) + 
      i(struggle, ref = "2") + as.factor(news_type) + location_type | city + wave,
    data = data, 
    family = binomial("logit"), 
    weights = data$weight,
    cluster = ~city + wave,
    ssc = ssc(adj = FALSE, cluster.adj = FALSE)
  ),
  feglm(
    outcome ~ i(group, "Group1") + i(wave, ref = "Wave5") + 
      i(group, event1, ref = "Group1", ref2 = "0") + 
      i(group, event2, ref = "Group1", ref2 = "0") + 
      i(group, event3, ref = "Group1", ref2 = "0") + 
      i(group, event4, ref = "Group1", ref2 = "0") + 
      i(age_group, ref = "2") + gender + education + 
      i(ethnicity, ref = "Ethnic1") + income_level + 
      i(perception, ref = "2") + as.factor(previous_vote) + 
      i(struggle, ref = "2") + as.factor(news_type) + location_type | city,
    data = data, 
    family = binomial("logit"), 
    weights = data$weight,
    cluster = ~city + wave,
    ssc = ssc(adj = FALSE, cluster.adj = FALSE)
  )
)

Variance contained negative values in the diagonal and was 'fixed' (a la Cameron, Gelbach & Miller 2011).
                                      feglm(outcome ~.. feglm(outcome ~...1
Dependent Var.:                                 outcome             outcome
                                                                           
group = Group2                         -0.2409 (0.2054)    -0.2409 (0.2411)
group = Group3                          0.1775 (0.2043)     0.1775 (0.2193)
event1 x group = Group2                -0.0144 (0.2281)    -0.0144 (0.2793)
event1 x group = Group3               -0.5420* (0.2496)   -0.5420. (0.3037)
event2 x group = Group2                 0.4310 (0.3427)     0.4310 (0.3645)
event2 x group = Group3                -0.3085 (0.2689)    -0.3085 (0.3193)
event3 x group = Group2                 0.2083 (0.2124)     0.2083 (0.2618)
event3 x group = Group3                -0.0349 (0.2878)    -0.0349 (0.3135)
event4 x group = Group2                -0.2356 (0.2701)    -0.2356 (0.3173)
event4 x group = Group3                -0.0930 (0.2075)    -0.0930 (0.2846)
age_group = 3                          -0.0434 (0.2539)    -0.0434 (0.2769)
age_group = 4                           0.1120 (0.2049)     0.1120 (0.2359)
genderMale                              0.0899 (0.1181)     0.0899 (0.1707)
educationLow                           -0.0650 (0.1390)    -0.0650 (0.1930)
educationMedium                        -0.0057 (0.1607)    -0.0057 (0.2067)
i(factor_var=ethnicity,ref="Ethnic1") -0.1301. (0.0769)    -0.1301 (0.1324)
income_level2                          -0.1411 (0.1032)    -0.1411 (0.1836)
income_level3                          -0.0132 (0.1947)    -0.0132 (0.2517)
income_level4                          -0.2512 (0.2213)    -0.2512 (0.2510)
perception = 1                        -0.1894* (0.0793)    -0.1894 (0.1572)
perception = 3                         -0.0761 (0.1769)    -0.0761 (0.2251)
as.factor(previous_vote)1               0.1698 (0.1279)     0.1698 (0.1768)
struggle = 1                           -0.1089 (0.1668)    -0.1089 (0.2044)
struggle = 3                           -0.0745 (0.2287)    -0.0745 (0.2482)
as.factor(news_type)News2               0.0170 (0.1297)     0.0170 (0.1911)
as.factor(news_type)News3              -0.0712 (0.2071)    -0.0712 (0.2337)
as.factor(news_type)NoNews              0.0709 (0.2078)     0.0709 (0.2248)
location_typeUrban                     -0.0313 (0.1415)    -0.0313 (0.1784)
wave = Wave1                                                0.2194 (0.1759)
wave = Wave2                                               -0.0798 (0.1669)
wave = Wave3                                                0.0673 (0.1890)
wave = Wave4                                                0.2126 (0.3022)
Fixed-Effects:                        -----------------   -----------------
city                                                Yes                 Yes
wave                                                Yes                  No
_____________________________________ _________________   _________________
S.E.: Clustered                         by: city & wave     by: city & wave
Observations                                      1,000               1,000
Squared Cor.                                    0.03632             0.03632
Pseudo R2                                      -0.02464            -0.02464
BIC                                             2,047.4             2,047.4
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1`

Questions:

Is the difference in standard errors and p-values between the two models due to the adjustment of degrees of freedom?
Is this a statistical problem (e.g. incidental parameter problem) or purely a software/package related issue? If it is just a software issue, how can I fix it?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FEGLM Adding one of the fixed effects as dummy gives different results than TWFE. #523

FEGLM Adding one of the fixed effects as dummy gives different results than TWFE. #523

YavuzMehmet2 commented Aug 20, 2024 •

edited by lrberge

Loading

FEGLM Adding one of the fixed effects as dummy gives different results than TWFE. #523

FEGLM Adding one of the fixed effects as dummy gives different results than TWFE. #523

Comments

YavuzMehmet2 commented Aug 20, 2024 • edited by lrberge Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

YavuzMehmet2 commented Aug 20, 2024 •

edited by lrberge

Loading