-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathGCA_Factor_Analyses.Rmd
114 lines (95 loc) · 3.9 KB
/
GCA_Factor_Analyses.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
---
title: "Factor Analysis of GCA"
author: "Derek Briggs"
date: "May 18, 2017"
output:
pdf_document: default
html_document: default
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## Data
I'm going to break this data into two sets, one for EFA, one for CFA
```{r}
library(psych)
d<-read.csv("1516GCA.csv")
x<-seq(from=1, to=nrow(d), by=2)
y<-seq(from=2, to=nrow(d), by=2)
d1<-d[x,]
d2<-d[y,]
df<-as.data.frame(d)
```
##Item Descriptives and Reliability
Let's have a look at item descriptives and alpha
```{r}
alpha(df,na.rm=T)
```
One thing to notice here is that item 19 stands out as having by far the lowest correlations with total score based on all other GCA items.
##Dimensionality Analyses
```{r}
cm<-cor(d1,use="complete.obs")
ev <- eigen(cm)
library(corrplot)
cor.plot(cm)
```
Let's run a Parallel Analysis Scree Plot
```{r}
fa.parallel(cm, n.obs = nrow(d1), fm="minres", fa="both",
main = "Parallel Analysis Scree Plots",
n.iter=100, error.bars=FALSE, SMC=FALSE,
ylabel=NULL, show.legend=TRUE)
```
So based on principal components could make a case for about 2; based on factors could make a case for up to 4 or 5. I'm going to run EFA with 2, 3 and 4 factors.
Starting with 2 Factors
```{r}
mod1<-fa(d1,nfactors=2,rotate="Promax",fm="pa",cor="tet")
print(mod1$loadings,cut=.3)
mod1$rms
```
Not bad. We have a Heywood case due to item 10, and a cross-loading for items 24. But root mean residual is just .04. This solution explains about 32% of item covariance. Notice also that three items don't have loadings > .3 on either factor (items 2, 3 and 25). Let's try 3 factors.
```{r}
mod2<-fa(d1,nfactors=3,rotate="Promax",fm="pa",cor="tet")
print(mod2$loadings,cut=.3)
mod2$rms
```
The 3 factor solution explains a little more covariance, reduces root mean residual from .043 to .038. Still getting that Heywood case. Also notice that items 2, 3 and 8 don't have loadings > .3. With three factors we get a lot more cross-loadings that show up--tricky to interpret. Item 19 now shows up as a problem. Lastly, let's try 4 factors.
```{r}
mod3<-fa(d1,nfactors=4,rotate="Promax",fm="pa",cor="tet")
print(mod3$loadings,cut=.3)
mod3$rms
```
OK, so this is interesting. With the 4 factor solution we no longer have the Heywood case, but now the cumulative proportion of variance explained is lower than it was in the 3 factor solution! Root mean residual is down to .033. But notice that only item 19 loads on the 4th factor. We also now lose item 22 on top of 2, 3 and 8. Here's a visualization
```{r}
cor.plot(mod3)
```
I'm going to try specifying a CFA model based on results from the two factor EFA. I'm gonna drop item 19 and not only keep strongest factor loading for each item.
## CFA
```{r }
library("lavaan")
fac3_mod <-
'factor1 =~ GCA1 + GCA4 + GCA7 + GCA8 + GCA9
+ GCA13 + GCA17 + GCA21 + GCA24
factor2 =~ GCA6 + GCA10 + GCA11 + GCA12 + GCA14
+ GCA15 + GCA16 + GCA18 + GCA20 + GCA22 + GCA23
+ GCA24'
fac3mod <- cfa(model = fac3_mod, data = d2, ordered = names(d2) ,estimator = "DWLS", se = "robust", test = "scaled.shifted", std.lv = TRUE)
summary(fac3mod,fit.measures = TRUE)
mi <- modindices(fac3mod)
mi[mi$op == "=~",]
```
The fit of this model is pretty good. Of course, a problem is that I've played around with other possibilities that also fit equally well! Next step is to do a CFA based on mapping of items to intended learning objectives (1-8). Need at least two items per factor, so dropping GCA25 for now.
```{r }
fac7_mod <-
'factor1 =~ GCA3 + GCA6 + GCA10 + GCA14 + GCA16 + GCA18
factor2 =~ GCA1 + GCA8 + GCA17
factor3 =~ GCA2 + GCA7 + GCA20 + GCA22 + GCA23
factor4 =~ GCA13 + GCA24
factor5 =~ GCA5 + GCA9 +GCA19
factor6 =~ GCA4 + GCA11 + GCA12
factor7 =~ GCA15 + GCA21
'
fac7mod <- cfa(model = fac7_mod, data = d2, ordered = names(d2) ,estimator = "DWLS", se = "robust", test = "scaled.shifted", std.lv = TRUE)
summary(fac7mod,fit.measures = TRUE)
inspect(fac7mod,"cov.lv")
```