- sampling distribution of b1
- could the value we get be generated from a DGP in which the value of b1 = 0?
- scales: units in which b1 are measured
- larger standard error = wider
- if it includes zero, we can't reject the null hypothesis
The null model can always be rejected if the sample size is large enough.
- if confidence interval keeps getting smaller and smaller, sampling distribution will get so narrow at some point that you must reject the null hypothesis
- you decide what amount of difference is significant
- power: probability that you'll reject the null hypothesis if the null hypothesis is not true
- must assume a β
- power + type II error = 100%
Mueller & Dweck (1998)
- interested in effect of feedback on willingness/skill to solve puzzles
- Raven's progressive matrices
- what fits in the missing space to complete this pattern?
- nonverbal measure of general cognitive ability/intelligence
- randomly assigned 5th graders into 3 groups
- control condition: nothing else
- effort condition: "you must have worked hard at these problems
- intelligence condition: you must be smart at these problems
- easy set > difficult set > new easy set
What outcomes might we be interested in?
head(sample(praisestudy,10))
- why might PSDIFF be a better outcome variable than PS3?
- equalizes differences between different groups
Look at effect of FEEDBACK on PS1 graphically.
tally(~FEEDBACK, data=praisestudy)
- 46 people in control, 38 in effort group, 39 in intelligence group
gf_histogram(~PS1, data=praisestudy, bins=9) %>%
gf_facet_grid(FEEDBACK~.)
favstats(PS1~FEEDBACK)
Write the model for PS1 explained by FEEDBACK using GLM notation.
- Yi = b0 + b1X1i + b2X2i + ei
What does X2i in the three-group model?
- whether or not a particular subject is in group 3
- b2 is the increment between group 1 and group 3
Fit the model of FEEDBACK on PS1.
lm(PS1~FEEDBACK, data=praisestudy)
What is the Effort group solving fewer problems correctly (on average) than the other two groups?
- could have been generated by the empty model
- use tools to evaluate this model to see if it could ahve been generated randomly
What are we hoping for in this model comparison?
- hoping to rule out the null model
- can we rull out that b1 and b2 are both equal to 0?
How well does the feedback model fit the data?
supernova(lm(PS1~FEEDBACK, data=praisestudy))
- F = 0.568
- less than 1
- if no difference, it should be exactly 1 (model accounts for no variation)
- PRE = 0.0094
- very low; less than 1% of the variation in PS1 is accounted for by our FEEDBACK variable
- it should be low
- p = .568
- if the true value of group difference is 0, then there is a .568 chance that we could have gotten the result that we got
- we cannot rule out the null hypothesis
- will go with empty model
Sampling distribution of F vs. PRE
fVal(PS1~FEEDBACK, data=praisestudy)
- F = .568
- this is the real F
fVal(PS1~shuffle(FEEDBACK), data=praisestudy)
- for sampling distribution F
- is different each output
- generating sampling distribution of F assuming there is no relationship between FEEDBACK and the outcome variable
SDoB1 <- do(1000)*fVal(PS1~shuffle(FEEDBACK), data=praisestudy)
gf_histogram(~fVal, SDOB1)
- do this 1000 times and save results in a sampling distribution
- look at distribution of Fs inside new data frame
- not normal, but looks like F distribution
- probability of observing an F out in the tail
- tells us probability that all the groups are equal
- random distribution of Fs if there is absolutely no relationship between the three groups