Skip to content

Latest commit

 

History

History
107 lines (93 loc) · 3.88 KB

lecture19.md

File metadata and controls

107 lines (93 loc) · 3.88 KB

Lecture 19 (Week 10, Wednesday)

Sampling Distributions

  • sampling distribution of b1
  • could the value we get be generated from a DGP in which the value of b1 = 0?
  • scales: units in which b1 are measured
  • larger standard error = wider
  • if it includes zero, we can't reject the null hypothesis

The null model can always be rejected if the sample size is large enough.

  • if confidence interval keeps getting smaller and smaller, sampling distribution will get so narrow at some point that you must reject the null hypothesis
  • you decide what amount of difference is significant

Power and Type II Error

  • power: probability that you'll reject the null hypothesis if the null hypothesis is not true
  • must assume a β
  • power + type II error = 100%

Replication

New Data Set

Mueller & Dweck (1998)

  • interested in effect of feedback on willingness/skill to solve puzzles
  • Raven's progressive matrices
    • what fits in the missing space to complete this pattern?
    • nonverbal measure of general cognitive ability/intelligence
    • randomly assigned 5th graders into 3 groups
    • control condition: nothing else
    • effort condition: "you must have worked hard at these problems
    • intelligence condition: you must be smart at these problems
  • easy set > difficult set > new easy set

What outcomes might we be interested in?

head(sample(praisestudy,10))
  • why might PSDIFF be a better outcome variable than PS3?
  • equalizes differences between different groups

Look at effect of FEEDBACK on PS1 graphically.

tally(~FEEDBACK, data=praisestudy)
  • 46 people in control, 38 in effort group, 39 in intelligence group
gf_histogram(~PS1, data=praisestudy, bins=9) %>%
gf_facet_grid(FEEDBACK~.)
favstats(PS1~FEEDBACK)

Write the model for PS1 explained by FEEDBACK using GLM notation.

  • Yi = b0 + b1X1i + b2X2i + ei

What does X2i in the three-group model?

  • whether or not a particular subject is in group 3
  • b2 is the increment between group 1 and group 3

Fit the model of FEEDBACK on PS1.

lm(PS1~FEEDBACK, data=praisestudy)

What is the Effort group solving fewer problems correctly (on average) than the other two groups?

  • could have been generated by the empty model
  • use tools to evaluate this model to see if it could ahve been generated randomly

What are we hoping for in this model comparison?

  • hoping to rule out the null model
  • can we rull out that b1 and b2 are both equal to 0?

How well does the feedback model fit the data?

supernova(lm(PS1~FEEDBACK, data=praisestudy))
  • F = 0.568
    • less than 1
    • if no difference, it should be exactly 1 (model accounts for no variation)
  • PRE = 0.0094
    • very low; less than 1% of the variation in PS1 is accounted for by our FEEDBACK variable
    • it should be low
  • p = .568
    • if the true value of group difference is 0, then there is a .568 chance that we could have gotten the result that we got
    • we cannot rule out the null hypothesis
    • will go with empty model

Sampling distribution of F vs. PRE

fVal(PS1~FEEDBACK, data=praisestudy)
  • F = .568
  • this is the real F
fVal(PS1~shuffle(FEEDBACK), data=praisestudy)
  • for sampling distribution F
  • is different each output
  • generating sampling distribution of F assuming there is no relationship between FEEDBACK and the outcome variable
SDoB1 <- do(1000)*fVal(PS1~shuffle(FEEDBACK), data=praisestudy)
gf_histogram(~fVal, SDOB1)
  • do this 1000 times and save results in a sampling distribution
  • look at distribution of Fs inside new data frame
  • not normal, but looks like F distribution
  • probability of observing an F out in the tail
  • tells us probability that all the groups are equal
  • random distribution of Fs if there is absolutely no relationship between the three groups