Q1
|
|
Refer to the article Exposure to Scientific Theories Affects Women’s Math Performance by Ilan Dar-Nimrod
and Steven J. Heine. You can find the article
in the .pdf file [link]. The file contains -- courtesy of the first author -- the
pre- and post-manipulation mathematics scores on which the article is based, along
with some supplementary material (and analyses and notes from JH). If you have trouble
extracting the data from the pdf file, they, and some SAS code, can also be found
in this text file [link].
The analyses done by JH at the end of this .pdf file used the data from all 4 groups.
For this assignment, restrict attention to two groups i.e. the 'ND vs. S' comparison,
and redo the requested analyses 'from scratch'. Note that for some portions below,
rather than work with the raw math1 scores, it may be easier to work with 'centered'
math1 scores (JH called the variable math1c) whose average across the combined
ND and S groups is 0. He got these by first obtaining the overall math1 mean in the
two groups combined, and subtracting this mean from each individual's math1 score.
a. Check 'how well the randomization worked' by computing the mean pre-manipulation
math score in each of these two groups, and the difference of the two means. On this
basis, which of the two groups has an 'math advantage' even before the manipulation?
b. Compute the mean and SD of the post-manipulation math scores in each of
these two groups, and the (crude) difference of the two means. By hand, compute the
t-statistic (common variance version). Verify your calculation by running the t-test
in your favourite statistical package {it is called TTEST in SAS and ttest (or
the immediate form ttesti) in Stata}. Comment on the p-value [ or the CI for
the difference in means ].
c. As suggested by some, 'level the playing field' by working with the post-minus-pre
difference in math scores rather than the post-manipulation scores you used in (b),
i.e. repeat step (b) but using the change scores. Comment on the p-value.
d. Even within the ND group (or within the S group), there isn't a perfect
100% correlation between the pre- and post scores. For each group, plot the post-
vs. pre- scores. Obtain the (within-group) correlations of pre- and post- scores,
and the (again, within-group) regression equations of the post-scores on the pre-scores
(for the regressions: if in SAS, you can for example use PROC REG; if in Stata
you can use 'regress' ).
e. For two groups of ND subjects, based on the regression equation fitted
to the scores in the ND group, how far apart would you predict their averages to
be on the post-manipulation scores if on average they were (i) 1 point apart on the
pre-manipulation math exam? (ii) 0.9 points apart pre- ?
Make the same type of calculation for two groups of S individuals 1 point apart pre-manipulation.
f. Given that the mean pre-scores of the S and ND groups were in fact just
about 0.9 points apart, how far apart would you expect the mean post-scores to be
IF the manipulation had NO effect? Do the calculation twice, the first time using
an 'exchange rate' {slope} for the value of 1 extra point pre-manipulation based
on what you saw in the ND group, and the second time using the slope from the S group.
g. Given the crude difference you did see in the post-scores in (b), and the
advantage calculated in (f), what differences in the mean post-scores would you arrive
at if you have leveled the playing field using these two different correction factors?
{you can think of using the pre-scores as giving each person a different 'handicap'
in the second competition -- just as if the contest between ND and S groups involved
golf rather than math!
h. Repeat step (c) but using an intermediate (common) exchange rate to obtain
an adjusted post-score for each subject, i.e.,
adjusted-post-score
= post-score - 0.58 x (pre-score - average pre-score in combined groups)
where 0.58 is the (assumed common) regression coefficient (slope) obtained in (i)
below.
{this approach uses parallel regression lines for the two groups, Next term,
you will learn how to test whether this assumption of parallel lines, i.e. a common
slope -- an assumption in the 'analysis of covariance' that the authors referred
to 6 lines from bottom of second column of article -- is justified by the data}.
Comment on the between-group difference in the means
of the adjusted values, and its associated p-value.
i. For each subject, use an indicator variable S=1 if in group S and 0 if
not i.e. if in group ND. Then run the following (multiple) regression equation:
(average) post-score = B0 + B1*pre-score + B2*S.
In SAS: PROC REG; MODEL Post-score = pre-score
S;
In Stata: regress Post-score pre-score S
How close is the B2 coefficient for S to the difference
in (adjusted) means shown in Figure 1 (Left) in the article?
j. Draw the pair of fitted parallel lines (obtained by setting S=0 and S=1
respectively in the fitted equation in (i)) in a diagram similar to that in the 'confounding:
reducing it by regression' notes found at the end of the .pdf file.
Interpret the 'crude' and 'adjusted' differences in the light of this diagram, or
in light of the 'anatomy of the adjustment' section of the same notes.
|