In a factorial ANOVA, when we take subjects as the experimental unit – then each subject is taken to provide one piece of data, i.e. for one condition.

 

– BUT in the studies we do, e.g. in a rating or production measure study, it’s more likely to have each subject provide data for multiple conditions – i.e. to give multiple ratings or produce multiple items – maybe not for every condition, but at least for more than one.  This is almost always the case for speech production and acoustic phonetic studies. It is rare to make only one measurement for a single subject, unless it is some global measure, or a very small study.

 

Max & Onghena (1999) (Some issues in the statistical analysis of completely randomized and repeated measures designs for speech, language, and hearing research, JSLHR 42: 261-270) offer a critique of our field’s practice in using ANOVA to analyze such studies: that they use repeated measures designs and should be analyzed with repeated measures, not factorial, ANOVA.

 

Repeated measures = multiple measures per subject -- factors are varied “within” rather than “betweensubjects

(from the StatView manual, p. 82-3: "the measurements taken on each experimental unit are essentially the same but measured under different times or experimental conditions" – called "within-subject" variables)

 

Compare:

     non-repeated measures ("between-subjects"): each condition in the experiment is provided by a different group of subjects (e.g. imagine that VOT data from an alveolar-before-/i/ condition is provided by one group of speakers, while VOT data from an alveolar-before-/a/ condition is provided by another group!)

     some repeated measures: e.g. a French dataset is provided by French speakers, and an English dataset is provided by English speakers, but each French and English speaker provides a set of VOT measures for all the various conditions

     all repeated measures ("within-subjects"): each subject provides all the conditions of the experiment (including use of bilingual speakers to compare two languages)

 

A repeated measures design increases the sensitivity of the test as subjects serve as their own controls, and thus across-subject variation is not a problem.  In this sense it’s easier to get a significant difference with RM ANOVA.  However, because subject is the experimental unit and the observations within a subject are not independent, the df in the RM analysis are lower than in a design with lots of subjects, thus making it harder to get a signficant difference.

 

The key point is that the experimental unit is the subject, not the token or repetition.  This point leads to another criticism of our practice by Max & Onghena: that single-subject analyses, with token as the experimental unit, are not valid.  Because the observations are not independent, the df used is too high and the significance is overestimated.  It’s true that we see analyses of individual subject data all the time in journals, but they say this is simply an error, no matter how often it is done.

 

Datafiles from a repeated measures study are generally set up with each subject on a single, separate, row.  StatView uses a compact variable, which has this organization.  Each column then contains the data from one condition of the experiment.  Because repeated measures designs allow you to use fewer subjects, the typical datafile has relatively few rows (subjects) and relatively many columns (conditions).   Compare the "right" and "wrong" data entry tables given by Max and Onghena, reproduced here: each row is a subject in the correct format.

 

 

Look at the file compactdata.xls, sheets 1-4, for data from various designs with within-subject factors, and rows always subjects.


One reason students sometimes avoid Repeated Measures analyses is that there is no automatic option for post-hoc tests.  See Hays section 13.25 (p. 579-583) about using Scheffe and Tukey HSD procedures or Bonferroni t-tests for post-hoc testing of within-subject factors; see Winer p. 529 about using  a factorial 1-way ANOVA for testing simple effects
(
a comparison of levels of one factor to a single level of another factor).

 

Note: here are some good links to real statistics webpages about Repeated Measures ANOVA:

 

http://www.utexas.edu/cc/docs/stat38.html

 

http://www.biols.susx.ac.uk/Home/Zoltan_Dienes/SPSS%202-way%20rm.html

 

http://ericae.net/ft/tamu/Rm.HTM

 


last updated Dec. 2006 by P. Keating

 

back to the UCLA Phonetics Lab statistics page

Back to the UCLA Phonetics Lab overall facilities page