Example 15: Olympic skating with DIF-type bias and multidimensionality |
The Pairs Skating competition at the 2002 Winter Olympics in Salt Lake City was contentious. It resulted in the awarding of Gold Medals to both a Russian and a Canadian pair, after the French judge admitted to awarding biased scores. Multidimensionality, differential item functioning, and item bias are all manifestations of disparate subdimensions within the data. In judged competitions, judge behavior can introduce unwanted subdimensions.
The data comprise 4 facets: skaters + program + skill + judges → rating
For a four-facet Rasch analysis of this model and these data, see www.winsteps.com/facetman/olympics.htm
For this analysis, each pair is allowed to have a different skill level, i.e., different measure, on each skill of each performance. The judges are modeled to maintain their leniencies across all performances.
In this judge-focused rectangular 2-facet analysis: (skaters + program + skill = rows) + (judges = columns) → rating
The rating scale is very long, 0-60. Alternative methods of analysis are shown in SFUNCTION=.
The control file and data are in exam15.txt.
Title = "Pairs Skating: Winter Olympics, SLC 2002"
Item = Judge
Person = Pair
NI = 9 ' the judges
Item1 = 14 ' the leading blank of the first rating
Xwide = 3 ' Observations are 3 CHARACTERS WIDE for convenience
NAME1 = 1 ' start of person identification
NAMELENGTH = 13 ' 13 characters identifiers
; CODES NEXT LINE HAS ALL OBSERVED RATING SCORES
CODES= " 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44+
+ 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60"
STEPKEEP=YES ; maintain missing intermediate scores in the hierarchy
@order = 1-2 ; pair order number at finish of competition in person label columns 1-2
@program = 11 ; program in person label column 11: Short or Free
@skil = 13 ; skill in person label column 13: Technical or Artistic
PSUBTOT = @order
DIF = @order ; judge "DIF" across skating pairs
tfile=*
30 ; produce Table 30 for judge-by-pairs "DIF"
28 ; produce Table 28 for skater-pair summary statistics
*
&END
1 Rus ;Mrs. Marina SANAIA : RUSSIA
2 Chn ;Mr. Jiasheng YANG : CHINA
3 USA ;Mrs. Lucy BRENNAN : USA
4 Fra ;Miss Marie Reine LE GOUGNE : FRANCE
5 Pol ;Mrs. Anna SIEROCKA : POLAND
6 Can ;Mr. Benoit LAVOIE : CANADA
7 Ukr ;Mr. Vladislav PETUKHOV : UKRAINE
8 Ger ;Mrs. Sissy KRICK : GERMANY
9 Jap ;Mr. Hideo SUGITA : JAPAN
; Description of Person Identifiers
; Cols. Description
; 1-2 Order immediately after competition (@order)
; 4-5 Skaters' initials
; 7-9 Nationality
; 11 Program: S=Short F=Free
; 13 Skill: T=Technical Merit, A=Artistic Impression
END LABELS
1 BS-Rus S T 58 58 57 58 58 58 58 58 57 ; 1 BEREZHNAYA Elena / SIKHARULIDZE Anton : RUS
1 BS-Rus S A 58 58 58 58 59 58 58 58 58 ; 2 BEREZHNAYA Elena / SIKHARULIDZE Anton : RUS
1 BS-Rus F T 58 58 57 58 57 57 58 58 57 ; 3 BEREZHNAYA Elena / SIKHARULIDZE Anton : RUS
1 BS-Rus F A 59 59 59 59 59 58 59 58 59 ; 4 BEREZHNAYA Elena / SIKHARULIDZE Anton : RUS
2 SP-Can S T 57 57 56 57 58 58 57 58 56 ; 5 SALE Jamie / PELLETIER David : CAN
2 SP-Can S A 58 59 58 58 58 59 58 59 58 ; 6 SALE Jamie / PELLETIER David : CAN
2 SP-Can F T 58 59 58 58 58 59 58 59 58 ; 7 SALE Jamie / PELLETIER David : CAN
2 SP-Can F A 58 58 59 58 58 59 58 59 59 ; 8 SALE Jamie / PELLETIER David : CAN
3 SZ-Chn S T 57 58 56 57 57 57 56 57 56 ; 9 SHEN Xue / ZHAO Hongbo : CHN
.....
From this data file, estimate judge severity. In my run this took 738 iterations, because the data are so thin, and the rating scale is so long.
Here is some of the output of Table 30, for Judge DIF, i.e., Judge Bias by skater pair order number, @order = $S1W2.
+-------------------------------------------------------------------------+
| Pair DIF DIF Pair DIF DIF DIF JOINT Judge |
| CLASS ADDED S.E. CLASS ADDED S.E. CONTRAST S.E. t Number Name |
|-------------------------------------------------------------------------|
| 13 -.93 .40 18 1.50 .39 -2.43 .56 -4.35 9 9 Jap |
| 14 -1.08 .36 18 1.50 .39 -2.58 .53 -4.83 9 9 Jap |
+-------------------------------------------------------------------------+
The most significant statistical bias is by the Japanese judge on skater pairs 13 and 14 vs. 18. These pairs are low in the final order, and so of little interest.
Table 23, the principal components/contrast analysis of Judge residuals is more interesting. Note that Judge 4, the French judge, is at the top with the largest contrast loading. The actual distortion in the measurement framework is small, but crucial to the awarding of the Gold Medal!
STANDARDIZED RESIDUAL CONTRAST PLOT
-1 0 1
++--------------------------------+--------------------------------++
.6 + 4 | +
| | |
.5 + | 5 7 +
C | | |
O .4 + 1 | +
N | | |
T .3 + | +
R | | |
A .2 + | +
S | | |
T .1 + | +
| | |
1 .0 +---------------------------------|---------------------------------+
| 2 | |
L -.1 + | +
O | | |
A -.2 + | +
D | | |
I -.3 + | +
N | | 9 |
G -.4 + | 8 +
| | 6 |
-.5 + | 3 +
| | |
++--------------------------------+--------------------------------++
-1 0 1
Judge MEASURE
Table 23 variance table also shows a very high explained variance by the measures, 97%, and the estimates require many iterations to converge. Why is this?
The Olympic Ice-skating data is problematic. This is because the judges' ratings are edited prior to display to the public. The ISU (International Skating Union) feel that disagreement among the judges about a skater's performance would look bad. So the head judge instructs disagreeing judges to redo their ratings. All this goes on while we are waiting for the judges' ratings to display. Sometimes there is quite a long wait! The result is that the data are too Guttman-like. Hence the large explained variance :-(
In fact, the large explained variance indicates that we may have lost measurement accuracy and precision. Rasch uses the randomness in the data to construct the variable. There is little randomness, particularly among high-performing skaters, so the variable definition is weak.
Another flaw in the ratings is that smart judges can game the system. They know that the data will be almost Guttman, so if they make very small tweaks to bias the ratings, the head judge won't catch them, but their effects can be profound - even altering the Gold Medal winners.
Analysts have pointed out these and other flaws to the ISU. Their response has been to make the rating process even more obscure :-(
Help for Winsteps Rasch Measurement and Rasch Analysis Software: www.winsteps.com. Author: John Michael Linacre
Facets Rasch measurement software.
Buy for $149. & site licenses.
Freeware student/evaluation Minifac download Winsteps Rasch measurement software. Buy for $149. & site licenses. Freeware student/evaluation Ministep download |
---|
Forum: | Rasch Measurement Forum to discuss any Rasch-related topic |
---|
Questions, Suggestions? Want to update Winsteps or Facets? Please email Mike Linacre, author of Winsteps mike@winsteps.com |
---|
State-of-the-art : single-user and site licenses : free student/evaluation versions : download immediately : instructional PDFs : user forum : assistance by email : bugs fixed fast : free update eligibility : backwards compatible : money back if not satisfied Rasch, Winsteps, Facets online Tutorials |
---|
Our current URL is www.winsteps.com
Winsteps® is a registered trademark