The Pairs Skating competition at the 2002 Winter Olympics in Salt Lake City was contentious. It resulted in the awarding of Gold Medals to both a Russian and a Canadian pair, after the French judge admitted to awarding biased scores. Multidimensionality, differential item functioning, and item bias are all manifestations of disparate subdimensions within the data. In judged competitions, judge behavior can introduce unwanted subdimensions.

The data comprise 4 facets: skaters + program + skill + judges → rating

For a four-facet Rasch analysis of this model and these data, see www.winsteps.com/facetman/olympics.htm

For this analysis, each pair is allowed to have a different skill level, i.e., different measure, on each skill of each performance. The judges are modeled to maintain their leniencies across all performances.

In this judge-focused rectangular 2-facet analysis: (skaters + program + skill = rows) + (judges = columns) → rating

The rating scale is very long, 0-60. Alternative methods of analysis are shown in SFUNCTION=.

The control file and data are in exam15.txt.

; This file is EXAM15.TXT

Title = "Pairs Skating: Winter Olympics, SLC 2002"

Item = Judge

Person = Pair

NI = 9 ' the judges

Item1 = 14 ' the leading blank of the first rating

Xwide = 3 ' Observations are 3 CHARACTERS WIDE for convenience

NAME1 = 1 ' start of person identification

NAMELENGTH = 13 ' 13 characters identifiers

; CODES NEXT LINE HAS ALL OBSERVED RATING SCORES

CODES= " 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44+

+ 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60"

STEPKEEP=YES ; maintain missing intermediate scores in the hierarchy

@order = 1-2 ; pair order number at finish of competition in person label columns 1-2

@program = 11 ; program in person label column 11: Short or Free

@skil = 13 ; skill in person label column 13: Technical or Artistic

PSUBTOT = @order

DIF = @order ; judge "DIF" across skating pairs

tfile=*

30 ; produce Table 30 for judge-by-pairs "DIF"

28 ; produce Table 28 for skater-pair summary statistics

&END

1 Rus ;Mrs. Marina SANAIA : RUSSIA

2 Chn ;Mr. Jiasheng YANG : CHINA

3 USA ;Mrs. Lucy BRENNAN : USA

4 Fra ;Miss Marie Reine LE GOUGNE : FRANCE

5 Pol ;Mrs. Anna SIEROCKA : POLAND

6 Can ;Mr. Benoit LAVOIE : CANADA

7 Ukr ;Mr. Vladislav PETUKHOV : UKRAINE

8 Ger ;Mrs. Sissy KRICK : GERMANY

9 Jap ;Mr. Hideo SUGITA : JAPAN

; Description of Person Identifiers

; Cols. Description

; 1-2 Order immediately after competition (@order)

; 4-5 Skaters' initials

; 7-9 Nationality

; 11 Program: S=Short F=Free

; 13 Skill: T=Technical Merit, A=Artistic Impression

END LABELS

1 BS-Rus S T 58 58 57 58 58 58 58 58 57 ; 1 BEREZHNAYA Elena / SIKHARULIDZE Anton : RUS

1 BS-Rus S A 58 58 58 58 59 58 58 58 58 ; 2 BEREZHNAYA Elena / SIKHARULIDZE Anton : RUS

1 BS-Rus F T 58 58 57 58 57 57 58 58 57 ; 3 BEREZHNAYA Elena / SIKHARULIDZE Anton : RUS

1 BS-Rus F A 59 59 59 59 59 58 59 58 59 ; 4 BEREZHNAYA Elena / SIKHARULIDZE Anton : RUS

2 SP-Can S T 57 57 56 57 58 58 57 58 56 ; 5 SALE Jamie / PELLETIER David : CAN

2 SP-Can S A 58 59 58 58 58 59 58 59 58 ; 6 SALE Jamie / PELLETIER David : CAN

2 SP-Can F T 58 59 58 58 58 59 58 59 58 ; 7 SALE Jamie / PELLETIER David : CAN

2 SP-Can F A 58 58 59 58 58 59 58 59 59 ; 8 SALE Jamie / PELLETIER David : CAN

3 SZ-Chn S T 57 58 56 57 57 57 56 57 56 ; 9 SHEN Xue / ZHAO Hongbo : CHN

.....

From this data file, estimate judge severity. In my run this took 738 iterations, because the data are so thin, and the rating scale is so long.

Here is some of the output of Table 30, for Judge DIF, i.e., Judge Bias by skater pair order number, @order = $S1W2.

+-------------------------------------------------------------------------+

| Pair DIF DIF Pair DIF DIF DIF JOINT Judge |

| CLASS ADDED S.E. CLASS ADDED S.E. CONTRAST S.E. t Number Name |

|-------------------------------------------------------------------------|

| 13 -.93 .40 18 1.50 .39 -2.43 .56 -4.35 9 9 Jap |

| 14 -1.08 .36 18 1.50 .39 -2.58 .53 -4.83 9 9 Jap |

+-------------------------------------------------------------------------+

The most significant statistical bias is by the Japanese judge on skater pairs 13 and 14 vs. 18. These pairs are low in the final order, and so of little interest.

Table 23, the principal components/contrast analysis of Judge residuals is more interesting. Note that Judge 4, the French judge, is at the top with the largest contrast loading. The actual distortion in the measurement framework is small, but crucial to the awarding of the Gold Medal!

STANDARDIZED RESIDUAL CONTRAST PLOT

-1 0 1

++--------------------------------+--------------------------------++

.6 + 4 | +

| | |

.5 + | 5 7 +

C | | |

O .4 + 1 | +

N | | |

T .3 + | +

R | | |

A .2 + | +

S | | |

T .1 + | +

| | |

1 .0 +---------------------------------|---------------------------------+

| 2 | |

L -.1 + | +

O | | |

A -.2 + | +

D | | |

I -.3 + | +

N | | 9 |

G -.4 + | 8 +

| | 6 |

-.5 + | 3 +

| | |

++--------------------------------+--------------------------------++

-1 0 1

Judge MEASURE

Table 23 variance table also shows a very high explained variance by the measures, 97%, and the estimates require many iterations to converge. Why is this?

The Olympic Ice-skating data is problematic. This is because the judges' ratings are edited prior to display to the public. The ISU (International Skating Union) feel that disagreement among the judges about a skater's performance would look bad. So the head judge instructs disagreeing judges to redo their ratings. All this goes on while we are waiting for the judges' ratings to display. Sometimes there is quite a long wait! The result is that the data are too Guttman-like. Hence the large explained variance :-(

In fact, the large explained variance indicates that we may have lost measurement accuracy and precision. Rasch uses the randomness in the data to construct the variable. There is little randomness, particularly among high-performing skaters, so the variable definition is weak.

Another flaw in the ratings is that smart judges can game the system. They know that the data will be almost Guttman, so if they make very small tweaks to bias the ratings, the head judge won't catch them, but their effects can be profound - even altering the Gold Medal winners.

Analysts have pointed out these and other flaws to the ISU. Their response has been to make the rating process even more obscure :-(

Help for Winsteps Rasch Measurement and Rasch Analysis Software: www.winsteps.com. Author: John Michael Linacre

Rasch Books and Publications
Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, 2nd Edn, 2024 George Engelhard, Jr. & Jue Wang	Applying the Rasch Model (Winsteps, Facets) 4th Ed., Bond, Yan, Heene	Advances in Rasch Analyses in the Human Sciences (Winsteps, Facets) 1st Ed., Boone, Staver	Advances in Applications of Rasch Measurement in Science Education, X. Liu & W. J. Boone	Rasch Analysis in the Human Sciences (Winsteps) Boone, Staver, Yale
Introduction to Many-Facet Rasch Measurement (Facets), Thomas Eckes	Statistical Analyses for Language Testers (Facets), Rita Green	Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments (Facets), George Engelhard, Jr. & Stefanie Wind	Aplicação do Modelo de Rasch (Português), de Bond, Trevor G., Fox, Christine M	Appliquer le modèle de Rasch: Défis et pistes de solution (Winsteps) E. Dionne, S. Béland
Exploring Rating Scale Functioning for Survey Research (R, Facets), Stefanie Wind	Rasch Measurement: Applications, Khine	Winsteps Tutorials - free Facets Tutorials - free	Many-Facet Rasch Measurement (Facets) - free, J.M. Linacre	Fairness, Justice and Language Assessment (Winsteps, Facets), McNamara, Knoch, Fan
Other Rasch-Related Resources: Rasch Measurement YouTube Channel
Rasch Measurement Transactions & Rasch Measurement research papers - free	An Introduction to the Rasch Model with Examples in R (eRm, etc.), Debelak, Strobl, Zeigenfuse	Rasch Measurement Theory Analysis in R, Wind, Hua	Applying the Rasch Model in Social Sciences Using R, Lamprianou	El modelo métrico de Rasch: Fundamentación, implementación e interpretación de la medida en ciencias sociales (Spanish Edition), Manuel González-Montesinos M.
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar	Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch	Rasch Models for Measurement, David Andrich	Constructing Measures, Mark Wilson	Best Test Design - free, Wright & Stone Rating Scale Analysis - free, Wright & Masters
Virtual Standard Setting: Setting Cut Scores, Charalambos Kollias	Diseño de Mejores Pruebas - free, Spanish Best Test Design	A Course in Rasch Measurement Theory, Andrich, Marais	Rasch Models in Health, Christensen, Kreiner, Mesba	Multivariate and Mixture Distribution Rasch Models, von Davier, Carstensen
As an Amazon Associate I earn from qualifying purchases. This does not change what you pay.

Coming Rasch-related Events
Jan. 17 - Feb. 21, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Feb. - June, 2025	On-line course: Introduction to Classical Test and Rasch Measurement Theories (D. Andrich, I. Marais, RUMM2030), University of Western Australia
Feb. - June, 2025	On-line course: Advanced Course in Rasch Measurement Theory (D. Andrich, I. Marais, RUMM2030), University of Western Australia
Apr. 21 - 22, 2025, Mon.-Tue.	International Objective Measurement Workshop (IOMW) - Boulder, CO, www.iomw.net
May 16 - June 20, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 20 - July 18, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Further Topics (E. Smith, Facets), www.statistics.com
Oct. 3 - Nov. 7, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com

Example 15: Olympic skating with DIF-type bias and multidimensionality

; This file is EXAM15.TXT

Questions, Suggestions? Want to update Winsteps or Facets? Please email Mike Linacre, author of Winsteps mike@winsteps.com