Example 15: Olympic skating with DIF-type bias and multidimensionality

The Pairs Skating competition at the 2002 Winter Olympics in Salt Lake City was contentious. It resulted in the awarding of Gold Medals to both a Russian and a Canadian pair, after the French judge admitted to awarding biased scores. Multidimensionality, differential item functioning, and item bias are all manifestations of disparate subdimensions within the data. In judged competitions, judge behavior can introduce unwanted subdimensions.

 

The data comprise 4 facets: skaters + program + skill + judges rating

 

For a four-facet Rasch analysis of this model and these data, see www.winsteps.com/facetman/olympics.htm

 

For this analysis, each pair is allowed to have a different skill level, i.e., different measure, on each skill of each performance. The judges are modeled to maintain their leniencies across all performances.

 

In this judge-focused rectangular 2-facet analysis: (skaters + program + skill = rows) + (judges = columns) rating

 

The rating scale is very long, 0-60. Alternative methods of analysis are shown in SFUNCTION=.

 

The control file and data are in exam15.txt.

 

; This file is EXAM15.TXT

Title  = "Pairs Skating: Winter Olympics, SLC 2002"

Item   = Judge

Person = Pair

NI     = 9       ' the judges

Item1  = 14      ' the leading blank of the first rating 

Xwide  = 3       ' Observations are 3 CHARACTERS WIDE for convenience

NAME1  = 1       ' start of person identification

NAMELENGTH = 13  ' 13 characters identifiers

 

; CODES NEXT LINE HAS ALL OBSERVED RATING SCORES

CODES= " 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44+

       + 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60"

 

STEPKEEP=YES     ; maintain missing intermediate scores in the hierarchy

 

@order = 1-2  ; pair order number at finish of competition in person label columns 1-2

@program = 11 ; program in person label column 11: Short or Free

@skil = 13   ; skill in person label column 13: Technical or Artistic

PSUBTOT = @order

DIF = @order  ; judge "DIF" across skating pairs

tfile=*

30   ; produce Table 30 for judge-by-pairs "DIF"

28   ; produce Table 28 for skater-pair summary statistics

*

&END

1 Rus ;Mrs. Marina SANAIA : RUSSIA

2 Chn ;Mr. Jiasheng YANG : CHINA

3 USA ;Mrs. Lucy BRENNAN : USA

4 Fra ;Miss Marie Reine LE GOUGNE : FRANCE

5 Pol ;Mrs. Anna SIEROCKA : POLAND

6 Can ;Mr. Benoit LAVOIE : CANADA

7 Ukr ;Mr. Vladislav PETUKHOV : UKRAINE

8 Ger ;Mrs. Sissy KRICK : GERMANY

9 Jap ;Mr. Hideo SUGITA : JAPAN

 

; Description of Person Identifiers

; Cols.  Description

; 1-2  Order immediately after competition (@order)

; 4-5  Skaters' initials

; 7-9  Nationality

; 11   Program: S=Short  F=Free

; 13   Skill: T=Technical Merit, A=Artistic Impression

END LABELS

 1 BS-Rus S T 58 58 57 58 58 58 58 58 57 ;  1 BEREZHNAYA Elena / SIKHARULIDZE Anton : RUS

 1 BS-Rus S A 58 58 58 58 59 58 58 58 58 ;  2 BEREZHNAYA Elena / SIKHARULIDZE Anton : RUS

 1 BS-Rus F T 58 58 57 58 57 57 58 58 57 ;  3 BEREZHNAYA Elena / SIKHARULIDZE Anton : RUS

 1 BS-Rus F A 59 59 59 59 59 58 59 58 59 ;  4 BEREZHNAYA Elena / SIKHARULIDZE Anton : RUS

 2 SP-Can S T 57 57 56 57 58 58 57 58 56 ;  5 SALE Jamie / PELLETIER David : CAN

 2 SP-Can S A 58 59 58 58 58 59 58 59 58 ;  6 SALE Jamie / PELLETIER David : CAN

 2 SP-Can F T 58 59 58 58 58 59 58 59 58 ;  7 SALE Jamie / PELLETIER David : CAN

 2 SP-Can F A 58 58 59 58 58 59 58 59 59 ;  8 SALE Jamie / PELLETIER David : CAN

 3 SZ-Chn S T 57 58 56 57 57 57 56 57 56 ;  9 SHEN Xue / ZHAO Hongbo : CHN

 .....

 

From this data file, estimate judge severity. In my run this took 738 iterations, because the data are so thin, and the rating scale is so long.

 

Here is some of the output of Table 30, for Judge DIF, i.e., Judge Bias by skater pair order number, @order = $S1W2.

 

+-------------------------------------------------------------------------+

| Pair    DIF   DIF  Pair   DIF   DIF     DIF    JOINT       Judge        |

| CLASS  ADDED  S.E. CLASS  ADDED  S.E. CONTRAST  S.E.   t   Number  Name |

|-------------------------------------------------------------------------|

| 13      -.93   .40 18      1.50   .39    -2.43   .56 -4.35      9 9 Jap |

| 14     -1.08   .36 18      1.50   .39    -2.58   .53 -4.83      9 9 Jap |

+-------------------------------------------------------------------------+

 

The most significant statistical bias is by the Japanese judge on skater pairs 13 and 14 vs. 18. These pairs are low in the final order, and so of little interest.

 

Table 23, the principal components/contrast analysis of Judge residuals is more interesting. Note that Judge 4, the French judge, is at the top with the largest contrast loading. The actual distortion in the measurement framework is small, but crucial to the awarding of the Gold Medal!

 

STANDARDIZED RESIDUAL CONTRAST PLOT

      -1                                0                                1

      ++--------------------------------+--------------------------------++

   .6 +                       4         |                                 +

      |                                 |                                 |

   .5 +                                 | 5      7                        +

C     |                                 |                                 |

O  .4 +                      1          |                                 +

N     |                                 |                                 |

T  .3 +                                 |                                 +

R     |                                 |                                 |

A  .2 +                                 |                                 +

S     |                                 |                                 |

T  .1 +                                 |                                 +

      |                                 |                                 |

1  .0 +---------------------------------|---------------------------------+

      |                  2              |                                 |

L -.1 +                                 |                                 +

O     |                                 |                                 |

A -.2 +                                 |                                 +

D     |                                 |                                 |

I -.3 +                                 |                                 +

N     |                                 |  9                              |

G -.4 +                                 |         8                       +

      |                                 |        6                        |

  -.5 +                                 |  3                              +

      |                                 |                                 |

      ++--------------------------------+--------------------------------++

      -1                                0                                1

Judge MEASURE

 

Table 23 variance table also shows a very high explained variance by the measures, 97%, and the estimates require many iterations to converge. Why is this?

 

The Olympic Ice-skating data is problematic. This is because the judges' ratings are edited prior to display to the public. The ISU (International Skating Union) feel that disagreement among the judges about a skater's performance would look bad. So the head judge instructs disagreeing judges to redo their ratings. All this goes on while we are waiting for the judges' ratings to display. Sometimes there is quite a long wait! The result is that the data are too Guttman-like. Hence the large explained variance :-(

 

In fact, the large explained variance indicates that we may have lost measurement accuracy and precision. Rasch uses the randomness in the data to construct the variable. There is little randomness, particularly among high-performing skaters, so the variable definition is weak.

 

Another flaw in the ratings is that smart judges can game the system. They know that the data will be almost Guttman, so if they make very small tweaks to bias the ratings, the head judge won't catch them, but their effects can be profound - even altering the Gold Medal winners.

 

Analysts have pointed out these and other flaws  to the ISU. Their response has been to make the rating process even more obscure :-(


Help for Winsteps Rasch Measurement and Rasch Analysis Software: www.winsteps.com. Author: John Michael Linacre

Facets Rasch measurement software. Buy for $149. & site licenses. Freeware student/evaluation Minifac download
Winsteps Rasch measurement software. Buy for $149. & site licenses. Freeware student/evaluation Ministep download

Rasch Books and Publications
Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, 2nd Edn, 2024 George Engelhard, Jr. & Jue Wang Applying the Rasch Model (Winsteps, Facets) 4th Ed., Bond, Yan, Heene Advances in Rasch Analyses in the Human Sciences (Winsteps, Facets) 1st Ed., Boone, Staver Advances in Applications of Rasch Measurement in Science Education, X. Liu & W. J. Boone Rasch Analysis in the Human Sciences (Winsteps) Boone, Staver, Yale
Introduction to Many-Facet Rasch Measurement (Facets), Thomas Eckes Statistical Analyses for Language Testers (Facets), Rita Green Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments (Facets), George Engelhard, Jr. & Stefanie Wind Aplicação do Modelo de Rasch (Português), de Bond, Trevor G., Fox, Christine M Appliquer le modèle de Rasch: Défis et pistes de solution (Winsteps) E. Dionne, S. Béland
Exploring Rating Scale Functioning for Survey Research (R, Facets), Stefanie Wind Rasch Measurement: Applications, Khine Winsteps Tutorials - free
Facets Tutorials - free
Many-Facet Rasch Measurement (Facets) - free, J.M. Linacre Fairness, Justice and Language Assessment (Winsteps, Facets), McNamara, Knoch, Fan
Other Rasch-Related Resources: Rasch Measurement YouTube Channel
Rasch Measurement Transactions & Rasch Measurement research papers - free An Introduction to the Rasch Model with Examples in R (eRm, etc.), Debelak, Strobl, Zeigenfuse Rasch Measurement Theory Analysis in R, Wind, Hua Applying the Rasch Model in Social Sciences Using R, Lamprianou El modelo métrico de Rasch: Fundamentación, implementación e interpretación de la medida en ciencias sociales (Spanish Edition), Manuel González-Montesinos M.
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch Rasch Models for Measurement, David Andrich Constructing Measures, Mark Wilson Best Test Design - free, Wright & Stone
Rating Scale Analysis - free, Wright & Masters
Virtual Standard Setting: Setting Cut Scores, Charalambos Kollias Diseño de Mejores Pruebas - free, Spanish Best Test Design A Course in Rasch Measurement Theory, Andrich, Marais Rasch Models in Health, Christensen, Kreiner, Mesba Multivariate and Mixture Distribution Rasch Models, von Davier, Carstensen
As an Amazon Associate I earn from qualifying purchases. This does not change what you pay.

facebook Forum: Rasch Measurement Forum to discuss any Rasch-related topic

To receive News Emails about Winsteps and Facets by subscribing to the Winsteps.com email list,
enter your email address here:

I want to Subscribe: & click below
I want to Unsubscribe: & click below

Please set your SPAM filter to accept emails from Winsteps.com
The Winsteps.com email list is only used to email information about Winsteps, Facets and associated Rasch Measurement activities. Your email address is not shared with third-parties. Every email sent from the list includes the option to unsubscribe.

Questions, Suggestions? Want to update Winsteps or Facets? Please email Mike Linacre, author of Winsteps mike@winsteps.com


State-of-the-art : single-user and site licenses : free student/evaluation versions : download immediately : instructional PDFs : user forum : assistance by email : bugs fixed fast : free update eligibility : backwards compatible : money back if not satisfied
 
Rasch, Winsteps, Facets online Tutorials


 

 
Coming Rasch-related Events: Winsteps and Facets
Oct 21 - 22 2024, Mon.-Tues. In person workshop: Facets and Winsteps in expert judgement test validity - UNAM (México) y Universidad Católica de Colombia. capardo@ucatolica.edu.co, benildegar@gmail.com
Oct. 4 - Nov. 8, 2024, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Jan. 17 - Feb. 21, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
May 16 - June 20, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 20 - July 18, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Further Topics (E. Smith, Facets), www.statistics.com
Oct. 3 - Nov. 7, 2025, Fri.-Fri. On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com

 

 

Our current URL is www.winsteps.com

Winsteps® is a registered trademark