Reliability - separation

(Separation) Reliability and Strata

These are reporting "reliably different". These are the opposite of inter-rater reliability statistics that are intended to report "reliably the same."

The reported "Separation" Reliability is the Rasch equivalent of the KR-20 or Cronbach Alpha "test reliability" statistic, i.e., the ratio of "True variance" to "Observed variance" for the elements of the facet. This shows how reproducible is the ordering of the measures. This may or may not indicate how "good" the test is in other respects. High (near 1.0) person and item reliabilities are preferred. This "separation" reliability is somewhat the opposite of an interrater reliability, so low (near 0.0) judge and rater separation reliabilities are preferred.

Since the "true" variance of a sample can never be known, but only approximated, the "true" reliability can also only be approximated. All reported reliabilities, such as KR-20, Cronbach Alpha, and the Separation Reliability etc. are only approximations. These approximations are all attempts to compute:

"Separation" Reliability = True Variance / Observed Variance

Facets computes upper and lower boundary values for the region in which the true reliability lies. When SE=Model, the upper boundary, the "Model" reliability, is computed on the basis that all unexpectedness in the data is Rasch-predicted randomness.

When SE=Real, The lower boundary, the "Real" reliability is computed on the basis that all unexpectedness in the data contradicts the Rasch model. The unknowable True reliability generally lies somewhere between these two. As contradictory sources of noise are remove from the data, the reported Model and Real reliabilities become closer, and the True Reliability approaches the Model Reliability.

The "model" reliability is based on the model standard errors, which are computed on the basis that all superfluous unexpectedness in the data is the randomness predicted by the Rasch model.

The "real" reliability is based on the hypothesis that superfluous randomness in the data contradicts the Rasch model:

Real S.E. = Model S.E. * sqrt(Max(INFIT MnSq, 1))

Conventionally, only a Person Reliability is reported and called the "test reliability". Facets reports separation reliabilities for all facets. Separation reliability is estimated based on the premise that the elements are locally independent. Specifically that raters are acting as "independent experts", not as "scoring machines". But when the raters act as "scoring machines", then Facets overestimates reliability. It would be the same as running MCQ bubble sheets twice through an optical scanner, so doubling the amount of "items" per person, and then claiming that we had increased test reliability! To assist in identifying this situation, Facets reports to what extent the raters are acting as "independent experts", as aspect of inter-rater reliability, see Table 7 Agreement Statistics.

Separation = True S.D. / Average measurement error

This estimates the number of statistically distinguishable levels of performance in a normally distributed sample with the same "true S.D." as the empirical sample, when the tails of the normal distribution are modeled as due to measurement error. www.rasch.org/rmt/rmt94n.htm

Strata = (4*Separation + 1)/3

This estimates the number of statistically distinguishable levels of performance in a normally distributed sample with the same "true S.D." as the empirical sample, when the tails of the normal distribution are modeled as extreme "true" levels of performance. www.rasch.org/rmt/rmt163f.htm

So, is sample separation is 2, then strata are (4*2+1)/3 = 3.

Separation = 2: The test is able to statistically distinguish between high and low performers.

Strata = 3: The test is able to statistically distinguish between very high, middle and very low performers.

Strata vs. Separation: this depends on the nature of the measure distribution.

Statistically:

If it is hypothesized to be normal, then separation.

If it is hypothesized to be heavy-tailed, then strata.

Substantively:

If very high and very low scores are probably due to accidental circumstances, then separation.

If very high and very low scores are probably due to very high and very low abilities, then strata.

If in doubt, assume that outliers are accidental, and use separation.

Example: I have 3 criteria in my analysis. Facets reports 32 Strata.

Explanation: "Strata" is a conceptual number, based on a hypothetical normal distribution of the criteria, with the same mean and S.D. as the observed criteria. Each of the infinity of criteria in the hypothetical distribution has the same precision (S.E.) as the average S.E. of the observed criteria. The result is that there are 32 statistically different levels of difficulty in the hypothetical distribution. The large number is because the S.E. of an observed criterion is small due to the large number of observations of each criterion.

Help for Facets Rasch Measurement and Rasch Analysis Software: www.winsteps.com Author: John Michael Linacre.

Rasch Books and Publications
Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences, 2nd Edn, 2024 George Engelhard, Jr. & Jue Wang	Applying the Rasch Model (Winsteps, Facets) 4th Ed., Bond, Yan, Heene	Advances in Rasch Analyses in the Human Sciences (Winsteps, Facets) 1st Ed., Boone, Staver	Advances in Applications of Rasch Measurement in Science Education, X. Liu & W. J. Boone	Rasch Analysis in the Human Sciences (Winsteps) Boone, Staver, Yale
Introduction to Many-Facet Rasch Measurement (Facets), Thomas Eckes	Statistical Analyses for Language Testers (Facets), Rita Green	Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments (Facets), George Engelhard, Jr. & Stefanie Wind	Aplicação do Modelo de Rasch (Português), de Bond, Trevor G., Fox, Christine M	Appliquer le modèle de Rasch: Défis et pistes de solution (Winsteps) E. Dionne, S. Béland
Exploring Rating Scale Functioning for Survey Research (R, Facets), Stefanie Wind	Rasch Measurement: Applications, Khine	Winsteps Tutorials - free Facets Tutorials - free	Many-Facet Rasch Measurement (Facets) - free, J.M. Linacre	Fairness, Justice and Language Assessment (Winsteps, Facets), McNamara, Knoch, Fan
Other Rasch-Related Resources: Rasch Measurement YouTube Channel
Rasch Measurement Transactions & Rasch Measurement research papers - free	An Introduction to the Rasch Model with Examples in R (eRm, etc.), Debelak, Strobl, Zeigenfuse	Rasch Measurement Theory Analysis in R, Wind, Hua	Applying the Rasch Model in Social Sciences Using R, Lamprianou	El modelo métrico de Rasch: Fundamentación, implementación e interpretación de la medida en ciencias sociales (Spanish Edition), Manuel González-Montesinos M.
Rasch Models: Foundations, Recent Developments, and Applications, Fischer & Molenaar	Probabilistic Models for Some Intelligence and Attainment Tests, Georg Rasch	Rasch Models for Measurement, David Andrich	Constructing Measures, Mark Wilson	Best Test Design - free, Wright & Stone Rating Scale Analysis - free, Wright & Masters
Virtual Standard Setting: Setting Cut Scores, Charalambos Kollias	Diseño de Mejores Pruebas - free, Spanish Best Test Design	A Course in Rasch Measurement Theory, Andrich, Marais	Rasch Models in Health, Christensen, Kreiner, Mesba	Multivariate and Mixture Distribution Rasch Models, von Davier, Carstensen
As an Amazon Associate I earn from qualifying purchases. This does not change what you pay.

Coming Rasch-related Events
Jan. 17 - Feb. 21, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
Feb. - June, 2025	On-line course: Introduction to Classical Test and Rasch Measurement Theories (D. Andrich, I. Marais, RUMM2030), University of Western Australia
Feb. - June, 2025	On-line course: Advanced Course in Rasch Measurement Theory (D. Andrich, I. Marais, RUMM2030), University of Western Australia
Apr. 21 - 22, 2025, Mon.-Tue.	International Objective Measurement Workshop (IOMW) - Boulder, CO, www.iomw.net
May 16 - June 20, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com
June 20 - July 18, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Further Topics (E. Smith, Facets), www.statistics.com
Oct. 3 - Nov. 7, 2025, Fri.-Fri.	On-line workshop: Rasch Measurement - Core Topics (E. Smith, Winsteps), www.statistics.com

Reliability - separation - strata

Questions, Suggestions? Want to update Winsteps or Facets? Please email Mike Linacre, author of Winsteps mike@winsteps.com