Testing the equivalence of translations of widely used response choice labels: results from the IQOLA Project. International Quality of Life Assessment
Access full-text PDFOpen Access
Check access options
Check access options
AuthorsKeller, Susan D.
Ware, John E. Jr.
Aaronson, Neil K.
Bjorner, Jakob B.
Brazier, John E.
Sanson-Fisher, Robert W.
UMass Chan AffiliationsDepartment of Quantitative Health Sciences
Document TypeJournal Article
KeywordsAnalysis of Variance
*Health Status Indicators
*Quality of Life
Health Services Research
MetadataShow full item record
AbstractThe similarity in meaning assigned to response choice labels from the SF-36 Health Survey (SF-36) was evaluated across countries. Convenience samples of judges (range, 10 to 117; median = 48) from 13 countries rated translations of response choice labels, using a variation of the Thurstone method of equal appearing intervals. Judges marked a point on a 10-cm line-representing the magnitude of a response choice label (e.g., "good" relative to the anchors of "poor" and "excellent"). Ratings were evaluated to determine the ordinal consistency of response choice labels within a response scale; the degree to which differences between adjacent response choice labels were equal interval; and the amount of variance due to response choice label, country, judge, and interaction between response choice label and country. Results confirmed the hypothesized ordering of response choice labels; the percentage of ordinal pairs ranged from 88.7% to 100% (median = 98.2%) across countries and response scales. Examination of the average magnitudes of response choice labels supported the "quasi-interval" nature of the scales. Analysis of variance (ANOVA) results supported the generalizability of response choice magnitudes across countries; labels explained 64% to 77% of the variance in ratings, and country explained 1% to 3%. These results support the equivalence of SF-36 response choice labels across countries. Departures from the assumption of equal intervals, when observed, were similar across countries and were greatest for the two response scales that are recalibrated under standard SF-36 scoring. Results provide justification for scoring translations of individual items using standard SF-36 scoring; whether these items form the same scales in other countries as they do in the United States is evaluated with tests of scaling assumptions.
SourceJ Clin Epidemiol. 1998 Nov;51(11):933-44. Link to article on publisher's site
Permanent Link to this Itemhttp://hdl.handle.net/20.500.14038/47412
Related ResourcesLink to Article in PubMed