Regression methods in ordinal data with many levels - Selecting the best statistical methods for patient-reported outcome measures

Background: Patient-reported outcomes (PROs) have long been used in clinical practice to ensure that outcomes most meaningful to patients are being considered. Patient-reported outcome measures (PROMs) (that is, the instruments, often questionnaires, used to assess PROs) are thus increasingly used as primary or secondary endpoints in clinical research. These include generic instruments, such as the EQ-5D or SF-36, which measure overall quality of life or health status, as well as more specific instruments assessing aspects such as sleep, depression, or pain. While visual analog scales most often range from 0 to 10, other PROMs could have any number of levels. When a PRO is included in a clinical research project, appropriate statistical methods must be selected. If no adjustment for potential confounders is desired, the choice of such methods is relatively straightforward (Forrest and Andersen 1986), even if the outcome is ordinal. On the other hand, if adjustment for confounders or stratification variables is required, regression models are necessary. One approach is to use ordinal regression models, for example, ordinal logistic regression, as they are available in standard software packages like R and Stata. These models are commonly used when outcomes have only a few levels (i.e., 3, 4 or 5). What if, however, the PRO has 10, 20, or more levels? Researchers often resort to using linear regression models or quantile (median) regression, rather than the typical ordinal logistic regression in these cases (Selman et al. 2024). The statistical literature has proposed several models for analysis of ordinal outcomes (McCullagh 1980; Parsons 2013). Nonetheless, it remains unclear when an ordinal outcome has “too many levels” for ordinal models and could (or should) be analyzed using other methods. We aim to fill this gap in the literature, and provide practical recommendations to researchers working with PROMs.

Objectives:
  • To determine, through a neutral comparison simulation study (Boulesteix et al. 2013; Morris et al. 2019), which regression approaches are most appropriate to analyze ordinal outcomes in terms of power and availability of clinically meaningful effect estimates.
  • To formulate recommendations on selection of regression methods when study endpoints are ordinal.
The conclusions of this project are highly relevant to clinical and epidemiological researchers who include patient-reported outcomes in their clinical trials and observational research projects. Determining power and sample size, and appropriate data analysis will be facilitated by recommendations based on the simulation study. Standard statistical methodology ensures that the recommendations are possible in practice.

Contact and direct supervisor: Sarah Haile sarah.haile@uzh.ch
Internal supervisor: Torsten Hothorn torsten.hothorn@uzh.ch

References:
Boulesteix A-L, Lauer S, Eugster MJA (2013). A plea for neutral comparison studies in computational sciences. PLOS One 8:e61562. https://doi.org/10.1371/journal.pone.0061562
Forrest M, Andersen B (1986). Ordinal scale and statistics in medical research. Br Med J 292:537–8. https://doi.org/10.1136/bmj.292.6519.537
McCullagh P (1980). Regression models for ordinal data. JRSSB 42:109–42.
Morris TP, White IR, Crowther MJ (2019). Using simulation studies to evaluate statistical methods. Statist Med 38:2074–2102. https://doi.org/10.1002/sim.8086
Parsons NR (2013). Proportional-odds models for repeated composite and long ordinal outcome scales. Statist Med 32:3181–91. https://doi.org/10.1002/sim.5756
Selman CJ, Lee KJ, Ferguson KN, et al (2024). Statistical analyses of ordinal outcomes in randomised controlled trials: A scoping review. Trials 25:241. https://doi.org/10.1186/s13063-024-08072-2