A neutral comparison of e-values and traditional methods for sequential analysis in clinical trials
Description: The new statistical inference framework "Safe Anytime-Valid Inference" (SAVI) based on so-called e-values promises to improve efficiency of statistical inferences. That is, in contrast to traditional p-values, e-values allow multiple analyses of accumulating data without adjusting for multiplicity while still maintaining Type I error rate control. While the theoretical properties of SAVI methods are to some extent understood through the proliferation of research articles in mathematical statistics in recent years, it is unclear whether these properties translate into actual benefits for clinical researchers. The aim of this thesis is to perform a neutral comparison of e-values with traditional methods for sequential analysis in clinical trials, such as group sequential designs and alpha-spending functions. To this end, traditional and SAVI methods will be summarized and applied to case studies from clinical trials. Based on these case studies, a neutral simulation study will be designed to better understand how the properties of SAVI methods compare to traditional methods under realistic conditions.
Contact: samuel.pawel@uzh.ch
References:
Grünwald, P., de Heide, R., Koolen, W. (2024). Safe testing, Journal of the Royal Statistical Society Series B: Statistical Methodology, 86(5):1091-1128.
https://doi.org/10.1093/jrsssb/qkae011
Ly, A., Boehm, U., Grünwald, P., Ramdas, A., van Ravenzwaaij, D. (2024). Safe Anytime-Valid Inference: Practical Maximally Flexible Sampling Designs for Experiments Based on e-Values.
https://doi.org/10.31234/osf.io/h5vae
Ramdas, A., Grünwald, P., Vovk, V., Shafer, G. (2023). Game-Theoretic Statistics and Safe Anytime-Valid Inference. Statistical Science. 38(4):576-60.
https://doi.org/10.1214/23-STS894
Turner, R., Ly, A. (2022). safestats: Safe Anytime-Valid Inference.
https://CRAN.R-project.org/package=safestats
Lakens, D., Pahlke, F., Wassmer, G. (2021). Group Sequential Designs: A Tutorial.
https://doi.org/10.31234/osf.io/x4azm
Pallmann, P., Bedding, A.W., Choodari-Oskooei, B. et al. (2018). Adaptive designs in clinical trials: why use them, and how to run and report them. BMC Medicine. 16:29.
https://doi.org/10.1186/s12916-018-1017-7
Wassmer, G., Brannath, W. (2016). Group sequential and confirmatory adaptive designs in clinical trials.
https://doi.org/10.1007/978-3-319-32562-0
Wassmer, G., Pahlke, F. (2024). rpact: Confirmatory Adaptive Clinical Trial Design and Analysis.
https://CRAN.R-project.org/package=rpact
Morris, T.P., White, I.R., Crowther M.J. (2019). Using simulation studies to evaluate statistical methods. Statistics in Medicine. 38(11):2074-2102.
https://doi.org/10.1002/sim.8086
Siepe, B.S., Bartoš, F., Morris, T.P., Boulesteix, A.-L., Heck, D.W., Pawel, S. (2024). Simulation Studies for Methodological Research in Psychology: A Standardized Template for Planning, Preregistration, and Reporting. Psychological Methods.
https://doi.org/10.1037/met0000695
https://doi.org/10.31234/osf.io/ufgy6
https://www.overleaf.com/latex/templates/ademp-prereg-simulation-study-template/dkhtxjtmpbfj
Predictive power and sample size calculations for ANCOVA
Description: Analysis of covariance (ANCOVA) is a widely used method in clinical trials that incorporate both baseline and follow-up measurements. Traditional approaches to power and sample size calculations for ANCOVA rely on specifying a fixed value for the correlation between baseline and follow-up outcomes (Borm et al., 2007; Kieser, 2020). In practice, however, this correlation is often uncertain and difficult for clinicians to quantify reliably. Bayesian predictive power offers a framework to address this limitation by explicitly incorporating parameter uncertainty. Rather than conditioning on a single assumed value, predictive power averages over a prior distribution for the parameter of interest (Spiegelhalter et al., 1986; Grieve, 2020). The primary objective of this thesis is to extend the concept of predictive power to the ANCOVA setting and to investigate whether modified sample size and power formulas can be derived that account for uncertainty in the baseline-follow-up correlation. A secondary aim is to elicit an appropriate prior distribution for this correlation parameter using empirical estimates reported in clinical trials (Walters et al., 2019). The developed methods and prior distribution could inform the design of future studies employing ANCOVA.
Contact: samuel.pawel@uzh.ch
References:
Borm, G., Fransen, J., Lemmens, W. (2007). A simple sample size formula for analysis of covariance in randomized clinical trials. Journal of Clinical Epidemiology.
DOI: 10.1016/j.jclinepi.2007.02.006.
Kieser, M. (2020). Methods and Applications of Sample Size Calculation and Recalculation in Clinical Trials. Chapter 3.4.
DOI: 10.1007/978-3-030-49528-2.
Spiegelhalter, D. J., Reedman, L. S., Blackburn, P. R. (1986). Monitoring clinical trials: Conditional power or predictive power? Controlled Clinical Trials.
DOI: 10.1016/0197-2456(86)90003-6.
Grieve, A. (2020). Hybrid Frequentist/Bayesian Power and Bayesian Power in Planning Clinical Trials.
DOI: 10.1201/9781003218531.
Walters, S., Jacques, R., dos Anjos Henriques-Cadby, I., Candlish, J., Totton, N., Xian, M. (2019). Sample size estimation for randomised controlled trials with repeated assessment of patient-reported outcomes: what correlation between baseline and follow-up outcomes should we assume? Trials.
DOI: 10.1186/s13063-019-3671-2.
Standardized and reproducible simulation studies with Omnibenchmark
Description: Simulation studies are widely used to evaluate the performance of statistical methods and provide guidance on their appropriate use. In such studies, researchers generate synthetic data sets under known data-generating mechanisms, apply the methods of interest, and compare the results to the known truth to assess performance (Morris et al., 2019; Siepe et al., 2024). Despite their importance, published simulation studies are often difficult to compare due to a lack of standardization in design, implementation, and reporting. As a result, different simulation studies may yield conflicting recommendations, limiting their practical usefulness. To address this issue, a "living" simulation benchmarking framework has been recently proposed, in which benchmarks are continuously updated with new data-generating mechanisms and methods (Bartoš et al., 2025). A key requirement for such an approach is strict computational reproducibility across all stages of the simulation study, including data generation, method implementation, and performance evaluation. This thesis aims to investigate whether Omnibenchmark (Mallona et al., 2024a,b), a recently developed framework for standardized benchmarking in bioinformatics, can serve as a viable solution for simulation studies in statistical research, which are usually implemented in the statistical programming language R. To this end, published simulation studies will be reimplemented within the Omnibenchmark framework to assess its feasibility, flexibility, and reproducibility. This project will involve (or require willingness to develop) practical skills in R, Python, containerization technologies, Linux environments, and command-line tools.
Contact: samuel.pawel@uzh.ch
References:
Bartoš, F., Pawel, S., Siepe, B. S. (2025). Living Synthetic Benchmarks: A Neutral and Cumulative Framework for Simulation Studies. arXiv.
DOI: 10.48550/ARXIV.2510.19489.
Mallona, I., Luetge, A., Carrillo, B., Incicau, D., Gerber, R., Meara, A., Sonrel, A., Soneson, C., Robinson, M. D. (2024a). Omnibenchmark: transparent, reproducible, extensible and standardized orchestration of solo and collaborative benchmarks. arXiv.
DOI: 10.48550/ARXIV.2409.17038.
Mallona, I., Soneson, C., Carrillo, B., Luetge, A., Incicau, D., Gerber, R., Sonrel, A., & Robinson, M. D. (2024b). Building a continuous benchmarking ecosystem in bioinformatics. arXiv.
DOI: 10.48550/ARXIV.2409.15472.
Morris, T. P., White, I. R., Crowther, M. J. (2019). Using simulation studies to evaluate statistical methods. Statistics in Medicine, 38(11), 2074–2102.
DOI: 10.1002/sim.8086.
Siepe, B. S., Bartoš, F., Morris, T. P., Boulesteix, A.-L., Heck, D. W., & Pawel, S. (2024). Simulation studies for methodological research in psychology: A standardized template for planning, preregistration, and reporting. Psychological Methods.
DOI: 10.1037/met0000695.