Combining p-values with adjustment for selective reporting
Description: Selective reporting of study results occurs in almost all fields of applied science. In preclinical research, sometimes one out of a number of "representative" experiments is reported, often the most convincing and significant one, i.e the one with the smallest p-value. The reporting of other experiments in the same paper that also support the scientific hypothesis may also be subject to selective reporting. Computing the combined evidence from different experiments with standard p-value combination methods is then no longer valid, as the individual p-values are no longer uniform under the null hypothesis of no effect. This Master thesis will develop and apply p-value combination methods with adjustments for selective reporting. A particular focus will be on high-impact papers in biomedical and pre-clinical lab research.
Contact: leonhard.held@uzh.ch
Combining evidentiary value and sample size of dual-criterion designs
Description: The dual-criterion combines statistical significance and effect size relevance and has recently become popular in clinical trials (Roychoudhury et al, 2018). Dual-criterion designs can be combined with Bayesian group-sequential methods to be able to stop a trial at interim for success or failure (Gsponer et al, 2014). Such adaptive methods often come with substantial sample size reductions, but this may be at the cost of increased error rates. The recently proposed experimental unit information index (EUII, Held et al, 2025) is a novel way to balance evidentiary value and sample size of adaptive designs. This Master thesis will investigate the applicability of the EUII in dual-criterion designs.
Contact: leonhard.held@uzh.ch
References:
Held et al (2025). The Experimental Unit Information Index: Balancing Evidentiary Value and Sample Size of Adaptive Designs.https://arxiv.org/abs/2511.17292
Gsponer et al. (2014). A practical guide to Bayesian group sequential
designs. Pharmaceutical Statistics, 13(1):71–80, doi:10.1002/pst.1593.
Roychoudhury et al. (2018) Beyond p-values: A phase II dual-criterion
design with statistical significance and clinical relevance. Clinical
Trials, 15(5):452–461, 2018.
doi:10.1177/1740774518770661.
Point and Interval Estimates for Drug Regulation
Description: The two-trials rule requires ''at least two pivotal studies, each convincing on its own'' for the demonstration of drug efficacy and subsequent marketing approval (FDA, 1998, 2019). This is typically implemented by requiring the p-values from the two trials to be significant at the standard (one-sided) alpha = 0.025 level. However, this procedure alone does not provide an effect estimate nor a CI, and conclusions may be incompatible with the meta-analytically pooled estimate of the two studies, which is often suggested as an alternative in the clinical trials literature. Also, often more than two trials are conducted where it is unclear how the two-trials rule should be extended (Rosenkranz, 2023). Alternative methods have therefore been recently proposed, such as Edgington's method (Held, 2024). The Master thesis will derive point and confidence intervals that match the two-trials rule and Edgington's method, respectively. The work will be based on the concept of the combined p-value function (Fraser, 2019; Held et al, 2024), which provides p-values for every point null hypothesis, confidence intervals and the median point estimate, which is by construction always compatible with the p-values and confidence intervals.
Contact: leonhard.held@uzh.ch
References:
FDA (1998). Providing clinical evidence of effectiveness for human drug and biological products. www.fda.gov/regulatory-information/search-fda-guidance-documents/providing-clinical-evidence-effectiveness-human-drug-and-biological-products
FDA (2019). Substantial evidence of effectiveness for human drug and biological products. https://www.fda.gov/drugs/guidance-compliance-regulatory-information/guidances-drugs
Fraser, D. A. S. (2019). The p-value Function and Statistical Inference. The American Statistician, 73(sup1), 135–147.
doi:10.1080/00031305.2018.1556735
Held L. (2024). Beyond the two-trials rule. Statistics in Medicine. 43(26): 5023-5042.
doi: 10.1002/sim.10055
Held et al (2024). http://arxiv.org/abs/2408.08135
Rosenkranz, G.K. (2023). A Generalization of the Two Trials Paradigm. Ther Innov Regul Sci 57, 316–320.
doi: 10.1007/s43441-022-00471-4
Power calculations for clinically relevant effect sizes
Traditional power calculations are based on standard point null hypothesis testing. However, in practice it is often more important to quantify the evidence for a clinically relevant effect rather than for a non-zero effect. The project will investigate the literature on this topic with special attention to the design and analysis of preclinical replication studies.