Home | Contact | Login

Master Program in Biostatistics

Selected Projects

To see the abstract of a project, click on title.

Thomas Fischer: Tetralogy of Fallot: Impact of prosthetic pulmonary valve replacement on ventricular function and outcomes (Statistical Consulting Fall 2020)

This retrospective cohort study investigates the long-term dynamics of pulmonary valve replace- ments with respect to ventricular function and dilation, exercise performance, heart insufficiency, and the risk of adverse cardiac events. Using generalized linear mixed models we find no evidence for an association between PVR and right- and left-ventricular ejection fraction, RVEDVi, or log- ProBNP. With respect to LVEDVi, we found strong evidence for a negative association between age and PVR. We observed that exercise capacity decreased more slowly in patients with prior PVR, however the effect was not statistically significant. Finally, we used Cox Proportional Hazards mod- els with time varying covariates to assess whether PVR was associated with the hazard of adverse cardiac events, and found strong evidence for a positive association between PVR and the hazard of atrial arrhythmia and endocarditis.


Thanh Elsener: Role of Multiparametric MRI for staging and grading in diffuse liver disease (Statistical Consulting Spring 2020)

Quantitative multiparametric magnetic resonance imaging (MRI) techniques are more widely used in clinical workflows. The goal of this study was to determine if there is a relationship between any combinations of T2 mapping, proton density fat fraction (PDFF), magnetic resonance elastography (MRE), and the grading or the fibrosis staging. The study also examined if these relations are the same in the subgroup of steatosis and non-steatosis patients. The applied methods were a the regression model for ordinal data, including the proportional odds model (POLR), and the logistic regression for binary data. After checking the model assumption and the goodness of fit, the binary logistic model was proposed to fit the data. The analysis showed that there is a weak association between PDFF and the activity grading. It was found that for fibrosis staging, there is a strong association with MRE and a weak relationship with T2. The subgroup analysis suggested two different models for grading and one model for staging, containing only MRE as predictor. This difference between the two subgroups could result from the small sample size or the weak associations between the predictors and the outcome. A bigger sample size is required for further studies in order to improve the generalizabilty of the results.


Oliver John: Assessing Today for a Better Tomorrow (Statistical Consulting Spring 2020)

The neonatal period, defined as the first 28 days of life of a newborn, is a child’s most vulnerable period. A variety of complications can cause aggravations to the newborn’s health, ultimately resulting in death. Developing countries have especially high neonatal mortality rates, which is why it is important to gather pre-, peri-, and post-natal data of newborns and systematically analyze which variables are associated with mortality. This will allow for accurate future interventions to improve a child’s survival. In a cohort study performed in Conakry, Guinea, a poor developing country in West Africa, a total of 168 newborn patients were admitted to the l’Institut de Nutrition et de Sante de l’Enfant (INSE) between February 15th and March 16th 2019. To explore factors associated with neonatal mortality, 190 medical and non-medical variables concerning characteristics and measures of the newborn and its mother regarding pregnancy, birth and hospitalization were collected. This studies analysis was largely performed using descriptive methods. Based on these descriptive results, we suggest that the following variables were particularly associated with neonatal mortality: variable General Condition and variables concerning a newborn’s weight, prematurity, temperature and respiratory disease.


Christoph Blapp: Self-Reported Quality of Life Among Palliative Cancer Patients in a Feasibility Study (Statistical Consulting Spring 2020)

This report explores a small sample of outpatient palliative care patients whose self-reported quality of life, among other characteristics, has been tracked over time through the EORTC-QLQ-C30 ques- tionnaire. The aim was to both check whether any patterns show themselves when comparing a small selection of patient characteristics to each patients development of quality of life within the data and to evaluate the internal consistency of certain subscales within the questionnaire. None of the patient characteristics showed a connection to general trends of quality of life. However, both signs of depres- sion and an above average amount of social problems show association with high magnitude trends in either direction and signs of depression show association with a high variance. All the evaluated subscales show an acceptable, but not excellent, internal consistency.


Xijin Chen: What are predictors of long-term work participation in people with cystic fibrosis undergoing lung transplantation? (Statistical Consulting Spring 2019)

Cystic Fibrosis is a life-limiting genetic disease that mainly affects the lungs, but also other organ systems. Respiratory failure remains the main cause of death in people with cystic fibrosis. Lung transplantation is the ultimate therapy option in carefully selected patients with end-stage lung disease. Return to work is recommended in people with cystic fibrosis following lung transplantation.  This study aimed to determine predictors of work participation in a cohort of patients who have underwent lung transplantation at the University Hospital Zurich.

This retrospective study included 99 patients who have had a lung transplant at the University Hospital Zurich between January 1996 and December 2016. Pre-transplantation factors (i.e., age, sex, body mass index, six-minute walk test distance, education, relationship status, living status and pre-employment status) and post-transplantation factors (e.g., chronic allograft dysfunction, diabetes, cancer, kidney disease) were retrieved from the patients’ charts. Follow-up time post lung transplantation was categorized into five periods: within one year, one to three years, three to five years, five to ten years and more than ten years. To evaluate work participation after lung transplantation, we investigated both work status (i.e., working for income) and average of work percentage at each time period.

Research questions:

1. What are the predictors for work status in people with cystic fibrosis undergoing lung transplantation?

2. What are the predictors for work percentage in people with cystic fibrosis undergoing lung transplantation?


Uriah Daugaard: Sample Size in Studies with Survival Endpoint: Review and Best Practice (Statistical Consulting Spring 2019)

Over the last few decades sample size calculation has become an integral part of clinical study design, as its importance regarding the informative value and the ethical aspects of research have been highlighted. However, the results of a recent systematic review suggest that in general, the reporting of sample size calculations in randomized clinical trials needs to improve.

The research question for this project can be divided into two groups:

1. Sample size for clinical survival studies in R: what methods for the sample size calculation for survival analysis in clinical studies are implemented in R and in which packages and functions are they? This includes the implementation in R of the methods described by Schoenfeld et al.(1983), Therneau and Grambsch (2000) and Collett (2015), if they are not already implemented.

2. Evaluation of common practice: we want to explore how recently published clinical studies with survival endpoint calculated their sample size and how they reported it. This includes the question of whether they reported all the information that is needed for the sample size calculation and whether we can recompute the reported sample sizes with the provided information up to a margin of error.


Natalia Popova: Identification of risk factors and mapping of excess weight in young Swiss males across the five largest cities (Statistical Consulting Spring 2019)

Obesity has a negative effect on health and is rising every year. The purpose of this study was to investigate spatial variation of excess weight and possible ecological factors for this variation in the five largest Swiss cities (Bern, Zurich, Basel, Lausanne, and Geneva) by analyzing data from Swiss conscripts.

Logistic regression was used to investigate the association between body mass index (BMI), professional status, area-based socio-economic position, occupational status, and food sales data across different postcode areas.

This analysis of BMI of young swiss conscripts helps to identify neighborhoods with high risk factors.


Lucas Kook: Anesthesia and aviation: Does added carbon dioxide in normobaric hypoxia have the same effect on cerebral oxygenation as in hypobaric hypoxia? (Statistical Consulting Spring 2019)

Complications during anesthesia may lead to severe apnoea and thus hypoxia, which may result in cerebral damage.

Recent findings conclude that inhaling increased carbon dioxide concentrations under hypobaric hypoxia substantially improves psycho-motor function and vigilance in air force pilots. It was hypothesized that this effect can be translated to normobaric conditions, i.e. patients undergoing surgery. When undergoing a major surgical procedure, most patients receive general anaesthesia. During anaesthesia, it may happen that airways get blocked after applying muscle relaxing medication and that patients lose the ability to breathe spontaneously. To avoid hypoxia and possible long term impairments, patients are pre-oxygenated for at least 3 minutes before intubation.

In this cross-over trial the aim is to provide a proof-of-concept that inhaling 95% O2 + 5% CO2 prior to apnoea improves cerebral tissue oxygenation and delays the onset of hypoxia compared to 95% O2 + air.


Samuel Pawel: Histology of Hepatitis E Virus Infection in Liver Specimens (Statistical Consulting Fall 2018)

Globally seen, viral hepatitis infections are a major health challenge and the reason for millions of deaths each year. The hepatitis E virus (HEV) is one of the five types of viral hepatitis infections. In the past it has been assumed that HEV occurres mainly epidemically in developing countries due to lower sanitation standards. However, over the last decade, evidence has accumulated that the prevalence of the HEV in European countries is much higher than initially thought.

Based on 33 histological criteria, three pathologists formulated five qualitatively different histological patterns of HEV infections, which they thought could be useful as tools for non-expert pathologists in characterizing HEV infections and identifying patients at risk for acute or chronic HEV.

This study aimed to answer the following questions:

1. Can the formulated histologic patterns be confirmed by a statistical analysis?

2. Is there an association between the patients’ immune status and assigned histologic pattern?

3. How good is the agreement between the three pathologists regarding the assessment of the specimens?


Tea Isler: Small sample considerations for anthelmintic resistance tests (Master Thesis Fall 2017/Spring 2018)

The most frequently used method to determine the efficacy of anthelmintic treatments is faecal examination. The faecal egg count reduction test (FECRT) is the most common technique to assess anthelmintic resistance of parasites in horses. There are some limitations to this test and in the past years new statistical models have been proposed to find a solution for these. However, all these models require a minimum number of 10 – 15 animals per stock which might often not be reached in practice.

In this thesis, the focus lies on the problem of very small sample size using the Bayesian hierarchical model from Wang et al. (2017). More specifically, some of the options available in the eggCounts package (Wang et al., 2017) were used such as analytical sensitivity and prior distributions to reduce the inevitable uncertainty that arises when analyzing very small datasets.


Senait Tekle: Retinopathy of prematurity (Statistical Consulting Spring 2018)

Retinopathy of prematurity (ROP) is a potentially sight-threatening disease affecting prematurely born babies.

After mid-2015, an increased necessity of ROP treatment for children born at the University Hospital Zurich was noticed. This demonstrates the need to investigate the parameters and known risk factors for ROP in this time compared with a previous time period. In this retrospective study, the two following research questions were investigated:

  1. Is there a change in the probability of death over time for infants born and treated at the University Hospital Zurich between 7/2013 to 6/2017 with a gestational age less than 29 weeks?
  2. What are the risk factors for ROP adjusted for the time trend?


Sascha Stutz: Validation of a triage system to forecast trauma team activation for severely injured patients and the role of the patients’ age (Statistical Consulting Spring 2018)

If patients suffer from severe injuries, the trauma team is to be activated. The decision whether to activate the trauma team is called triage and is made in the field by the team in the ambulance. Since pre-clinical evaluation of injury severity can be very difficult, at this point only limited information is available to make this decision.

The purpose of a triage system is to minimize the under-triage rate and the over-triage rate.


Sandra Siegfried: Effect of mental stress on glucose control of patients with type 1 and type 2 diabetes mellitus (Statistical Consulting Spring 2018)

Chronic stress has been associated with higher glycemic levels in diabetic patients (Lloyd et al., 1999), but it is still unclear whether exposure to short-term stress can also cause glucose level variations. A better understanding holds possible treatment adjustments in stressful situations in the daily life of diabetes patients.

In this report, collected data from four recent studies in the field of diabetic care is reanalyzed from a different perspective, considering the individual differences of glucose concentrations in patients under stress-testing and at baseline.


Charlotte Micheloud: Diagnosing E. alveolaris versus E. granulosus with IHC as a new gold standard and identification of six morphological markers as powerful diagnostic tests (Statistical Consulting Spring 2018)

E. alveolaris and E. granulosus are two types of Echinococcus, a parasitic tapeworm that can infect humans and cause a lot of harm. This study is focusing on liver infection only.

The two types of tape worms require two different treatments, and a false diagnosis results in heavy consequences, specifically under-treatment or over-treatment of the patient. This demonstrates the need of a powerful diagnostic test.

  1. Is the new diagnostic test (IHC) as powerful as the gold standard (PCR)?
  2. Which of the morphological markers can be considered as powerful diagnostic tools?
  3. Does chemotherapy prior to liver resection impact on (i) the size of the lesion (ii) the amount of calcification (iii) the amount of inflammation (iv) the presence of necrosis?


Eleftheria Michalopoulou: MRI of the bony labyrinth of the inner ear for patients with Ménière's disease (Statistical Consulting Spring 2018)

Ménière’s disease (MD) is a chronic disorder of the inner ear labyrinth with heterogeneous clinical conditions, which makes it difficult to diagnose. In this exploratory retrospective study, two pathologies of the endolymphatic sac in patients with MD are considered: the degenerative and the hypoplastic pathology.

The aim of this study is to investigate whether there is a difference in the hearing loss course of the MD affected ears between the degenerative and the hypoplastic pathology types over time.


Peter Meili: Potential Risk Factors for Increased Numbers in Wildlife-Vehicle Collisions (Statistical Consulting Spring 2018)

There are many prevention methods for vehicle collisions with animals, with two main goals: either changing the driver behavior or changing the wildlife behavior.

However, a primary study by Benten et al., 2017 has found that wildlife-warning reflectors are not an effective measure against collision. The data from this study is used in this report for a secondary analysis. The aim is to find potential risk factors explaining increased numbers (compared to the mean) in wildlife vehicle collisions.


Sandar Lim: Long-term effect of serum bicarbonate levels in Swiss kidney transplant patients (Statistical Consulting Fall 2017)

Chronic kidney disease (CKD) is a condition characterized by a gradual loss of kidney function over time, which deteriorates a body’s ability to regulate waste. For patients with end-stage CKD, kidney failure necessitates dialysis or kidney transplant. Kidney transplant is a treatment of choice for end-stage CKD and it has been proven superior to dialysis due to better survival, quality of life, and cost-effectiveness. Despite advances, there is still a number of patients with declining transplant function and graft loss after surgery and the reasons are not completely understood.

Metabolic acidosis (MA), indicated by low serum bicarbonate levels, is a known risk factor for mortality and progressive renal dysfunction in CKD. A recent study by Park et al. (2017) showed that MA after transplantation was associated with increased risk of graft loss and patient death. This motivated us to test for any evidence of a relationship between serum bicarbonate, graft function, and mortality in Swiss kidney transplant recipients.

The objectives of this study were:
1. to determine the prevalence of MA in kidney transplant recipients at University Hospital Zürich from 2000 to 2015;
2. to determine the degree of association (if any) between serum bicarbonate, graft function, and patient survival.


Xinglu Liu: The relations between cholesterol and mortality in Switzerland (Statistical Consulting Fall 2017)

Cholesterol has become a commonly cited medical term over the last few decades. It is a waxy essential substance which is made in the body by the liver and is also found in some foods. It plays a vital role in how every cell works and is also needed to make Vitamin D, some hormones and bile for digestion. However, high blood cholesterol can increase the risk of getting heart and circulatory diseases. An estimated 4.4 million people worldwide will be killed due to non-optimal cholesterol level. High cholesterol has been linked to other related risk factors such as age, socioeconomic status, living habits, and culture. It has also long been thought that cholesterol is a risk factor for heart disease, including coronary heart disease (CHD), stroke, cancer, and cardiovascular diseases (CVD).

This project was inspired by Menotti et al. (2008) and Verschuren et al. (1995), which present linear associations between cholesterol level and CHD mortality in several countries. The aims of this report were to:
1. explore the influence of cholesterol on mortality across the three major language regions (German, French and Italian) of Switzerland;
2. analyze the relationship between cholesterol level and cardiovascular (CVD) mortality;
3. analyze the relationship between cholesterol level and cancer mortality;
4. use these results to compare Switzerland to seven other countries.


Eleftherios Papagiannoulis: General vs Regional Anesthesia: A retrospective analysis (Statistical Consulting Fall 2017)

The use of anesthesia is required in many medical procedures. Forms of anesthesia have been used by humanity for thousands of years but it is during the last centuries that this field has experienced a great growth and evolved to what we call Modern Anesthesia. When a patient undergoes an operation, depending on the clinical nature of the procedure, general or regional anesthesia may be applied. Consequently, it is of interest to answer the question whether or not a specific method of anesthesia results in less complications and a better post-operative condition for the patient.

In this study, we compared patients under general and regional anesthesia, where regional cases corresponded to either Epidural, Spinal or both methods of anesthesia combined. The aim of this analysis was to investigate possible differences in complications and postoperative status across the two anesthesia types.


Silvano Sele: Agreement between different methods of ocular torsion measurements (Statistical Consulting Fall 2017)

Ocular alignment is achieved through vertical, horizontal or torsional rotation of the eyes (Lengwiler et al., 2017). Eye motility can be affected by trochlear nerve palsy, leading to vision impairments (e.g. double vision). Foveo-papillary angle (FPA) measurements from scanning laser ophthalmoscopy (SLO) images are compared to the commonly used standard (fundus images) in patients with trochlear nerve palsy and in healthy controls. SLO is readily available, quickly done, and does not require pupil dilation. The measurements may be even more accurate than the fundus measurements.

The SLO images can be acquired using a combination of camerafixation/lightfixation and eyetracker/notracker, resulting in: camerafixation-eyetracker (Camera-yT), camerafixation-notracker (Camera-nT), lightfixation- eyetracker (Light-yT) and lightfixation-notracker (Light-nT). Measurements acquired from the four combinations are compared to each other. The Camera-yT combination is expected to be the most objective of these four combinations.

This study aimed to answer the following questions:
1. How well is the agreement between FPA measurements from fundus images and FPA measurements from the four SLO-techniques?
2. How well is the agreement between FPA measurements from the four SLO-techniques? How reliable are measurements from fundus images? (Reproducibility coefficient)
3. How reliable are measurements in each of the four SLO-techniques? (Repeatability and Reproducibility coefficient)
4. How do the objective tests correlate with the subjective tests of torsion (Double Maddox rod and Harms tangent screen)?
5. Is there a FPA cut-off that allows for discrimination between patients and healthy controls?


Angelo Duo: Treatment of opiate withdrawal in neonates with morphine, phenobarbital or chlorpromazine (Statistical Consulting Spring 2017)

Neonatal Abstinence Syndrome (NAS) has increased in the last 20 years (Allegaert and van den Anker, 2016). It poses a significant amount of suffering to infants and families and also contributes to the use neonatal beds and other resources (Zimmermann-Baer and Bucher, 2017). Several substances have been used to treat NAS, but until now a randomized, prospective, double-blind study is missing (Zimmermann-Baer and Bucher, 2017). Here, the results of a multicenter, double-blind, parallel-group study with three arms in six hospitals in Switzerland are analyzed. Included were late preterm and term infants (34 gestational weeks or more) with withdrawal symptoms born by mothers who had taken opiates or methadone during their pregnancy. The exclusion criteria were diseases of infants that probably would require an extended hospital stay and a negative drug screening test in two meconium samples.

The effect of the three drugs on the following outcomes were investigated:

1. Duration of treatment
2. Duration of hospitalization
3. Suffering during withdrawal
4. Need for a second drug
5. Intensity of care
6. Association of drugs on the treatment or hospitalization time


Tea Isler: Effect of school based physical activity programme on physical fitness in primary schoolchildren: cluster randomised trial (Statistical Consulting Spring 2017)

The study is a multi-component physical activity intervention during one school year and aimed to increase physical fitness, physical activity and quality of life while decreasing body fat. The program had a huge interest among schools as 190 classes were eligible and consented to participate. Only 28 classes were then randomly selected, ending up with 540 participants from 1st and 5th grade. Although the cluster was the class, randomisation to control and intervention groups was done on school level to avoid contamination. Children from both control and intervention group had three hours of physical educations, which are compulsory by law. The intervention arm had two additional physical education lessons, three to five short activity breaks during academic lessons and 10 min daily physical activity homework. To assess the difference between the two groups the study analysed the following four outcomes:

• body fat (measured as sum of four skinfolds)
• physical fitness
• physical activity
• quality of life
Physical fitness is a global term that describes the mastery of several attributes of fitness including strength, endurance, power, speed, etc.. When they conducted the study, to analyse the physical fitness variable, they performed seven tests such as high jump, situps, etc.. In the statistical analysis published in 2010 only the data from one of the seven tests was used.
In this project all the physical fitness tests are analysed in detail by checking which one of them achieve a significant difference between control and intervention group at the end of the intervention program. Furthermore, new data three years off intervention is now available. My second research question is then to analyse if the significant difference between control and intervention at the end of the intervention program is maintained after three years.


Sonja Hartnack: Mid back pain - prognostic factors and trajectories (Statistical Consulting Spring 2017)

Mid back pain (MBP) can affect individuals’ quality of life. According to Leboeuf-Yde et al. (2009), MBP can be defined as pain localised between the 1st and the 12th thoracic vertebrae and the cor- responding posterior aspect of the trunc. Thoracic spine pain is used as a synonym for MBP. Whereas the cervical and the lumbar spine regions are flexible, allowing for movements of the corresponding parts of the body, the thoracic region is less flexible. The thoracic vertebrae are linked to the ribs or costae and together they build the thoracic cage, a stable environment to protect the vital organs.

MBP can occur after a trauma, i.e. traffic collisions may result in so-called whiplash-disorders encompassing a number of symptoms. MBP can be one of these symptoms (Johansson et al., 2015), possibly resulting in a long recovery process which may also be influenced by a range of biopsychosocial factors. Although the specific underlying cause for MBP remains often unknown, a number of aetiologies is known to cause MBP. Next to inflammatory disorders (i.e. spondyloarthropathies), structural changes (e.g. fractures, osteoporosis, scoliosis, Scheuermann’s disease, hyperkyphosis, diffuse idiopathic skeletal hyperostosis), infections (i.e. spondylodiscitis) as well as cancer, cardiovascular diseases or non-musculosceletal pain have been described to cause MBP.

So far, studies on trajectories and risk factors for mid back pain are lacking (Johansson et al., 2017). Therefore, clinicians are still challenged when seeking information about prognosis of MBP. In contrast, studies exploring low back pain trajectories have identified distinct subgroups with (highly) varying profiles (Deyo et al., 2015; Dunn et al., 2006; Kongsted et al., 2015; Tamcan et al., 2010).

Balgrist, University Hospital of Zurich, is a teaching hospital providing education to young professionals. The Policlinic for Chiropractic Medicine at Balgrist, started recently to build up a patient data bank for different types of back pain. The aim of this consultancy project was to analyse the first MBP patient data available providing answers to the research questions.
1. To study which clinical factors at baseline could be predictors for an unfavourable outcome
2. To develop a model which can be used by the clinician to become aware of which patients are at risk for an unfavourable outcome
3. To describe patterns of recovery of MBP during the first three months of chiropractic treatment


Kelly Reeve: The effect of physician gender on prehospital analgesia in a physician staffed helicopter emergency medical service (Statistical Consulting Fall 2016)

As required by Swiss law, Rega 1, a non-profit helicopter emergency medical services (HEMS) organi- zation, keeps a database of all its helicopter movements and links this to information on patient and mission characteristics. An internal quality control study from April to October 2011 supplemented this data with numerical rating scale (NRS) for pain scores and thus enabled the study “Prehospi- tal analgesia in a physician staffed helicopter emergency medical service” by Oberholzer et al. (2016). Among other objectives, this study aimed to identify patient, physician, and mission factors associated with moderate to high pain levels upon hospital admission and to identify reasons for persistant pain in patients. An interesting finding of this study was that patients experiencing sufficient pain man- agement were significantly more likely to be treated by a male physician than by a female physician (75.9% compared to 61.7%, p < 0.0001).

The objective of this secondary analysis is to explore the relationship between physician gender and sufficient analgesia status found in Oberholzer et al. (2016). Three possible channels for this finding are studied:
1. Systematic differences in the way male and female physicians treat patients;
2. Systematic differences in the way patients respond to male and female physicians;
3. Systematic differences in cases seen by male and female physicians.
Exploring each of these pathways in the analysis will allow for a better understanding of the observed physician gender effect.


Roman Flury: MApping of Locomotor Training (MALT) (Statistical Consulting Fall 2016)

Interim analysis of a clinical trial. What constitutes conventional therapy for patients with an incomplete spinal cord injury (SCI) in the four ability groups 100% Support, Wheelchair, Moderate Walker and Good Walker? More specifically, how much therapy do patients receive during their rehabilitation program and what exactly is done during the therapy sessions. The main focus thereby is to provide graphical interpretation of the clinical trial data.

Proportions of the different interventions were calculated and compared between patients. To take the different time durations of the interventions into account the proportions were weighted accordingly. Patients were further stratified by rehabilitation clinic and the interventions they had identified. The results are provided for the entire process of rehabilitation of the patient, as well as for the first and last week only. To provide graphical interpretation of the clinical trial data and to ease its accessibility, the analysis is available as a Shiny web application for the investigators in the trial. Shiny is a web application framework for R, which makes it possible to access the analysis on a web browser. The user can interactively change implemented parameters in order for a responsive output. The Shiny application provides not only graphical interpretation of the matters discussed in this analysis but gives an overview of other recorded variables related to the patients’ therapy sessions. In order to protect the true identities of the patients, the name of the respective clinics are encrypted and no other information, which allows to identify a patient, is available.


Muriel Buri: Parametric Bootstrap Inference for Transformation Models (Master Thesis Spring 2016)

The purpose of this master thesis is to use the parametric bootstrap resampling method for doing statistical model inference on transformation models. Based on previous research completed by Hothorn et al. in 2015 (Hothorn, T., Möst, L., and Buühlmann, P. (2015). Most Likely Transformations. arXiv:1508.06749. Technical report, v2. URL arxiv.org/abs/1508.06749), this project utilizes the implementation of maximum likelihood-based estimation for transformation models. The framework of conditional transformation models as well as the bootstrap resampling method is profoundly explained within this thesis.

To practically illustrate the use of these approaches, the in R publicly available data set from the German Breast Cancer Study Group-2 (GBSG2) trials was used. A conditional transformation model estimates the conditional distribution of the response variable Y defined from the GBSG2 data set. Consequently, the para- metric bootstrap resampling method can be applied to draw B new response variables Y1⋆ , . . . , YB⋆ from the conditional distribution function. This procedure resulted in B new conditional transformation models, which were subsequently used for the parametric bootstrap inference. We used log-likelihood ratio statistics as a likelihood based measurement for comparing the bootstrap generated model to the original transformation model. The statistical inference of the bootstrap generated transformation models was carried out in two ways: first, on the model parameters and the distribution thereof; and second, on the data specific prediction functions, e.g. the density function, the empirical cumulative distribution function, the survivor function, etc. Furthermore, this research has shown that the degrees of freedom of the Chi-squared distributed log-likelihood ratio statistics are not defined as they are expected to be. Regarding the not as expected log-likelihood ratio statistics distribution, this thesis does not definitively provide a solution, however, simulations have been in- cluded to prove the presumption that a correction of the degrees of freedom in instances of multiply occurring model coefficients is essential. In conclusion, the results of this thesis advance the understanding of graphical model inference of the model parameters of a conditional transformation model as well as the inference of the conditional transformation model itself.


Verena Steffen: In-hospital blood glucose monitoring — a retrospective analysis of the year 2014 (Statistical Consulting Spring 2016)

Glucose is a sugar that circulates in the blood of humans and animals as blood glucose. In healthy organisms it functions as the main source of energy through aerobic respiration. It is obtained from external sources, mostly through the breakdown of carbohydrates from food. The regulation is mainly ensured with the insulin reaction. Insulin is a hormone that allows the body’s cells to absorb and use glucose. In Diabetes mellitus the body is unable to regulate glucose levels due to the lack of insulin. Therefore, high blood glucose levels are expected in diabetic patients. However, also secondary factors like pancreatitis, thyroid dysfunctions, renal failure, or liver diseases lead to increased blood glucose (hyperglycemia). Hypoglycemia, i.e. low blood glucose levels are less frequent, and usually comes from insulin treatment or hypopituitarism.

Optimal blood glucose control is a challenge in hospitalized patients. One reason for that is that it is generally considered as secondary importance compared with the reason for hospitalization. Hyper- and hypoglycemia are associated with adverse clinical outcomes and a higher mortality (Clement et al., 2004). In addition to high and low glucose, high variability of glucose values is a major concern. Hirsch and Brownlee (2005) discussed the question if monitoring blood glucose variability should become the gold standard in glycemic control. Also Moghissi et al. (2009) report that high glucose variability leads to increased mortality in patients undergoing a transplantation and surgical patients.

In order to review the glycemic control at university hospital Bern all blood glucose measurement values from the year 2014 are analyzed retrospectively. Blood glucose was measured with two different methods: with the GLUC3 clinical chemistry (ClinChem) test (Roche Diagnostics GmbH, Mannheim, 2014), and point of care (POC) testing. POC measurements are considered inaccurate compared to ClinChem results (Hoedemaekers et al., 2008). The main goal of this project is to get more insight into the data, and for that reason descriptive analysis takes up most of the statistical analysis part. Another goal of this analysis is to find out if there are differences between the performance of different wards in the hospital. Performance will be evaluated by intra-patient glucose variability measured as coefficient of variation in %.


Thimo Schuster: Preoperative Predictors of Shunt-dependent Hydrocephalus in Patients with Fetal Spina Bifida Repair

Research question: Identification and quantification of predictors for the probability to require a shunt operation within the first year of life in children who underwent prenatal spina bifida repair.

Data: The data consists of observations from 14 children who received a prenatal repair of spina bifida and who had to be at least 1 year old in order to be included in the study. For this analysis, only prenatally and preoperatively measured data was considered (see Table 1). All measurements were obtained using MRI techniques.

Challenges for the analysis:

  1. Five covariates were investigated in detail. For an unambiguous and straightforward analsis, the number of observations needs to be much larger than n = 14.

  2. With n that small, the uncertainty in the estimators is often too large, and to reach a conclusion can be difficult.

  3. The outcome variable Shunt is binary. For such an outcome, logistic regression has to be used, which requires a larger sample size than ordinary linear regression methods for a continuous outcome.

  4. Ventr3 is completely separated with respect to the outcome. Ordinary logistic regression is not able to handle complete separation, therefore, this method cannot be used.

  5. Ties are present in some covariates.

Some of the conclusions:

  • Despite the problems, the results of this analysis show that Ventr3 is a good prenatal preoperative predictor for requiring a shunt operation in children who underwent prenatal spina bifida repair, e.g. an ELRM model for Ventr3 results in an odds ratio estimate of 15.11, i.e. an increase of 1 mm in Ventr3 results in a 15.11 times higher odds for requiring a shunt operation.

  • FOHR and Ventr are shown to be reasonable predictors for requiring a shunt. The results obtained show an ambiguous influence on the probability of requiring a shunt, but are pointing in the right direction.

    CBD is a poor predictor for requiring a shunt. According to its ROC curve, it is only slightly better than a random predictor. All test results for this variable were far from showing an influcence on the probability of requiring a shunt.

  • A pairs plot shows that all covariates correlate with the age of the fetus, which has to be considered when doing analysis.

  • A shortcoming of the study design is the lack of a control group. No data from children who were not operated was provided. Therefore, interpretation can only be made conditional on the fact that the fetus received fetal spina bifida repair.


Burak Günhan: Network meta-analysis with integrated nested Laplace approximations (Master Thesis Fall 2015)

This thesis investigates how to perform inference with different approaches in meta- analysis models as well as in regression-type meta-analysis models named meta-regression. Chapter 1 contains an introduction to meta-analysis as well as different statistical models and estimation techniques for meta-analysis. Also, a recent Bayesian inference method named integrated nested Laplace approximations (INLA) is used for making estimations in meta-analysis. Chapter 2 contains a motivation for a broader type of meta-analysis called network meta-analysis (NMA). This chapter introduces two models, namely the Lumley and the Lu-Ades models, for NMA and shows how INLA apply to those models. Chapter 3 starts with the discussion of the distinction for different types of inconsistency in the network, namely the cycle inconsistency and the design inconsistency. Then, the design-by-interaction model using random inconsistency parameters, the Jackson model, is introduced. This chapter continues with showing how INLA can be used as an inference method for the Jackson model. Also, Chapter 3 shows that the Lu-Ades models depend on the treatment ordering while the Jackson model do not for an application. All analysis was performed in the R programming language (R Core Team, 2015). Three different applications were used to demonstrate the use of INLA and other methods. The appendix includes the R-code which are used to obtain the results in the main text and the BUGS/JAGS-code to fit the consistency and the Jackson model with MCMC. Also, an R function, meta.inla, which is developed to implement the models introduced in Chapter 1 with INLA is given.


Christos Polysopoulos: Analysis of Alvarado Scores in Acute Appendicitis and Construction of a New Alvarado Score Taking into Account Sonographic Criteria (Statistical Consulting Fall 2015)

Appendicitis is an inflammation of the appendix, which is a worm-shaped pouch attached to the cecum, the beginning of the large intestine. The appendix has no known function in the body, but it can become diseased. Appendicitis is a medical emergency, and if it is left untreated the appendix may rupture and cause a potentially fatal infection.

The causes of appendicitis are not well understood, but it is believed to occur as a result of one or more of these factors: an obstruction within the appendix, the development of an ulceration (an abnormal change in tissue accompanied by the death of cells) within the appendix, and the invasion of bacteria.

The Alvarado score has been developed as a clinical score to assist doctors in the diagnosis of appendicitis. The score consists of 6 clinical items and 2 laboratory measurements with a total of 10 points.

In practice, additional Ultrasound analysis is performed to confirm the initial suspicion of acute appendicitis. Primary objective of this analysis is whether the already in-use Alvarado score can be upgraded taking into account additional information from the sonographic results. The up-graded Alvarado score will be a tool assisting doctors in the classification of patients with abdominal pain to cases or controls.


Zhongxing Zhang: Statistical analysis of the PSA bounce phenomenon in patients treated with permanent seed brachytherapy (Statistical Consulting Fall 2015)

Permanent seed brachytherapy (BT) (also known as low dose brachytherapy) is an established curative treatment option for low-grade prostate cancer. Permanent seed implants involve injecting approxi- mately 100 radioactive seeds into the prostate gland. They give off their radiation at a low dose rate over several weeks or months, and then the seeds remain in the prostate gland permanently. Although first used in the 1970s, BT is still commonly used nowadays because it is convenient and has perceived lower levels of toxicity. The treatment outcomes of BT is similar to surgical interventions or other radiotherapeutic modalities such as external beam radiotherapy (EBRT).

Prostate-specific antigen (PSA) is an important biomarker in diagnosis and follow-up of patients with prostate cancer after curative treatment. In contrast to radical prostatectomy, where PSA levels fall to undetectable levels shortly after surgery, in BT as in EBRT the PSA level tends to fall slowly over the course of two to five years. This is thought to be due to the slower process of tumor cell killing with radiotherapy. The PSA level even may fluctuate and increase temporarily without a clear reason both in BT and EBRT, albeit more often in BT: in 30-40% of BT-patients (Caloglu and Ciezki, 2009) and about 20% of EBRT-patients (Pinkawa et al., 2010). This phenomenon is called PSA bounce, which usually suggests high risk for the clinical recurrence of prostate cancer and biochemical failure (BF). Most studies choose an arbitrary increment in PSA level such as ≥ 0.2 ng/ml as PSA bounce (Caloglu and Ciezki, 2009)(Gaztanaga and Crook, 2013). However, the measurement error of PSA can be as large as 0.1 ng/ml (Kennedy et al., 2010) which makes the arbitrary threshold may be less reliable in some patients. Nevertheless, PSA bounce currently is still one of the most important references for clinicians to evaluate the treatment efficacy. An increment in PSA level ≥ 2 ng/ml is considered as the current standard definition (i.e., Phoenix definition) for BF(Thompson et al., 2010).

The aim of this study is to model the PSA changes after BT therapy and try to provide a statistical approach to identify PSA bounces that may indicate BF or recurrence of cancer.


Muriel Buri: Health risks and social significance of BMI ≥ 35 kg/m2 in Switzerland (Statistical Consulting Spring 2015)

In the year 2011, the requirements for a gastric bypass surgery covered by the standard insurance benefits of a basic health insurance from a person in Switzerland have got changed. The threshold of having a body mass index (BMI) ≥ 40 kg/m2 has been reduced to the new threshold of BMI ≥ 35 kg/m2. Additionally, the attending doctor no longer needs to prove for comorbidity - an additional disease due to the excess weight.

According to Fä̈h (2011), the gastric bypass surgery is a bariatric surgery that aids obese people to loose weight. In more detail, the stomach volume gets irreversibly reduced to a so-called small pouch, which then gets reconnected to the small intestine. In that way, the duodenum is no longer in direct contact with food as the food only passes through the small stomach pouch directly into the small intestine. Specialized intestinal cells get activated and they release metabolically active hormones, which lead to an improvement of the metabolism. Furthermore, they cause a much earlier and stronger saturation, so the patient gets support to loose weight. The gastric bypass surgery especially leads to a better result in terms of yo-yo dieting compared to the adjustable gastric band. Meanwhile many doctors favour the gastric bypass surgery because there is no foreign matter in the body of the patient. The follow-up and recover period is therefore easier to handle. The guidelines introduced in the year 2011 of getting such a surgery as an assistance for loosing weight have been criticized a lot. The set threshold of having a BMI ≥ 35 kg/m2 is diversely discussed and the opinions about it differ a lot.

The aim of this study is to focus on the general health risks and social significance of having a BMI ≥ 35 kg/m2. As part of an advanced secondary study - not part of this report - it is the aim to find a data based BMI threshold. The generated threshold will then no longer be arbitrary and therefore easier to comprehend and defend.


Natacha Bodenhausen: Subgroup analysis of the TOBY trial (Statistical Consulting Spring 2015)

Lack of oxygen shortly before or during birth may cause major brain dysfunction, called encephalopathy, which is manifested by lethargy, stupor or coma. About one quarter of babies suffering from perinatal asphyxial encephalopathy die and 40% has severe disability, occurring majors cost for families and society. Several clinical studies have shown that hypothermia, cooling of the body to 33.5 degrees Celsius.

In this study, we analyse cranial ultrasound scans obtained for a subset of the patients from one of the randomized clinical trial. The trial was called TOBY trial for Total Body Hypothermia for Neonatal Encephalopathy Trial and included 325 patients. Although the primary outcome of that trial, death and severe disability combined, was not improved by the cooling treatment, several secondary outcomes including scores on the Mental Developmental Index and Psychomotor Developmental Index were improved [2]. 246 patients were included in this substudy, 123 were assigned to undergo intensive care plus cooling and 123 were assigned to intensive care without cooling. Research questions: Are the cranial ultrasound scans predictive for the primary outcome? Is there a difference between cooled and non-cooled groups?


Yuchen Ling: Does Erythropoietin Improve Secondary Outcomes in Very Preterm Infants? (Statistical Consulting Spring 2015)

Erythropoietin, a cytokine widely used for treatment of anemia of prematurity, has been shown to have neuroprotective effects on the brain in many studies. Fauchère et al. (2008) has shown that no adverse effects were observed in the treatment of early administration of high-dose recombinant human erythropoietin (rhEpo) in very preterm infants. Hence, it enabled us to further investigate the treatment effect of rhEpo on other outcomes of interest in later studies. In this project, a prospective, multi-center, randomized, triple-blinded and placebo-controlled study was conducted in Switzerland.

Very preterm infants were recruited in Zurich, Aarau, Basel, Geneva and Chur from 01/2006 to 03/2012. For each patient, a treatment of “recombinant human erythropoietin (rhEpo) (3000 U/kg)” or an equivalent volume of placebo (NaCl 0.9%) were given at 3 hours, 12 to 18 hours and 36 to 42 hours age after birth. The primary outcome of this project was the mental development index (MDI) and the treatment effect on MDI was already analyzed in previous study. The aim of this project is to investigate the treatment effects on the secondary outcomes of very preterm infants between the Epo-treated group and the placebo treated group. The first study endpoint was 07/2014. Hence, data for this project were recorded when the patients were born and at 24 months after birth. Moreover, two ways of analysis, intention to treat analysis and per protocol analysis, are considered in this project.

The secondary outcomes of interest are
1. Growth at 24 months
2. Physical development index (PDI) at 24 months
3. Survival without disability at 24 months

The research question that we are most interested is to determine whether the recombinant human erythropoietin treatment improve secondary outcomes in very preterm infants.


Monika Hebeisen: Estimation from Interval-censored Time-to-event Data: Method Comparison by Simulation based on GALLIUM Study for Follicular Lymphoma (Master Thesis Fall 2014)

The current accepted approach to collect time-to-event data, e.g. PFS, in oncology studies is to have patients show up regularly for assessments and, in case of no event, censoring them at their last assessment date without event. However, especially in indolent disesases, the patient only has assessments in quite large intervals, e.g. 6 months. If a patient progresses asymptomatically, this might happen anywhere in these 6 months, i.e. potentially much earlier than the date of next assessment where the event is then assigned to. So in fact, we do not have right-censored data we are looking at, but interval-censored data: we only know an interval where the event happened in. The high-level goals of this project are to
• describe methods to estimate survival functions and hazard ratios from interval-censored data,
• summarize what R packages are available or to implement methods which are not available,
• run a simulation study mimicking the schedule and actual visit pattern of a typical lymphoma study with induction and maintenance,
• apply the different methods to estimate survival function and hazard ratio for the PRIMA and/or GALLIUM study, to get an idea about the differences for the different methods.


Jipcy Amador: Disease Progression and Survival in ALS Patients (Master Thesis Spring 2014)

Objectives: To evaluate the stability of the prediction model for Amyotrophic Lateral Sclerosis (ALS) disease progression proposed by researchers Hothorn T. and Jung H; to identify relevant factors for the prediction of disease course. Additionally, to determine the effect of the predictor factors on the survival time of ALS patients.

Methods: Both prediction models for ALS disease progression and survival time were obtained with a conditional random forest algorithm. Multivariate Cox proportional hazards models were used to determine the prognostic value of predictor factors. The results were based on 4838 patients, from the PRO-ACT (Pooled Resource Open-Access ALS Clinical Trials) database, characterized by variables measured at baseline (demographics, family and medical history), and variables measured at follow-up examinations (the ALS functional rating scale, ALSFRS, and its modified version ALSFRS-R; physiological pa- rameters; and biomarkers). The target predictor variable, and some prognostic factors, based on the ALSFRS, were computed using mixed-effects models.

Results: The risk of death increases when the disease progression indicators show more severity in the ALS symptoms, and the disease course is faster during the first three months since trial onset. Bulbar rather than limb onset is associated with faster disease progression, and higher risk of death. The risk of death increases when a patient suffers from cramps and speech deterioration as symptoms of previously diagnosed ALS disease. A decline in lung function is also associated with faster disease progression, and worsen- ing in survival time. The biomarkers bicarbonate, phosphorus and calcium are significant prognostic factors in the survival time of a patient. In line with previous findings, the biomarker creatinine has an impact on disease course and survival. The risk of death is increased with increases in pulse rate, respiratory rate, phosphorus, bicarbonate, cre- atinine, and potassium. Meanwhile risk of death is increased with decreases in weight, calcium, creatine kinase, and protein.

Conclusions: The stability of the proposed prediction model for disease progression is a step forward in establishing a model with high performance. This aids determining whether a patient will experience slow or fast disease progression, especially in the early stages of the disease. The following patients’ characteristics have a relevant effect on both disease progression and survival time: the ALSFRS slope up to 92 days (the most important variable), the ALSFRS-item-specific slopes, especially loss of useful speech in prediction, and climbing stairs difficulties in survival; the ALSFRS range (variability in- dicator) ; age; bulbar rather than limb onset; the physiological parameter vital capacity; and the biomarkers bicarbonate, phosphorus, and creatinine.


Katarina Matthes: A comparison of count-based and assembly-based methods for differential splice detection (Master Thesis Spring 2014)

Detection of differential isoform usage between experimental conditions, such as control versus treatment or disease versus healthy, is of significant biomedical relevance, especially given that splicing patterns are aberrant in many diseases. In particular, knowledge of pathological alternative splicing may allow the development of new treatments. To date, several methods to detect differential splicing from RNA-seq data have been published. These methods differ in the way they count or assemble RNA-seq reads and in their test statistics. To the best of our knowledge, there is no comprehensive comparison of these methods in terms of their detection performance. Using simulations as a constructed "truth", the differential splicing detection performance for the best-known methods at gene-level were assessed. DEXSeq, edgeR and voomex are compared using both exon- and event-based counts, from DEXSeq and MISO, respectively. In addition, Cuffdiff2 was selected as an assembly-based method. It is illustrated, that methods using event counts performed best in terms of their true positive rate, meaning they could detect the most truly differential spliced genes. In addition, methods using event counts controlled the false discovery better than exon-based methods. Moreover, voomex controls in average the FDR best and DEXSeq seems to have the most diffculties with controlling the FDR. Furthermore, the simulation performance was split into several groups: by expression level and number of transcripts. Not surprisingly, all methods exhibit better performance for higher expressed genes. Moreover, Cuffdiff2 appears to be somewhat conservative and could not detect well differentially spliced genes containing a larger number of transcripts compared to the other methods. All in all, it is demonstrated that usage of event-based counts yields notably better results than methods using exon counts and Cuffdiff2.


Lukas Bütikofer: Influence of Ungulate Browsing on Tree Regeneration (Master Thesis Spring 2014)

Ungulate browsing and its effect on tree regeneration was analyzed based on long-term data collected in the north-western part of Switzerland. Fixed areas installed in forest stands with a large variety of characteristics have been monitored for over ten years for both terminal shoot browsing and seedling numbers. The data is unique in terms of consistency of the sampling areas, and regularity and length of monitoring. We used a two-step modeling procedure to address the variables influencing ungulate browsing and parameters of tree regeneration such as seedling density, height composition and its temporal change, and species composition. Parameters were first estimated in a stratum-specific manner based on generalized linear mixed models (GLMM) that make use of the correlation structure within the data, and then entered random forest models together with a large number of covariates and – in case of the regeneration parameters – browsing probability. The modeling procedure was remarkably consistent, reliable and stable in reaction to various modifications. A very convincing correlation between browsing probability and roe deer density was discovered, suggesting that roe deer density may be the main factor that determines the extent of browsing. This effect was not modified by other variables such as forest stand characteristics, seedling height or snow depth.

Increasing browsing probability was associated with decreasing seedling density for all analyzed tree species in a non-linear manner. Already at low browsing levels seedling density dramatically dropped to almost half for some species but did not much change thereafter. The severity of the effect was clearly species- and might also be height-dependent. In particular Abies alba and Fagus sylvatica were susceptible, Picea abies, Acer pseudoplatanus, Fraxinus excelsior and Sorbus aucuparia were much more resilient. Taller seedlings were relatively more affected indicating that a direct impact of browsing on mortality cannot be the sole process responsible for seedling loss. Indirect effects via competition between seedlings and a reduction in growth rate are likely.

Other variables that modify the effect of browsing on seedling density have not been found and, in particular, forest density and forest type did not influence browsing-associated loss in seedling density. Little evidence for a browsing-dependent change in seedling height and species composition was detected. However, ungulate browsing was associated with a decreased proportion of the tallest compared to the smallest seedling and might contribute to a relative reduction of Abies alba with increasing seedling height. The time-dependent change in in the proportion of taller seedlings – an approximated species growth rate – was also found to decrease in association with browsing at already low levels, very similar to seedling density. Overall, terminal shoot browsing exerted a variety of effects on tree regeneration from which most were already observed at low browsing levels. Very high compared to moderate levels of ungulate browsing did not significantly alter the situation.


Fiona Huang: Evaluation of CD4 and CD8 as progression markers for untreated and treated HIV-1 infection (Master Thesis Fall 2013)

This thesis discusses the influence of lagged CD8 lymphocyte values on CD4 lymphocytes during a HIV disease progression. The thesis is part of a Swiss HIV cohort (SHCS) project which investigates the role of CD8 lymphocytes in HIV infected persons. CD8 lymphocytes have been suggested as a possible marker for a HIV disease progression which is reflected by CD4 cell counts. The data collected by the SHCS offers the possibility to analyse such a hypothesis as the SHCS systematically collects CD8 lymphocyte values along with CD4 lymphocytes and RNA viral load measurements.

Using linear mixed models the influence of a combination of lagged CD4 and CD8 lymphocyte subtypes, the influence of the time span between two observations and the influence of additional confounders on CD4 lymphocyte counts is investigated. Moreover, lagged RNA viral load measurements are considered.

This thesis analyses the SHCS data in a novel way addressing the question of the influence of CD8 lymphocytes on CD4 lymphocytes during an untreated and HAART treated HIV disease. New insights regarding the dynamics of lymphocyte subtypes for HIV infected individuals are provided by applying the standard method of linear mixed models in a longitudinal data context.


Isaac Gravestock: Bayesian Tree Models Priors and Posterior Approximations (Master Thesis Fall 2013)

A tree model is a recursive partitioning scheme that can represent a flow chart or decision making process. An example is a clinical decision process that could be taken regarding a patient’s risk of disease. A tree is referred to as a classification tree if the values on the leaves are categories and a regression tree if the values are numerical. As a prediction model, observations are assigned a value by starting at the root of the tree and progressing downwards along the branches. They descend either to the left or right depending on the decisions of the splitting rules. When the observation reaches a leaf, it is assigned the value of that leaf. For example, a patient aged 41 with no family history of a certain disease would be classified as low risk using the respective tree.

The most important advance in tree models came with Breiman et al.’s 1984 book which proposed the Classification and Regression Trees (CART) algorithm. This recursive partitioning algorithm grows a tree by iteratively searching for the optimal local splitting rule and adding it to the tree. Once a large tree is constructed the tree is “pruned”, removing terminal splitting nodes that do not improve the fit of the model above a given threshold.

Tree model algorithms such as CART have been found to be quite unstable: small changes to the training data can change the fitted model considerably. One way to handle this is to use ensemble methods such as random forests, bagging and boosting. These methods fit many trees and combine the predictions in various ways. By doing so their predictive performance is increased, but they lose a key benefit of the tree model: its simplicity.


Sarah Thommen: Analysis of Thyroid Cancer Data (Statistical Consulting Spring 2013)

To treat thyroid cancer by surgery, thyroidectomy is the common procedure. Additionally the lymph nodes which are located nearby the thyroid gland can be removed as well. These additional dissections are called central neck dissection (CND) or lateral neck dissection (ND). There is no standard treatment which is generally performed. The choice of the surgery is depending on the surgeon and controversy discussed. The aim of the study is to find out, weather performing CND as a standard treatment would be reasonable. The research question are:

1. Does central neck dissection have a effect on the overall survival (OS) of patients, diagnosed with thyroid cancer?

2. Does central neck dissection have a effect on the disease-free survival (DFS) of patients, diagnosed with thyroid cancer?

To test for statistically significant differences between two groups, Fisher’s Exact Test for Count Data, Wilcoxon rank sum test, Kruskal-Wallis-Test or Student’s t-Test was performed. To visualize the survival, Kaplan-Meier plots were made.

The cause of death was very often not the cancer (only 3 out of the 20 that died, died due to cancer). It is known, that it is very problematic to assign deaths (entirely) to cancer or not to cancer. Therefore for the overall survival, relative survival analysis was performed as well. With this method, the observed survival of the patient included in the data-set is compared to the survival that would have been expected for those patients if they had no cancer. The expected survival is computed from a healthy reference population. The aim is to measure the difference of survival for the study population due to the cancer. For relative survival analysis, the data was matched on sex, year of death and age at diagnosis. The reference population mortality rates were taken form the population of Switzerland.

According to the analysis above, on one hand central neck dissection does not have a statistically significant effect on the overall survival of patients suffering from thyroid cancer. On the other hand, there is some evidence that patients receiving central neck dissection and/or lateral neck dissection in addition to the thyroidectomy are at higher risk to relapse or die.

I would conclude, that performing CND as a standard treatment is not recomandable, because there is not enough evidence that there is an effect on survival and if there is an effect, its rather a negative than a positive one. Also the risk and harm an additional surgery causes, is probably worse for the patient. The estimates we obtained are probably biased, due to selection bias, allocation bias and assessment bias. The reason for this assumption is that the data was obtained from different hospitals, in different departments, with different selection procedures and surgeons involved. Moreover, the whole study was not blinded.


Fiona Huang: Analysis of Heart Insufficiency Data (Statistical Consulting Spring 2013)

For patients being under therapy of heart insufficiency over years, physicians observed that the blood pressure decreased over time. They want to know if there is association between

1. Blood pressure (BP) level and patient mortality

2. Reduction rate of blood pressure and patient mortality

Cox proportional hazard models with time-dependent covariates have been fitted to the data and the resulting estimates lead to the following conclusions. It seems to be quite important to keep the relative stability of BP. Patients with stable blood pressure level were associated with low death rate. Those with very variable blood pressure level were more likely to die, no matter if the blood pressure increased or decreased. Here, we considered patients with change rate of SBP in (-0.0276,0.0277] and change rate of DBP in (-0.0201,0.0189] as patients with stable blood pressure level. Blood pressure is related to the heart’s ability to pump. If ∆ BP reduced, the heart’s ability to pump decreased. Low ∆ BP level was associated with high mortality rate. From further analysis we see that, it is better to improve patients' ∆ BP very gently so that their ∆ BP would go to a higher level and meanwhile their blood pressure could keep stable.


Isaac Gravestock: Analysis of Hand Surgery Data (Statistical Consulting Spring 2013)

In this retrospective study of the rehabilitation of patients who have undergone tendon repair surgery at the University Hospital Zurich between 2006 and 2011, the main objective is to compare two rehabilitation protocols: Kleinert and Controlled Active Motion (CAM). Kleinert was the predominant protocol in use up to 2011, with others being used as appropriate. Rehabilitation using CAM was introduced in 2011. The endpoints used to compare the treatments are the rate of complications during rehabilitation and active range of motion. The complications considered are infection, rupture and adhesion of the tendon. The range of motion measure used is total active motion (TAM), the sum of the angles through which the patient can actively flex the three finger joints.

A naive analysis of TAM measurements under the rehabilitation protocols was conducted using a t-test with unequal variances. This analysis assumes that all fingers are independent, however since some patients had surgery on multiple fingers, the results are biased. To remove this assumption, generalised estimating equations (GEE) with an exchangeable correlation structure were used to fit linear models for TAM for each time point. Possible confounders were identified using F -tests of univariate GEE linear models. Further GEE linear models were fitted to adjust for confounders.


Sih-Jing Liao: Fractional Polynomials with Test-based Bayes Factors (Master Thesis Fall 2012/Spring 2013)

In a regression model, the relationship between explanatory and response variables is generally curved. To describe this curvy relation, conventional polynomials are frequently exploited. Yet, the choice of polynomial order is an important issue, as it dominates the performance of fitting. Low order restricts the model shapes, whereas high order may lead to overfitting. To alleviate the effect of power selection, the fractional polynomials were proposed for regression as it provides a general family of parametric models, derived from a set of predefined powers. A framework of a multivariable analysis based on fractional polynomials for describing continuous variables known as multivariable fractional polynomials was devised by classical approach and it is also implemented by a Bayesian paradigm. However, with Bayesian model selection, the choice of prior model and the calculation of marginal likelihood are always issues, and these limit the approach from being incorporated with the multivariable fractional polynomials. To avoid the complex computation of marginal likelihood and reduce the subjectivity associated with prior selection, an objective approach for Bayesian model selection procedure is studied in this work, where a novel method for calculating Bayes factor is introduced, along with four different ways to determine prior parameter.


Florian Gerber: Outline of a Tiling Correction for a SIMS Experiment (Statistical Consulting Spring 2012)

The development of an organism requires perfectly controlled and organized tissue differentiation. This is investigated by Florin Marty and Erich Brunner in the wing development of a fruit fly drosophila melanogaster. In this process the emergences of compartments regulates many key processes such as growth and metamorphosis. Although much is known about the genes involved there is still a lack of knowledge on the executing molecules. Using SIMS (Secondary Ion Mass Spectronomy) biologists are trying to identify such key molecules. Since the SIMS technology is a more recent technology there are still unaddressed issues. One of these problems is examined in this work. If SIMS experiments are developed further they may be also useful for clinical diagnostics e.g. of tumors.

In our case the SIMS experiment is done at the AMOLF institute in Amsterdam. The resulting data has the following structure: for each location (x, y) the number of particles n with a certain m/z-ratio (mass-to-charge ratio) p is measured. The amount of raw data produced for one tissue measurement is about 1 GB. In order to do (biological) interpretations of the data they are processed further including a final PCA step. The present project is centered around the following research questions:

  1. Is is possible to correct or adjust SIMS data (raw, level one or level two) such that the principal components do not exhibit a tiling pattern?
  2. How does the tiling mechanism vary over an individual chip and over different m/z-ratios?
  3. Can we propose statistical approaches to correct for the tiling structure?