MATH3001 Project in Mathematics
Project: Survival Analysis
Summary
Survival analysis is concerned with data on how long it takes for a certain event to occur, such as: death of a patient after onset of a disease; breakdown of a home appliance after purchase; making a claim after taking out a car insurance policy; finding a job after graduation from university, etc. The appropriate statistical methods to analyse and model survival data are of critical importance in medical studies and actuarial work, as well as in other settings. The most characteristic feature of survival data, which makes the statistical analysis particularly challenging and interesting, is that some observations may be incomplete (censored), e.g. when it is only known that an individual was alive up to a certain time, which may be caused by loss to follow up or just the end of study.
Pre-requisites
MATH2715 Statistical Methods or MATH2735 Statistical Modelling or equivalent. This project is not available for students who have taken MATH2775 Survival Analysis.
Objectives
The aim of this project is to study in some depth statistical methods for the analysis of time-to- event data, with applications to medical, actuarial and industrial practices. Students will also acquire skills of analysing survival data sets using standard routines in the statistical software package R.
In its first half, the project will cover the basic principles and techniques of statistical analysis of survival data, including a selection from the following topics:
● Nonparametric estimation: the Kaplan–Meier and Nelson–Aalen estimators.
● Comparison of survival curves; the log-rank tests.
● Estimation of the variance; Greenwood’s formula.
● Parametric survival models.
● Maximum likelihood estimation in the presence of censoring.
● The Cox proportional hazards model.
● The accelerated failure time (AFT) model.
Further topics, to be included in the second half of the project, are subject to students’ individual choices and preferences, which may be one or more of the following:
1. Critical review of the Kaplan–Meier estimator of the survival function.
2. Point processes approach to survival modelling.
3. Interval censoring and justification of actuarial (life-table) estimators.
4. Model building and selection of significant explanatory variables.
5. Justification of the log-likelihood ratio tests.
6. Critical review of the Cox regression model; partial likelihood.
7. Model checking in parametric models using residuals.
8. Model checking in the Cox regression model; Cox{Snell residuals.
9. Testing the assumption of proportional hazards.
10. Proportional odds models.
11. Time-dependent variables.
12. Frailty models.
13. Non-proportional hazards and institutional comparisons.
14. Competing risks.
15. Multiple events and event history modelling.
16. Dependent (informative) censoring.
Bibliography
Books
1. D. Collett, Modelling Survival Data in Medical Research , Texts in Statistical Science, Chapman & Hall / CRC, Boca Raton { London, 1994 (1st ed.), 2003 (2nd ed.), 2015 (3rd ed.).
2. D. R. Cox and D. Oakes, Analysis of Survival Data, Chapman & Hall, London, 1984.
3. R. C. Elandt-Johnson and N. L. Johnson, Survival Models and Data Analysis, John Wiley & Sons, New York, 1980.
4. J. D. Kalb eisch and R. L. Prentice, The Statistical Analysis of Failure Time Data, Wiley Series in Probability and Mathematical Statistics, John Wiley & Sons, New York, 1980 (1st ed.), 2002 (2nd ed.).
5. G. Brostrom, Event History Analysis with R, Chapman & Hall/CRC The R Series, CRC Press, Boca Raton, FL, 2012.
6. M. Mills, Introducing Survival and Event History Analysis, SAGE Publ., London, 2011.
Selected Papers
7. B. Altshuler, Theory for the measurement of competing risks in animal experiments, Mathematical Biosciences 6 (1970), 1{11. (https://doi.org/10.1016/0025-5564(70) 90052-0)
8. J. P. Costella, A simple alternative to Kaplan{Meier for survival curves, Preprint (2010), available at https://johncostella.com/physics/survival.pdf.
9. D. R. Cox, Regression models and life-tables, Journal of the Royal Statistical Society, Series B 34 (1972) 187{220. (https://doi.org/10.2307/2985181)
10. P. Hougaard, Fundamentals of survival data, Biometrics 55 (1999), 13{22. (https:// doi.org/10.1111/j.0006-341X.1999.00013.x)
11. E. L. Kaplan and P. Meier, Nonparametric estimation from incomplete observations, Journal of the American Statistical Association 53 (1958), 457{481. (https://doi.org/ 10.1080/01621459.1958.10501452)
12. R. Peto and J. Peto, Asymptotically efficient rank invariant test procedures, Journal of the Royal Statistical Society, Series A 135 (1972), 185–207. (https://doi.org/10.2307/ 2344317)