MAS3916 : Discrete Stochastic Modelling & Survival Analysis (Inactive)
- Inactive for Year: 2024/25
- Module Leader(s): Dr Phil Ansell
- Lecturer: Professor Robin Henderson
- Owning School: Mathematics, Statistics and Physics
- Teaching Location: Newcastle City Campus
Semesters
Your programme is made up of credits, the total differs on programme to programme.
Semester 1 Credit Value: | 10 |
Semester 2 Credit Value: | 10 |
ECTS Credits: | 10.0 |
European Credit Transfer System |
Aims
To gain an understanding of some of the areas of stochastic modelling in discrete time that underpin quantitative descriptions of population growth, epidemics and the analysis of DNA sequences
To provide an appreciation of the need for and an understanding of, the principal statistical methods required in the analysis of survival data.
Module summary
Many random processes can be thought of as evolving through a sequence of successive generations. For example, population growth depends on which individuals successfully produce offspring in the next generation; the transmission of a disease through a population depends on how individuals interact from day to day. Models which incorporate random variation allow prediction of important quantities such as the size of a population or duration of an epidemic, as well as variability in these estimates. This module presents techniques for modelling such processes with applications drawn in particular from the biological sciences. Branching processes will be introduced as a means of modelling population growth. Stochastic models which describe epidemic growth by considering the size of the infected and immune subpopulations will then be studied. These have been important in recent years for the analysis of influenza and other epidemics.
DNA sequences can be considered as strings of letters from the four-letter alphabet {A,C,G,T}. Markov chains provide a useful stochastic model to describe the probability of the letter at the next
site in the sequence given the letter at the current site. However, the presence of genes and other functional elements within a sequence suggest that more sophisticated models are required, in particular, models which allow these transition probabilities to vary along the length of the sequence. Originally developed for automatic speech recognition, hidden Markov models have proved to be a remarkably flexible and powerful model for automatic segmentation and gene-finding in DNA sequences. As their name suggests, hidden Markov models are based on an unobserved Markov chain. Methods for estimating this "hidden" Markov chain will be considered. Computational algorithms will be developed in R.
There are many areas where interest focuses on data which measures the time to some event. In recent decades the principal application for such data has been how long patients survive before some event occurs. The event may be death or it may be the recurrence of a disease which had been in remission, or some other event. Applications are not solely medical: how long it takes a battery to run down or how long a component in a machine lasts before it fails are just two industrial examples. Such data are known as survival data, or sometimes lifetime data, and their analysis is called survival analysis. The main complication with survival data is that many observations will be ‘censored’, i.e. they are only partially observed. For example, when a trial of a new treatment for cancer is terminated many of the patients will still be alive. Therefore the survival times of those who died will be known exactly whereas for those still alive at the end of the trial, their survival time is only known to exceed their present survival. Methods for dealing with this form of data will be considered.
Outline Of Syllabus
Review of Markov chains. Probability generating functions, random sums of discrete random variables. Branching processes and extinction probability. Stochastic models of epidemics: the SIS, Greenwood and Reed-Frost models. Comparison with deterministic models. Duration and size of epidemics.
Markov chain models; model choice. Hidden Markov models; simulation; inference via maximum likelihood; forward-backward algorithm; local and global decoding; Baum-Welch algorithm. Application to DNA sequence analysis.
Time-to-event data, censoring patterns. Non-parametric survival analysis: calculation of Kaplan-Meier estimates; use of log-rank statistics. Parametric survival analysis: exponential, Weibull and log-logistic distributions; likelihood analysis of effect of covariates. Proportional hazards model: partial likelihood; diagnostics; time-varying effects. Frailty. Prediction and explained variation.
Teaching Methods
Teaching Activities
Category | Activity | Number | Length | Student Hours | Comment |
---|---|---|---|---|---|
Structured Guided Learning | Lecture materials | 36 | 1:00 | 36:00 | Non-Synchronous Activities |
Scheduled Learning And Teaching Activities | Lecture | 9 | 1:00 | 9:00 | Synchronous On-Line Material |
Guided Independent Study | Assessment preparation and completion | 30 | 1:00 | 30:00 | Completion of in course assessments |
Scheduled Learning And Teaching Activities | Lecture | 9 | 1:00 | 9:00 | Present in Person |
Structured Guided Learning | Structured non-synchronous discussion | 18 | 1:00 | 18:00 | Non Synchronous Discussion of Lecture Material |
Scheduled Learning And Teaching Activities | Drop-in/surgery | 4 | 1:00 | 4:00 | Office Hour or Discussion Board Activity |
Guided Independent Study | Independent study | 94 | 1:00 | 94:00 | Lecture preparation, background reading, course review |
Total | 200:00 |
Teaching Rationale And Relationship
Non-synchronous online materials are used for the delivery of theory and explanation of methods, illustrated with examples, and for giving general feedback on assessed work. Present-in-person and synchronous online sessions are used to help develop the students’ abilities at applying the theory to solving problems and to identify and resolve specific queries raised by students, and to allow students to receive individual feedback on marked work. In addition, office hours/discussion board activity will provide an opportunity for more direct contact between individual students and the lecturer: a typical student might spend a total of one or two hours over the course of the module, either individually or as part of a group.
Assessment Methods
The format of resits will be determined by the Board of Examiners
Exams
Description | Length | Semester | When Set | Percentage | Comment |
---|---|---|---|---|---|
Written Examination | 120 | 2 | A | 80 | N/A |
Other Assessment
Description | Semester | When Set | Percentage | Comment |
---|---|---|---|---|
Written exercise | 1 | M | 8 | written exercises |
Written exercise | 2 | M | 12 | written exercises |
Assessment Rationale And Relationship
A substantial formal examination is appropriate for the assessment of the material in this module. The course assessments will allow the students to develop their problem solving techniques, to practise the methods learnt in the module, to assess their progress and to receive feedback; these assessments have a secondary formative purpose as well as their primary summative purpose.
Reading Lists
Timetable
- Timetable Website: www.ncl.ac.uk/timetable/
- MAS3916's Timetable