Staff Profile
Bayesian hierarchical models are useful for applications where the latent variables are usually of interest. One example of my current research is on stochastic block models for networks, where the goal is to uncover the underlying groupings of the nodes. The models are essentially mixture models with the network itself being the data. Another example of my research is on econometrics models for house prices, where the goal is to infer the house price index based on prices of houses (which are naturally heterogeneous goods) observed at different and irregular time points. As my work focuses on both building a useful model and carrying out the inference, efficient and scalable computational algorithms are equally important. While I use Markov chain Monte Carlo (MCMC) routinely, other methods are currently being explored.
I am also interested in studying the degrees of the networks, especially on the (mis-)use of the discrete power law and the incorporation of extreme value theory. While the power law captures the essence of most of the data, the right-hand tail is usually ill-fitted, but there is where extreme value theory comes to the rescue as, by definition, it deals with outliers and extreme observations.
I have worked with a charity Freegle (https://www.ilovefreegle.org/) through the Statisticians for Society scheme under the Royal Statistical Society. Our case study is here.
As an advocate of reproducibility and more generally open research, I am current the local network co-lead of the UK Reproducibility Network at Newcastle University. I have also participated in producing open-source training materials on literate programming with Elixir.
My profile on Google Scholar, ResearchGate, ORCID are as linked. My personal webpage is https://clement-lee.github.io/.
Current research interests:
- Bayesian hierarchical models
- Computationally intensive statistics (e.g. Markov chain Monte Carlo)
- Network modelling (e.g. preferential attachment models, stochastic block models)
- Extreme value theory for discrete data (e.g. degree distribution)
R packages:
PhD supervision:
- Chris Drowley: https://geospatialcdt.ac.uk/team/christopher-drowley/
- Thomas Boughen
2023/24:
- MAS8403 Statistical Foundations of Data Science
- MAS8383 Statistical Learning Methodology
- MAS8384 Bayesian Methodology
2022/23: MAS8951 Modern Bayesian Inference
-
Articles
- Lee C, Eastoe E, Farrell A. Degree distributions in networks: Beyond the power law. Statistica Neerlandica 2024, epub ahead of print.
- Svalova A, Walshaw D, Lee C, Demyanov V, Parker NG, Povey MJ, Abbott GD. Estimating the asphaltene critical nanoaggregation concentration region using ultrasonic measurements and Bayesian inference. Scientific Reports 2021, 11(1), 6698.
- Lee C, Wilkinson DJ. A hierarchical model of non-homogeneous Poisson processes for Twitter retweets. Journal of the American Statistical Association 2020, 115(529), 1-15.
- Foster E, Lee C, Imamura F, Hollidge SE, Westgate KL, Venables MC, Poliakov I, Rowland MK, Osadchiy T, Bradley JC, Simpson EL, Adamson AJ, Olivier P, Wareham N, Forouhi NG, Brage S. Validity and reliability of an online self-report 24-hour dietary recall method (Intake24): A doubly-labelled water study and repeated measures analysis. Journal of Nutritional Science 2019, 8, e29.
- Lee C, Wilkinson DJ. A Review of Stochastic Block Models and Extensions for Graph Clustering. Applied Network Science 2019, 4, 122.
- Lee C, Garbett A, Wilkinson DJ. A network epidemic model for online community commissioning data. Statistics and Computing 2018, 28(4), 891-904.
-
Conference Proceedings (inc. Abstract)
- Garbett A, Chatting D, Wilkinson G, Lee C, Kharrufa A. ThinkActive: Designing for pseudonymous activity tracking in the classroom. In: CHI '18: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 2018, Montreal, Canada: Association for Computing Machinery.