Research Group

Data science

A Black woman is standing on the left, pointing at some data displayed on a large screen on the wall. A white woman is sitting on her right, looking up at the screen she is pointing at.

Data science unifies statistics, data analysis and their related methods in order to understand and analyse phenomena in applications ranging from healthcare to finance.

It is an interdisciplinary research area that employs techniques and theories drawn from many fields. In our department our main areas of research interest are statistics and actuarial science, and operational research. Professor George Alfred Barnard was one of the first professors of the department (1966-1975). He served as President of the Operational Research Society (1962–1964), the Institute of Mathematics and its Applications in (1970–1971) and the Royal Statistical Society (1971-1972).

Many of our academics are also members of the university's Institute for Analytics and Data Science, and work closely with the Institute for Social and Economic Research, the Essex Business School, the School of Computer Science and Electronic Engineering, and the School of Life Sciences.

The data science research group has the following four research themes:

  • Actuarial Mathematics theme (Tolulope Fadina, Junlei Hu, Peng Liu, Spyridon Vrontos, and Jackie Wong) - Theme members conduct multidisciplinary research in the broad areas of actuarial science and finance, including predictability, asset-liability management, risk management and risk theory, mathematical finance, financial data science and applied probability in actuarial science and queueing systems.
  • Data Science and Statistical Learning theme (Joe Bailey, Mario Gutierrez-Roig, Stella Hadjiantoni, Andrew Harrison, Berthold Lausen, and Osama Mahmoud) - Theme members work on a range of data science methodologies covering artificial intelligence, statistical learning, computational statistics, epidemiology, bioinformatics and environmental statistics.
  • Operational Research (OR) theme (Georgios Amanatidis, Fanlin Meng, Abdel Salhi, and Xinan Yang) - Theme members conduct multidisciplinary research in the broad areas of OR and mathematical modelling including linear and nonlinear programming, combinatorial optimisation, deterministic and stochastic dynamic programming, algorithm (heuristics) design and analysis (including the novel Plant Propagation Algorithm (PPA) developed by Salhi), implementation of algorithms, data analytics and applications in portfolio selection, labour scheduling, green distribution, and predictive modelling.
  • Statistical Methodology theme (Yanchun Bao, Hongsheng Dai, and Yassir Rabhi) - Theme members work on a broad range of statistics and applied probability topics, including Bayesian statistics, longitudinal and survival analysis, causal inference, applied probability, exact Monte Carlo simulation such as Monte Carlo (or Bayesian) Fusion methods, nonparametric estimation for bivariate survival functions, semiparametric and nonparametric methods for length-biased and censored data.

Essex Data Science Seminar Series

Our group runs a regular research seminar series throughout the academic year. Along with hosting talks from our academics and research students, we also invite experts from other institutions to present their latest work.

Our seminars are open to anyone at the University of Essex who may be interested in the topic being discussed.

Upcoming research seminars

Summer term 2021

20th May 2021 - "Analysing distributed personal health data in a privacy-preserving manner" - Chang Sun, Maastricht University.

13th May 2021 - "Properties of the bridge sampler with a focus on splitting the MCMC sample" - Jackie Wong Siaw Tze, University of Essex.

6th May 2021 - "What Knowledge Management Strategy for a data driven audit approach?" - Jihane Britel, VINCI energies FRANCE.

29th April 2021 - "Estimating mode effects from a sequential mixed-modes experiment" - Yanchun Bao, University of Essex.

Spring term 2021

25th March 2021 - "Bayesian Analysis of chromosomal interactions in Hi-C data using the hidden Markov random field model" - Godwin Osuntoki, University of Essex.

18th March 2021 - "Singular Learning Theory and Information Criteria" - Sumio Watanabe, Tokyo Institute of Technology.

11th March 2021 - "Approximating images by optimally arranging polygons: a heuristic study into computational art" - Daan van den Berg, University of Amsterdam

4th March 2021 - "Sampling-Assisted Inference of Intractable Models" - Bo Zhang, University of Essex.

25th February 2021 - "Symmetric measures of variability induced by risk measures" - Dr Tolulope Fadina, University of Essex.

18th February 2021 - "Using A.I. and street-view images for estimating socio-economic indicators" - Dr Mario Gutiérrez-Roig, University of Essex.

28th January 2021 - "Selection bias, missing data and causal inference" - Professor Kate Tilling, University of Bristol.

21st January 2021 - "Linear Algebra and Neural Approaches for Representation Learning" - Dr Tingting Mu, University of Manchester.

Highlights of Autumn 2020 seminars

EpiViz: an implementation of Circos plots for epidemiologists

Matt Lee, a PhD student from the University of Bristol, delivered a talk on the use of Circos plots in epidemiology.

Biological pathways involve numerous processes, but epidemiology studies predominantly focus on single exposure and single outcome associations. This is primarily because identifying meaningful intermediate associations that can be taken forward for further analysis is complex.

In his talk, Matt discussed how tools like EpiViz can be used to produce simple and efficient Circos plots for those new to programming and data visualisation. By giving people a tool that makes data visualisation easier to produce, epidemiologists can gain a better understanding of the results of complex epidemiological studies. Greater insight in to the results can help increase the impact of such studies.

Related papers

A Statistician’s Botanical Garden - The Ideas behind Trees, Model-Based Trees and Random Forests

Classification and regression trees, model-based trees and random forests are powerful statistical methods from the field of machine learning. However, while individual trees are easy to interpret, random forests are "black box" prediction methods. Despite this, they provide variable importance measures, that are being used to judge the relevance of the individual predictor variables.

In this seminar, Professor Carolin Strobl introduced the rationale behind trees, model-based trees and random forests, and illustrated their potential for high-dimensional data exploration, while also pointing out limitations and potential pitfalls in their practical application.

Related papers

Detecting the hierarchical structure of the cell nucleus

Chromatin consists of DNA wrapped around histones and forms complex three-dimensional structures within the cell nucleus with various degrees of compaction.

Genes have been shown to be repressed by their proximity to the nuclear periphery or activated by being in contact with special regulatory regions called enhancers. Thus the relative positioning of genes and their interactions with other regions are very important in determining whether they are expressed or not.

In this talk, Iona Olan from the University of Cambridge discussed her work on cellular senescence, a phenotype associated with dramatic changes in its chromatin interactions network relative to normal cells. Senescence corresponds to permanent cell cycle arrest and has been shown to act as a protective barrier against tumourigenesis.

Related papers

Our academics

Dr Georgios Amanatidis

Lecturer in Mathematics

Department of Mathematical Sciences, University of Essex

Dr Joseph Bailey

Lecturer in Environmetrics

Department of Mathematical Sciences, University of Essex

Research area: Statistics

Dr Yanchun Bao

Lecturer in Data Science and Statistics

Department of Mathematical Sciences, University of Essex

Research area: Statistics

Dr Hongsheng Dai

Reader in Statistics

Department of Mathematical Sciences, University of Essex

Research area: Statistics

Dr Tolulope Fadina

Lecturer in Actuarial Science and Finance

Department of Mathematical Sciences, University of Essex

Research area: Actuarial science.

Dr Mario Gutierrez-Roig

Lecturer in Data Science and Statistics

Department of Mathematical Sciences, University of Essex

Research area: Statistics

Dr Stella Hadjiantoni

Lecturer in Data Science and Statistics

Department of Mathematical Sciences, University of Essex

Research area: Statistics

Dr Andrew Harrison

Senior Lecturer in Data Science

Department of Mathematical Sciences, University of Essex

Research area: Statistics

Dr Junlei Hu

Lecturer in Actuarial Science

Department of Mathematical Sciences, University of Essex

Research area: Actuarial science

Professor Berthold Lausen

Professor of Data Science

Department of Mathematical Sciences, University of Essex

Research area: Statistics.

Dr Peng Liu

Lecturer in Actuarial Science and Finance

Department of Mathematical Sciences, University of Essex

Research area: Actuarial science.

Dr Osama Mahmoud

Lecturer in Data Science and Statistics

Department of Mathematical Sciences, University of Essex

Research area: Statistics.

Dr Fanlin Meng

Lecturer in Data Science

Department of Mathematical Sciences, University of Essex

Research area: Statistics

Dr Yassir Rabhi

Lecturer in Data Science and Statistics

Department of Mathematical Sciences, University of Essex

Research area: Statistics.

Professor Abdellah Salhi

Professor of Operational Research

Department of Mathematical Sciences, University of Essex

Research area: Operational research

Dr Spyridon Vrontos

Senior Lecturer in Actuarial Science

Department of Mathematical Sciences, University of Essex

Research area: Actuarial science

Dr Jackie Wong Siaw Tze

Lecturer in Actuarial Science

Department of Mathematical Sciences, University of Essex

Research area: Statistics, actuarial science.

Dr Xinan Yang

Senior Lecturer in Operational Research

Departmental of Mathematical Sciences, University of Essex

Research area: Operational research