On model-based clustering and directional outlier detection

  • Thu 16 Jan 20

    14:00 - 16:00

  • Colchester Campus

    STEM Centre 3.1

  • Event speaker

    Professor Cristina Tortora

  • Event type

    Lectures, talks and seminars
    Mathematical Sciences Departmental Seminar

  • Event organiser

    Mathematical Sciences, Department of

  • Contact details

    Andrew Harrison

Mathematical Sciences Departmental Seminar

These Departmental Seminars are for everyone interested in Maths. We encourage anyone interested in the subject in general, or in the particular subject of the seminar, to come along. It's a great opportunity to meet people in the Maths Department and join in with our community. 

Refreshments are shared in the Department (STEM 5.1) after every seminar.

On model-based clustering and directional outlier detection

Professor Cristina Tortora

Model-based clustering assumes that the data were generated from a convex combination of densities. The choice of the density function is crucial; the multivariate contaminated normal distribution (MCN) was proposed to model datasets characterized by the presence of outliers.

The MCN is a two-component Gaussian mixture; one of the components, with a large prior probability, represents the good observations, and the other, with a small prior probability, the same mean, and an inflated covariance matrix, represents the outliers. Mixtures of MCN distributions can detect outliers and perform cluster analysis improving the clustering performance when compared to normal mixtures and representing an alternative to t mixtures.

However, the mixture of MCN distributions has two drawbacks; it assumes symmetry around the means and it uses univariate parameters to model the proportion of outliers and their impact on the inflation parameter, i.e., they are the same for all the variables. This is a limit because clusters can be skewed and the outliers may be different in each dimension.

In this talk, we will address those issues presenting a paradigm for parameterizing contamination and skewness within variants of the mixtures of shifted asymmetric Laplace (SAL) distributions. These models will be able to provide both group labels for like observations and detect whether an observation is an outlying point, unifying the fields of model-based learning and outlier detection.

Of particular interest are the multiple scaled variants of the mixtures of SAL distributions which allow for directional contamination and skewness, resulting in contours that do not have the traditional elliptical shapes.


Professor Cristina Tortora is an Assistant Professor in the Department of Mathematics and Statistics, San Jose State University.


Related events