Skip to content

I'm looking for...

Courses Research People Something else

Unable to find any suggestions for your query...

Prefer to see our subject areas?

Browse courses by subject

Unable to find any suggestions for your query...

Looking for funded postgraduate opportunities?

View doctoral training partnerships Browse postgraduate research opportunities

Unable to find any suggestions for your query...

We are different and we are the same. #WeAreEssex

Unable to find any suggestions for your query...

Looking for student or staff information?

Student Directory Staff Directory

Courses

Unable to find any suggestions for your query...

Prefer to see our subject areas?

Browse courses by subject

Research

Unable to find any suggestions for your query...

Looking for funded postgraduate opportunities?

View doctoral training partnerships Browse postgraduate research opportunities

People

Unable to find any suggestions for your query...

We are different and we are the same. #WeAreEssex

Something else

Unable to find any suggestions for your query...

Looking for student or staff information?

Student Directory Staff Directory

Event

Symbolic Data Analysis: Parametric multivariate analysis of interval data

Professor Paula Brito, Universidade do Porto

Thu 25 Apr 19

11:00 - 12:30
Colchester Campus

EBS.2.66
Event speaker

Professor Paula Brito
Event type

Lectures, talks and seminars
Mathematical Sciences Departmental Seminar
Event organiser

Mathematics, Statistics and Actuarial Science, School of
Contact details

Andrew Harrison harry@essex.ac.uk

Symbolic Data is concerned with analysing data with intrinsic variability, which is to be taken into account. In Data Mining, Multivariate Data Analysis and classical Statistics, the elements under analysis are generally individual entities for which a single value is recorded for each variable - e.g., individuals, described by age, salary, education level, etc.

However, when the elements of interest are classes or groups of some kind - the citizens living in given towns; car models, rather than specific vehicles - then there is variability inherent to the data.

Symbolic data goes beyond the usual data representation model, considering variables whose observed values for each element are no longer necessarily single real values or categories, but may assume the form of sets, intervals, or, more generally, distributions. In this talk we focus on the analysis of interval data, i.e., when the variables’ values are intervals of IR.

Parametric probabilistic models for interval-valued variables have been proposed and studied by Brito & Duarte Silva (2012). These models are based on the representation of each observed interval by its MidPoint and LogRange, and Multivariate Normal and Skew-Normal distributions are assumed for the whole set of 2p MidPoints and LogRanges of the original p interval-valued variables.

The intrinsic nature of the interval-valued variables leads to different structures of the variance-covariance matrix, represented by different possible configurations. For all cases, maximum likelihood estimators of the corresponding parameters have been derived.

This framework may be applied to different statistical multivariate methodologies, thereby allowing for inference approaches for symbolic data; in particular M(ANOVA), discriminant analysis, model-based clustering, robust estimation and outlier detection are addressed. The referred modelling and methods are implemented in the R package MAINT.Data, available on CRAN.