Research Project

Development of methods for automated text analyses using modern Deep Learning techniques

Principal Investigators
Dr Shoaib Jameel
Dr Jon Chamberlain
Mahsa Abazari Kia

This project is a collaboration between BT (formerly British Telecom) and the University of Essex. It explores novel methods to conduct multi-document automatic text summarisation.

Automatic text summarisation involves training a machine learning model to recognise key words and sentences within a large piece of text (such as a multi-page report) and produce an accurate summary of the content. This can save time and money by helping individuals identify the most important texts to view first. For example, a social worker with ten reports to read could be guided, by the machine, which reports are about those most at risk so they can be prioritised.

However, multi-document text summarisation involves feeding the model with a collection of text from different sources (such as a series of academic papers on a topic). In this case, the machine learning model would need to create a summary that covers all the texts, not just individual ones.

The main challenge lies in automatically summarising noisy, short, and domain-specific texts using various signals such as temporal patterns and named entities.

Partners

This project is run in collaboration with telecoms company BT.

Funding

This project is funded by the BT and University of Essex PhD scholarship programme.