Summer Schools

Analytics, Data Science and Decision Making

Essex Analytics, Data Science and Decision Making Summer School


Monday 25 July - Friday 5 August 2022

The Institute for Analytics and Data Science presents its annual Summer School, bringing you cutting-edge courses across the fields of data science, analytics and decision making. 

Our Summer School provides a unique opportunity to attend state-of-the-art courses delivered by world-leading academics and industry partners from around the globe and network with peers in an invaluable forum for knowledge exchange.

Applications for our Analytics, Data Science and Decision Making Summer School will open in 2022.

For enquiries, please contact: iadssum@essex.ac.uk 


Institute of Analytics and Data Science on social media:

Overview

Data driven policymaking is central to ensuring a stable future. At Essex, research in data science and artificial intelligence is part of our DNA.

In 1967, Tony Brooker became the founding Chair of Computer Science at Essex, having followed in Alan Turing’s footsteps at the Computing Machine Laboratory in Manchester, in the 50s.

In 2016, the first UNESCO Chair in Analytics and Data Science was conferred on Essex academic Professor Maria Fasli with the key objective of highlighting the critical role that data plays in promoting equality, sustainable development and how it can enhance people’s lives.

Now, whether you're interested in the curation and management of big data, advanced techniques and methods including artificial intelligence and statistical methods, or applications of big data in fields ranging from business and finance to bioinformatics, you'll find a set of relevant courses for you.

Teaching programme

Whether you’re exploring data science, or have advanced practice, we have a range of introductory and academic courses to suit your needs.

Some courses may have prerequisites, or preparation may be required before attending. All this information is detailed in the course outline, which you will be able to download from spring 2021.

There are 4 parallel sessions running each day, mainly six hours in length, and delegates are able to create a tailored programme of study by selecting from a range of different focuses, with the option of attending for 1 week or 2 weeks. 

Course topics

Introduction to Python

This Introduction to Python course is for beginners. We aim to introduce fundamental programming concepts using Google Colab. We will introduce variables, data types, casting, string, Booleans, operators, lists, tuples, loops, conditions, functions, and a bit of NumPy. This course is designed for those who are coming from a non-technical background. 

Data Protection, Security, Ethics and Liability in the Age of Big Data

This session aims to introduce the current EU and UK data protection regime and the changes to be brought in by the General Data Protection Regulation applicable since May 2018, in spite of Brexit.  Furthermore, the session will present and allow for discussion of the specific challenges big data analytics bring, especially in light of the reports published by various data protection regulators on big data both at UK and EU levels. Special attention will be given to security requirements in data protection law.

The last two hours (with both speakers) will introduce the ethical issues arising from Big Data and present the correlative legal issues that may arise in light of Data Protection legislations and of criminal law. Torts and contracts will not be covered.

Introduction to R

R is an interactive computing environment and programming language designed for statistical analysis and graphics. Extensions to the basic capabilities of R are straightforward to produce and share with others. It is widely and increasingly used in many Big Data fields of research including bioinformatics.

Because of its power and flexibility, R is more demanding to learn than traditional statistical packages but rewards some initial effort. This course is based on tested material that we have been using for several years to help research students, postdocs and faculty get started in their own data analysis, and is refined each time based on feedback. It is aimed at people who may have little or no programming experience.

Introduction to Machine Learning

The aim of this course is to provide an introduction to Machine Learning and a discussion of the types of problems it is suitable for. The course will then introduce Kernel Machines and show how they can provide robust but flexible classifiers when the number of training points is limited.

Deep Learning for Images and Text 1

Day 1 tutorial will focus on convolutional neural networks, also known as convnets, a type of deep-learning model almost universally used in computer vision applications. You’ll learn to apply convnets to image-classification problems—in particular, those involving small training datasets, which are the most common use case if you aren’t a large tech company.  

Day 2 tutorial will focus on deep-learning models that can process text (understood as sequences of word or sequences of characters), time-series, and sequence data in general. The two-fundamental deep-learning algorithms for sequence processing are recurrent neural networks and 1D convnets. The applications of these algorithms are in document classification, time series classification, sequence to sequence learning and sentiment analysis. 

Best Practice Analytics 

This course will look at methods and tools that can help us create high-quality analytics and reproducible results. We will also look at how to move from a single analyst, spreadsheet driven approach to collaborative analytics that follows a best practice governance model. Adopting practices from test driven software development, we will look at how to establish an analytics process based on documentation, versioning, testing, peer review, collaboration and risk evaluation. We will use examples in R, Shiny, Python and Jupyter Notebooks to illustrate the ideas taught in the course.

The aim is to give you an understanding of the challenges you will face when running your own real-world data analytics project and introduce you to a number of principles you can follow to achieve high-quality reproducible results.

Tree-based Models for Machine Learning in Data Analytics

In this course, participants will learn how to work with tree-based models to solve data science problems in Python. Everything from using a single tree for regression or classification to more advanced ensemble methods will be covered. Participants start learning about basic CARTs (classification and regression trees) followed by implementation of bagged trees, Random Forests, and boosted trees using the Gradient Boosting Machine, or GBM. The course will include dedicated practical sessions for these techniques and allow the participant to create high performance tree-based models for a real-world dataset.

Data Science Management

As you progress in your career, you will most likely be asked to manage other colleagues. This requires complementing technical Data Science skills with strategic, interpersonal and other skills. This course will present elements and challenges of managing a Data Science team. Understanding those will allow you to practice the required skills early on – preparing you for any upcoming opportunities.

Interpreting Classifiers

AI systems are very good at making predictions in a variety of settings. In many cases, however, this comes at the expense of interpretability of the models used. In this course, we will see how to interpret the decisions made by these systems such that they are accessible to humans. We will provide an overview of different methods for interpreting classifiers. We will look at machine learning models which are interpretable by nature as well as model-agnostic methods for interpreting classifiers. This course is both theoretical and practical.

Learning with Small Data Sets 

During this course we will explore Bayesian learning and how knowledge-based priors can help obtain good results from datasets with only few samples. In particular, we will start the day with a theorical introduction on Bayesian learning followed by a practical session, during which we will see hands-on how to apply Bayesian inference methods. 

Introduction to Data Visualisation Using R

In the era of misinformation and fake news, producing data visualisations that are clear and interpretable to an audience is essential in engaging people with data. Whilst there are many software packages available to produce data graphics, many offer limited customisation of graphics, or are not easily reproducible. This course will explore tools to produce high-quality graphics using the R programming language, focussing on the “ggplot2” package. The “ggplot2” package allows almost endless customisation of data visualisations, has a number of excellent extension packages that add further flexibility, and, being in R, is entirely script-based and therefore highly reproducible. This course will equip attendants with the skills to produce high-quality data visualisations using the “ggplot2” package and extensions, and would be beneficial to people working in any field where data visualisation is important. The course will be suitable for those with little to intermediate prior programming experience.

Introduction to Natural Language Processing

In this tutorial we will introduce the basic concepts of NLP, starting with simple text pre-processing techniques such as tokenisation and part-of-speech tagging, and moving on to more complex tasks such as term extraction, entity recognition and information extraction. The techniques will be demonstrated using GATE, one of the most widely used toolkits for performing all kinds of NLP tasks, and which is freely available and open source. GATE includes not only its own text processing components, but also includes a number of popular third party NLP components, all of which participants will be able to experiment with during the tutorial with hands-on exercises. 

Practical Text Analytics and Sentiment Analysis from Social Media

This tutorial will introduce the concepts of social media and sentiment analysis from unstructured text.  It will first introduce the concept of social media analysis, showing how this form of noisy text requires different solutions from traditional text analysis, with practical examples and exercises showing how this can be achieved. This leads into the more specialised task of sentiment analysis: the problem of extracting opinions automatically from text. It will cover both rule-based and machine learning techniques, provide some information on the key underlying NLP and text analysis processes required, and look in detail at some of the major problems and solutions, such as detection of sarcasm, use of informal language, spam opinion detection, trustworthiness of opinion holders, and so on. The techniques will be demonstrated with real applications developed in GATE, an open-source language processing toolkit. Hands-on exercises and relevant materials will be provided for participants to try out the applications, and to experiment with building their own simple tools.

Bandits, Learning, and Search

We will provide a basic overview of multi-armed bandit problems and algorithms for solving them. We will illustrate the application of such algorithms on a real problem in the scope of the advertising industry. Then we will continue with the relation of multi-armed bandits to reinforcement learning, and further on with the relation of reinforcement learning to Monte Carlo tree search. We will describe the application of such algorithms for game playing in the scope of the General Video Game AI competition.

Introduction to TensorFlow and Deep Learning

The course introduces Tensorflow as a programming language from scratch and shows how to use it to build simple neural networks and perform backpropagation. Students are encouraged to program along with the tutor. The basic underlying workings of TensorFlow and neural networks are taught without resorting to higher-level black box packages, so that students can gain a fundamental understanding of how deep learning works. The course also gives an introductory overview of popular deep learning models, including convolutional neural networks and recurrent neural networks.

Recurrent neural networks with Keras 

This course teaches a deep understanding of how recurrent neural networks work, what they are used for, and how to implement them efficiently using Keras and Tensorflow. The day culminates with unique advanced recurrent neural network examples applied to control problems.  Note that natural-language processing examples will not be covered.

GIS Systems in R

Geographic information systems (GIS) software form a powerful tool in the analysis of many types of spatial data, from understanding political trends in different areas, mapping the spread of infectious diseases, or understanding the impacts of climate change across the globe. This course will focus on the “sf” package, and will explore the merits and functionality of working with “simple features” based objects in geographical analyses. This one-day course will familiarise users with the array of GIS packages available in R, and enable users to carry out basic GIS operations on a variety of different geographical data formats.

Synergy of Optimisation and Machine Learning

In the first part of the course, we will discuss modern optimisation approaches that do not require significant investment of expertise and time in algorithm development, but still allow to tackle real-world problems. We will cover the following topics: the meaning of optimisation and the relevance to decision support/making, off-the-shelf solvers, algorithm complexity, simple exact algorithms, simple heuristics, metaheuristics, algorithm configuration and tuning. The second part of the course will provide foundations for exploiting the strong connections between optimisation and data science. In a series of exercises, you will see how the techniques studied in the first part of the course are used in machine learning. The aim is to enhance understanding, and so the usage, of optimisation within machine learning. Conversely, it is being increasingly recognised that the control of optimisation algorithms would itself benefit from application of data science techniques. We will present methods, with exercises, that are being developed for data science to improve the performance of existing optimisation methods in many real-world problems. 

Overall, the course presents the close interactions between data science and optimisation.  You will gain deeper understanding of the optimisation within machine learning and decision support systems, and so how to make more effective use of them.

Learning Under Different Training and Testing Distributions

Systems based on machine learning methods often suffer a major challenge when applied to the real-world datasets. The conditions under which the system was developed will differ from those in which we use the system. Few sophisticated examples could be email spam filtering, stock prediction, health diagnostic, and brain-computer interface (BCI) systems, that took a few years to develop. 

Will this system be usable, or will it need to be adapted because the distribution has changed since the system was first built? Apparently, any form of real-world data analysis is cursed with such problems, which arise for reasons varying from the sample selection bias or operating in non-stationary environments. 

This tutorial will focus on the issues of dataset shifts (e.g. covariate shift, prior-probability shift, and concept shift) and will cover transfer learning for managing to learn a satisfactory model.

Traditional and Deep Learning Methods

This course will introduce machine learning and deep learning techniques and allow participants to gain practical knowledge implementing them. The morning session will focus on regression and classification and practical machine learning such as regularisation and bias/variance theory. The afternoon session will cover neural networks and deep learning, as well as CNN and sequence models. 

Bayesian Analysis in R

Bayesian statistics are increasingly popular in many scientific disciplines. In this course, you will learn the theoretical underpinnings of Bayesian approaches and the differences between Bayesian and frequentist statistics. You will also learn how to implement, plot, and interpret Bayesian models in R. Finally, you will learn more about some of the advanced options for statistical modelling in this framework, including multi-level modelling and generalised linear approaches.

Introduction to Network Science

This one-day introductory course on network science will give a broad overview of the different concepts and methods commonly applied in social network analysis. We will first consider different kinds of network data and their representation and discuss the basics of network visualisation, including a hands-on example using the free software visone as an example. We will also discuss different kinds of applications and usage scenarios of network science in business and social contexts. The second part of the course will introduce exploratory and descriptive methods for the analysis of networks, at three levels of granularity: at the node level, the subgroup level, and the network level. The third part of the course will introduce inferential or statistical network analysis, including the basic ideas behind a range of models like the exponential random graph model and its various extensions, latent space models, the quadratic assignment procedure, and related techniques. We will cover the implementation of these methods in a very cursory way using R, but the focus is on the methods, not their implementation. Overall, this course is an introductory-level teaser for interested academics, practitioners, and data scientists who would like to explore what they can possibly do with their relational data in the way of exploration and prediction.

Transfer Learning with Transformers in NLP

This course will go through the basics of transfer learning in Natural Language Processing NLP. We will discuss cutting edge architectures like BERT and GPT. Our aim will be to cover the basics of transformer-based models and how to fine-tune pretrained models on local datasets in Python.

Machine learning for Causal Inference from Observational Data

This course will introduce the basic principles of causal modelling (potential outcomes, graphs, causal effects) while emphasising the key role of design and assumptions in obtaining robust estimates. It will also cover the basic principles of machine learning and the use of machine learning methods to do causal inference (e.g. methods stemming from domain adaptation and propensity scores). Lastly it will show how to implement these techniques for causal analysis and interpret the results in illustrative examples. By the end of this course participants should: understand the distinction between causal effects and associations and appreciate the key role of design and possibly untestable assumptions in the estimation of causal effects, understand the role of training and testing models on data and the use of regularization to avoid overfitting, and be able to position machine learning within the causal tool chain.

Understanding our Research and Innovation Infrastructure

All this technology can make thinking about infrastructure feel somewhat geeky and distant from the reality of doing research. This short introductory course is designed to strip away as much of the technobabble as possible and so that it will provide a framework of understanding that should be useful to you. It won't answer all your questions, but it might provide insight into potential research opportunities, and who knows, even answer  "how do I get the computer to say yes, so I can get my research done?"

The learning outcome will be to provide you with an introduction to research infrastructures so that you can better advocate and cost for the supporting digital infrastructure you need to for your research.

The course is structured around:

  • What makes "data" research data?
  • Defining what we mean by research and innovation infrastructure.
  • Understanding the scale needed to transform research systems, cultures, and decision-making.
  • What analytical capacity is needed for large scale research?
  • How do we safely share data?
  • How to understand infrastructure costs, including public cloud?

Real practical industry experience

Our delegates are also given real scenarios from our partner businesses, and tasked with solving these in groups. This provides you with a unique opportunity to apply your data analytics skills within real-life situations, while developing your communication and presentation skills.
 
Proposals will be formed in groups and presented to businesses on the final day of the conference.
 
In the past our course providers have come from a range of leading industry partners and academic institutions worldwide including BT, Fraunhofer, Microsoft, MSXInternational, Celtra, Micro Focus and Micro Focus Vertica.

Essex data analytics booklets
"The Big Data Summer School was a great experience! It was the most varied timetable I've seen that covers data science topics and I'm really glad I attended."
John Tipple Summer school delegate, 2018
          

Eligibility

There are no specific eligibility requirements to book a place on the Summer School. We welcome attendees (students, researchers, academics and professionals) in the emerging fields of big data, data science and analytics who would like to find out more or be updated on the current trends and developments in this exciting and fast-developing field.

We accept applicants from all levels and experience from around the world and offer a range of introductory and academic courses to suit your needs.

Whilst we don't restrict access to courses, when selecting the courses that make up your programme please pay attention to our recommended prerequisites. These prerequisites are stipulated to ensure that all attendees make the best out of their time with us.

If you are unsure whether a certain published course is right for you, please do not hesitate to contact us at iadssum@essex.ac.uk and a member of the IADS team will be happy to advise you on whether a course is suitable for your requirements and skillsets.

Please note, to take part in the summer school virtually you will need access to a laptop/computer and have a reliable internet connection. 

Fees

What is included in the fees?

  • A tailored programme of courses and access to preparatory and bonus materials from our providers using a secure drive.
  • Expert providers and business industry leaders to provide you with high level content, hands-on experience and professional insight. 
  • Support from the IADS team via Slack, Zoom and virtual channels. 
  • A delegate pack full of resources, information, and memorabilia that will be sent individually to you. 
  • Hardcopy certification upon completing your course, endorsed by the University of Essex and the Institute for Analytics and Data Science.

Tuition fee discounts are available for students at partner universities - please get in touch for details.

Our fees have been reduced for the 2021 year to accommodate the online learning environment. These are indicated below: 

Early registration (until 07/06/2021)

Audience

Cost

(1 week)

 Cost

(2 weeks)

Commercial participants  £550  £900

University of Essex students

 
£355
 £570
University of Essex staff or alumni £420
 £680
Non-University of Essex students £420
 £680
Non-University of Essex academics  £455
 £735
Pubic sector partners £455
 £735

Standard registration (08/06/2021 - 05/07/2021)

Audience

Cost

(1 week)

Cost

(2 weeks) 

Commercial participants  £600  £1000

University of Essex students

 
£390
 £640
University of Essex staff or alumni £460
 £760
Non-University of Essex students £460
 £760
Non-University of Essex academics  £495
 £820
Public sector partners £495
 £820

Late registration (06/07/2021 - 16/07/2021)

Audience

Cost

(1 week)

 Cost

(2 weeks)

Commercial Participants  £650  £1100

University of Essex students

 
£425
 £710
University of Essex staff or alumni £500
 £840
Non-University of Essex students £500
 £840
Non-University of Essex academics  £540
 £905
Public sector partners

£540

 £905

How to apply

Registration for our 2021 Online Summer School is now closed. 

To apply for a place please fill out our online application form. Upon completion, you will then be notified of the outcome, and if successful, directed to the payment portals below. 

If you have any further queries, please contact iadssum@essex.ac.uk

Self funding

If you are self-funding your attendance at the Online Summer School, please book and pay for your place

Paying by Proficio

If you are paying for your course fee using University of Essex Proficio funds, you will need to use the University of Essex Proficio platform. 


Paying by invoice

If your institution requires an invoice to process payment, please email us stating how much the invoice should be for, and the name and postal address of the person making the payment, to iadssum@essex.ac.uk 

After payment

Once you've paid, we will send you our enrolment guidance pack with instructions on how to finalise your place and access our preparatory materials to prepare for your training with us.

Lecturer speaking to a room of students
"The Summer School was well organised, the courses I attended were very informative and the keynotes were excellent."
Analytics and Data Science Summer School Delegate 2016

Bringing together
great minds

Over 32 essential training courses
Over 60 providers and keynotes from around the world
5 years of running successful summer training
Over 490 satisfied delegates in the past five years
Lecturer at an executive education course
Lecturer
Summer School certificates
Evening social event
Get in touch
Get in touch
Essex Analytics, Data Science and Decision Making Summer School Institute for Analytics and Data Science