Skip to content

I'm looking for...

Courses Research People Something else

Unable to find any suggestions for your query...

Prefer to see our subject areas?

Browse courses by subject

Unable to find any suggestions for your query...

Looking for funded postgraduate opportunities?

View doctoral training partnerships Browse postgraduate research opportunities

Unable to find any suggestions for your query...

We are different and we are the same. #WeAreEssex

Unable to find any suggestions for your query...

Looking for student or staff information?

Student Directory Staff Directory

Courses

Unable to find any suggestions for your query...

Prefer to see our subject areas?

Browse courses by subject

Research

Unable to find any suggestions for your query...

Looking for funded postgraduate opportunities?

View doctoral training partnerships Browse postgraduate research opportunities

People

Unable to find any suggestions for your query...

We are different and we are the same. #WeAreEssex

Something else

Unable to find any suggestions for your query...

Looking for student or staff information?

Student Directory Staff Directory

Event

Bandit PCA

Artificial Intelligence & Machine Learning Seminar

Fri 26 Apr 19

16:00 - 17:00
Colchester Campus

1N1.4.1
Event speaker

Dr Gregely Neu
Event type

Lectures, talks and seminars
Event organiser

Computer Science and Electronic Engineering, School of
Contact details

Dr Dimitri Ognibene dimitri.ognibene@essex.ac.uk

On 26 April we have a special talk from Dr Gergely Neu from the Pompeu Fabra University, Barcelona, about his new formal results on the PCA methods.

Gergely is a very creative and productive Machine Learning theoretician and is one of the few European recipients of the last Google Faculty Research Award. He has produced several interesting results on different areas of Machine Learning, such as Inverse reinforcement learning and the exploration-exploitation dilemma.

Bandit PCA Abstract:

We consider a partial-feedback variant of the well-studied online PCA problem where a learner attempts to predict a sequence of d-dimensional vectors in terms of a quadratic loss, while only having limited feedback about the environment's choices. We focus on a natural notion of bandit feedback where the learner only observes the loss associated with its own prediction. Based on the classical observation that this decision-making problem can be lifted to the space of density matrices, we propose an algorithm that is shown to achieve a regret of O(d^3/2√T) after T rounds in the worst case. We also prove data-dependent bounds that improve on the basic result when the loss matrices of the environment have bounded rank or the loss of the best action is bounded. One version of our algorithm runs in O(d) time per trial which massively improves over every previously known online PCA method. We complement these results by a lower bound of Ω(d√T).