PhD Candidate in Computer Science

Neurobotics Lab, Department of Computer Science

University of Freiburg

Yuan Zhang received the B.Eng degree in Electronic Engineering from Tsinghua University in 2017, and the M.S.c degree in Machine Learning at University College London in 2018. After graduation, he come back to China and started to work on applying reinforcement learning in Natural Language Processing tasks including dialog policy learning and weak supervision learning in a startup called Laiye. His research interest lies in reinforcement learning, especially its applications in real-world scenarios (e.g. dialogue systems, games, robotics).

Project description

Deep Learning has brought significant progress in a variety of applications of machine learning in recent years. As powerful non-linear function approximators, their potential for use in learning-based control applications is very appealing. They benefit from large amounts of data, and present a very scalable solution e.g. for learning hard-to-model plant dynamics from data. Currently, the most widely-used method of training these deep networks are maximum likelihood approaches, which only give a point estimate of the parameters that maximize the likelihood of the input data, and do not quantify how certain the model is about its predictions. The uncertainty of the model is, however, a crucial factor in robust and risk-averse control applications. This is especially important when the learned dynamics model is to be used to predict over a longer horizon, resulting in compounding errors of inaccurate models. Bayesian Deep Learning approaches offer a promising alternative that allows to quantify model uncertainty explicitly, but many current approaches are difficult to scale, have high computational overhead, and poorly calibrated uncertainties. The objective for the ESR in this project will be to develop new Bayesian Deep Learning approaches, including recurrent architectures, that address these issues and are well suited for embedded control applications with their challenging constraints on computational complexity, memory, and real-time demands.