Deep learning, in particular Neural Networks (NNs), has achieved state-of-the-art results in many applications in the last decade. However, the way these models work and operate is still not fully understood, and in many ways they are approached as black-box when deployed in practice.
Unfortunately, this raises several concerns about their suitability to deal with sensitive applications, especially when humans are involved. NNs trained with standard approaches based on stochastic gradient descent have in fact proven to mimic and reinforces prediction biases depending on sensitive features such as gender, race, age, nationality etc.
Recent work has demonstrated how, thanks to their handling of uncertainty in a principled manner, Bayesian Neural Networks (BNNs) tend to exhibit a fairer behaviour than their deterministic counterpart. However method to fix wrong behaviours, or quantitatively increase their fairness still do not exist.
This project will focus on exploring and developing fair training techniques for BNNs in sensitive applications, by building on the adversarial learning literature for deterministic NNs and translating it to the fields of fairness and Bayesian training.
A familiarity with neural networks and with python is expected from the student.