The sudden rise of adversarial examples (i.e., input points intentionally crafted so to trick a model into misprediction) has shown that even state-of-the-art deep learning models can be extremely vulnerable to intelligent attacks. Unfortunately, the fragility of such models makes their deployment in safety-critical real-world applications (e.g., self-driving cars or eHealth) difficult to justify, hence calling for the need to better understand and protect against intelligent attacks.
Vulnerabilities however are not only concerned with the input space; the weights, the activation functions and even the finite precision of neural networks’ implementation can be exploited by attackers. This project will look at investigating the vulnerability of neural networks under various attackers assumptions and develop empirical evidence on their robustness.
A familiarity with Python and with neural networks is required for the project.