Reconstructing Redacted Text Using A Tranformer LLM

Currently, the main method for enhancing the privacy of text is redaction i.e. masking out selected key words such as names, emails etc. While this is effective for removing obvious personally identifiable information, redacted text may potentially still leak textual patterns that can reveal authorship, sensitive topics (e.g. that the text relates to a medical … Read more

How Best To Optimise Machine Learning Hyperparameters?

When designing and training a neural network model the hyperparameters include the SGD step size, mini-batch size, gradient decay policy, choice of regularisation etc. Selecting values for these hyperparameters is a key step in obtaining a useful model. While selection is commonly based on heuristics and trial and error, there is also much interest in … Read more

Training A Transformer LLM To Play Tic Tac Toe

Transformer neural nets have transformed natural language processing, but they can also be applied to other sequences not just sequences of words. In particular, they can be applied to game playing, which consists of a sequence of moves and where the task is to predict a good next move. In this project you will investigate … Read more

How Private Are Android Apps Really

In this project we’ll look at the data shared by a set of similar apps on Android phones with a view to assessing their privacy. Typically there have been no measurements studies of the actual data shared by apps “in the wild”. Since network traffic is encrypted the project will involve some “white hat” hacking … Read more