Advancements in privacy enhancing technologies for machine learning

Hall, Adam

doi:10.17869/enu.2024.3789831

Abstract

The field of privacy preserving machine learning is still in its infancy and has been growing in popularity since 2019. Privacy enhancing technologies within the context of machine learning are composed of a set of core techniques. These relate to cryptography, distributed computation- or federated learning- differential privacy, and methods for managing distributed identity. Furthermore, the notion of contextual integrity exists to quantify the appropriate flow of information.
The aim of this work is to advance a vision of a privacy compatible infrastructure, where web 3.0 exists as a decentralised infrastructure, enshrines the user’s right to privacy and consent over information concerning them on the Internet.
This thesis contains a mix of experiments relating to privacy enhancing technologies in the context of machine learning. A number of privacy enhancing methods are advanced in these experiments, and a novel privacy preserving flow is created. This includes the establishment of an open-source framework for vertically distributed federated learning and the advancement of a novel privacy preserving machine learning framework which accommodates a core set of privacy enhancing technologies. Along with this, the work advances a novel means of describing privacy preserving information flows which extends the definition of contextual integrity.
This thesis establishes a range of contributions to the advancement of privacy enhancing technologies for privacy preserving machine learning. A case study is evaluated, and a novel, heterogeneous stack classifier is built which predicts the presence of insider threat and demonstrates the efficacy of machine learning in solving problems in this domain, given access to real data. It also draws conclusions about the applicability of federated learning in this use case. A novel framework is introduced that facilitates vertically distributed machine learning on data relating to the same subjects held on different hosts. Researchers can use this to achieve vertically federated learning in practice. The weaknesses in the security of the Split Neural Networks technique are discussed, and appropriate defences were explored in detail. These defences harden SplitNN against inversion attacks. A novel distributed trust framework is established which facilitated peer-to-peer access control without the need for a third party. This puts forward a solution for fully privacy preserving access control while interacting with privacy preserving machine learning infrastructure. Finally, a novel framework for the implementation of structured transparency is given. This provides a cohesive way to manage information flows in the privacy preserving machine learning and analytics space, offering a well-stocked toolkit for the implementation of structured transparency which utilises the aforementioned technologies. This also exhibits homomorphically encrypted inference which fully hardens the SplitNN methodology against model inversion attacks.
The most significant finding in this work is the production of an information flow which combines; split neural networks, homomorphic encryption, zero-knowledge access control and elements of differential privacy. This flow facilitates homomorphic inference through split neural networks, advancing the state-of-the-art with regard to privacy preserving machine learning.

Thesis Type	Thesis
Deposit Date	Aug 23, 2024
Publicly Available Date	Aug 23, 2024
DOI	https://doi.org/10.17869/enu.2024.3789831
Award Date	Jul 5, 2024

Advancements in privacy enhancing technologies for machine learning

Hall, Adam

Authors

Abstract

Citation

Files

Downloadable Citations