Privacy-Preserving Machine Learning in TensorFlow with TF Encrypted
Advanced Spark and TensorFlow Meetup, May 2019
O’Reilly AI Conference New York, April 2019 (slides)
We focus on recent applications of privacy-preserving techniques to machine learning, in particular deep learning, and review how tools such as homomorphic encryption and multi-party computation can benefit the machine learning process. We then more in-depth with concrete examples of how the TF Encrypted open source library can be used to explore and experiment directly in TensorFlow, showing how predictions can be done without exposing the prediction input, and how a model can be fitted without ever exposing the training data.
Cryptography for Privacy-Preserving Machine Learning
AI & Trust at Applied Machine Learning Days, January 2019 (slides)
Zero Knowledge TLV Meetup, January 2019 (slides, video)
Hacking Deep Learning, January 2019 (slides, video)
IFIP Summer School on Privacy and Identity Management, August 2018 (slides)
(Cyber)Security for Software Engineers, June 2018 (slides)
Paris Machine Learning Meetup (ParisML), December 2017 (slides)
In this talk we focus on recent applications of advanced cryptographic methods to machine learning, in particular deep learning. After illustrating how tools such as homomorphic encryption and multi-party computation can benefit the machine learning process in terms of privacy and trust, we proceed to give a high-level overview of their underlying principles in order to understand differences, weaknesses, and strengths. As an example we show how a model can be trained on data that remain encrypted throughout the whole process. We do so using TF Encrypted, a library on top of TensorFlow for working with encrypted data.
What Privacy Has Meant For Snips
Privacy in Statistical Analysis (PSA), May 2017 (slides)
Privacy has been a guiding component at Snips from its very beginning, partly motivated by a core belief but also by a business rationale. In this talk we will outline some of the arguments that have led to our decisions and illustrate some of the challenges we have faced. We will further discuss a few concrete techniques by which we have aimed to overcome these, tailored to the fact that Snips is primarily focused on mobile and IoT.
Private Machine Learning in TensorFlow using Secure Computation
with Jason Mancuso, Yann Dupis, et al.
NeurIPS workshop on Privacy Preserving Machine Learning, December 2018 (paper, slides)
We present a framework for experimenting with secure multi-party computation directly in TensorFlow. By doing so we benefit from several properties valuable to both researchers and practitioners, including tight integration with ordinary machine learning processes, existing optimizations for distributed computation in TensorFlow, high-level abstractions for expressing complex algorithms and protocols, and an expanded set of familiar tooling. We give an open source implementation of a state-of-the-art protocol and report on concrete benchmarks using typical models from private machine learning.
A Generic Framework for Privacy Preserving Deep Learning
with Theo Ryffel, Andrew Trask, et al.
NeurIPS workshop on Privacy Preserving Machine Learning, December 2018 (paper)
We detail a new framework for privacy preserving deep learning and discuss its assets. The framework puts a premium on ownership and secure processing of data and introduces a valuable representation based on chains of commands and tensors. This abstraction allows one to implement complex privacy preserving constructs such as Federated Learning, Secure Multiparty Computation, and Differential Privacy while still exposing a familiar deep learning API to the end-user.
Private Data Aggregation on a Budget
with Valerio Pastro and Mathieu Poumeyrol
Homomorphic Encryption Applications and Technology (HEAT), November 2017 (slides)
Theory and Practice of Multi-Party Computation (TPMPC), April 2017 (slides)
NIPS workshop on Private Multi-Party Machine Learning (PMPML), December 2016 (full paper)
We provide a practical solution to performing cross-user machine learning through aggregation on a sensitive dataset distributed among privacy-concerned users. We focus on a scenario in which a single company wishes to obtain the distribution of aggregate features, while ensuring a high level of privacy for the users. We are interested in the case where users own devices that are not necessarily powerful or online at all times, like smartphones or web browsers. This premise makes general solutions, such as general multiparty computation (MPC), less applicable. We design an efficient special-purpose MPC protocol that outputs aggregate features to the company, while keeping online presence and computational complexity on the users’ side at a minimum. This basic protocol is secure against a majority of corrupt users, as long as they do not collude with the company. If they do, we still guarantee security, as long as the fraction of corrupt users is lower than a certain, tweakable, parameter. We propose different enhancements of this solution: one guaranteeing some degree of active security, and one that additionally ensures differential privacy. Finally, we report on the performance of our implementation on several realistic real-world use-cases across different devices.
Universally Composable Symbolic Analysis for Two-Party Protocols based on Homomorphic Encryption
with Ivan Damgård
EUROCRYPT, May 2014 (full paper)
We consider a class of two-party function evaluation protocols in which the parties are allowed to use ideal functionalities as well as a set of powerful primitives, namely commitments, homomorphic encryption, and certain zero-knowledge proofs. We illustrate that with these it is possible to capture protocols for oblivious transfer, coin-flipping, and generation of multiplication-triple.
We show how any protocol in our class can be compiled to a symbolic representation expressed as a process in an abstract process calculus, and prove a general computational soundness theorem implying that if the protocol realises a given ideal functionality in the symbolic setting, then the original version also realises the ideal functionality in the standard computational UC setting. In other words, the theorem allows us to transfer a proof in the abstract symbolic setting to a proof in the standard UC model. Finally, we show that the symbolic interpretation is simple enough in a number of cases for the symbolic proof to be partly automated using the ProVerif tool.
On Secure Two-Party Integer Division
with Chao Ning and Tomas Toft
Financial Cryptography and Data Security (FC), February 2012 (full paper)
We consider the problem of secure integer division: given two Paillier encryptions of l-bit values n and d, determine an encryption of n/d without leaking any information about n or d. We propose two new protocols solving this problem.
Type-Based Automated Verification of Authenticity in Asymmetric Cryptographic Protocols
with Naoki Kobayashi, Yunde Sun, and Hans Hüttel
Automated Technology for Verification and Analysis (ATVA), 2011
Gordon and Jeffrey developed a type system for verification of asymmetric and symmetric cryptographic protocols. We propose a modified version of Gordon and Jeffrey’s type system and develop a type inference algorithm for it, so that protocols can be verified automatically as they are, without any type annotations or explicit type casts. We have implemented a protocol verifier SPICA2 based on the algorithm, and confirmed its effectiveness.
Formal Analysis of Privacy for Anonymous Location Based Services
with Stéphanie Delaune and Graham Steel
Theory of Security and Applications (TOSCA), March 2011 (paper)
We propose a framework for formal analysis of privacy in location based services such as anonymous electronic toll collection. We give a formal definition of privacy, and apply it to the VPriv scheme for vehicular services. We analyse the resulting model using the ProVerif tool, concluding that our privacy property holds only if certain conditions are met by the implementation. Our analysis includes some novel features such as the formal modelling of privacy for a protocol that relies on interactive zero-knowledge proofs of knowledge and list permutations.
Formal Analysis of Privacy for Vehicular Mix-Zones
with Stéphanie Delaune and Graham Steel
European Symposium on Research in Computer Security (ESORICS), 2010
Embedded Security in Cars (ESCAR), 2010
Formal Methods and Cryptography (CryptoForma), 2010
Foundations of Security and Privacy (FCS-PrivMod), July 2010
Safety critical applications for recently proposed vehicle to vehicle ad-hoc networks (VANETs) rely on a beacon signal, which poses a threat to privacy since it could allow a vehicle to be tracked. Mix-zones, where vehicles encrypt their transmissions and then change their identifiers, have been proposed as a solution to this problem. In this work, we describe a formal analysis of mix-zones. We model a mix-zone and propose a formal definition of privacy for such a zone. We give a set of necessary conditions for any mix-zone protocol to preserve privacy. We analyse, using the tool ProVerif, a particular proposal for key distribution in mix-zones, the CMIX protocol. We show that in many scenarios it does not preserve privacy, and we propose a fix.
The state of privacy-preserving machine learning
on The Data Exchange with Ben Lorica
In this episode of the Data Exchange I speak with Morten Dahl, research scientist at Dropout Labs, a startup building a platform and tools for privacy-preserving machine learning. He is also behind TF Encrypted, an open source framework for encrypted machine learning in TensorFlow. The rise of privacy regulations like CCPA and GDPR combined with the growing importance of ML has led to a strong interest in tools and techniques for privacy-preserving machine learning among researchers and practitioners. Morten brings the unique perspective of someone who understands both security and machine learning: he is a longtime security researcher who also happens to have worked as a data scientist in industry.
Experimenting with TF Encrypted
with Jason Mancuso (blog post)
Private Analytics with SDA
with Mario Cornejo and Mathieu Poumeyrol (blog post, code)
Benchmarking Paillier Encryption
with Mario Cornejo and Mathieu Poumeyrol (blog post, code, benchmarks)
Prime Number Generation in Rust
with Mario Cornejo (blog post)
Optimizing Threshold Secret Sharing
with Mathieu Poumeyrol (blog post)
High-Volume Secret Sharing
with Mathieu Poumeyrol (blog post, code)
How Practical is Somewhat Homomorphic Encryption Today?
with Maeva Benoit (blog post)
Differential Privacy for the Rest of Us
with Joseph Dureau (blog post)
Symbolic Analysis of Cryptographic Protocols
Ph.D. thesis, Aarhus University, 2013 (thesis)
Proving Security using Type Inference
M.Sc. thesis, Aalborg University, 2008 (thesis)
A Modal Logic for Mobile Resources
B.Sc. thesis, Aalborg University, 2006 (thesis)