Recently, Google released the tensorflow toolkit for privacy protection, which can evaluate the privacy attributes of various machine learning classifiers. Google said that it aims to become the basis of a privacy test suite, which can be used regardless of the skill level of AI developers.
At present, various AI privacy technologies are still controversial topics in the community, but there is no standard guide to establish a private model. More and more studies show that artificial intelligence model can reveal the sensitive information of training data set, resulting in privacy risk. The mitigation method used by tensorflow privacy protection is differential privacy, which adds noise to the training data to hide a single example. It is understood that this noise is designed in the academic worst case, and will significantly affect the accuracy of the model.
Therefore, this prompted Google researchers to seek another option. The new tensorflow privacy module supports the member inference attack method, and establishes a classifier to infer whether there are specific samples in the training data set. The more accurate the classifier, the more memory, so the less privacy protection of the model. The attacker who makes high-precision prediction will successfully find out what data is used in the training set.
The tests provided by the new modules are black boxes, which means that they only use the output of the model, not internal or input samples. They generate a vulnerability score to determine whether the model leaks information from the training set, and they do not need any retraining, making them relatively easy to execute.
“After using membership inference tests internally, we will share these tests with developers to help them build more private models, explore better architecture options, use regularization techniques such as early stop, exit, weight attenuation and input enhancement, or collect more data.” Google brain’s double song and Google software engineer David marn said on the tensorflow blog.
In addition, Google said: “in the future, it will explore the feasibility of extending member inference attacks beyond classifiers and develop new tests. It also plans to explore adding new tests to the tensorflow ecosystem by integrating new tests with tensorflow extended (TFX), an end-to-end platform for deploying production machine learning pipelines. “
Google added go and Java support to the basic differential privacy library opened last summer, and also provided privacy on beam. The end-to-end differential privacy solution based on Apache beam (a language specific SDK model and collection) relies on the low-level building blocks of the differential privacy library and combines them into an “out of the box” solution, The solution considers the steps necessary for differential privacy.
In addition, Google has also launched a new privacy loss allocation tool to track the privacy budget, allowing developers to estimate the total cost of user privacy for collecting differential private queries, and better evaluate the overall impact of its pipeline.