mirror of
https://github.com/IBM/ai-privacy-toolkit.git
synced 2026-06-08 15:05:13 +02:00
Updated FAQ (markdown)
parent
a7454aef41
commit
6b21d5af59
1 changed files with 1 additions and 1 deletions
2
FAQ.md
2
FAQ.md
|
|
@ -1,5 +1,5 @@
|
|||
### 1. Why do ML models need privacy protection?
|
||||
Recent studies show that a malicious third party with access to a trained ML model, even without access to the training data itself, can still reveal sensitive, personal information about the people whose data was used to train the model. For example, it may be possible to reveal whether or not a person’s data is part of the model’s training set (membership inference), or even infer sensitive atributes about them, such as their salary (attribute inference). For more information see: https://github.com/IBM/ai-privacy-toolkit/wiki/Relevant-papers#membership-inference-attacks
|
||||
Recent studies show that a malicious third party with access to a trained ML model, even without access to the training data itself, can still reveal sensitive, personal information about the people whose data was used to train the model. For example, it may be possible to reveal whether or not a person’s data is part of the model’s training set (membership inference), or even infer sensitive atributes about them, such as their salary (attribute inference). For more information see: https://github.com/IBM/ai-privacy-toolkit/wiki/Relevant-papers#privacy-attacks-on-ml-models
|
||||
|
||||
### 2. What do you mean when you say anonymization?
|
||||
The ML-guided anonymization method implemented in the anonymization module of this toolkit is based on a long-known construct called k-anonymity, which was proposed by L. Sweeney in 2002 to address the problem of releasing personal data while preserving individual privacy. This is a method to reduce the likelihood of any single person being identified when the dataset is linked with other, external data sources. The approach is based on generalizing attributes and possibly deleting records until each record becomes indistinguishable from at least k − 1 other records. This generalization is applied only to those attributes that can be linked with other data sources containing identifiers, called quasi-identifiers (QI).
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue