Add dataset privacy risk assessment example notebook. (#73)

* Add dataset assessment notebook and reference to module from project README

Signed-off-by: Maya Anderson <mayaa@il.ibm.com>
This commit is contained in:
andersonm-ibm 2023-05-04 12:21:42 +03:00 committed by GitHub
parent dbb958f791
commit 782edabd58
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
4 changed files with 410 additions and 1 deletions

View file

@ -16,6 +16,9 @@ minimization principle in GDPR for ML models. It enables to reduce the amount of
personal data needed to perform predictions with a machine learning model, while still enabling the model
to make accurate predictions. This is done by by removing or generalizing some of the input features.
The [**dataset assessment**](apt/risk/data_assessment/README.md) module implements a tool for privacy assessment of
synthetic datasets that are to be used in AI model training.
Official ai-privacy-toolkit documentation: https://ai-privacy-toolkit.readthedocs.io/en/latest/
Installation: pip install ai-privacy-toolkit

View file

@ -87,7 +87,6 @@ class DatasetAttackMembership(DatasetAttack):
labels = np.concatenate((np.zeros((len(non_member_probabilities),)), np.ones((len(member_probabilities),))))
results = np.concatenate((non_member_probabilities, member_probabilities))
svc_disp = RocCurveDisplay.from_predictions(labels, results)
svc_disp.plot()
plt.plot([0, 1], [0, 1], color="navy", linewidth=2, linestyle="--", label='No skills')
plt.title('ROC curve')
plt.savefig(f'{filename_prefix}{dataset_name}_roc_curve.png')

File diff suppressed because one or more lines are too long

View file

@ -13,3 +13,8 @@ tensorflow==2.8.3
xgboost==1.7.2
Pillow==9.3.0
sortedcontainers==2.4.0
#notebooks
notebook
jupyter
ipywidgets