mirror of
https://github.com/IBM/ai-privacy-toolkit.git
synced 2026-04-25 04:46:21 +02:00
* keras wrapper + blackbox classifier wrapper (fix #7) * fix error in NCP calculation * Update notebooks * Fix #25 (incorrect attack_feature indexes for social feature in notebook) * Consistent naming of internal parameters
1197 lines
37 KiB
Text
1197 lines
37 KiB
Text
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Using ML anonymization to defend against attribute inference attacks"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"In this tutorial we will show how to anonymize models using the ML anonymization module. \n",
|
||
"\n",
|
||
"We will demonstrate running inference attacks both on a vanilla model, and then on different anonymized versions of the model. We will run both black-box and white-box attribute inference attacks using ART's inference module (https://github.com/Trusted-AI/adversarial-robustness-toolbox/tree/main/art/attacks/inference). \n",
|
||
"\n",
|
||
"This will be demonstarted using the Nursery dataset (original dataset can be found here: https://archive.ics.uci.edu/ml/datasets/nursery). \n",
|
||
"\n",
|
||
"The sensitive feature we are trying to infer is the 'social' feature, after turning it into a binary feature (the original value 'problematic' receives the new value 1 and the rest 0). We also preprocess the data such that all categorical features are one-hot encoded."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Load data"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 121,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>parents</th>\n",
|
||
" <th>has_nurs</th>\n",
|
||
" <th>form</th>\n",
|
||
" <th>children</th>\n",
|
||
" <th>housing</th>\n",
|
||
" <th>finance</th>\n",
|
||
" <th>social</th>\n",
|
||
" <th>health</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>8450</th>\n",
|
||
" <td>pretentious</td>\n",
|
||
" <td>very_crit</td>\n",
|
||
" <td>foster</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>less_conv</td>\n",
|
||
" <td>convenient</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>not_recom</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>12147</th>\n",
|
||
" <td>great_pret</td>\n",
|
||
" <td>very_crit</td>\n",
|
||
" <td>complete</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>critical</td>\n",
|
||
" <td>inconv</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>recommended</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2780</th>\n",
|
||
" <td>usual</td>\n",
|
||
" <td>critical</td>\n",
|
||
" <td>complete</td>\n",
|
||
" <td>4</td>\n",
|
||
" <td>less_conv</td>\n",
|
||
" <td>convenient</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>not_recom</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>11924</th>\n",
|
||
" <td>great_pret</td>\n",
|
||
" <td>critical</td>\n",
|
||
" <td>foster</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>critical</td>\n",
|
||
" <td>convenient</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>not_recom</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>59</th>\n",
|
||
" <td>usual</td>\n",
|
||
" <td>proper</td>\n",
|
||
" <td>complete</td>\n",
|
||
" <td>2</td>\n",
|
||
" <td>convenient</td>\n",
|
||
" <td>convenient</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>not_recom</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5193</th>\n",
|
||
" <td>pretentious</td>\n",
|
||
" <td>less_proper</td>\n",
|
||
" <td>complete</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>convenient</td>\n",
|
||
" <td>inconv</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>recommended</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1375</th>\n",
|
||
" <td>usual</td>\n",
|
||
" <td>less_proper</td>\n",
|
||
" <td>incomplete</td>\n",
|
||
" <td>2</td>\n",
|
||
" <td>less_conv</td>\n",
|
||
" <td>convenient</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>priority</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>10318</th>\n",
|
||
" <td>great_pret</td>\n",
|
||
" <td>less_proper</td>\n",
|
||
" <td>foster</td>\n",
|
||
" <td>4</td>\n",
|
||
" <td>convenient</td>\n",
|
||
" <td>convenient</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>priority</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6396</th>\n",
|
||
" <td>pretentious</td>\n",
|
||
" <td>improper</td>\n",
|
||
" <td>completed</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>less_conv</td>\n",
|
||
" <td>convenient</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>recommended</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>485</th>\n",
|
||
" <td>usual</td>\n",
|
||
" <td>proper</td>\n",
|
||
" <td>incomplete</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>critical</td>\n",
|
||
" <td>inconv</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>not_recom</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>10366 rows × 8 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" parents has_nurs form children housing finance \\\n",
|
||
"8450 pretentious very_crit foster 1 less_conv convenient \n",
|
||
"12147 great_pret very_crit complete 1 critical inconv \n",
|
||
"2780 usual critical complete 4 less_conv convenient \n",
|
||
"11924 great_pret critical foster 1 critical convenient \n",
|
||
"59 usual proper complete 2 convenient convenient \n",
|
||
"... ... ... ... ... ... ... \n",
|
||
"5193 pretentious less_proper complete 1 convenient inconv \n",
|
||
"1375 usual less_proper incomplete 2 less_conv convenient \n",
|
||
"10318 great_pret less_proper foster 4 convenient convenient \n",
|
||
"6396 pretentious improper completed 3 less_conv convenient \n",
|
||
"485 usual proper incomplete 1 critical inconv \n",
|
||
"\n",
|
||
" social health \n",
|
||
"8450 1 not_recom \n",
|
||
"12147 1 recommended \n",
|
||
"2780 1 not_recom \n",
|
||
"11924 1 not_recom \n",
|
||
"59 0 not_recom \n",
|
||
"... ... ... \n",
|
||
"5193 0 recommended \n",
|
||
"1375 1 priority \n",
|
||
"10318 0 priority \n",
|
||
"6396 1 recommended \n",
|
||
"485 1 not_recom \n",
|
||
"\n",
|
||
"[10366 rows x 8 columns]"
|
||
]
|
||
},
|
||
"execution_count": 121,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"import os\n",
|
||
"import sys\n",
|
||
"sys.path.insert(0, os.path.abspath('..'))\n",
|
||
"\n",
|
||
"from apt.utils.dataset_utils import get_nursery_dataset_pd\n",
|
||
"\n",
|
||
"(x_train, y_train), (x_test, y_test) = get_nursery_dataset_pd(transform_social=True)\n",
|
||
"\n",
|
||
"x_train"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Train decision tree model"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 122,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Base model accuracy: 0.9969135802469136\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"from sklearn.tree import DecisionTreeClassifier\n",
|
||
"from art.estimators.classification.scikitlearn import ScikitlearnDecisionTreeClassifier\n",
|
||
"from sklearn.preprocessing import OneHotEncoder\n",
|
||
"from sklearn.compose import ColumnTransformer\n",
|
||
"from sklearn.impute import SimpleImputer\n",
|
||
"from sklearn.pipeline import Pipeline\n",
|
||
"\n",
|
||
"numeric_features = ['social']\n",
|
||
"categorical_features = ['children', 'parents', 'has_nurs', 'form', 'housing', 'finance', 'health']\n",
|
||
"numeric_transformer = Pipeline(\n",
|
||
" steps=[('imputer', SimpleImputer(strategy='constant', fill_value=0))]\n",
|
||
")\n",
|
||
"categorical_transformer = OneHotEncoder(handle_unknown=\"ignore\", sparse=False)\n",
|
||
"preprocessor = ColumnTransformer(\n",
|
||
" transformers=[\n",
|
||
" (\"num\", numeric_transformer, numeric_features),\n",
|
||
" (\"cat\", categorical_transformer, categorical_features),\n",
|
||
" ]\n",
|
||
")\n",
|
||
"\n",
|
||
"train_encoded = preprocessor.fit_transform(x_train)\n",
|
||
"test_encoded = preprocessor.transform(x_test)\n",
|
||
" \n",
|
||
"model = DecisionTreeClassifier()\n",
|
||
"model.fit(train_encoded, y_train)\n",
|
||
"\n",
|
||
"art_classifier = ScikitlearnDecisionTreeClassifier(model)\n",
|
||
"\n",
|
||
"print('Base model accuracy: ', model.score(test_encoded, y_test))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Attack\n",
|
||
"### Black-box attack\n",
|
||
"The black-box attack basically trains an additional classifier (called the attack model) to predict the attacked feature's value from the remaining n-1 features as well as the original (attacked) model's predictions.\n",
|
||
"#### Train attack model"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 123,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"import numpy as np\n",
|
||
"from art.attacks.inference.attribute_inference import AttributeInferenceBlackBox\n",
|
||
"\n",
|
||
"# social feature after preprocessing\n",
|
||
"attack_feature = 0\n",
|
||
"\n",
|
||
"# training data without attacked feature\n",
|
||
"x_train_for_attack = np.delete(train_encoded, attack_feature, 1)\n",
|
||
"# only attacked feature\n",
|
||
"x_train_feature = train_encoded[:, attack_feature].copy().reshape(-1, 1)\n",
|
||
"\n",
|
||
"bb_attack = AttributeInferenceBlackBox(art_classifier, attack_feature=attack_feature)\n",
|
||
"\n",
|
||
"# get original model's predictions\n",
|
||
"x_train_predictions = np.array([np.argmax(arr) for arr in art_classifier.predict(train_encoded)]).reshape(-1,1)\n",
|
||
"\n",
|
||
"# use half of training set for training the attack\n",
|
||
"attack_train_ratio = 0.5\n",
|
||
"attack_train_size = int(len(train_encoded) * attack_train_ratio)\n",
|
||
"\n",
|
||
"# train attack model\n",
|
||
"bb_attack.fit(train_encoded[:attack_train_size])"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### Infer sensitive feature and check accuracy"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 124,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"0.6000385876905268\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"# get inferred values\n",
|
||
"values=[0, 1]\n",
|
||
"\n",
|
||
"inferred_train_bb = bb_attack.infer(x_train_for_attack[attack_train_size:], pred=x_train_predictions[attack_train_size:], values=values)\n",
|
||
"# check accuracy\n",
|
||
"train_acc = np.sum(inferred_train_bb == np.around(x_train_feature[attack_train_size:], decimals=8).reshape(1,-1)) / len(inferred_train_bb)\n",
|
||
"print(train_acc)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"This means that for 60% of the training set, the attacked feature is inferred correctly using this attack."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Whitebox attack\n",
|
||
"This attack does not train any additional model, it simply uses additional information coded within the attacked decision tree model to compute the probability of each value of the attacked feature and outputs the value with the highest probability."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 125,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"0.6980513216284006\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"from art.attacks.inference.attribute_inference import AttributeInferenceWhiteBoxDecisionTree\n",
|
||
"\n",
|
||
"priors = [6925 / 10366, 3441 / 10366]\n",
|
||
"\n",
|
||
"wb2_attack = AttributeInferenceWhiteBoxDecisionTree(art_classifier, attack_feature=attack_feature)\n",
|
||
"\n",
|
||
"# get inferred values\n",
|
||
"inferred_train_wb2 = wb2_attack.infer(x_train_for_attack, x_train_predictions, values=values, priors=priors)\n",
|
||
"\n",
|
||
"# check accuracy\n",
|
||
"train_acc = np.sum(inferred_train_wb2 == np.around(x_train_feature, decimals=8).reshape(1,-1)) / len(inferred_train_wb2)\n",
|
||
"print(train_acc)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"The white-box attack is able to correctly infer the attacked feature value in 69% of the training set. "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Anonymized data\n",
|
||
"## k=100\n",
|
||
"\n",
|
||
"Now we will apply the same attacks on an anonymized version of the same dataset (k=100). The data is anonymized on the quasi-identifiers: finance, social, health.\n",
|
||
"\n",
|
||
"k=100 means that each record in the anonymized dataset is identical to 99 others on the quasi-identifier values (i.e., when looking only at those 3 feature, the records are indistinguishable)."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 126,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>parents</th>\n",
|
||
" <th>has_nurs</th>\n",
|
||
" <th>form</th>\n",
|
||
" <th>children</th>\n",
|
||
" <th>housing</th>\n",
|
||
" <th>finance</th>\n",
|
||
" <th>social</th>\n",
|
||
" <th>health</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>pretentious</td>\n",
|
||
" <td>very_crit</td>\n",
|
||
" <td>foster</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>less_conv</td>\n",
|
||
" <td>convenient</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>not_recom</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>great_pret</td>\n",
|
||
" <td>very_crit</td>\n",
|
||
" <td>complete</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>critical</td>\n",
|
||
" <td>inconv</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>recommended</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>usual</td>\n",
|
||
" <td>critical</td>\n",
|
||
" <td>complete</td>\n",
|
||
" <td>4</td>\n",
|
||
" <td>less_conv</td>\n",
|
||
" <td>convenient</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>not_recom</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>great_pret</td>\n",
|
||
" <td>critical</td>\n",
|
||
" <td>foster</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>critical</td>\n",
|
||
" <td>convenient</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>not_recom</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>usual</td>\n",
|
||
" <td>proper</td>\n",
|
||
" <td>complete</td>\n",
|
||
" <td>2</td>\n",
|
||
" <td>convenient</td>\n",
|
||
" <td>convenient</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>not_recom</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>10361</th>\n",
|
||
" <td>pretentious</td>\n",
|
||
" <td>less_proper</td>\n",
|
||
" <td>complete</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>convenient</td>\n",
|
||
" <td>convenient</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>recommended</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>10362</th>\n",
|
||
" <td>usual</td>\n",
|
||
" <td>less_proper</td>\n",
|
||
" <td>incomplete</td>\n",
|
||
" <td>2</td>\n",
|
||
" <td>less_conv</td>\n",
|
||
" <td>inconv</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>priority</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>10363</th>\n",
|
||
" <td>great_pret</td>\n",
|
||
" <td>less_proper</td>\n",
|
||
" <td>foster</td>\n",
|
||
" <td>4</td>\n",
|
||
" <td>convenient</td>\n",
|
||
" <td>convenient</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>priority</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>10364</th>\n",
|
||
" <td>pretentious</td>\n",
|
||
" <td>improper</td>\n",
|
||
" <td>completed</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>less_conv</td>\n",
|
||
" <td>convenient</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>recommended</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>10365</th>\n",
|
||
" <td>usual</td>\n",
|
||
" <td>proper</td>\n",
|
||
" <td>incomplete</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>critical</td>\n",
|
||
" <td>convenient</td>\n",
|
||
" <td>0</td>\n",
|
||
" <td>not_recom</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>10366 rows × 8 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" parents has_nurs form children housing finance \\\n",
|
||
"0 pretentious very_crit foster 1 less_conv convenient \n",
|
||
"1 great_pret very_crit complete 1 critical inconv \n",
|
||
"2 usual critical complete 4 less_conv convenient \n",
|
||
"3 great_pret critical foster 1 critical convenient \n",
|
||
"4 usual proper complete 2 convenient convenient \n",
|
||
"... ... ... ... ... ... ... \n",
|
||
"10361 pretentious less_proper complete 1 convenient convenient \n",
|
||
"10362 usual less_proper incomplete 2 less_conv inconv \n",
|
||
"10363 great_pret less_proper foster 4 convenient convenient \n",
|
||
"10364 pretentious improper completed 3 less_conv convenient \n",
|
||
"10365 usual proper incomplete 1 critical convenient \n",
|
||
"\n",
|
||
" social health \n",
|
||
"0 0 not_recom \n",
|
||
"1 1 recommended \n",
|
||
"2 0 not_recom \n",
|
||
"3 0 not_recom \n",
|
||
"4 0 not_recom \n",
|
||
"... ... ... \n",
|
||
"10361 0 recommended \n",
|
||
"10362 0 priority \n",
|
||
"10363 0 priority \n",
|
||
"10364 0 recommended \n",
|
||
"10365 0 not_recom \n",
|
||
"\n",
|
||
"[10366 rows x 8 columns]"
|
||
]
|
||
},
|
||
"execution_count": 126,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"from apt.utils.datasets import ArrayDataset\n",
|
||
"from apt.anonymization import Anonymize\n",
|
||
"\n",
|
||
"features = x_train.columns\n",
|
||
"QI = [\"finance\", \"social\", \"health\"]\n",
|
||
"\n",
|
||
"anonymizer = Anonymize(100, QI, categorical_features=categorical_features)\n",
|
||
"anon = anonymizer.anonymize(ArrayDataset(x_train, x_train_predictions))\n",
|
||
"anon"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 127,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"7585"
|
||
]
|
||
},
|
||
"execution_count": 127,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# number of distinct rows in original data\n",
|
||
"len(x_train.drop_duplicates())"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 128,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"3001"
|
||
]
|
||
},
|
||
"execution_count": 128,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# number of distinct rows in anonymized data\n",
|
||
"len(anon.drop_duplicates())"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Train decision tree model"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 129,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Anonymized model accuracy: 0.9054783950617284\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"anon_encoded = preprocessor.fit_transform(anon)\n",
|
||
"test_encoded = preprocessor.transform(x_test)\n",
|
||
"\n",
|
||
"anon_model = DecisionTreeClassifier()\n",
|
||
"anon_model.fit(anon_encoded, y_train)\n",
|
||
"\n",
|
||
"anon_art_classifier = ScikitlearnDecisionTreeClassifier(anon_model)\n",
|
||
"\n",
|
||
"print('Anonymized model accuracy: ', anon_model.score(test_encoded, y_test))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Attack\n",
|
||
"### Black-box attack"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 130,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"0.5813235577850666\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"# training data without attacked feature\n",
|
||
"x_train_for_attack = np.delete(train_encoded, attack_feature, 1)\n",
|
||
"# only attacked feature\n",
|
||
"x_train_feature = train_encoded[:, attack_feature].copy().reshape(-1, 1)\n",
|
||
"\n",
|
||
"anon_bb_attack = AttributeInferenceBlackBox(anon_art_classifier, attack_feature=attack_feature)\n",
|
||
"\n",
|
||
"# get original model's predictions\n",
|
||
"anon_x_train_predictions = np.array([np.argmax(arr) for arr in anon_art_classifier.predict(train_encoded)]).reshape(-1,1)\n",
|
||
"\n",
|
||
"# train attack model\n",
|
||
"anon_bb_attack.fit(train_encoded[:attack_train_size])\n",
|
||
"\n",
|
||
"# get inferred values\n",
|
||
"inferred_train_anon_bb = anon_bb_attack.infer(x_train_for_attack[attack_train_size:], pred=anon_x_train_predictions[attack_train_size:], values=values)\n",
|
||
"# check accuracy\n",
|
||
"train_acc = np.sum(inferred_train_anon_bb == np.around(x_train_feature[attack_train_size:], decimals=8).reshape(1,-1)) / len(inferred_train_anon_bb)\n",
|
||
"print(train_acc)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"### White box attack"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 131,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"0.6857032606598495\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"anon_wb2_attack = AttributeInferenceWhiteBoxDecisionTree(anon_art_classifier, attack_feature=attack_feature)\n",
|
||
"\n",
|
||
"# get inferred values\n",
|
||
"inferred_train_anon_wb2 = anon_wb2_attack.infer(x_train_for_attack, anon_x_train_predictions, values=values, priors=priors)\n",
|
||
"\n",
|
||
"# check accuracy\n",
|
||
"anon_train_acc = np.sum(inferred_train_anon_wb2 == np.around(x_train_feature, decimals=8).reshape(1,-1)) / len(inferred_train_anon_wb2)\n",
|
||
"print(anon_train_acc)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"The accuracy of the attacks remains more or less the same. Let's check the precision and recall for each case:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 132,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"(0.3353658536585366, 0.22540983606557377)\n",
|
||
"(0.3354908306364617, 0.18208430913348947)\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"def calc_precision_recall(predicted, actual, positive_value=1):\n",
|
||
" score = 0 # both predicted and actual are positive\n",
|
||
" num_positive_predicted = 0 # predicted positive\n",
|
||
" num_positive_actual = 0 # actual positive\n",
|
||
" for i in range(len(predicted)):\n",
|
||
" if predicted[i] == positive_value:\n",
|
||
" num_positive_predicted += 1\n",
|
||
" if actual[i] == positive_value:\n",
|
||
" num_positive_actual += 1\n",
|
||
" if predicted[i] == actual[i]:\n",
|
||
" if predicted[i] == positive_value:\n",
|
||
" score += 1\n",
|
||
" \n",
|
||
" if num_positive_predicted == 0:\n",
|
||
" precision = 1\n",
|
||
" else:\n",
|
||
" precision = score / num_positive_predicted # the fraction of predicted “Yes” responses that are correct\n",
|
||
" if num_positive_actual == 0:\n",
|
||
" recall = 1\n",
|
||
" else:\n",
|
||
" recall = score / num_positive_actual # the fraction of “Yes” responses that are predicted correctly\n",
|
||
"\n",
|
||
" return precision, recall\n",
|
||
" \n",
|
||
"# black-box regular\n",
|
||
"print(calc_precision_recall(inferred_train_bb, x_train_feature))\n",
|
||
"# black-box anonymized\n",
|
||
"print(calc_precision_recall(inferred_train_anon_bb, x_train_feature))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 133,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"(0.6457357075913777, 0.2002324905550712)\n",
|
||
"(0.6384266263237519, 0.12263876780005813)\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"# white-box regular\n",
|
||
"print(calc_precision_recall(inferred_train_wb2, x_train_feature))\n",
|
||
"# white-box anonymized\n",
|
||
"print(calc_precision_recall(inferred_train_anon_wb2, x_train_feature))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Precision and recall remain almost the same, sometimes even slightly increasing.\n",
|
||
"\n",
|
||
"Now let's see what happens when we increase k to 1000."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## k=1000\n",
|
||
"\n",
|
||
"Now we apply the attacks on an anonymized version of the same dataset (k=1000). The data has been anonymized on the quasi-identifiers: finance, social, health."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 134,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"anonymizer2 = Anonymize(1000, QI, categorical_features=categorical_features)\n",
|
||
"anon2 = anonymizer2.anonymize(ArrayDataset(x_train, x_train_predictions))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 135,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"1727"
|
||
]
|
||
},
|
||
"execution_count": 135,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# number of distinct rows in anonymized data\n",
|
||
"len(anon2.drop_duplicates())"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Train decision tree model"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 136,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Anonymized model accuracy: 0.8981481481481481\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"anon2_encoded = preprocessor.fit_transform(anon2)\n",
|
||
"test_encoded = preprocessor.transform(x_test)\n",
|
||
"\n",
|
||
"anon2_model = DecisionTreeClassifier()\n",
|
||
"anon2_model.fit(anon2_encoded, y_train)\n",
|
||
"\n",
|
||
"anon2_art_classifier = ScikitlearnDecisionTreeClassifier(anon2_model)\n",
|
||
"\n",
|
||
"print('Anonymized model accuracy: ', anon2_model.score(test_encoded, y_test))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Attack\n",
|
||
"### Black-box attack"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 137,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"0.546015820953116\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"# training data without attacked feature\n",
|
||
"x_train_for_attack = np.delete(train_encoded, attack_feature, 1)\n",
|
||
"# only attacked feature\n",
|
||
"x_train_feature = train_encoded[:, attack_feature].copy().reshape(-1, 1)\n",
|
||
"\n",
|
||
"anon2_bb_attack = AttributeInferenceBlackBox(anon2_art_classifier, attack_feature=attack_feature)\n",
|
||
"\n",
|
||
"# get original model's predictions\n",
|
||
"anon2_x_train_predictions = np.array([np.argmax(arr) for arr in anon2_art_classifier.predict(train_encoded)]).reshape(-1,1)\n",
|
||
"\n",
|
||
"# train attack model\n",
|
||
"anon2_bb_attack.fit(train_encoded[:attack_train_size])\n",
|
||
"\n",
|
||
"# get inferred values\n",
|
||
"inferred_train_anon2_bb = anon2_bb_attack.infer(x_train_for_attack[attack_train_size:], pred=anon2_x_train_predictions[attack_train_size:], values=values)\n",
|
||
"# check accuracy\n",
|
||
"train_acc = np.sum(inferred_train_anon2_bb == np.around(x_train_feature[attack_train_size:], decimals=8).reshape(1,-1)) / len(inferred_train_anon2_bb)\n",
|
||
"print(train_acc)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"### White box attack"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 138,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"0.6680493922438742\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"anon2_wb2_attack = AttributeInferenceWhiteBoxDecisionTree(anon2_art_classifier, attack_feature=attack_feature)\n",
|
||
"\n",
|
||
"# get inferred values\n",
|
||
"inferred_train_anon2_wb2 = anon2_wb2_attack.infer(x_train_for_attack, anon2_x_train_predictions, values=values, priors=priors)\n",
|
||
"\n",
|
||
"# check accuracy\n",
|
||
"train_acc = np.sum(inferred_train_anon2_wb2 == np.around(x_train_feature, decimals=8).reshape(1,-1)) / len(inferred_train_anon_wb2)\n",
|
||
"print(train_acc)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 139,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"(0.3353658536585366, 0.22540983606557377)\n",
|
||
"(0.32242990654205606, 0.16159250585480095)\n",
|
||
"(0.6457357075913777, 0.2002324905550712)\n",
|
||
"(1, 0.0)\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"# black-box regular\n",
|
||
"print(calc_precision_recall(inferred_train_bb, x_train_feature))\n",
|
||
"# black-box anonymized\n",
|
||
"print(calc_precision_recall(inferred_train_anon2_bb, x_train_feature))\n",
|
||
"\n",
|
||
"# white-box regular\n",
|
||
"print(calc_precision_recall(inferred_train_wb2, x_train_feature))\n",
|
||
"# white-box anonymized\n",
|
||
"print(calc_precision_recall(inferred_train_anon2_wb2, x_train_feature))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"The accuracy of the black-box attack is slightly reduced, as well as the precision and recall in both attacks."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## k=100, all QI\n",
|
||
"Now let's see what happens if we define all 8 features in the Nursery dataset as quasi-identifiers."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 140,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"QI2 = [\"parents\", \"has_nurs\", \"form\", \"children\", \"housing\", \"finance\", \"social\", \"health\"]\n",
|
||
"anonymizer3 = Anonymize(100, QI2, categorical_features=categorical_features)\n",
|
||
"anon3 = anonymizer3.anonymize(ArrayDataset(x_train, x_train_predictions))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 141,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"39"
|
||
]
|
||
},
|
||
"execution_count": 141,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# number of distinct rows in anonymized data\n",
|
||
"len(anon3.drop_duplicates())"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 142,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Anonymized model accuracy: 0.7600308641975309\n",
|
||
"BB attack accuracy: 0.5330889446266641\n",
|
||
"WB attack accuracy: 0.6680493922438742\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"anon3_encoded = preprocessor.fit_transform(anon3)\n",
|
||
"test_encoded = preprocessor.transform(x_test)\n",
|
||
"\n",
|
||
"anon3_model = DecisionTreeClassifier()\n",
|
||
"anon3_model.fit(anon3_encoded, y_train)\n",
|
||
"\n",
|
||
"anon3_art_classifier = ScikitlearnDecisionTreeClassifier(anon3_model)\n",
|
||
"\n",
|
||
"print('Anonymized model accuracy: ', anon3_model.score(test_encoded, y_test))\n",
|
||
"\n",
|
||
"# training data without attacked feature\n",
|
||
"x_train_for_attack = np.delete(train_encoded, attack_feature, 1)\n",
|
||
"# only attacked feature\n",
|
||
"x_train_feature = train_encoded[:, attack_feature].copy().reshape(-1, 1)\n",
|
||
"\n",
|
||
"anon3_bb_attack = AttributeInferenceBlackBox(anon3_art_classifier, attack_feature=attack_feature)\n",
|
||
"\n",
|
||
"# get original model's predictions\n",
|
||
"anon3_x_train_predictions = np.array([np.argmax(arr) for arr in anon3_art_classifier.predict(train_encoded)]).reshape(-1,1)\n",
|
||
"\n",
|
||
"# train attack model\n",
|
||
"anon3_bb_attack.fit(train_encoded[:attack_train_size])\n",
|
||
"\n",
|
||
"# get inferred values\n",
|
||
"inferred_train_anon3_bb = anon3_bb_attack.infer(x_train_for_attack[attack_train_size:], pred=anon3_x_train_predictions[attack_train_size:], values=values)\n",
|
||
"# check accuracy\n",
|
||
"train_acc = np.sum(inferred_train_anon3_bb == np.around(x_train_feature[attack_train_size:], decimals=8).reshape(1,-1)) / len(inferred_train_anon2_bb)\n",
|
||
"print('BB attack accuracy: ', train_acc)\n",
|
||
"\n",
|
||
"anon3_wb2_attack = AttributeInferenceWhiteBoxDecisionTree(anon3_art_classifier, attack_feature=attack_feature)\n",
|
||
"\n",
|
||
"# get inferred values\n",
|
||
"inferred_train_anon3_wb2 = anon3_wb2_attack.infer(x_train_for_attack, anon3_x_train_predictions, values=values, priors=priors)\n",
|
||
"\n",
|
||
"# check accuracy\n",
|
||
"train_acc = np.sum(inferred_train_anon3_wb2 == np.around(x_train_feature, decimals=8).reshape(1,-1)) / len(inferred_train_anon_wb2)\n",
|
||
"print('WB attack accuracy: ', train_acc)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 143,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"(0.3353658536585366, 0.22540983606557377)\n",
|
||
"(0.344644750795334, 0.19028103044496486)\n",
|
||
"(0.6457357075913777, 0.2002324905550712)\n",
|
||
"(1, 0.0)\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"# black-box regular\n",
|
||
"print(calc_precision_recall(inferred_train_bb, x_train_feature))\n",
|
||
"# black-box anonymized\n",
|
||
"print(calc_precision_recall(inferred_train_anon3_bb, x_train_feature))\n",
|
||
"\n",
|
||
"# white-box regular\n",
|
||
"print(calc_precision_recall(inferred_train_wb2, x_train_feature))\n",
|
||
"# white-box anonymized\n",
|
||
"print(calc_precision_recall(inferred_train_anon3_wb2, x_train_feature))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Accuracy of both attacks has decreased. Precision and recall remain roughly the same in the black-box case. \n",
|
||
"\n",
|
||
"*In the anonymized version of the white-box attack, no records were predicted with the positive value for the attacked feature."
|
||
]
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": "Python 3",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.8.3"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 2
|
||
}
|