ai-privacy-toolkit/notebooks/attribute_inference_anonymization_nursery.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Using ML anonymization to defend against attribute inference attacks"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this tutorial we will show how to anonymize models using the ML anonymization module. \n",
    "\n",
    "We will demonstrate running inference attacks both on a vanilla model, and then on different anonymized versions of the model. We will run both black-box and white-box attribute inference attacks using ART's inference module (https://github.com/Trusted-AI/adversarial-robustness-toolbox/tree/main/art/attacks/inference). \n",
    "\n",
    "This will be demonstarted using the Nursery dataset (original dataset can be found here: https://archive.ics.uci.edu/ml/datasets/nursery). \n",
    "\n",
    "The sensitive feature we are trying to infer is the 'social' feature, after turning it into a binary feature (the original value 'problematic' receives the new value 1 and the rest 0). We also preprocess the data such that all categorical features are one-hot encoded."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 121,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>parents</th>\n",
       "      <th>has_nurs</th>\n",
       "      <th>form</th>\n",
       "      <th>children</th>\n",
       "      <th>housing</th>\n",
       "      <th>finance</th>\n",
       "      <th>social</th>\n",
       "      <th>health</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>8450</th>\n",
       "      <td>pretentious</td>\n",
       "      <td>very_crit</td>\n",
       "      <td>foster</td>\n",
       "      <td>1</td>\n",
       "      <td>less_conv</td>\n",
       "      <td>convenient</td>\n",
       "      <td>1</td>\n",
       "      <td>not_recom</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12147</th>\n",
       "      <td>great_pret</td>\n",
       "      <td>very_crit</td>\n",
       "      <td>complete</td>\n",
       "      <td>1</td>\n",
       "      <td>critical</td>\n",
       "      <td>inconv</td>\n",
       "      <td>1</td>\n",
       "      <td>recommended</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2780</th>\n",
       "      <td>usual</td>\n",
       "      <td>critical</td>\n",
       "      <td>complete</td>\n",
       "      <td>4</td>\n",
       "      <td>less_conv</td>\n",
       "      <td>convenient</td>\n",
       "      <td>1</td>\n",
       "      <td>not_recom</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11924</th>\n",
       "      <td>great_pret</td>\n",
       "      <td>critical</td>\n",
       "      <td>foster</td>\n",
       "      <td>1</td>\n",
       "      <td>critical</td>\n",
       "      <td>convenient</td>\n",
       "      <td>1</td>\n",
       "      <td>not_recom</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>59</th>\n",
       "      <td>usual</td>\n",
       "      <td>proper</td>\n",
       "      <td>complete</td>\n",
       "      <td>2</td>\n",
       "      <td>convenient</td>\n",
       "      <td>convenient</td>\n",
       "      <td>0</td>\n",
       "      <td>not_recom</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5193</th>\n",
       "      <td>pretentious</td>\n",
       "      <td>less_proper</td>\n",
       "      <td>complete</td>\n",
       "      <td>1</td>\n",
       "      <td>convenient</td>\n",
       "      <td>inconv</td>\n",
       "      <td>0</td>\n",
       "      <td>recommended</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1375</th>\n",
       "      <td>usual</td>\n",
       "      <td>less_proper</td>\n",
       "      <td>incomplete</td>\n",
       "      <td>2</td>\n",
       "      <td>less_conv</td>\n",
       "      <td>convenient</td>\n",
       "      <td>1</td>\n",
       "      <td>priority</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10318</th>\n",
       "      <td>great_pret</td>\n",
       "      <td>less_proper</td>\n",
       "      <td>foster</td>\n",
       "      <td>4</td>\n",
       "      <td>convenient</td>\n",
       "      <td>convenient</td>\n",
       "      <td>0</td>\n",
       "      <td>priority</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6396</th>\n",
       "      <td>pretentious</td>\n",
       "      <td>improper</td>\n",
       "      <td>completed</td>\n",
       "      <td>3</td>\n",
       "      <td>less_conv</td>\n",
       "      <td>convenient</td>\n",
       "      <td>1</td>\n",
       "      <td>recommended</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>485</th>\n",
       "      <td>usual</td>\n",
       "      <td>proper</td>\n",
       "      <td>incomplete</td>\n",
       "      <td>1</td>\n",
       "      <td>critical</td>\n",
       "      <td>inconv</td>\n",
       "      <td>1</td>\n",
       "      <td>not_recom</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>10366 rows × 8 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "           parents     has_nurs        form children     housing     finance  \\\n",
       "8450   pretentious    very_crit      foster        1   less_conv  convenient   \n",
       "12147   great_pret    very_crit    complete        1    critical      inconv   \n",
       "2780         usual     critical    complete        4   less_conv  convenient   \n",
       "11924   great_pret     critical      foster        1    critical  convenient   \n",
       "59           usual       proper    complete        2  convenient  convenient   \n",
       "...            ...          ...         ...      ...         ...         ...   \n",
       "5193   pretentious  less_proper    complete        1  convenient      inconv   \n",
       "1375         usual  less_proper  incomplete        2   less_conv  convenient   \n",
       "10318   great_pret  less_proper      foster        4  convenient  convenient   \n",
       "6396   pretentious     improper   completed        3   less_conv  convenient   \n",
       "485          usual       proper  incomplete        1    critical      inconv   \n",
       "\n",
       "       social       health  \n",
       "8450        1    not_recom  \n",
       "12147       1  recommended  \n",
       "2780        1    not_recom  \n",
       "11924       1    not_recom  \n",
       "59          0    not_recom  \n",
       "...       ...          ...  \n",
       "5193        0  recommended  \n",
       "1375        1     priority  \n",
       "10318       0     priority  \n",
       "6396        1  recommended  \n",
       "485         1    not_recom  \n",
       "\n",
       "[10366 rows x 8 columns]"
      ]
     },
     "execution_count": 121,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import os\n",
    "import sys\n",
    "sys.path.insert(0, os.path.abspath('..'))\n",
    "\n",
    "from apt.utils.dataset_utils import get_nursery_dataset_pd\n",
    "\n",
    "(x_train, y_train), (x_test, y_test) = get_nursery_dataset_pd(transform_social=True)\n",
    "\n",
    "x_train"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Train decision tree model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 122,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Base model accuracy:  0.9969135802469136\n"
     ]
    }
   ],
   "source": [
    "from sklearn.tree import DecisionTreeClassifier\n",
    "from art.estimators.classification.scikitlearn import ScikitlearnDecisionTreeClassifier\n",
    "from sklearn.preprocessing import OneHotEncoder\n",
    "from sklearn.compose import ColumnTransformer\n",
    "from sklearn.impute import SimpleImputer\n",
    "from sklearn.pipeline import Pipeline\n",
    "\n",
    "numeric_features = ['social']\n",
    "categorical_features = ['children', 'parents', 'has_nurs', 'form', 'housing', 'finance', 'health']\n",
    "numeric_transformer = Pipeline(\n",
    "    steps=[('imputer', SimpleImputer(strategy='constant', fill_value=0))]\n",
    ")\n",
    "categorical_transformer = OneHotEncoder(handle_unknown=\"ignore\", sparse=False)\n",
    "preprocessor = ColumnTransformer(\n",
    "    transformers=[\n",
    "        (\"num\", numeric_transformer, numeric_features),\n",
    "        (\"cat\", categorical_transformer, categorical_features),\n",
    "    ]\n",
    ")\n",
    "\n",
    "train_encoded = preprocessor.fit_transform(x_train)\n",
    "test_encoded = preprocessor.transform(x_test)\n",
    "    \n",
    "model = DecisionTreeClassifier()\n",
    "model.fit(train_encoded, y_train)\n",
    "\n",
    "art_classifier = ScikitlearnDecisionTreeClassifier(model)\n",
    "\n",
    "print('Base model accuracy: ', model.score(test_encoded, y_test))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Attack\n",
    "### Black-box attack\n",
    "The black-box attack basically trains an additional classifier (called the attack model) to predict the attacked feature's value from the remaining n-1 features as well as the original (attacked) model's predictions.\n",
    "#### Train attack model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 123,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from art.attacks.inference.attribute_inference import AttributeInferenceBlackBox\n",
    "\n",
    "# social feature after preprocessing\n",
    "attack_feature = 0\n",
    "\n",
    "# training data without attacked feature\n",
    "x_train_for_attack = np.delete(train_encoded, attack_feature, 1)\n",
    "# only attacked feature\n",
    "x_train_feature = train_encoded[:, attack_feature].copy().reshape(-1, 1)\n",
    "\n",
    "bb_attack = AttributeInferenceBlackBox(art_classifier, attack_feature=attack_feature)\n",
    "\n",
    "# get original model's predictions\n",
    "x_train_predictions = np.array([np.argmax(arr) for arr in art_classifier.predict(train_encoded)]).reshape(-1,1)\n",
    "\n",
    "# use half of training set for training the attack\n",
    "attack_train_ratio = 0.5\n",
    "attack_train_size = int(len(train_encoded) * attack_train_ratio)\n",
    "\n",
    "# train attack model\n",
    "bb_attack.fit(train_encoded[:attack_train_size])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Infer sensitive feature and check accuracy"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 124,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.6000385876905268\n"
     ]
    }
   ],
   "source": [
    "# get inferred values\n",
    "values=[0, 1]\n",
    "\n",
    "inferred_train_bb = bb_attack.infer(x_train_for_attack[attack_train_size:], pred=x_train_predictions[attack_train_size:], values=values)\n",
    "# check accuracy\n",
    "train_acc = np.sum(inferred_train_bb == np.around(x_train_feature[attack_train_size:], decimals=8).reshape(1,-1)) / len(inferred_train_bb)\n",
    "print(train_acc)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This means that for 60% of the training set, the attacked feature is inferred correctly using this attack."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Whitebox attack\n",
    "This attack does not train any additional model, it simply uses additional information coded within the attacked decision tree model to compute the probability of each value of the attacked feature and outputs the value with the highest probability."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 125,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.6980513216284006\n"
     ]
    }
   ],
   "source": [
    "from art.attacks.inference.attribute_inference import AttributeInferenceWhiteBoxDecisionTree\n",
    "\n",
    "priors = [6925 / 10366, 3441 / 10366]\n",
    "\n",
    "wb2_attack = AttributeInferenceWhiteBoxDecisionTree(art_classifier, attack_feature=attack_feature)\n",
    "\n",
    "# get inferred values\n",
    "inferred_train_wb2 = wb2_attack.infer(x_train_for_attack, x_train_predictions, values=values, priors=priors)\n",
    "\n",
    "# check accuracy\n",
    "train_acc = np.sum(inferred_train_wb2 == np.around(x_train_feature, decimals=8).reshape(1,-1)) / len(inferred_train_wb2)\n",
    "print(train_acc)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The white-box attack is able to correctly infer the attacked feature value in 69% of the training set. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Anonymized data\n",
    "## k=100\n",
    "\n",
    "Now we will apply the same attacks on an anonymized version of the same dataset (k=100). The data is anonymized on the quasi-identifiers: finance, social, health.\n",
    "\n",
    "k=100 means that each record in the anonymized dataset is identical to 99 others on the quasi-identifier values (i.e., when looking only at those 3 feature, the records are indistinguishable)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 126,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>parents</th>\n",
       "      <th>has_nurs</th>\n",
       "      <th>form</th>\n",
       "      <th>children</th>\n",
       "      <th>housing</th>\n",
       "      <th>finance</th>\n",
       "      <th>social</th>\n",
       "      <th>health</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>pretentious</td>\n",
       "      <td>very_crit</td>\n",
       "      <td>foster</td>\n",
       "      <td>1</td>\n",
       "      <td>less_conv</td>\n",
       "      <td>convenient</td>\n",
       "      <td>0</td>\n",
       "      <td>not_recom</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>great_pret</td>\n",
       "      <td>very_crit</td>\n",
       "      <td>complete</td>\n",
       "      <td>1</td>\n",
       "      <td>critical</td>\n",
       "      <td>inconv</td>\n",
       "      <td>1</td>\n",
       "      <td>recommended</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>usual</td>\n",
       "      <td>critical</td>\n",
       "      <td>complete</td>\n",
       "      <td>4</td>\n",
       "      <td>less_conv</td>\n",
       "      <td>convenient</td>\n",
       "      <td>0</td>\n",
       "      <td>not_recom</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>great_pret</td>\n",
       "      <td>critical</td>\n",
       "      <td>foster</td>\n",
       "      <td>1</td>\n",
       "      <td>critical</td>\n",
       "      <td>convenient</td>\n",
       "      <td>0</td>\n",
       "      <td>not_recom</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>usual</td>\n",
       "      <td>proper</td>\n",
       "      <td>complete</td>\n",
       "      <td>2</td>\n",
       "      <td>convenient</td>\n",
       "      <td>convenient</td>\n",
       "      <td>0</td>\n",
       "      <td>not_recom</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10361</th>\n",
       "      <td>pretentious</td>\n",
       "      <td>less_proper</td>\n",
       "      <td>complete</td>\n",
       "      <td>1</td>\n",
       "      <td>convenient</td>\n",
       "      <td>convenient</td>\n",
       "      <td>0</td>\n",
       "      <td>recommended</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10362</th>\n",
       "      <td>usual</td>\n",
       "      <td>less_proper</td>\n",
       "      <td>incomplete</td>\n",
       "      <td>2</td>\n",
       "      <td>less_conv</td>\n",
       "      <td>inconv</td>\n",
       "      <td>0</td>\n",
       "      <td>priority</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10363</th>\n",
       "      <td>great_pret</td>\n",
       "      <td>less_proper</td>\n",
       "      <td>foster</td>\n",
       "      <td>4</td>\n",
       "      <td>convenient</td>\n",
       "      <td>convenient</td>\n",
       "      <td>0</td>\n",
       "      <td>priority</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10364</th>\n",
       "      <td>pretentious</td>\n",
       "      <td>improper</td>\n",
       "      <td>completed</td>\n",
       "      <td>3</td>\n",
       "      <td>less_conv</td>\n",
       "      <td>convenient</td>\n",
       "      <td>0</td>\n",
       "      <td>recommended</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10365</th>\n",
       "      <td>usual</td>\n",
       "      <td>proper</td>\n",
       "      <td>incomplete</td>\n",
       "      <td>1</td>\n",
       "      <td>critical</td>\n",
       "      <td>convenient</td>\n",
       "      <td>0</td>\n",
       "      <td>not_recom</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>10366 rows × 8 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "           parents     has_nurs        form children     housing     finance  \\\n",
       "0      pretentious    very_crit      foster        1   less_conv  convenient   \n",
       "1       great_pret    very_crit    complete        1    critical      inconv   \n",
       "2            usual     critical    complete        4   less_conv  convenient   \n",
       "3       great_pret     critical      foster        1    critical  convenient   \n",
       "4            usual       proper    complete        2  convenient  convenient   \n",
       "...            ...          ...         ...      ...         ...         ...   \n",
       "10361  pretentious  less_proper    complete        1  convenient  convenient   \n",
       "10362        usual  less_proper  incomplete        2   less_conv      inconv   \n",
       "10363   great_pret  less_proper      foster        4  convenient  convenient   \n",
       "10364  pretentious     improper   completed        3   less_conv  convenient   \n",
       "10365        usual       proper  incomplete        1    critical  convenient   \n",
       "\n",
       "      social       health  \n",
       "0          0    not_recom  \n",
       "1          1  recommended  \n",
       "2          0    not_recom  \n",
       "3          0    not_recom  \n",
       "4          0    not_recom  \n",
       "...      ...          ...  \n",
       "10361      0  recommended  \n",
       "10362      0     priority  \n",
       "10363      0     priority  \n",
       "10364      0  recommended  \n",
       "10365      0    not_recom  \n",
       "\n",
       "[10366 rows x 8 columns]"
      ]
     },
     "execution_count": 126,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from apt.utils.datasets import ArrayDataset\n",
    "from apt.anonymization import Anonymize\n",
    "\n",
    "features = x_train.columns\n",
    "QI = [\"finance\", \"social\", \"health\"]\n",
    "\n",
    "anonymizer = Anonymize(100, QI, categorical_features=categorical_features)\n",
    "anon = anonymizer.anonymize(ArrayDataset(x_train, x_train_predictions))\n",
    "anon"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 127,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "7585"
      ]
     },
     "execution_count": 127,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# number of distinct rows in original data\n",
    "len(x_train.drop_duplicates())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 128,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3001"
      ]
     },
     "execution_count": 128,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# number of distinct rows in anonymized data\n",
    "len(anon.drop_duplicates())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Train decision tree model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 129,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Anonymized model accuracy:  0.9054783950617284\n"
     ]
    }
   ],
   "source": [
    "anon_encoded = preprocessor.fit_transform(anon)\n",
    "test_encoded = preprocessor.transform(x_test)\n",
    "\n",
    "anon_model = DecisionTreeClassifier()\n",
    "anon_model.fit(anon_encoded, y_train)\n",
    "\n",
    "anon_art_classifier = ScikitlearnDecisionTreeClassifier(anon_model)\n",
    "\n",
    "print('Anonymized model accuracy: ', anon_model.score(test_encoded, y_test))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Attack\n",
    "### Black-box attack"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 130,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.5813235577850666\n"
     ]
    }
   ],
   "source": [
    "# training data without attacked feature\n",
    "x_train_for_attack = np.delete(train_encoded, attack_feature, 1)\n",
    "# only attacked feature\n",
    "x_train_feature = train_encoded[:, attack_feature].copy().reshape(-1, 1)\n",
    "\n",
    "anon_bb_attack = AttributeInferenceBlackBox(anon_art_classifier, attack_feature=attack_feature)\n",
    "\n",
    "# get original model's predictions\n",
    "anon_x_train_predictions = np.array([np.argmax(arr) for arr in anon_art_classifier.predict(train_encoded)]).reshape(-1,1)\n",
    "\n",
    "# train attack model\n",
    "anon_bb_attack.fit(train_encoded[:attack_train_size])\n",
    "\n",
    "# get inferred values\n",
    "inferred_train_anon_bb = anon_bb_attack.infer(x_train_for_attack[attack_train_size:], pred=anon_x_train_predictions[attack_train_size:], values=values)\n",
    "# check accuracy\n",
    "train_acc = np.sum(inferred_train_anon_bb == np.around(x_train_feature[attack_train_size:], decimals=8).reshape(1,-1)) / len(inferred_train_anon_bb)\n",
    "print(train_acc)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### White box attack"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 131,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.6857032606598495\n"
     ]
    }
   ],
   "source": [
    "anon_wb2_attack = AttributeInferenceWhiteBoxDecisionTree(anon_art_classifier, attack_feature=attack_feature)\n",
    "\n",
    "# get inferred values\n",
    "inferred_train_anon_wb2 = anon_wb2_attack.infer(x_train_for_attack, anon_x_train_predictions, values=values, priors=priors)\n",
    "\n",
    "# check accuracy\n",
    "anon_train_acc = np.sum(inferred_train_anon_wb2 == np.around(x_train_feature, decimals=8).reshape(1,-1)) / len(inferred_train_anon_wb2)\n",
    "print(anon_train_acc)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The accuracy of the attacks remains more or less the same. Let's check the precision and recall for each case:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 132,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(0.3353658536585366, 0.22540983606557377)\n",
      "(0.3354908306364617, 0.18208430913348947)\n"
     ]
    }
   ],
   "source": [
    "def calc_precision_recall(predicted, actual, positive_value=1):\n",
    "    score = 0  # both predicted and actual are positive\n",
    "    num_positive_predicted = 0  # predicted positive\n",
    "    num_positive_actual = 0  # actual positive\n",
    "    for i in range(len(predicted)):\n",
    "        if predicted[i] == positive_value:\n",
    "            num_positive_predicted += 1\n",
    "        if actual[i] == positive_value:\n",
    "            num_positive_actual += 1\n",
    "        if predicted[i] == actual[i]:\n",
    "            if predicted[i] == positive_value:\n",
    "                score += 1\n",
    "    \n",
    "    if num_positive_predicted == 0:\n",
    "        precision = 1\n",
    "    else:\n",
    "        precision = score / num_positive_predicted  # the fraction of predicted “Yes” responses that are correct\n",
    "    if num_positive_actual == 0:\n",
    "        recall = 1\n",
    "    else:\n",
    "        recall = score / num_positive_actual  # the fraction of “Yes” responses that are predicted correctly\n",
    "\n",
    "    return precision, recall\n",
    "    \n",
    "# black-box regular\n",
    "print(calc_precision_recall(inferred_train_bb, x_train_feature))\n",
    "# black-box anonymized\n",
    "print(calc_precision_recall(inferred_train_anon_bb, x_train_feature))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 133,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(0.6457357075913777, 0.2002324905550712)\n",
      "(0.6384266263237519, 0.12263876780005813)\n"
     ]
    }
   ],
   "source": [
    "# white-box regular\n",
    "print(calc_precision_recall(inferred_train_wb2, x_train_feature))\n",
    "# white-box anonymized\n",
    "print(calc_precision_recall(inferred_train_anon_wb2, x_train_feature))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Precision and recall remain almost the same, sometimes even slightly increasing.\n",
    "\n",
    "Now let's see what happens when we increase k to 1000."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## k=1000\n",
    "\n",
    "Now we apply the attacks on an anonymized version of the same dataset (k=1000). The data has been anonymized on the quasi-identifiers: finance, social, health."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 134,
   "metadata": {},
   "outputs": [],
   "source": [
    "anonymizer2 = Anonymize(1000, QI, categorical_features=categorical_features)\n",
    "anon2 = anonymizer2.anonymize(ArrayDataset(x_train, x_train_predictions))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 135,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1727"
      ]
     },
     "execution_count": 135,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# number of distinct rows in anonymized data\n",
    "len(anon2.drop_duplicates())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Train decision tree model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 136,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Anonymized model accuracy:  0.8981481481481481\n"
     ]
    }
   ],
   "source": [
    "anon2_encoded = preprocessor.fit_transform(anon2)\n",
    "test_encoded = preprocessor.transform(x_test)\n",
    "\n",
    "anon2_model = DecisionTreeClassifier()\n",
    "anon2_model.fit(anon2_encoded, y_train)\n",
    "\n",
    "anon2_art_classifier = ScikitlearnDecisionTreeClassifier(anon2_model)\n",
    "\n",
    "print('Anonymized model accuracy: ', anon2_model.score(test_encoded, y_test))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Attack\n",
    "### Black-box attack"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 137,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.546015820953116\n"
     ]
    }
   ],
   "source": [
    "# training data without attacked feature\n",
    "x_train_for_attack = np.delete(train_encoded, attack_feature, 1)\n",
    "# only attacked feature\n",
    "x_train_feature = train_encoded[:, attack_feature].copy().reshape(-1, 1)\n",
    "\n",
    "anon2_bb_attack = AttributeInferenceBlackBox(anon2_art_classifier, attack_feature=attack_feature)\n",
    "\n",
    "# get original model's predictions\n",
    "anon2_x_train_predictions = np.array([np.argmax(arr) for arr in anon2_art_classifier.predict(train_encoded)]).reshape(-1,1)\n",
    "\n",
    "# train attack model\n",
    "anon2_bb_attack.fit(train_encoded[:attack_train_size])\n",
    "\n",
    "# get inferred values\n",
    "inferred_train_anon2_bb = anon2_bb_attack.infer(x_train_for_attack[attack_train_size:], pred=anon2_x_train_predictions[attack_train_size:], values=values)\n",
    "# check accuracy\n",
    "train_acc = np.sum(inferred_train_anon2_bb == np.around(x_train_feature[attack_train_size:], decimals=8).reshape(1,-1)) / len(inferred_train_anon2_bb)\n",
    "print(train_acc)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### White box attack"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 138,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.6680493922438742\n"
     ]
    }
   ],
   "source": [
    "anon2_wb2_attack = AttributeInferenceWhiteBoxDecisionTree(anon2_art_classifier, attack_feature=attack_feature)\n",
    "\n",
    "# get inferred values\n",
    "inferred_train_anon2_wb2 = anon2_wb2_attack.infer(x_train_for_attack, anon2_x_train_predictions, values=values, priors=priors)\n",
    "\n",
    "# check accuracy\n",
    "train_acc = np.sum(inferred_train_anon2_wb2 == np.around(x_train_feature, decimals=8).reshape(1,-1)) / len(inferred_train_anon_wb2)\n",
    "print(train_acc)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 139,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(0.3353658536585366, 0.22540983606557377)\n",
      "(0.32242990654205606, 0.16159250585480095)\n",
      "(0.6457357075913777, 0.2002324905550712)\n",
      "(1, 0.0)\n"
     ]
    }
   ],
   "source": [
    "# black-box regular\n",
    "print(calc_precision_recall(inferred_train_bb, x_train_feature))\n",
    "# black-box anonymized\n",
    "print(calc_precision_recall(inferred_train_anon2_bb, x_train_feature))\n",
    "\n",
    "# white-box regular\n",
    "print(calc_precision_recall(inferred_train_wb2, x_train_feature))\n",
    "# white-box anonymized\n",
    "print(calc_precision_recall(inferred_train_anon2_wb2, x_train_feature))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The accuracy of the black-box attack is slightly reduced, as well as the precision and recall in both attacks."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## k=100, all QI\n",
    "Now let's see what happens if we define all 8 features in the Nursery dataset as quasi-identifiers."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 140,
   "metadata": {},
   "outputs": [],
   "source": [
    "QI2 = [\"parents\", \"has_nurs\", \"form\", \"children\", \"housing\", \"finance\", \"social\", \"health\"]\n",
    "anonymizer3 = Anonymize(100, QI2, categorical_features=categorical_features)\n",
    "anon3 = anonymizer3.anonymize(ArrayDataset(x_train, x_train_predictions))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 141,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "39"
      ]
     },
     "execution_count": 141,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# number of distinct rows in anonymized data\n",
    "len(anon3.drop_duplicates())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 142,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Anonymized model accuracy:  0.7600308641975309\n",
      "BB attack accuracy:  0.5330889446266641\n",
      "WB attack accuracy:  0.6680493922438742\n"
     ]
    }
   ],
   "source": [
    "anon3_encoded = preprocessor.fit_transform(anon3)\n",
    "test_encoded = preprocessor.transform(x_test)\n",
    "\n",
    "anon3_model = DecisionTreeClassifier()\n",
    "anon3_model.fit(anon3_encoded, y_train)\n",
    "\n",
    "anon3_art_classifier = ScikitlearnDecisionTreeClassifier(anon3_model)\n",
    "\n",
    "print('Anonymized model accuracy: ', anon3_model.score(test_encoded, y_test))\n",
    "\n",
    "# training data without attacked feature\n",
    "x_train_for_attack = np.delete(train_encoded, attack_feature, 1)\n",
    "# only attacked feature\n",
    "x_train_feature = train_encoded[:, attack_feature].copy().reshape(-1, 1)\n",
    "\n",
    "anon3_bb_attack = AttributeInferenceBlackBox(anon3_art_classifier, attack_feature=attack_feature)\n",
    "\n",
    "# get original model's predictions\n",
    "anon3_x_train_predictions = np.array([np.argmax(arr) for arr in anon3_art_classifier.predict(train_encoded)]).reshape(-1,1)\n",
    "\n",
    "# train attack model\n",
    "anon3_bb_attack.fit(train_encoded[:attack_train_size])\n",
    "\n",
    "# get inferred values\n",
    "inferred_train_anon3_bb = anon3_bb_attack.infer(x_train_for_attack[attack_train_size:], pred=anon3_x_train_predictions[attack_train_size:], values=values)\n",
    "# check accuracy\n",
    "train_acc = np.sum(inferred_train_anon3_bb == np.around(x_train_feature[attack_train_size:], decimals=8).reshape(1,-1)) / len(inferred_train_anon2_bb)\n",
    "print('BB attack accuracy: ', train_acc)\n",
    "\n",
    "anon3_wb2_attack = AttributeInferenceWhiteBoxDecisionTree(anon3_art_classifier, attack_feature=attack_feature)\n",
    "\n",
    "# get inferred values\n",
    "inferred_train_anon3_wb2 = anon3_wb2_attack.infer(x_train_for_attack, anon3_x_train_predictions, values=values, priors=priors)\n",
    "\n",
    "# check accuracy\n",
    "train_acc = np.sum(inferred_train_anon3_wb2 == np.around(x_train_feature, decimals=8).reshape(1,-1)) / len(inferred_train_anon_wb2)\n",
    "print('WB attack accuracy: ', train_acc)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 143,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(0.3353658536585366, 0.22540983606557377)\n",
      "(0.344644750795334, 0.19028103044496486)\n",
      "(0.6457357075913777, 0.2002324905550712)\n",
      "(1, 0.0)\n"
     ]
    }
   ],
   "source": [
    "# black-box regular\n",
    "print(calc_precision_recall(inferred_train_bb, x_train_feature))\n",
    "# black-box anonymized\n",
    "print(calc_precision_recall(inferred_train_anon3_bb, x_train_feature))\n",
    "\n",
    "# white-box regular\n",
    "print(calc_precision_recall(inferred_train_wb2, x_train_feature))\n",
    "# white-box anonymized\n",
    "print(calc_precision_recall(inferred_train_anon3_wb2, x_train_feature))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Accuracy of both attacks has decreased. Precision and recall remain roughly the same in the black-box case. \n",
    "\n",
    "*In the anonymized version of the white-box attack, no records were predicted with the positive value for the attacked feature."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}