{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Using ML anonymization to defend against attribute inference attacks" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this tutorial we will show how to anonymize models using the ML anonymization module. \n", "\n", "We will demonstrate running inference attacks both on a vanilla model, and then on different anonymized versions of the model. We will run both black-box and white-box attribute inference attacks using ART's inference module (https://github.com/Trusted-AI/adversarial-robustness-toolbox/tree/main/art/attacks/inference). \n", "\n", "This will be demonstarted using the Nursery dataset (original dataset can be found here: https://archive.ics.uci.edu/ml/datasets/nursery). \n", "\n", "The sensitive feature we are trying to infer is the 'social' feature, after turning it into a binary feature (the original value 'problematic' receives the new value 1 and the rest 0). We also preprocess the data such that all categorical features are one-hot encoded." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load data" ] }, { "cell_type": "code", "execution_count": 121, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
| \n", " | parents | \n", "has_nurs | \n", "form | \n", "children | \n", "housing | \n", "finance | \n", "social | \n", "health | \n", "
|---|---|---|---|---|---|---|---|---|
| 8450 | \n", "pretentious | \n", "very_crit | \n", "foster | \n", "1 | \n", "less_conv | \n", "convenient | \n", "1 | \n", "not_recom | \n", "
| 12147 | \n", "great_pret | \n", "very_crit | \n", "complete | \n", "1 | \n", "critical | \n", "inconv | \n", "1 | \n", "recommended | \n", "
| 2780 | \n", "usual | \n", "critical | \n", "complete | \n", "4 | \n", "less_conv | \n", "convenient | \n", "1 | \n", "not_recom | \n", "
| 11924 | \n", "great_pret | \n", "critical | \n", "foster | \n", "1 | \n", "critical | \n", "convenient | \n", "1 | \n", "not_recom | \n", "
| 59 | \n", "usual | \n", "proper | \n", "complete | \n", "2 | \n", "convenient | \n", "convenient | \n", "0 | \n", "not_recom | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 5193 | \n", "pretentious | \n", "less_proper | \n", "complete | \n", "1 | \n", "convenient | \n", "inconv | \n", "0 | \n", "recommended | \n", "
| 1375 | \n", "usual | \n", "less_proper | \n", "incomplete | \n", "2 | \n", "less_conv | \n", "convenient | \n", "1 | \n", "priority | \n", "
| 10318 | \n", "great_pret | \n", "less_proper | \n", "foster | \n", "4 | \n", "convenient | \n", "convenient | \n", "0 | \n", "priority | \n", "
| 6396 | \n", "pretentious | \n", "improper | \n", "completed | \n", "3 | \n", "less_conv | \n", "convenient | \n", "1 | \n", "recommended | \n", "
| 485 | \n", "usual | \n", "proper | \n", "incomplete | \n", "1 | \n", "critical | \n", "inconv | \n", "1 | \n", "not_recom | \n", "
10366 rows × 8 columns
\n", "| \n", " | parents | \n", "has_nurs | \n", "form | \n", "children | \n", "housing | \n", "finance | \n", "social | \n", "health | \n", "
|---|---|---|---|---|---|---|---|---|
| 0 | \n", "pretentious | \n", "very_crit | \n", "foster | \n", "1 | \n", "less_conv | \n", "convenient | \n", "0 | \n", "not_recom | \n", "
| 1 | \n", "great_pret | \n", "very_crit | \n", "complete | \n", "1 | \n", "critical | \n", "inconv | \n", "1 | \n", "recommended | \n", "
| 2 | \n", "usual | \n", "critical | \n", "complete | \n", "4 | \n", "less_conv | \n", "convenient | \n", "0 | \n", "not_recom | \n", "
| 3 | \n", "great_pret | \n", "critical | \n", "foster | \n", "1 | \n", "critical | \n", "convenient | \n", "0 | \n", "not_recom | \n", "
| 4 | \n", "usual | \n", "proper | \n", "complete | \n", "2 | \n", "convenient | \n", "convenient | \n", "0 | \n", "not_recom | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 10361 | \n", "pretentious | \n", "less_proper | \n", "complete | \n", "1 | \n", "convenient | \n", "convenient | \n", "0 | \n", "recommended | \n", "
| 10362 | \n", "usual | \n", "less_proper | \n", "incomplete | \n", "2 | \n", "less_conv | \n", "inconv | \n", "0 | \n", "priority | \n", "
| 10363 | \n", "great_pret | \n", "less_proper | \n", "foster | \n", "4 | \n", "convenient | \n", "convenient | \n", "0 | \n", "priority | \n", "
| 10364 | \n", "pretentious | \n", "improper | \n", "completed | \n", "3 | \n", "less_conv | \n", "convenient | \n", "0 | \n", "recommended | \n", "
| 10365 | \n", "usual | \n", "proper | \n", "incomplete | \n", "1 | \n", "critical | \n", "convenient | \n", "0 | \n", "not_recom | \n", "
10366 rows × 8 columns
\n", "