Update readme's with paper citations (#21)

2026-07-23 17:01:03 +02:00 · 2022-02-01 12:27:22 +02:00 · 2022-02-01 12:27:22 +02:00 · 9de078f937
commit 9de078f937
parent 3feebe8973
2 changed files with 23 additions and 12 deletions
--- a/apt/anonymization/README.md
+++ b/apt/anonymization/README.md
@ -19,4 +19,11 @@ The following figure depicts the overall process:
 </p>
 <br />

+Citation
+--------
+Goldsteen A., Ezov G., Shmelkin R., Moffie M., Farkash A. (2022) Anonymizing Machine Learning Models. In: Garcia-Alfaro 
+J., Muñoz-Tapia J.L., Navarro-Arribas G., Soriano M. (eds) Data Privacy Management, Cryptocurrencies and Blockchain 
+Technology. DPM 2021, CBT 2021. Lecture Notes in Computer Science, vol 13140. Springer, Cham. 
+https://doi.org/10.1007/978-3-030-93944-1_8
+

--- a/apt/minimization/README.md
+++ b/apt/minimization/README.md
@ -37,8 +37,7 @@ The current implementation supports numeric features and categorical features.
 Start by training your machine learning model. In this example, we will use a ``DecisionTreeClassifier``, but any 
 scikit-learn model can be used. We will use the iris dataset in our example.

-.. code:: python
-
+```
  from sklearn import datasets
  from sklearn.model_selection import train_test_split
  from sklearn.tree import DecisionTreeClassifier
@ -48,36 +47,37 @@ scikit-learn model can be used. We will use the iris dataset in our example.

  base_est = DecisionTreeClassifier()
  base_est.fit(X_train, y_train)
+```

 Now create the ``GeneralizeToRepresentative`` transformer and train it. Supply it with the original model and the 
 desired target accuracy. The training process may receive the original labeled training data or the model's predictions 
 on the data.

-.. code:: python
-
+```
  predictions = base_est.predict(X_train)
  gen = GeneralizeToRepresentative(base_est, target_accuracy=0.9)
  gen.fit(X_train, predictions)
+```

 Now use the transformer to transform new data, for example the test data.

-.. code:: python
-
+```
  transformed = gen.transform(X_test)
+```

 The transformed data has the same columns and formats as the original data, so it can be used directly to derive 
 predictions from the original model.

-.. code:: python
-
+```
  new_predictions = base_est.predict(transformed)
-  
+```
+
 To export the resulting generalizations, retrieve the ``Transformer``'s ``_generalize`` parameter.

-.. code:: python
-
+```
  generalizations = base_est._generalize
-  
+```
+
 The returned object has the following structure::

  {
@ -103,6 +103,10 @@ Where each value inside the range list represents a cutoff point. For example, f
 this example are: ``<21.5, 21.5-39.0, 39.0-51.0, 51.0-70.5, >70.5``. The ``untouched`` list represents features that 
 were not generalized, i.e., their values should remain unchanged.

+Citation
+--------
+Goldsteen, A., Ezov, G., Shmelkin, R. et al. Data minimization for GDPR compliance in machine learning models. AI Ethics 
+(2021). https://doi.org/10.1007/s43681-021-00095-8