ai-privacy-toolkit/apt/minimization/__init__.py
abigailgold f2e1364b43
Add data minimization functionality to the ai-privacy-toolkit (#3)
* Fix directory issue when running tests for first time

* Initial version of data minimization

* Update version and documentation

* Fix documentation
2021-07-12 15:56:42 +03:00

19 lines
1,018 B
Python

"""
Module providing data minimization for ML.
This module implements a first-of-a-kind method to help reduce the amount of personal data needed to perform
predictions with a machine learning model, by removing or generalizing some of the input features. For more information
about the method see: http://export.arxiv.org/pdf/2008.04113
The main class, ``GeneralizeToRepresentative``, is a scikit-learn compatible ``Transformer``, that receives an existing
estimator and labeled training data, and learns the generalizations that can be applied to any newly collected data for
analysis by the original model. The ``fit()`` method learns the generalizations and the ``transform()`` method applies
them to new data.
It is also possible to export the generalizations as feature ranges.
The current implementation supports only numeric features, so any categorical features must be transformed to a numeric
representation before using this class.
"""
from apt.minimization.minimizer import GeneralizeToRepresentative