mirror of
https://github.com/IBM/ai-privacy-toolkit.git
synced 2026-05-27 14:25:14 +02:00
Add data minimization functionality to the ai-privacy-toolkit (#3)
* Fix directory issue when running tests for first time * Initial version of data minimization * Update version and documentation * Fix documentation
This commit is contained in:
parent
bcc3d67ba4
commit
f2e1364b43
14 changed files with 920 additions and 34 deletions
19
apt/minimization/__init__.py
Normal file
19
apt/minimization/__init__.py
Normal file
|
|
@ -0,0 +1,19 @@
|
|||
"""
|
||||
Module providing data minimization for ML.
|
||||
|
||||
This module implements a first-of-a-kind method to help reduce the amount of personal data needed to perform
|
||||
predictions with a machine learning model, by removing or generalizing some of the input features. For more information
|
||||
about the method see: http://export.arxiv.org/pdf/2008.04113
|
||||
|
||||
The main class, ``GeneralizeToRepresentative``, is a scikit-learn compatible ``Transformer``, that receives an existing
|
||||
estimator and labeled training data, and learns the generalizations that can be applied to any newly collected data for
|
||||
analysis by the original model. The ``fit()`` method learns the generalizations and the ``transform()`` method applies
|
||||
them to new data.
|
||||
|
||||
It is also possible to export the generalizations as feature ranges.
|
||||
|
||||
The current implementation supports only numeric features, so any categorical features must be transformed to a numeric
|
||||
representation before using this class.
|
||||
|
||||
"""
|
||||
from apt.minimization.minimizer import GeneralizeToRepresentative
|
||||
Loading…
Add table
Add a link
Reference in a new issue