Commit graph

4 commits

Author SHA1 Message Date
andersonm-ibm
a40484e0c9
Add column distribution comparison, and a third method for dataset asssessment by membership classification (#84)
* Add column distribution comparison, and a third method for dataset assessment by membership classification

* Address review comments, add additional distribution comparison tests and make them externally configurable too, in addition to the alpha becoming configurable.

Signed-off-by: Maya Anderson <mayaa@il.ibm.com>
2023-09-21 16:43:19 +03:00
abigailgold
13a0567183
Make data minimization more consistent and performant (#83)
* Update requirements

* Update incompatible scipy version

* Reduce runtime of dataset assessment tests

* ncp is now a class that contains 3 values: fit_score, transform_score and generalizations_score so that it doesn't matter in what order the different methods are called, all calculated ncp scores are stored.
Generalizations can now be applied either from tree cells or from global generalizations struct depending on the value of generalize_using_transform. Representative values can also be computed from global generalizations.
Removing a feature from the generalization can also be applied in either mode.

* Compute generalizations with test data when possible (for computing better representatives).

* Externalize common test code to methods.
2023-08-21 18:39:15 +03:00
andersonm-ibm
3885ab9d9d
Change back flake8 warnings to errors. Fix tests not to fail it. (#76)
Signed-off-by: Maya Anderson <mayaa@il.ibm.com>
2023-05-11 11:33:50 +03:00
Maya Anderson
dbb958f791 Merge pull request #71 from IBM/dataset_assessment
Add AI privacy Dataset assessment module with two attack implementations.

Signed-off-by: Maya Anderson <mayaa@il.ibm.com>
2023-03-20 14:21:29 +02:00