ai-privacy-toolkit

mirror of https://github.com/IBM/ai-privacy-toolkit.git synced 2026-06-08 15:05:13 +02:00

Author	SHA1	Message	Date
abigailgold	57e38ea4fa	Support for many new model output types (#93 ) * General model wrappers and methods supporting multi-label classifiers * Support for pytorch multi-label binary classifier * New model output types + single implementation of score method that supports multiple output types. * Anonymization with pytorch multi-output binary model * Support for multi-label binary models in minimizer. * Support for multi-label logits/probabilities --------- Signed-off-by: abigailt <abigailt@il.ibm.com>	2024-07-03 09:04:59 -04:00
abigailgold	e00535d120	Fix error with pandas dataframes (#92 ) * Fix error with pandas dataframes in _columns_different_distributions + add appropriate test * Update documentation of classes to reflect that all data should be encoded and scaled. --------- Signed-off-by: abigailt <abigailt@il.ibm.com>	2024-02-13 08:56:12 -05:00
abigailgold	a8f5326572	Fix issue with computed ranges for one-hot encoded features (#90 ) Signed-off-by: abigailt <abigailt@il.ibm.com>	2024-01-17 12:45:22 -05:00
abigailgold	6d81cd8ed4	Support for one-hot encoded features in minimization (#87 ) * Initial version with first working test * Make sure representative values in generalizations for 1-hot encoded features are consistent. * Updated notebooks for one-hot encoded data * Review comments Signed-off-by: abigailt <abigailt@il.ibm.com>	2023-12-24 18:18:18 -05:00
abigailgold	5dce961092	Support 1-hot encoded features in anonymization + fixes related to encoding in minimization (#86 ) * Support 1-hot encoded features in anonymization (#72) * Fix anonymization adult notebook + new notebook to demonstrate anonymization on 1-hot encoded data * Minimizer: No default encoder, if none provided data is supplied to the model as is. Fix data type of representative values. Fix and add more tests. Signed-off-by: abigailt <abigailt@il.ibm.com>	2023-10-19 11:48:15 +03:00
abigailgold	26addd192f	Support pytorch models in data minimization (#85 ) * Support pytorch models in data minimization Signed-off-by: abigailt <abigailt@il.ibm.com>	2023-09-21 17:48:15 +03:00
andersonm-ibm	a40484e0c9	Add column distribution comparison, and a third method for dataset asssessment by membership classification (#84 ) * Add column distribution comparison, and a third method for dataset assessment by membership classification * Address review comments, add additional distribution comparison tests and make them externally configurable too, in addition to the alpha becoming configurable. Signed-off-by: Maya Anderson <mayaa@il.ibm.com>	2023-09-21 16:43:19 +03:00
abigailgold	13a0567183	Make data minimization more consistent and performant (#83 ) * Update requirements * Update incompatible scipy version * Reduce runtime of dataset assessment tests * ncp is now a class that contains 3 values: fit_score, transform_score and generalizations_score so that it doesn't matter in what order the different methods are called, all calculated ncp scores are stored. Generalizations can now be applied either from tree cells or from global generalizations struct depending on the value of generalize_using_transform. Representative values can also be computed from global generalizations. Removing a feature from the generalization can also be applied in either mode. * Compute generalizations with test data when possible (for computing better representatives). * Externalize common test code to methods.	2023-08-21 18:39:15 +03:00
andersonm-ibm	e9a225501f	Limit scikit-learn version because of API changes (#81 ) * Limit scikit-learn versions between 0.22.2 and 1.1.3, remove deprecated load_boston(). * Set pytest configuration option to show test progress in detail. * Change np.int to int according to DeprecationWarning Signed-off-by: Maya Anderson <mayaa@il.ibm.com>	2023-05-14 08:52:06 +03:00
andersonm-ibm	3885ab9d9d	Change back flake8 warnings to errors. Fix tests not to fail it. (#76 ) Signed-off-by: Maya Anderson <mayaa@il.ibm.com>	2023-05-11 11:33:50 +03:00
abigailgold	8a9ef80146	Increase version to 0.2.0 (#74 ) * Remove tensorflow dependency if not using keras model * Remove xgboost dependency if not using xgboost model * Documentation updates Signed-off-by: abigailt <abigailt@il.ibm.com>	2023-05-08 12:50:55 +03:00
Maya Anderson	dbb958f791	Merge pull request #71 from IBM/dataset_assessment Add AI privacy Dataset assessment module with two attack implementations. Signed-off-by: Maya Anderson <mayaa@il.ibm.com>	2023-03-20 14:21:29 +02:00
abigailgold	d52fcd0041	Formatting (#68 ) Fix most flake/lint errors and ignore a few others Signed-off-by: abigailt <abigailt@il.ibm.com>	2022-12-25 15:13:57 +02:00
Maya Anderson	89bdcfc00e	Prepare project for CI: cleanup dependencies, fix test data location, cleanup assert. Signed-off-by: Maya Anderson <mayaa@il.ibm.com>	2022-12-20 16:00:36 +02:00
abigailt	64038f76f9	Merge with main	2022-08-01 18:12:03 +03:00
abigailt	dc5cc793ee	Merge with main	2022-08-01 18:11:34 +03:00
abigailt	a9e2a35e18	Add support for xgboost XGBClassifier (#53 )	2022-07-28 17:21:24 +03:00
olasaadi	74ce92acc4	fix	2022-07-26 18:37:44 +03:00
abigailt	a13415ad67	Externalize BlackboxClassifier dataset (x and predictions)	2022-07-25 16:31:45 +03:00
abigailt	fb534f7a0f	BlackboxClassifier based on predictions to work with DatasetWithPredictions	2022-07-25 16:31:45 +03:00
abigailt	77a6e08c8e	Keras regression support	2022-07-24 18:45:50 +03:00
Ron Shmelkin	c77e34e373	update pytorch wrapper to use torch loaders fix tests and dataset style	2022-07-24 14:31:47 +03:00
olasaadi	6f69f5557b	fix bug	2022-07-20 18:29:48 +03:00
olasaadi	3bf26b67d2	fix	2022-07-20 17:36:00 +03:00
abigailt	a7d156660e	Wrap predict method in BlackBoxClassifierPredictMethod to avoid exception in ART when supplied method returns scalars	2022-07-20 13:33:19 +03:00
abigailt	1cc73b3da1	Check for mismatch between model output type and actual output	2022-07-20 13:33:19 +03:00
abigailt	bc7ab0cc7f	Add model type to blackbox classifier (#49 )	2022-07-20 13:33:19 +03:00
olasaadi	4973fbebc6	fix	2022-07-19 21:16:39 +03:00
abigailgold	00f9c16863	Support additional use cases for data (#46 ) * Make ART black box classifier not apply preprocessing to data * Add option to store predictions (in addition to x,y) in Dataset and Data classes	2022-07-11 14:28:09 +03:00
Shlomit Shachor	e25e58b253	enhance calculation of nb classes + tests (#45 ) * update get_nb_classes method to handle 1-hot and scalar input	2022-07-05 11:32:17 +03:00
abigailgold	c6eb553a9f	Blackbox predict method (#43 ) * Support output probabilities * Support black box classifier with predict method * Update requirements (security alert #1)	2022-06-30 18:23:53 +03:00
Shlomit Shachor	1c4b963add	Wrappers no train (#40 ) 1) Handle train None in Data 2) Update BB Classifier to handle None either for train or test (x or y)	2022-06-26 14:43:22 +03:00
olasaadi	21cba95a28	fix	2022-06-06 14:32:34 +03:00
olasaadi	c954f53ad7	fix	2022-06-06 14:02:40 +03:00
olasaadi	302d0c4b8c	update	2022-06-02 15:25:07 +03:00
olasaadi	a3fb68fb56	update	2022-05-30 12:52:32 +03:00
olasaadi	023f8764da	update	2022-05-30 11:51:22 +03:00
olasaadi	59d8b16bb4	fix	2022-05-23 12:49:38 +03:00
abigailgold	dfa684da6b	Consistent one-hot-encoding (#38 ) * Reuse code between generalize and transform methods * Option to get encoder from user * Consistent encoding for decision tree and generalizations (separate from target model encoding)	2022-05-22 18:02:33 +03:00
abigailt	7055d5ecf6	Fix bug in pruning loop + fix test	2022-05-19 18:07:03 +03:00
abigailt	186f11eaaf	Fix misclassification of categorical features with no generalizations (now appear under the 'untouched' category)	2022-05-19 16:42:31 +03:00
abigailgold	fe676fa426	New model wrappers (#32 ) * keras wrapper + blackbox classifier wrapper (fix #7) * fix error in NCP calculation * Update notebooks * Fix #25 (incorrect attack_feature indexes for social feature in notebook) * Consistent naming of internal parameters	2022-05-12 15:44:29 +03:00
abigailgold	2b2dab6bef	Data and Model wrappers (#26 ) * Squashed commit of wrappers: Wrapper minimizer * apply dataset wrapper on minimizer * apply changes on minimization notebook * add black_box_access and unlimited_queries params Dataset wrapper anonymizer Add features_names to ArrayDataset and allow providing features names in QI and Cat features not just indexes update notebooks categorical features and QI passed by indexes dataset include feature names and is_pandas param add pytorch Dataset Remove redundant code. Use data wrappers in model wrapper APIs. add generic dataset components Create initial version of wrappers for models * Fix handling of categorical features	2022-04-27 12:33:27 +03:00
abigailt	a37ff06df8	Squashed commit of the following: commit `d53818644e` Author: olasaadi <92303887+olasaadi@users.noreply.github.com> Date: Mon Mar 7 20:12:55 2022 +0200 Build the dt on all features anon (#23) * add param to build the DT on all features and not just on QI * one-hot encoding only for categorical features commit `c47819a031` Author: abigailt <abigailt@il.ibm.com> Date: Wed Feb 23 19:40:11 2022 +0200 Update docs commit `7e2ce7fe96` Merge: `7fbd1e4` `752871d` Author: abigailt <abigailt@il.ibm.com> Date: Wed Feb 23 19:26:44 2022 +0200 Merge remote-tracking branch 'origin/main' into main commit `7fbd1e4b90` Author: abigailt <abigailt@il.ibm.com> Date: Wed Feb 23 19:22:54 2022 +0200 Update version and docs commit `752871dd0c` Author: olasaadi <92303887+olasaadi@users.noreply.github.com> Date: Wed Feb 23 14:57:12 2022 +0200 add minimization notebook (#22) * add german credit notebook to showcase new features (minimize only some features and categorical features) * add notebook to show minimization data on a regression problem	2022-04-25 17:39:30 +03:00
Ola Saadi	ac5d82aab6	Wrapper minimizer (#20 ) * apply dataset wrapper on minimizer * apply changes on minimization notebook * add black_box_access and unlimited_queries params	2022-04-18 13:14:49 +03:00
ABIGAIL GOLDSTEEN	6b04fd5564	Remove failing assert Regression scores do not necessarily have to be between 0 and 1 (as opposed to classification scores).	2022-04-05 14:51:02 +03:00
Ola Saadi	5f6a258f8f	Merge branch 'wrappers' into dataset_wrapper_anonimizer	2022-03-28 17:11:41 +03:00
olasaadi	b54f0a2382	fix tests	2022-03-24 19:35:26 +02:00
olasaadi	66c86dc595	fix notebook and add features_names to ArrayDataset and allow providing features names in QI and Cat features not just indexes	2022-03-24 19:32:24 +02:00
olasaadi	312469212e	fix docstring and fix assert in test	2022-03-22 13:59:28 +02:00

1 2

67 commits