ai-privacy-toolkit

mirror of https://github.com/IBM/ai-privacy-toolkit.git synced 2026-04-25 21:06:21 +02:00

Author	SHA1	Message	Date
abigailgold	6d81cd8ed4	Support for one-hot encoded features in minimization (#87 ) * Initial version with first working test * Make sure representative values in generalizations for 1-hot encoded features are consistent. * Updated notebooks for one-hot encoded data * Review comments Signed-off-by: abigailt <abigailt@il.ibm.com>	2023-12-24 18:18:18 -05:00
abigailgold	5dce961092	Support 1-hot encoded features in anonymization + fixes related to encoding in minimization (#86 ) * Support 1-hot encoded features in anonymization (#72) * Fix anonymization adult notebook + new notebook to demonstrate anonymization on 1-hot encoded data * Minimizer: No default encoder, if none provided data is supplied to the model as is. Fix data type of representative values. Fix and add more tests. Signed-off-by: abigailt <abigailt@il.ibm.com>	2023-10-19 11:48:15 +03:00
abigailgold	26addd192f	Support pytorch models in data minimization (#85 ) * Support pytorch models in data minimization Signed-off-by: abigailt <abigailt@il.ibm.com>	2023-09-21 17:48:15 +03:00
abigailgold	13a0567183	Make data minimization more consistent and performant (#83 ) * Update requirements * Update incompatible scipy version * Reduce runtime of dataset assessment tests * ncp is now a class that contains 3 values: fit_score, transform_score and generalizations_score so that it doesn't matter in what order the different methods are called, all calculated ncp scores are stored. Generalizations can now be applied either from tree cells or from global generalizations struct depending on the value of generalize_using_transform. Representative values can also be computed from global generalizations. Removing a feature from the generalization can also be applied in either mode. * Compute generalizations with test data when possible (for computing better representatives). * Externalize common test code to methods.	2023-08-21 18:39:15 +03:00
andersonm-ibm	e9a225501f	Limit scikit-learn version because of API changes (#81 ) * Limit scikit-learn versions between 0.22.2 and 1.1.3, remove deprecated load_boston(). * Set pytest configuration option to show test progress in detail. * Change np.int to int according to DeprecationWarning Signed-off-by: Maya Anderson <mayaa@il.ibm.com>	2023-05-14 08:52:06 +03:00
abigailgold	8a9ef80146	Increase version to 0.2.0 (#74 ) * Remove tensorflow dependency if not using keras model * Remove xgboost dependency if not using xgboost model * Documentation updates Signed-off-by: abigailt <abigailt@il.ibm.com>	2023-05-08 12:50:55 +03:00
abigailgold	d52fcd0041	Formatting (#68 ) Fix most flake/lint errors and ignore a few others Signed-off-by: abigailt <abigailt@il.ibm.com>	2022-12-25 15:13:57 +02:00
abigailgold	00f9c16863	Support additional use cases for data (#46 ) * Make ART black box classifier not apply preprocessing to data * Add option to store predictions (in addition to x,y) in Dataset and Data classes	2022-07-11 14:28:09 +03:00
abigailgold	c6eb553a9f	Blackbox predict method (#43 ) * Support output probabilities * Support black box classifier with predict method * Update requirements (security alert #1)	2022-06-30 18:23:53 +03:00
abigailgold	dfa684da6b	Consistent one-hot-encoding (#38 ) * Reuse code between generalize and transform methods * Option to get encoder from user * Consistent encoding for decision tree and generalizations (separate from target model encoding)	2022-05-22 18:02:33 +03:00
abigailt	7055d5ecf6	Fix bug in pruning loop + fix test	2022-05-19 18:07:03 +03:00
abigailt	186f11eaaf	Fix misclassification of categorical features with no generalizations (now appear under the 'untouched' category)	2022-05-19 16:42:31 +03:00
abigailgold	fe676fa426	New model wrappers (#32 ) * keras wrapper + blackbox classifier wrapper (fix #7) * fix error in NCP calculation * Update notebooks * Fix #25 (incorrect attack_feature indexes for social feature in notebook) * Consistent naming of internal parameters	2022-05-12 15:44:29 +03:00
abigailgold	2b2dab6bef	Data and Model wrappers (#26 ) * Squashed commit of wrappers: Wrapper minimizer * apply dataset wrapper on minimizer * apply changes on minimization notebook * add black_box_access and unlimited_queries params Dataset wrapper anonymizer Add features_names to ArrayDataset and allow providing features names in QI and Cat features not just indexes update notebooks categorical features and QI passed by indexes dataset include feature names and is_pandas param add pytorch Dataset Remove redundant code. Use data wrappers in model wrapper APIs. add generic dataset components Create initial version of wrappers for models * Fix handling of categorical features	2022-04-27 12:33:27 +03:00
olasaadi	752871dd0c	add minimization notebook (#22 ) * add german credit notebook to showcase new features (minimize only some features and categorical features) * add notebook to show minimization data on a regression problem	2022-02-23 14:57:12 +02:00
olasaadi	3feebe8973	Regression minimization (#20 ) * support regression in minimization and add test * fix #10	2022-01-27 15:57:55 +02:00
olasaadi	a9a93c8a3a	Train just on qi (#15 ) * QI updates * update code to support training ML on QI features * fix code so features that are not from QI should not be part of generalizations and add description * merging two branches, training on QI and on all data * adding tests and asserts	2022-01-12 17:01:27 +02:00
olasaadi	2eb626c00c	Sup cat features (#14 ) * support categorical features * update the documentation and readme added a test for the case where cells are supplied as a param. * add big tests (adult test and iris) and fixed bugs * update transform to return numpy if original data is numpy * added nursery test * break loop if there is an illegal level * Stop pruning one step before passing accuracy threshold * adding asserts and fix DecisionTreeClassifier init * Fix tests Co-authored-by: abigailt <abigailt@il.ibm.com>	2022-01-11 09:51:04 +02:00
abigailgold	f2e1364b43	Add data minimization functionality to the ai-privacy-toolkit (#3 ) * Fix directory issue when running tests for first time * Initial version of data minimization * Update version and documentation * Fix documentation	2021-07-12 15:56:42 +03:00

19 commits