AIA¶
- class synthyverse.evaluation.privacy.AIA(quasi_identifiers=None, sensitive_features=None, discrete_features=None, model_name='xgboost', model_params=None, random_state=0)¶
Bases:
objectRegistry name:
aiaAttribute Inference Attack (AIA) privacy metric.
Trains a supervised ML model on synthetic data to infer each sensitive feature from quasi-identifiers, then evaluates the inferred sensitive feature values on real data. Higher performance indicates higher attribute disclosure risk.
- Parameters:
quasi_identifiers (list) – Feature names used by the attacker. If None, all non-sensitive features are used. If sensitive_features is also None, all other features are used for each target feature.
sensitive_features (list) – Sensitive feature names to infer. If None, all features are evaluated as sensitive features.
discrete_features (list) – List of discrete/categorical feature names. Used as the authoritative source for classification vs. regression targets and quasi-identifier preprocessing.
model_name (str) – Model family. Supported values include “xgboost”, “randomforest”, “decisiontree”, “linearregression”, and “svm”, including some common aliases. Every model except for XGBoost is a scikit-learn model. Default: “xgboost”.
model_params (dict) – Model parameters passed to the selected estimator.
random_state (int) – Random seed for reproducibility. Default: 0.
- evaluate(X_train, X_syn)¶
Evaluate AIA on real training data using models trained on synthetic data.
- Parameters:
X_train (
DataFrame) – Real training data as a pandas DataFrame.X_syn (
DataFrame) – Synthetic data used to train the attribute inference models.
- Returns:
- Dictionary with per-sensitive-feature attack scores. Keys have
the form “aia.<sensitive_feature>.<score>”.
- Return type:
dict