DOMIAS¶
- class synthyverse.evaluation.privacy.DOMIAS(ref_prop=0.5, member_prop=1.0, n_components=0.99, random_state=0, discrete_features=None, subsample=False, repeats=1)¶
Bases:
MIARegistry name:
mia.domiasDOMIAS membership inference attack metric.
Paper: “Membership inference attacks against synthetic data through overfitting detection” by van Breugel et al. (2023).
DOMIAS compares multivariate density of the attack records under the synthetic and reference distributions. Uses Gaussian KDE on PCA-transformed data for density estimation.
- Parameters:
discrete_features (list) – List of discrete/categorical feature names. Default: [].
ref_prop (float) – Proportion of test set to use as reference for density estimation. Default: 0.5.
member_prop (float) – Proportion of train set to use as members. Default: 1.0.
n_components (int or float) – Number of PCA components. Float in (0,1] = variance target, int = exact components. Default: 0.99.
subsample (bool) – Whether to subsample synthetic and member sets to match reference and evaluation non-member sizes. Default: False.
repeats (int) – Number of repeated evaluations when subsampling records. Default: 1.
random_state (int) – Random seed for reproducibility. Default: 0.
Example
>>> import pandas as pd >>> from synthyverse.evaluation import DOMIAS >>> >>> # Prepare data >>> X_train = pd.DataFrame(...) >>> X_test = pd.DataFrame(...) >>> X_syn = pd.DataFrame(...) >>> discrete_features = ["category_col"] >>> >>> # Create metric >>> metric = DOMIAS( ... discrete_features=discrete_features, ... ref_prop=0.5, ... n_components=0.95, ... random_state=42 ... ) >>> >>> # Evaluate >>> results = metric.evaluate(X_train, X_test, X_syn)
- evaluate(X_train, X_test, X_syn)¶
Evaluate membership inference risk.
- Parameters:
X_train (
DataFrame) – Real training data whose rows are treated as members.X_test (
DataFrame) – Independent real test data split into reference records and evaluation non-members.X_syn (
DataFrame) – Synthetic data available to the attacker.
- Returns:
- Dictionary with attack AUC and lift-at-k scores. Keys have the
form “<attack_name>.<score>”.
- Return type:
dict