PRDC¶
- class synthyverse.evaluation.fidelity.PRDC(discrete_features=None, k=5, n_jobs=-1)¶
Bases:
objectRegistry name:
prdcPrecision, Recall, Density, and Coverage for tabular synthetic data.
Paper: “Reliable fidelity and diversity metrics for generative models” by Naeem et al. (2020).
- Parameters:
discrete_features (list) – List of discrete/categorical feature names. Default: [].
k (int) – Number of nearest neighbours used to estimate each sample’s manifold radius. Default: 5.
n_jobs (int) – Number of parallel jobs for sklearn pairwise distances. Default: -1.
Example
>>> import pandas as pd >>> from synthyverse.evaluation import PRDC >>> >>> metric = PRDC(discrete_features=["category_col"], k=5) >>> results = metric.evaluate(X_train, X_syn)
- evaluate(X_train, X_syn)¶
Evaluate synthetic data using PRDC.
- Parameters:
X_train (
DataFrame) – Real training data as a pandas DataFrame.X_syn (
DataFrame) – Synthetic data as a pandas DataFrame.
- Returns:
- Dictionary with keys:
”prdc.precision”: Fraction of synthetic samples in the real manifold
”prdc.recall”: Fraction of real samples in the synthetic manifold
”prdc.density”: Average number of real manifolds containing a synthetic sample
”prdc.coverage”: Fraction of real samples whose nearest synthetic sample is in range
- Return type:
dict