ShapeTrend

class synthyverse.evaluation.fidelity.ShapeTrend(discrete_features=[], numerical_correlation='pearson', n_bins_numerical=20)

Bases: object

Registry name: shapetrend

Low-level implementation of the Column Shape and Column Pair Trend scores from the SDMetrics library (https://docs.sdv.dev/sdmetrics/).

Parameters:
  • discrete_features (list) – List of discrete/categorical feature names. Default: [].

  • numerical_correlation (str) – Correlation method for numerical-numerical pairs. One of “spearman” or “pearson”. Default: “pearson”.

  • n_bins_numerical (int) – Number of bins used to discretize numerical features for mixed-pair trends. Must be >= 2. Default: 20.

Example

>>> import pandas as pd
>>> from synthyverse.evaluation import ShapeTrend
>>>
>>> # Prepare data
>>> X_real = pd.DataFrame(...)
>>> X_syn = pd.DataFrame(...)
>>> discrete_features = ["category_col"]
>>>
>>> # Create metric
>>> metric = ShapeTrend(discrete_features=discrete_features)
>>>
>>> # Evaluate
>>> results = metric.evaluate(X_real, X_syn)
evaluate(X_train, X_syn)

Evaluate synthetic data using SDMetrics shape and trend scores.

Parameters:
  • X_train (DataFrame) – Real training data as a pandas DataFrame.

  • X_syn (DataFrame) – Synthetic data as a pandas DataFrame.

Returns:

Dictionary with keys:
  • ”shapetrend.shape”: Column shapes score

  • ”shapetrend.trend”: Column pair trends score

Return type:

dict