FeatureWisePlots

class synthyverse.evaluation.fidelity.FeatureWisePlots(img_save_path, discrete_features=None, bins=20, max_categories=20, file_format='png', dpi=150)

Bases: object

Registry name: featurewiseplots

Save one multi-panel plot comparing real training and synthetic data.

Numerical features are plotted as overlaid density histograms. Discrete features are plotted as side-by-side normalized frequency bars.

Parameters:
  • img_save_path (str) – Directory where the feature plot file will be saved.

  • discrete_features (list) – List of discrete/categorical feature names. Default: [].

  • bins (int) – Number of bins for numerical histograms. Default: 20.

  • max_categories (int) – Maximum number of categorical levels to show before grouping the remainder into “__other__”. Default: 20.

  • file_format (str) – Image file format passed to matplotlib. Default: “png”.

  • dpi (int) – Saved figure resolution. Default: 150.

Example

>>> import pandas as pd
>>> from synthyverse.evaluation import FeatureWisePlots
>>>
>>> metric = FeatureWisePlots(
...     img_save_path="results/featurewise_plots",
...     discrete_features=["category_col"],
... )
>>> results = metric.evaluate(X_train, X_syn)
evaluate(X_train, X_syn)

Save one feature-wise real-vs-synthetic comparison plot.

Parameters:
  • X_train (DataFrame) – Real training data as a pandas DataFrame.

  • X_syn (DataFrame) – Synthetic data as a pandas DataFrame.

Returns:

Dictionary with keys “featurewiseplots.n_plots”,

”featurewiseplots.save_dir”, and “featurewiseplots.files”.

Return type:

dict