CTABGAN¶

class synthyverse.generators.ctabgan_generator.CTABGANGenerator(target_column, class_dim=(256, 256, 256, 256), random_dim=100, num_channels=64, l2scale=1e-05, batch_size=500, epochs=150, sides=[4, 8, 16, 24, 32, 64], random_state=0, **kwargs)[source]¶

Bases: TabularBaseGenerator

Conditional Tabular GAN (CTABGAN).

This is the CTABGAN+ implementation from the original paper. Improves on previous conditional GANs through convolutional layers and elaborate preprocessing schemes. Unlike the original implementation, we automatically detect feature-type categories (e.g., gaussian-like columns) as part of preprocessing.

Paper: “Ctab-gan+: Enhancing tabular data synthesis” by Zhao et al. (2024).

Parameters:

target_column (str) – Name of the target column.
class_dim (tuple) – Tuple of dimensions for class-specific layers. Default: (256, 256, 256, 256).
random_dim (int) – Dimension of random noise vector. Default: 100.
num_channels (int) – Number of channels in generator. Default: 64.
l2scale (float) – L2 regularization scale. Default: 1e-5.
batch_size (int) – Batch size for training. Default: 500.
epochs (int) – Number of training epochs. Default: 150.
sides (list) – List of side dimensions for generator. Default: [4, 8, 16, 24, 32, 64].
random_state (int) – Random seed for reproducibility. Default: 0.
**kwargs – Additional arguments passed to TabularBaseGenerator.

Example

>>> import pandas as pd
>>> from synthyverse.generators import CTABGANGenerator
>>>
>>> # Load data
>>> X = pd.read_csv("data.csv")
>>> discrete_features = ["category_col"]
>>>
>>> # Create generator (requires target column)
>>> generator = CTABGANGenerator(
...     target_column="target",
...     epochs=150,
...     batch_size=500,
...     random_state=42
... )
>>>
>>> # Fit and generate
>>> generator.fit(X, discrete_features)
>>> X_syn = generator.generate(1000)