TabARGN

class synthyverse.generators.tabargn_generator.TabARGNGenerator(workspace=None, max_epochs=100, random_state=0)

Bases: BaseGenerator

Registry name: tabargn

Tabular AutoRegressive Generative Network (TabARGN).

Uses the implementation from the MostlyAI engine.

Paper: “TabularARGN: A Flexible and Efficient Auto-Regressive Framework for Generating High-Fidelity Synthetic Data” by Tiwald et al. (2025).

Parameters:
  • workspace (str, optional) – Directory for storing intermediate files. If omitted, an internal temporary workspace is created.

  • max_epochs (int) – Maximum number of training epochs. Default: 100.

  • random_state (int) – Random seed for reproducibility. Default: 0.

Example

>>> import pandas as pd
>>> from synthyverse.generators import TabARGNGenerator
>>>
>>> # Load data
>>> X = pd.read_csv("data.csv")
>>> discrete_features = ["category_col"]
>>>
>>> # Create generator
>>> generator = TabARGNGenerator(
...     max_epochs=100,
...     random_state=42
... )
>>>
>>> # Fit and generate
>>> generator.fit(X, discrete_features)
>>> X_syn = generator.generate(1000)
fit(X, discrete_features, X_val=None)

Fit the generator to tabular data.

Parameters:
  • X (DataFrame) – Training data in the generator’s input space.

  • discrete_features (list) – Names of categorical/discrete columns in X.

  • X_val (Optional[DataFrame]) – Optional validation data in the same schema as X.

Returns:

The fitted generator.

generate(n)

Generate synthetic tabular data.

Parameters:

n (int) – Number of synthetic rows to generate.

Returns:

Synthetic data in the generator’s model space.

classmethod load(path)

Load a generator persisted with the default pickle layout.

save(path)

Persist the generator state with the default pickle layout.