Configurable causal DAG simulator for synthetic mixed-type data and CI test benchmarks.
CausalDataGeneratorclass for configurable simulation- Support for
customandrandomDAGs - Mixed continuous/binary/categorical nodes (configurable categorical cardinality)
- Structural forms:
linear,polynomial,interaction,sigmoid,cos,sin,stratum_means - Optional element-wise
post_transform(tanh,sin,cos,exp_neg_abs,sqrt_abs,relu,sign) - Cross-type mechanisms:
- continuous -> categorical (
categorical_model.name = "threshold") - categorical -> continuous (
functional_form.name = "stratum_means", including mixed-parent cases withmetric_weights)
- continuous -> categorical (
- Noise models:
- additive (
gaussian,student_t,gamma,exponential,laplace,cauchy,uniform) - multiplicative (
gaussian,student_t,gamma,exponential) - heteroskedastic (
abs_first_parent,abs_parent_plus_const,mean_abs_plus_const)
- additive (
- Random weight sampling controls (including exclusion band around zero)
force_uniform_marginalsfor balanced exogenous binary / categorical draws- Template helpers (
chain_config,fork_config,collider_config,independence_config) - Reproducibility via
seed_structureandseed_data(or singleseed) - Optional d-separation CI oracle output (
store_ci_oracle=true)
From PyPI:
pip install dagsamplerOr with uv:
uv venv
source .venv/bin/activate
uv pip install dagsamplerFrom GitHub (latest main):
uv pip install "dagsampler @ git+https://github.com/averinpa/dagsampler.git"from dagsampler import CausalDataGenerator
config = {
"simulation_params": {"n_samples": 200, "seed": 42},
"graph_params": {
"type": "custom",
"nodes": ["X", "Y", "Z1"],
"edges": [["X", "Z1"], ["Y", "Z1"]],
},
}
result = CausalDataGenerator(config).simulate()
data = result["data"]
dag = result["dag"]
params = result["parametrization"]The package exposes dagsampler-generate.
dagsampler-generate \
--config config.json \
--output dataset.csv \
--params-out params.json \
--edges-out edges.jsonconfig.json must contain the same structure used by CausalDataGenerator.
- Documentation — full reference for every config option, mechanism, and noise model.
- Configuration examples — JSON snippets for each feature.
- Model formulations — mathematical definitions.
- Template configurations —
chain_config,fork_config,collider_confighelpers. examples/— runnable notebooks.
uv pip install -e ".[dev]"
pytest -q