The network_ops.generate_data module generates synthetic voting datasets for training and testing multi-winner voting rules. It creates preference profiles using various probability models and computes optimal/worst committees according to specified axioms.
Run the script with command-line arguments in key=value format:
python -m network_ops.generate_data n_profiles=1000 prefs_per_profile=50 m=5 num_winners=3 pref_model="IC" axioms="all" out_folder="data"n_profiles(int): Number of preference profiles to generateprefs_per_profile(int): Number of voters per profile (mean if varied_voters=True)m(int): Number of candidates in each profilenum_winners(int): Size of committees to electpref_model(str): Preference generation model (see options below)axioms(str): Which axiom set to evaluate ("all", "root", "both")out_folder(str): Directory to save generated datasets
varied_voters(bool): If True, vary number of voters per profile. Default: Falsevoters_std_dev(int): Standard deviation for voter count when varied_voters=True. Default: 0rwd_folder(str): Path to real-world data folder (required for pref_model="real_world")
Available preference models include:
"IC"- Impartial Culture (uniform random)"IAC"- Impartial Anonymous Culture"URN-R"- Urn model"identity"- All voters have identical preferences"MALLOWS-RELPHI-R"- Mallows model
"euclidean__args__dimensions=3_-_space=gaussian_ball""euclidean__args__dimensions=10_-_space=gaussian_ball""euclidean__args__dimensions=3_-_space=uniform_ball""euclidean__args__dimensions=10_-_space=uniform_ball""euclidean__args__dimensions=3_-_space=gaussian_cube""euclidean__args__dimensions=10_-_space=gaussian_cube""euclidean__args__dimensions=3_-_space=uniform_cube""euclidean__args__dimensions=10_-_space=uniform_cube"
"single_peaked_conitzer"- Single-peaked preferences (Conitzer)"single_peaked_walsh"- Single-peaked preferences (Walsh)"stratification__args__weight=0.5"- Stratified model
"mixed"- Mixed distribution"real_world"- Generated from real-world voting data (requires rwd_folder)
The script generates CSV files with the following naming convention:
n_profiles={n}-num_voters={v}-varied_voters={vv}-voters_std_dev={std}-m={m}-committee_size={c}-pref_dist={model}-axioms={axioms}-{TYPE}.csv
Where {TYPE} is either TRAIN or TEST.
Each row represents one preference profile with columns including:
Profile: The preference profile as rankingsn_voters: Number of voters in this profilemin_violations-committee: Committee that minimizes axiom violationsmin_violations: Number of violations for optimal committeemax_violations-committee: Committee that maximizes axiom violationsmax_violations: Number of violations for worst committee- Feature columns: Candidate pair matrices, binary pairs, rank matrices (normalized)
- Rule winner columns (TEST data only): Winners for each voting rule
Generate IC data with 1000 profiles:
python -m network_ops.generate_data n_profiles=1000 prefs_per_profile=50 m=5 num_winners=3 pref_model="IC" axioms="all" out_folder="results"Generate varied voter data:
python -m network_ops.generate_data n_profiles=500 prefs_per_profile=50 varied_voters=True voters_std_dev=10 m=6 num_winners=2 pref_model="URN-R" axioms="both" out_folder="data"Generate real-world based data:
python -m network_ops.generate_data n_profiles=100 prefs_per_profile=20 m=5 num_winners=2 pref_model="real_world" rwd_folder="data/real_world_data" axioms="all" out_folder="rwd_results"- The script automatically generates both TRAIN and TEST datasets
- Large datasets are saved incrementally to prevent data loss
- Real-world preference generation requires existing real-world voting data in the specified folder
- Preference models with
__args__use the specified parameters (e.g., dimensions for euclidean models)
The network_ops.train_networks module trains neural networks to learn multi-winner voting rules from generated data. It trains multiple networks using different feature sets and loss functions for robust comparison.
python -m network_ops.train_networks n_profiles=10000 n_voters=50 m=5 num_winners=3 data_path="data" varied_voters=True out_folder="trained_networks"n_profiles(int): Number of training profiles to usen_voters(int): Number of voters per profile (must match training data)m(int): Number of candidates (must match training data)num_winners(int): Committee size (must match training data)data_path(str): Directory containing training data (generated by generate_data)varied_voters(bool): Whether training data has varied voters (must match training data)out_folder(str): Directory to save trained networks
axioms(str): Axiom set used in training data ("all", "root", "both"). Default: "all"pref_dist(str): Specific preference distribution to train on. If not provided, trains on all available distributionsvoters_std_dev(int): Standard deviation for varied voters (must match training data if applicable)
The script automatically:
- Loads training data from the specified data_path
- Trains multiple networks per parameter set using different:
- Feature sets (candidate pairs, binary pairs, rank matrices)
- Loss functions (MSE, etc.)
- Saves trained models with descriptive filenames
- Records training losses for analysis
- Hidden layers: 5
- Hidden nodes: 256 per layer
- Epochs: 50
- Early stopping: Enabled with patience
- Optimizer: Adam
Trained networks are saved with naming convention:
NN-num_voters={n}-m={m}-num_winners={k}-pref_dist={dist}-axioms={axioms}-features={features}-loss={loss}-idx={idx}-.pt
Train networks for IC distribution:
python -m network_ops.train_networks n_profiles=10000 n_voters=50 m=5 num_winners=3 data_path="data" varied_voters=False pref_dist="IC" out_folder="networks"Train on all distributions with varied voters:
python -m network_ops.train_networks n_profiles=5000 n_voters=50 m=6 num_winners=2 data_path="data" varied_voters=True voters_std_dev=10 out_folder="trained_networks"The network_ops.evaluate_networks module evaluates trained neural networks against test data and compares their performance with existing multi-winner voting rules.
python -m network_ops.evaluate_networks n_profiles=1000 n_voters=50 varied_voters=True voters_std_dev=10 m=5 num_winners=3 data_path="data" network_path="trained_networks" out_folder="evaluation_results"n_profiles(int): Number of test profiles to evaluate onn_voters(int): Number of voters per profile (must match test data)varied_voters(bool): Whether test data has varied voters (must match test data)voters_std_dev(int): Standard deviation for varied voters (must match test data)m(int): Number of candidates (must match test data)num_winners(int): Committee size (must match test data)data_path(str): Directory containing test datanetwork_path(str): Directory containing trained networksout_folder(str): Directory to save evaluation results
axioms(str): Axiom set to evaluate ("all", "root", "both"). Default: "all"pref_dist(str): Specific preference distribution to evaluate. If not provided, evaluates all available distributions
The script:
- Loads test data and trained networks
- Makes predictions using neural networks
- Compares against existing rules:
- Borda, STV, PAV, Monroe, etc.
- Random choice baseline
- Min/Max violation committees
- Computes distances between all rule pairs
- Counts axiom violations for each rule
- Saves comprehensive results
Generated files include:
- Axiom violation results:
axiom_violation_results-{params}.csv - Rule distance matrices:
rule_distances-{params}.csv - Performance comparisons: Accuracy and violation statistics
Evaluate networks on IC test data:
python -m network_ops.evaluate_networks n_profiles=1000 n_voters=50 varied_voters=False voters_std_dev=0 m=5 num_winners=3 data_path="data" network_path="trained_networks" pref_dist="IC" out_folder="results"Comprehensive evaluation across all distributions:
python -m network_ops.evaluate_networks n_profiles=5000 n_voters=50 varied_voters=True voters_std_dev=10 m=6 num_winners=2 data_path="data" network_path="trained_networks" out_folder="evaluation_results"- Test data parameters must exactly match training data parameters
- Networks must exist for the specified parameters
- Evaluation compares neural networks against 15+ established voting rules
- Results enable analysis of when neural networks perform well vs. poorly
The optimize_interpretable_rules.py script uses simulated annealing to optimize positional scoring rules that satisfy specified axioms while minimizing axiom violations on sampled preference profiles.
python optimize_interpretable_rules.py num_winners=[1,2,3] axioms_to_optimize="reduced" num_profiles_to_sample=5000 n_annealing_steps=10000num_winners(list): List of committee sizes to optimize for (e.g.,[1,2,3])
axioms_to_optimize(str or list): Which axioms to optimize for"reduced"- Reduced axiom set (default)"all"- All available axioms- Custom list (e.g.,
["majority_winner", "condorcet_winner"])
num_profiles_to_sample(int): Number of preference profiles to sample for optimization (default: 5000)n_annealing_steps(int): Number of simulated annealing steps (default: 0 = no annealing)
- Sampling: Generates random preference profiles from mixed distributions
- Optimization: Uses simulated annealing to find scoring vectors that minimize axiom violations
- Evaluation: Tests optimized rules against sampled profiles for each axiom
- Results: Outputs optimized scoring vectors and violation rates
Optimize for single winner with annealing:
python optimize_interpretable_rules.py num_winners=[1] axioms_to_optimize="reduced" n_annealing_steps=5000Optimize for multiple committee sizes:
python optimize_interpretable_rules.py num_winners=[1,2,3,4] axioms_to_optimize="all" num_profiles_to_sample=10000The experiment_both_axiom_sets/arXiv/ directory contains scripts to generate comprehensive appendix materials including tables, plots, and LaTeX documentation.
Creates pairwise distance matrices between voting rules:
cd experiment_both_axiom_sets/arXiv/
python make_distance_tables.py- Output: LaTeX tables in
distance_tex_tables/ - Process: Computes Hamming distances between rule outputs across all experimental conditions
Creates performance summary tables for all rules and distributions:
python make_summary_table.py- Output: Formatted tables in
summary_tables/ - Content: Axiom violation rates, rule rankings, statistical summaries
Creates visualization plots for experimental results:
python plot_experiment_data.py- Output: PNG plots in
plots/subdirectories - Types: Axiom violation heatmaps, distribution comparisons, rule performance charts
Combines all materials into a single LaTeX appendix:
python make_appendix.py- Output:
combined_appendix.tex - Content: Integrated tables, plots, and structured documentation
experiment_both_axiom_sets/arXiv/
├── distance_heatmaps/ # Distance visualization plots
├── distance_tex_tables/ # LaTeX distance tables
├── summary_tables/ # Performance summary tables
├── plots/ # Experimental result plots
└── combined_appendix.tex # Final appendix document
make_distance_tables.py: Generates rule similarity matricesmake_summary_table.py: Creates performance comparison tablesplot_experiment_data.py: Generates all experimental plotsmake_appendix.py: Compiles everything into LaTeX format
- Scripts expect experimental results to be available in the appropriate data directories
- LaTeX compilation requires the generated
.texfiles and associated plot images - The workflow processes data for m=5,6,7 candidates across all preference distributions