Auptimize optimizes spatial audio layouts for improved perceptual sound localization. It adjusts sound source positions based on empirical human perception data to compensate for localization errors such as front-back confusion and elevation misjudgment.
Given a set of visual element positions on a sphere around a listener, Auptimize uses integer programming (Gurobi) to find optimal sound source positions that maximize the likelihood of correct auditory localization. The system uses frequency distributions derived from collected participant data to model how humans perceive sound directions.
- Python 3.9
- numpy
- pandas
- scipy
- gurobipy (Gurobi optimizer)
conda create -n auptimize python=3.9
python -m pip install numpy pandas scipy gurobipyNote: Using Gurobi requires a valid license, which is free for academics (https://www.gurobi.com/solutions/licensing).
If you want to test Auptimize using the outputs from our user study, the pre-computed results are already included in simulation_data_userstudy/. You can convert them to Unity-compatible coordinate files and export them directly to a Unity project.
-
Run
generate_unity_layouts.pyto convert the user study data into Unity coordinate files:python generate_unity_layouts.py
This reads the CSV files from
simulation_data_userstudy/and outputs.txtlayout files to../unity/Assets/Resources/TrialCoordinates/. Each file contains pairs of visual and sound coordinates in radians, ready for use in Unity.
If you want to run the optimization yourself on new layout configurations, you will need the participant data to build the perception model.
-
Download participant data. If the
data/directory is not included in this repository, download it from this link.Place the downloaded participant folders into
./data/so the directory structure looks like:python/ data/ p1/ headposition.csv main.csv p2_.../ headposition.csv main.csv ... -
Run Auptimize optimization. This generates layouts, optimizes sound positions using integer programming, and saves the results to CSV:
python simulation.py --layout_type random --num_elements 2 --num_layouts 1000
Options:
--layout_type:random,side_by_side, orcone_of_confusion--num_elements: Number of audio-visual elements (only forrandomandside_by_sidelayouts)--num_layouts: Number of layouts to generate (only forrandomlayouts; all possible layouts are generated forside_by_sideandcone_of_confusionlayouts.)
Results are saved to
auptimize_simulation/. -
Export to Unity (optional). Convert your new simulation results to Unity layout files by updating the
in_dirpath ingenerate_unity_layouts.pyto point to your output directory, then run:python generate_unity_layouts.py
| File | Description |
|---|---|
optimization.py |
Integer programming optimization using Gurobi. Computes frequency distributions from participant data and solves for optimal sound positions |
simulation.py |
Main CLI entry point. Runs optimization, predicts perceived locations, and computes evaluation metrics |
utils.py |
Data loading (setup()) and coordinate conversion between rectangular and spherical coordinate systems |
evaluation_metrics.py |
Distance and error functions: circular difference, Haversine angular distance, geodesic distance, L1/L2 norms, cone-of-confusion distance, and adjusted error metrics |
generate_layouts.py |
Generates layout configurations: random, side-by-side, cone-of-confusion, and exhaustive grid combinations |
generate_unity_layouts.py |
Converts simulation result CSVs to Unity-compatible coordinate files (radians) |
- Theta (azimuthal): 0-360 degrees (front=0, left=90, back=180, right=270)
- Phi (elevation): -60 to +60 degrees (down=-60, horizontal=0, up=+60)
- Radius: Fixed at 1.5 (listener-centric sphere)