βββββββββ βββββββββ βββββββββ βββ βββ βββ
βββββββββ βββββββββ βββββββββ βββ βββ βββ
βββββββββ βββββββββ βββββββββ βββ βββ βββ
βββββββββ βββ βββββββββ βββ βββ βββ
βββ βββββββββ βββββββββ βββββββββββ
βββ βββββββββ βββββββββ ββββββ©βββββ
P o s e S c e n e E v e r y W h e r e
A Framework for Generating Multi-View Data for Action Recognition Training from Monocular Videos
All tests were conducted in the following environment.
Version compatibility may need to be verified in some environments.
CPU: Intel(R) Core(TM) i9-13900KF
GPU: Nvidia GeForce RTX 4090, CUDA 12.1
OS: Ubuntu 24.04 LTS
Conda: 25.5.1
To run the modules provided in this repository, you'll need to set up a Conda-based environment.
If Conda isn't already installed on your system, please follow the link below to install it before proceeding with the next steps.
π Download Anaconda or π Download Miniconda
Step 1. Clone Repository
git clone https://github.com/PLASS-Lab/PSEW
cd PSEWStep 2. Creating and Activating a Conda Virtual Environment
conda env create -f env.yaml
conda activate psewStep 3. Model Weight Download
Place the weight file you downloaded from the link below into the {root}/model directory.
- SMPL-X: Download Link: SMPL eXpressive
- MultiHMR: Download Link: multiHMR_896_L.pt
.
βββ checker.py
βββ loader.py
βββ multiHMR
β βββ multiHMR_896_L.pt
βββ smpl_mean_params.npz
βββ smplx
βββ SMPLX_FEMALE.npz
βββ SMPLX_FEMALE.pkl
βββ SMPLX_MALE.npz
βββ SMPLX_MALE.pkl
βββ SMPLX_NEUTRAL.npz
βββ SMPLX_NEUTRAL.pkl
βββ version.txtPlace the source video for generating multi-viewpoint videos in the {root}/data/input directory.
Note: The PSEW framework supports video files with the following extensions(
.mp4,.avi,.mov,.mkv)
you will define the settings for the virtual camera. (data/camera_config.json)
# camera_config.json
{
"size": { #Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β· Output Video Resolution
"width": 1920, #Β·Β·Β·Β·Β·Β·Β· Output video horizontal pixels
"height": 1080 #Β·Β·Β·Β·Β·Β·Β· Output video vertical pixels
},
"fps": 30, #Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β· Output video frames per second (FPS)
"brightness": 0.5, #Β·Β·Β·Β·Β·Β·Β· Virtual space brightness
"distance": 3, #Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β· Distance from the origin of the space to the virtual camera
"fov": 130, #Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β· Virtual camera FOV value
"rotation": { #Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β· Virtual camera rotation value
"horizontal": { #Β·Β·Β·Β·Β·Β· Virtual camera horizontal rotation value
"angle_start": 0,
"angle_end": 359,
"angle_step": 30
},
"vertical": { #Β·Β·Β·Β·Β·Β·Β·Β· Virtual camera vertical rotation value
"angle_start": 0,
"angle_end": 60,
"angle_step": 30
}
}
}
To run this framework, execute the following command.
python run.pyThe execution arguments are as follows:
| Argument | Type | Default | Choices | Description |
|---|---|---|---|---|
--model |
str | multiHMR_896_L |
- | Model to use for pose estimation |
--input_dir |
str | data/input |
- | Directory containing input videos |
--working_dir |
str | data/working |
- | Working directory for temporary files |
--output_dir |
str | data/output |
- | Output directory for generated videos |
--camera_config |
str | data/camera_config.json |
- | Path to camera configuration file |
--save_overlay |
int | 1 |
[0,1] |
Whether to save overlay visualization |
--extra_views |
int | 0 |
[0,1] |
Whether to generate extra viewpoints |
--det_thresh |
float | 0.3 |
- | Detection threshold for pose estimation |
--nms_kernel_size |
float | 3 |
- | Non-maximum suppression kernel size |
--fov |
float | 60 |
- | Field of view for virtual camera |
--distance |
int | 0 |
[0,1] |
Distance calculation mode |
--unique_color |
int | 1 |
[0,1] |
Whether to use unique colors for different people |
--use_checkpoint |
int | 1 |
[0,1] |
Whether to use checkpointing for resuming |
--draw_outline |
int | 1 |
[0,1] |
Whether to draw mesh outlines |
When you run the framework, a working directory will be created inside the {root}/data directory. The structure of the working directory is as follows:
| Element | Description | Filename Rules |
|---|---|---|
frames |
Frames extracted from the video | {frame_index}.jpg |
checkpoint |
Step-by-step Processing Results | - |
capture |
Frames from various camera angles | {v_rotation_angle}_{h_rotation_angle}_{frame_index}.jpg |
mesh |
3D human mesh vertices information NumPy array | {frame_index}.npy |
face |
3D human mesh face information NumPy array | {frame_index}.npy |
scene |
Virtual space where a 3D human mesh is placed. | {frame_index}.glb |
When the framework's task is complete, the results are saved to data/output.
The videos within the directory named after the video's title will follow the filename role ({v_rotation_angle}_{h_rotation_angle}_{frame_index}.mp4).