Skip to content

Latest commit

Β 

History

History
149 lines (115 loc) Β· 7.15 KB

File metadata and controls

149 lines (115 loc) Β· 7.15 KB
    β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–™β•—  β–Ÿβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—  β–ˆβ–ˆβ•— β–ˆβ–ˆβ•— β–ˆβ–ˆβ•—
    β–ˆβ–ˆβ•”β•β•β•β–ˆβ–ˆβ•‘  β–ˆβ–ˆβ•”β•β•β•β•β•β•  β–ˆβ–ˆβ•”β•β•β•β•β•β•  β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘
    β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–›β•‘  β–œβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–™β•—  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—  β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘
    β–ˆβ–ˆβ•”β•β•β•β•β•β•        β–ˆβ–ˆβ•‘  β–ˆβ–ˆβ•”β•β•β•β•β•β•  β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘ β–ˆβ–ˆβ•‘
    β–ˆβ–ˆβ•‘        β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–›β•‘  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ•—  β–œβ–ˆβ–ˆβ–ˆβ–›β–œβ–ˆβ–ˆβ–ˆβ–›β•‘
    β•šβ•β•        β•šβ•β•β•β•β•β•β•β•  β•šβ•β•β•β•β•β•β•β•  β•šβ•β•β•β•β•©β•β•β•β•β•
      P o s e  S c e n e  E v e r y  W h e r e

PSEW(Pose Scene EveryWhere)

A Framework for Generating Multi-View Data for Action Recognition Training from Monocular Videos

Note

All tests were conducted in the following environment.

Version compatibility may need to be verified in some environments.

CPU: Intel(R) Core(TM) i9-13900KF
GPU: Nvidia GeForce RTX 4090, CUDA 12.1
OS: Ubuntu 24.04 LTS
Conda: 25.5.1

Installation

To run the modules provided in this repository, you'll need to set up a Conda-based environment.

If Conda isn't already installed on your system, please follow the link below to install it before proceeding with the next steps.

πŸ”— Download Anaconda or πŸ”— Download Miniconda

Step 1. Clone Repository

git clone https://github.com/PLASS-Lab/PSEW
cd PSEW

Step 2. Creating and Activating a Conda Virtual Environment

conda env create -f env.yaml
conda activate psew

Step 3. Model Weight Download

Place the weight file you downloaded from the link below into the {root}/model directory.

.
β”œβ”€β”€ checker.py
β”œβ”€β”€ loader.py
β”œβ”€β”€ multiHMR
β”‚   └── multiHMR_896_L.pt
β”œβ”€β”€ smpl_mean_params.npz
└── smplx
    β”œβ”€β”€ SMPLX_FEMALE.npz
    β”œβ”€β”€ SMPLX_FEMALE.pkl
    β”œβ”€β”€ SMPLX_MALE.npz
    β”œβ”€β”€ SMPLX_MALE.pkl
    β”œβ”€β”€ SMPLX_NEUTRAL.npz
    β”œβ”€β”€ SMPLX_NEUTRAL.pkl
    └── version.txt

Data Preparation

Place the source video for generating multi-viewpoint videos in the {root}/data/input directory.

Note: The PSEW framework supports video files with the following extensions(.mp4, .avi, .mov, .mkv)

Setting Virtual Camera

you will define the settings for the virtual camera. (data/camera_config.json)

# camera_config.json
{
    "size": { #Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β· Output Video Resolution
        "width": 1920, #Β·Β·Β·Β·Β·Β·Β· Output video horizontal pixels
        "height": 1080 #Β·Β·Β·Β·Β·Β·Β· Output video vertical pixels
    },
    "fps": 30, #Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β· Output video frames per second (FPS)
    "brightness": 0.5, #Β·Β·Β·Β·Β·Β·Β· Virtual space brightness
    "distance": 3, #Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β· Distance from the origin of the space to the virtual camera
    "fov": 130, #Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β· Virtual camera FOV value
    "rotation": { #Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β·Β· Virtual camera rotation value

        "horizontal": { #Β·Β·Β·Β·Β·Β· Virtual camera horizontal rotation value
            "angle_start": 0,
            "angle_end": 359,
            "angle_step": 30
        },
        "vertical": { #Β·Β·Β·Β·Β·Β·Β·Β· Virtual camera vertical rotation value
            "angle_start": 0,
            "angle_end": 60,
            "angle_step": 30
        }
    }
}

Run

To run this framework, execute the following command.

python run.py

The execution arguments are as follows:

Argument Type Default Choices Description
--model str multiHMR_896_L - Model to use for pose estimation
--input_dir str data/input - Directory containing input videos
--working_dir str data/working - Working directory for temporary files
--output_dir str data/output - Output directory for generated videos
--camera_config str data/camera_config.json - Path to camera configuration file
--save_overlay int 1 [0,1] Whether to save overlay visualization
--extra_views int 0 [0,1] Whether to generate extra viewpoints
--det_thresh float 0.3 - Detection threshold for pose estimation
--nms_kernel_size float 3 - Non-maximum suppression kernel size
--fov float 60 - Field of view for virtual camera
--distance int 0 [0,1] Distance calculation mode
--unique_color int 1 [0,1] Whether to use unique colors for different people
--use_checkpoint int 1 [0,1] Whether to use checkpointing for resuming
--draw_outline int 1 [0,1] Whether to draw mesh outlines

When you run the framework, a working directory will be created inside the {root}/data directory. The structure of the working directory is as follows:

Element Description Filename Rules
frames Frames extracted from the video {frame_index}.jpg
checkpoint Step-by-step Processing Results -
capture Frames from various camera angles {v_rotation_angle}_{h_rotation_angle}_{frame_index}.jpg
mesh 3D human mesh vertices information NumPy array {frame_index}.npy
face 3D human mesh face information NumPy array {frame_index}.npy
scene Virtual space where a 3D human mesh is placed. {frame_index}.glb

When the framework's task is complete, the results are saved to data/output.

The videos within the directory named after the video's title will follow the filename role ({v_rotation_angle}_{h_rotation_angle}_{frame_index}.mp4).