We take an algorithmic approach to caclulate a score for a rep of a certain exercise. Given a time series of 3D keypoints, we calcuate angle between joints, velocity, and distance between joints. To calculate the score, we compare these values to a predefined range of 'perfect' values for a specific exercise
We use the Human3.6m 17 keypoints
| Joint Number | Joint Name |
|---|---|
| 0 | Root |
| 1 | Right Hip |
| 2 | Right Knee |
| 3 | Right Ankle |
| 4 | Left Hip |
| 5 | Left Knee |
| 6 | Left Ankle |
| 7 | Belly |
| 8 | Neck |
| 9 | Nose |
| 10 | Head |
| 11 | Left Shoulder |
| 12 | Left Elbow |
| 13 | Left Wrist |
| 14 | Right Shoulder |
| 15 | Right Elbow |
| 16 | Right Wrist |
We split a video of mutliple repeititons of the same exercise into individual clips containing a single repetition.
These clips are used later for analysis and tracking form across a whole set of reps.
The algorithm we used is derived from AIFit. We first achieve an estimate using a fixed clip length and optimize to find individual clip start and stop frames.
To obtain an initial estimate of the segmentation, we follow these steps:
-
Assume a fixed-period pose signal:
$T_{init} = T(t_{start}, \tau) = {T_i | t_i = t_{start} + (i - 1)\tau}$ -
$\tau$ : period -
$t_{start}$ : starting point of repetitions
-
-
Define affinity between two 3D poses:
$A(p_m, p_n)$ = negative mean per joint position error (MPJPE) between$p_m$ and$p_n$ -
Determine initial period estimate
$\tau^*$ using auto-correlation:$R_{PP}(\tau, s) = \frac{1}{N - 2s - \tau} \sum_{t=s}^{N-s-\tau} A(p_t, p_{t+\tau})$ -
$s$ : signal shrinkage at both ends to account for noise -
$N$ : total number of poses
-
-
Iterate over
$s$ and$\tau$ to find$\tau^*$ :- Select smallest
$\tau$ where$R_{PP}(\tau, s)$ reaches a local maximum - Corresponding
$s$ becomes$s^*$
- Select smallest
After estimating the period $\tau^$, we search for the beginning of the first repetition $t_{start}$ by maximizing the average affinity $A_{avg}$ of $T(t_{start}, \tau^)$:
where:
$A_{seq}(T_i, T_j) = \frac{1}{\tau^} \sum_{l=1}^{\tau^} A(p_{t_i+l}, p_{t_j+l})$
-
$A_{seq}(T_i, T_j)$ computes the similarity between two repetitions of equal period$\tau^*$ (intervals$T_i$ and$T_j$ ). -
$A_{avg}(T)$ averages similarities between all possible pairs of intervals, representing a global affinity of the repetition segmentation$T$ . - At this stage,
$T$ is parameterized only by$t_{start}$ , since$\tau^*$ was found in the previous step.
We select $t^{start}$ as the smallest value for which $A{avg}(T(t_{start}, \tau^))$ has a local maximum. This approach:
- Provides the highest similarity between repetitions.
- Uses the smallest such maximum to prevent solutions like the beginning of the 2nd/3rd/etc. interval, which are also local maxima.
| Metric | Formula |
|---|---|
| Angle | |
| Velocity | $\mathbf{v}{P{F_2/F_1}} = \frac{d\mathbf{r}_2}{dt} = \frac{d(\mathbf{r}1 + \mathbf{r}{1P})}{dt} = \frac{d\mathbf{r}1}{dt} + \frac{d\mathbf{r}{1P}}{dt}$ |
| Distance |
