Using GPU to compute statistical features based on PyTorch.
Also compare the results with the features computed by CPU (Numpy).
The return is a pd dataframe with columns: 'feature name', 'feature value gpu', 'feature value cpu', and 'time consumption'.
"": No reference
" **": More than one reference and one is questionable
"~": Further research required on feature
| Number | Feature | Description | Info |
|---|---|---|---|
| 1 | calculate_harmonic_mean_abs(X) | Calculates the harmonic mean of the absolute values of X | * |
| 2 | calculate_trimmed_mean_abs(X) | Calculates the trimmed mean of absolute values of X | * |
| 3 | calculate_std_abs(X) | Calculates the standard deviation of the absolute values of X | * |
| 4 | calculate_skewness_abs(X) | Calculate skewness of absolute values of X | * |
| 5 | calculate_kurtosis_abs(X) | Calculates the kurtosis of the absolute values of X | * |
| 6 | calculate_median_abs | Calculates the median of the absolute values of X | * |
| 7 | calculate_min_abs(X) | Calculates the minimum value of the absolute values of X | * |
| 8 | calculate_range_abs(X) | Calculates the range of the absolute values of X | * |
| 9 | calculate_variance_abs(X) | Calculates the variance of the absolute values of X | * |
| 10 | calculate_mean_absolute_deviation(X) | Calculates the mean of the absolute deviation of X | ~ |
| 11 | calculate_signal_magnitude_area(X) | Calculates the magnitude area of X. The sum of the absolute values of X | ~ |
| 12 | calculate_cardinality(X) | ~ | |
| 13 | calculate_rms_to_mean_abs(X) | Computes the ratio of the RMS value to mean absolute value of X | * |
| 14 | calculate_area_under_squared_curve(X) | Computed the area under the curve of X squared | * |
| 15 | calculate_exponential_moving_average(X, param) | Calculates the exponential moving average of X | * |
| 16 | calculate_fisher_information(X) | Computes the Fisher information of X | ~ |
| 17 | calculate_local_maxima_and_minima(X) | Calculates the local maxima and minima of X | * |
| 18 | calculate_log_return(X) | Returns the logarithm of the ratio between the last and first values of which is a measure of the percentage change in X | ~ |
| 19 | calculate_lower_complete_moment(X) | * | |
| 20 | calculate_mean_second_derivative_central(X) | Returns the mean of the second derivative of X | |
| 21 | calculate_median_second_derivative_central(X) | Calculates the median of the second derivative of X | * |
| 23 | calculate_ratio_of_fluctuations(X) | Computes the ratio of positive and negative fluctuations in X | * |
| 24 | calculate_ratio_value_number_to_sequence_length(X) | Returns the ratio of length of a set of X to the length X | * |
| 25 | calculate_second_order_difference(X) | Returns the second differential of X | ** |
| 26 | calculate_signal_resultant(X) | * | |
| 27 | calculate_sum_of_negative_values(X) | Calculates the sum of negative values in X | * |
| 28 | calculate_sum_of_positive_values(X) | Returns the sum of positive values in X | * |
| 29 | calculate_variance_of_absolute_differences(X) | Returns variance of the absolute of the first order difference of X | |
| 30 | calculate_weighted_moving_average(X) | Returns the weighted moving average of X | * |
| 31 | calculate_covariance | ~ |
| Number | Feature | Reference |
|---|---|---|
| 1. | calculate_mean_to_variance |
| Number | Feature | Reference |
|---|---|---|
| 1 | extract_wavelet_features(params) | |
| 2 | extract_spectrogram_features(params) | |
| 3 | extract_stft_features(params) | |
| 4 | teager_kaiser_energy_operator(X) |
| Number | Feature | Reference |
|---|---|---|
| 1 | calculate_spectral_subdominant_valley | * |
| 2 |
-
Median frequency
-
Spectral bandwidth
-
Spectral absolute deviation
-
Spectral slope linear
-
Spectral slope logarithmic
-
Spectral flatness
-
Peak frequencies
-
Spectral edge frequency
-
Band power
-
Spectral entropy
-
Spectral contrast
-
Spectral coefficient variation
-
Spectral flux
-
Spectral rolloff
-
Harmonic ratio
-
Fundamental frequency
-
Spectral crest factor
-
Spectral decrease
-
Spectral irregularity
-
Mean frequency
-
Frequency winsorized mean
-
Total harmonic distortion
-
Inharmonicity
-
Tristimulus
-
Spectral rollon
-
Spectral hole count
-
Spectral autocorrelation
-
Spectral variability
-
Spectral spread ratio
-
Spectral skewness ratio
-
Spectral kurtosis ratio
-
Spectral tonal power ratio
-
Spectral noise to harmonics ratio
-
Spectral even to odd harmonic energy ratio
-
Spectral strongest frequency phase
-
Spectral frequency below peak
-
Spectral frequency above peak
-
Spectral cumulative frequency
-
Spectral cumulative frequency
-
Spectral cumulative frequency above
-
Spectral spread shift
-
Spectral entropy shift
-
Spectral change vector magnitude
-
Spectral low frequency content
-
Spectral mid frequency content
-
Spectral peak-to-valley ratio
-
Spectral valley depth mean
-
Spectral valley depth std
-
Spectral valley depth variance
-
Spectral valley width mode
-
Spectral valley width standard deviation
-
Spectral subdominant valley
-
Spectral valley count
-
Spectral peak broadness
-
Spectral valley broadness
-
Frequency variance
-
Frequency standard deviation
-
Frequency Range
-
Frequency Trimmed mean
-
Harmonic product spectrum
-
Smoothness
-
Roughness
Statistical features from wavelets, spectrogram and short-time fourier transform
- Hurst exponent from detrended fluctuation analysis
- Winsorized mean
- Weighted moving average
- Sum of positive values
- Sum of negative values
- Stochastic oscillator value
- Smoothing by binomial filter
- Signal-to-noise ratio
- Signal resultant
- Second order difference
- Ratio value number to sequence length
- Ratio beyond r signal
- Petrosian fractal dimension
- Percentage of positive values
- Percentage of negative values
- Pearson correlation coefficient
- Peak-to-peak distance
- Number of inflection points
- Moving average
- Mode
- Median second derivative central
- Mean relative change
- Mean crossings
- Lower complete moment
- Log return
- Katz fractal dimension
- Histogram bin frequencies
- Fisher information
- First quartile
- First order difference
- Exponential moving average
- Energy ratio by chunks
- Differential entropy
- Cumulative sum
- Covariance
- Count
- Area under curve
- Area under squared curve
- Renyi entropy
- Tsallis entropy
- Root mean squared to mean absolute
- Cardinality
- Hjorth mobility and complexity
- Singular value decomposition (SVD) entropy
- Higuchi fractal dimensions
- Slope sign change
- Average amplitude change
- Signal magnitude area*
- Median absolute deviation
- Coefficient of variation
- Higher order moments
- Mean auto correlation
- Impulse factor
- Shape factor
- Clearance factor
- Crest factor
- Zero crossings
- Entropy
- Log energy
- Mean absolute deviation
- Interquartile range
- Variance absolute
- Maximum absolute
- Minimum absolute
- Range absolute
- Range
- Median absolute
- Kurtosis absolute
- Skewness absolute
- Standard deviation absolute
- Trimmed mean absolute
- Trimmed mean
- Harmonic Mean
- Harmonic mean absolute
- Geometric mean
- Geometric mean absolute
- Mean absolute
| Number | Feature | Description |
|---|---|---|
| 1 | Augmented dickey fuller test | Perform the Augmented Dickey-Fuller (ADF) test to check for stationarity in a given time series signal. |
| 2 | Hurst exponent | Calculate the Hurst Exponent of a given time series using Detrended Fluctuation Analysis (DFA). |
| Number | Feature | Reason |
|---|---|---|
| 1 | calculate_roll_mean | Same implementation as calculate_moving_average |
| 2 | calculate_absolute_energy | Same implementation as signal energy |
| 3 | calculate_cumulative_energy | Produces same result as the absolute energy and signal energy. These three will always be the same for a given signal. |
| 4 | calculate_intercept_of_linear_fit | This feature is returned again in the calculate_linear_trend_with_full_linear_regression_results function |
| 5 | calculate_pearson_correlation_coefficient | Since this function calculates the Pearson correlation coefficient between the signal and its one-step lagged version, it is fundamentally calculating the autocorrelation of the signal. The autocorrelation is already present(calculate_mean_auto_correlation). Having both is redundant. |
| 6 | calculate_slope_of_linear_fit | This is already calculated in calculate_linear_trend_with_full_linear_regression_results |
| 7 | calculate_frequency_std | Same implementation as calculate_spectral_bandwidth with order set to 2 |
| 8 | calculate_frequency_variance | Same implementation as calculate_spectral_variance |
| 9 | calculate_mean_frequency(freqs, magnitudes) | Same as calculate_spectral_centroid with order set to 1 |
| 10 | calculate_first_quartile | calculate_percentile(signal, percentiles=[25, 50, 75]) returns the first, second, and third quartiles |
| 11 | calculate_third_quartile | calculate_percentile(signal, percentiles=[25, 50, 75]) returns the first, second, and third quartiles |
| 14 | calculate_spectral_entropy_shift | Same implementation as calculate_spectral_entropy but with spectrum_magnitudes as argument and not psd |
| 13 | calculate_spectral_spread_shift | Same spectral standard deviation |
| 14 | calculate_spectral_autocorrelatiion | Autocorrelation of magnitudes is backed by literature |
| Number | Feature | Type | Reason |
|---|---|---|---|
| 1 | calculate_histogram_bins | statistical | |
| 2 | calculate_signal_magnitude_area | statistical | |
| 3 | calculate_spectral_hole_count | spectral | Spectral holes are typically of use in radio signals. Although the aim is to make this a very comprehensive toolbox, this feature is a little bit out of scope. |
| Number | Feature | Description | Added yet? |
|---|---|---|---|
| 1 | absolute sum of changes | ✔️ | |
| 2 | ar_coefficient(x, param) | This feature calculator fits the unconditional maximum likelihood of an autoregressive AR(k) process | |
| 3 | benford correlation | ✔️ | |
| 4 | c3 | uses c3 statistics to measure non-linearity in the time series | |
| 5 | count_above(x, t) | Returns the percentage of values in x that are higher than t | ✔️ |
| 6 | count_below(x, t) | Returns the percentage of values in x that are lower than t | ✔️ |
| 7 | cid_ce(x, normalize) | This function calculator is an estimate for a time series complexity [1] (A more complex time series has more peaks, valleys etc.). | ✔️ |
| 8 | friedrich_coefficients(x, param) | Coefficients of polynomial h(x), which has been fitted to the deterministic dynamics of Langevin model | |
| 9 | has_duplicate(x) | Checks if any value in x occurs more than once | ✔️ |
| 10 | has_duplicate_max(x) | Checks if the maximum value of x is observed more than once | ✔️ |
| 11 | has_duplicate_min(x) | Checks if the minimal value of x is observed more than once | ✔️ |
| 12 | index_mass_quantile(x, param) | Calculates the relative index i of time series x where q% of the mass of x lies left of i. | |
| 13 | mean_n_absolute_max(x, number_of_maxima) | Calculates the arithmetic mean of the n absolute maximum values of the time series. | |
| 14 | large_standard_deviation(x, r) | Does time series have large standard deviation | ✔️ |
| 15 | lempel_ziv_complexity(x, bins) | Calculate a complexity estimate based on the Lempel-Ziv compression algorithm. | ✔️ |
| 16 | matrix_profile(x, param) | Calculates the 1-D Matrix Profile[1] and returns Tukey's Five Number Set plus the mean of that Matrix Profile. | |
| 17 | max_langevin_fixed_point(x, r, m) | Largest fixed point of dynamics :math:argmax_x {h(x)=0}` estimated from polynomial h(x), which has been fitted to the deterministic dynamics of Langevin model | |
| 18 | binned entropy | ||
| 19 | symmetry looking | Boolean variable denoting if the distribution of x looks symmetric. | |
| 20 | change_quantiles | First fixes a corridor given by the quantiles ql and qh of the distribution of x. | |
| 21 | fft_coefficient | Calculates the fourier coefficients of the one-dimensional discrete Fourier Transform for real input by fast fourier transformation algorithm | |
| 22 | matrix_profile | Calculates the 1-D Matrix Profile[1] and returns Tukey's Five Number Set plus the mean of that Matrix Profile. | |
| 23 | mean_n_absolute_max | Calculates the arithmetic mean of the n absolute maximum values of the time series. | |
| 24 | number_crossing_m | Calculates the number of crossings of x on m. | |
| 25 | number_cwt_peaks | Number of different peaks in x. | |
| 26 | number_peaks | Calculates the number of peaks of at least support n in the time series x. | |
| 27 | partial_autocorrelation | Calculates the value of the partial autocorrelation function at the given lag. | |
| 28 | query_similarity_count | This feature calculator accepts an input query subsequence parameter, compares the query (under z-normalized Euclidean distance) to all subsequences within the time series, and returns a count of the number of times the query was found in the time series (within some predefined maximum distance threshold). | |
| 29 | ratio_value_number_to_time_series_length | Returns a factor which is 1 if all values in the time series occur only once, and below one if this is not the case. | |
| 30 | value_count | Count occurrences of value in time series x. | ✔️ |
| 31 | variance_larger_than_standard_deviation | Is variance higher than the standard deviation? | ✔️ |
- calculate_higher_order_moments does not always produce the same result as mean, variance, skew and kurtosis when moment order is set to [1,2,3,4]
- calculate_rms_to_mean_abs has no direct reference yet
- calculate_exponential_moving_average returns the last value in the array. Is there a reason?
Corrections
- calculate_katz_fractal_dimensions
- calculate_sum_of_reoccurring_values
- calculate_sum_of_reoccurring_data_points
- calculate_petrosian_fractal_dimension
- calculate_sample_entropy
- calculate_approximate_entropy