Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -70,3 +70,4 @@ MLPR.py
docs/Makefile
sifibridge-*
*.pyc
*.model
135 changes: 135 additions & 0 deletions docs/source/documentation/prediction/discrete_classification_doc.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
# Discrete Classifiers

Unlike continuous classifiers that output a prediction for every window of EMG data, discrete classifiers are designed for recognizing transient, isolated gestures. These classifiers operate on variable-length templates (sequences of windows) and are well-suited for detecting distinct movements like finger snaps, taps, or quick hand gestures.

Discrete classifiers expect input data in a different format than continuous classifiers:
- **Continuous classifiers**: Operate on individual windows of shape `(n_windows, n_features)`.
- **Discrete classifiers**: Operate on templates (sequences of windows) where each template has shape `(n_frames, n_features)` and can vary in length.

To prepare data for discrete classifiers, use the `discrete=True` parameter when calling `parse_windows()` on your `OfflineDataHandler`:

```Python
from libemg.data_handler import OfflineDataHandler

odh = OfflineDataHandler()
odh.get_data('./data/', regex_filters)
windows, metadata = odh.parse_windows(window_size=50, window_increment=10, discrete=True)
# windows is now a list of templates, one per file/rep
```

For feature extraction with discrete data, use the `discrete=True` parameter:

```Python
from libemg.feature_extractor import FeatureExtractor

fe = FeatureExtractor()
features = fe.extract_features(['MAV', 'ZC', 'SSC', 'WL'], windows, discrete=True, array=True)
# features is a list of arrays, one per template
```

## Majority Vote LDA (MVLDA)

A classifier that applies Linear Discriminant Analysis (LDA) to each frame within a template and uses majority voting to determine the final prediction. This approach is simple yet effective for discrete gesture recognition.

```Python
from libemg._discrete_models import MVLDA

model = MVLDA()
model.fit(train_features, train_labels)
predictions = model.predict(test_features)
probabilities = model.predict_proba(test_features)
```

## Dynamic Time Warping Classifier (DTWClassifier)

A template-matching classifier that uses Dynamic Time Warping (DTW) distance to compare test samples against stored training templates. DTW is particularly useful when gestures may vary in speed or duration, as it can align sequences with different temporal characteristics.

```Python
from libemg._discrete_models import DTWClassifier

model = DTWClassifier(n_neighbors=3)
model.fit(train_features, train_labels)
predictions = model.predict(test_features)
probabilities = model.predict_proba(test_features)
```

The `n_neighbors` parameter controls how many nearest templates are used for voting (k-nearest neighbors with DTW distance).

## Pretrained Myo Cross-User Model (MyoCrossUserPretrained)

A pretrained deep learning model for cross-user discrete gesture recognition using the Myo armband. This model uses a convolutional-recurrent architecture and recognizes 6 gestures: Nothing, Close, Flexion, Extension, Open, and Pinch.

```Python
from libemg._discrete_models import MyoCrossUserPretrained

model = MyoCrossUserPretrained()
# Model is automatically downloaded on first use

# The model provides recommended parameters for OnlineDiscreteClassifier
print(model.args)
# {'window_size': 10, 'window_increment': 5, 'null_label': 0, ...}

predictions = model.predict(test_data)
probabilities = model.predict_proba(test_data)
```

This model expects raw windowed EMG data (not extracted features) with shape `(batch_size, seq_len, n_channels, n_samples)`.

## Online Discrete Classification

For real-time discrete gesture recognition, use the `OnlineDiscreteClassifier`:

```Python
from libemg.emg_predictor import OnlineDiscreteClassifier
from libemg._discrete_models import MyoCrossUserPretrained

# Load pretrained model
model = MyoCrossUserPretrained()

# Create online classifier
classifier = OnlineDiscreteClassifier(
odh=online_data_handler,
model=model,
window_size=model.args['window_size'],
window_increment=model.args['window_increment'],
null_label=model.args['null_label'],
feature_list=model.args['feature_list'], # None for raw data
template_size=model.args['template_size'],
min_template_size=model.args['min_template_size'],
gesture_mapping=model.args['gesture_mapping'],
buffer_size=model.args['buffer_size'],
rejection_threshold=0.5,
debug=True
)

# Start recognition loop
classifier.run()
```

## Creating Custom Discrete Classifiers

Any custom discrete classifier should implement the following methods to work with LibEMG:

- `fit(x, y)`: Train the model where `x` is a list of templates and `y` is the corresponding labels.
- `predict(x)`: Return predicted class labels for a list of templates.
- `predict_proba(x)`: Return predicted class probabilities for a list of templates.

```Python
class CustomDiscreteClassifier:
def __init__(self):
self.classes_ = None

def fit(self, x, y):
# x: list of templates (each template is an array of frames)
# y: labels for each template
self.classes_ = np.unique(y)
# ... training logic

def predict(self, x):
# Return array of predictions
pass

def predict_proba(self, x):
# Return array of shape (n_samples, n_classes)
pass
```
3 changes: 3 additions & 0 deletions docs/source/documentation/prediction/prediction.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@ EMG Prediction
.. include:: classification_doc.md
:parser: myst_parser.sphinx_

.. include:: discrete_classification_doc.md
:parser: myst_parser.sphinx_

.. include:: regression_doc.md
:parser: myst_parser.sphinx_

Expand Down
2 changes: 1 addition & 1 deletion docs/source/documentation/prediction/predictors.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

After recording, processing, and extracting features from a window of EMG data, it is passed to a machine learning algorithm for prediction. These control systems have evolved in the prosthetics community for continuously predicting muscular contractions for enabling prosthesis control. Therefore, they are primarily limited to recognizing static contractions (e.g., hand open/close and wrist flexion/extension) as they have no temporal awareness. Currently, this is the form of recognition supported by LibEMG and is an initial step to explore EMG as an interaction opportunity for general-purpose use. This section highlights the machine-learning strategies that are part of `LibEMG`'s pipeline.

There are two types of models supported in `LibEMG`: classifiers and regressors. Classifiers output a discrete motion class for each window, whereas regressors output a continuous prediction along a degree of freedom. For both classifiers and regressors, `LibEMG` supports statistical models as well as deep learning models. Additionally, a number of post-processing methods (i.e., techniques to improve performance after prediction) are supported for all models.
There are three types of models supported in `LibEMG`: classifiers, regressors, and discrete classifiers. Classifiers output a motion class for each window of EMG data, whereas regressors output a continuous prediction along a degree of freedom. Discrete classifiers are designed for recognizing transient, isolated gestures and operate on variable-length templates rather than individual windows. For classifiers and regressors, `LibEMG` supports statistical models as well as deep learning models. Additionally, a number of post-processing methods (i.e., techniques to improve performance after prediction) are supported for all models.

## Statistical Models

Expand Down
1 change: 1 addition & 0 deletions libemg/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,4 @@
from libemg import gui
from libemg import shared_memory_manager
from libemg import environments
from libemg import _discrete_models
126 changes: 126 additions & 0 deletions libemg/_discrete_models/DTW.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
from tslearn.metrics import dtw_path
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DTWClassifier imports 'tslearn.metrics' but tslearn is not listed in the project's requirements.txt. This will cause an ImportError when users try to use this classifier. Add tslearn to requirements.txt or document it as an optional dependency.

Copilot uses AI. Check for mistakes.
import numpy as np


class DTWClassifier:
"""Dynamic Time Warping k-Nearest Neighbors classifier.

A classifier that uses Dynamic Time Warping (DTW) distance for template
matching with k-nearest neighbors. Suitable for discrete gesture recognition
where temporal alignment between samples varies.

Parameters
----------
n_neighbors: int, default=1
Number of neighbors to use for k-nearest neighbors voting.

Attributes
----------
templates: list of ndarray
The training templates stored after fitting.
labels: ndarray
The labels corresponding to each template.
classes_: ndarray
The unique class labels known to the classifier.
"""

def __init__(self, n_neighbors=1):
"""Initialize the DTW classifier.

Parameters
----------
n_neighbors: int, default=1
Number of neighbors to use for k-nearest neighbors voting.
"""
self.n_neighbors = n_neighbors
self.templates = None
self.labels = None
self.classes_ = None

def fit(self, features, labels):
"""Fit the DTW classifier by storing training templates.

Parameters
----------
features: list of ndarray
A list of training samples (templates) where each sample is
a 2D array of shape (n_frames, n_features).
labels: array-like
The target labels for each template.
"""
self.templates = features
self.labels = np.array(labels)
self.classes_ = np.unique(labels)

def predict(self, samples):
"""Predict class labels for samples.

Parameters
----------
samples: list of ndarray
A list of samples to classify where each sample is a 2D array
of shape (n_frames, n_features).

Returns
-------
ndarray
Predicted class labels for each sample.
"""
# We can reuse predict_proba logic to get the class with highest probability
probas = self.predict_proba(samples)
return self.classes_[np.argmax(probas, axis=1)]

def predict_proba(self, samples, gamma=None, eps=1e-12):
"""Predict class probabilities using DTW distance-weighted voting.

Computes DTW distances to all templates, selects k-nearest neighbors,
and computes class probabilities using exponentially weighted voting.

Parameters
----------
samples: list of ndarray
A list of samples to classify where each sample is a 2D array
of shape (n_frames, n_features).
gamma: float, default=None
The kernel bandwidth for distance weighting. If None, automatically
computed based on median neighbor distance.
eps: float, default=1e-12
Small constant to prevent division by zero.

Returns
-------
ndarray
Predicted class probabilities of shape (n_samples, n_classes).
"""
if self.templates is None:
raise ValueError("Call fit() before predict_proba().")

X = np.asarray(samples, dtype=object)
out = np.zeros((len(X), len(self.classes_)), dtype=float)

for i, s in enumerate(X):
# DTW distances to templates
dists = np.array([dtw_path(t, s)[1] for t in self.templates], dtype=float)
Comment on lines +101 to +103
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DTW distance calculation in the inner loop (line 30) computes distances to all templates for every prediction, which could be slow for large template sets. The implementation uses a list comprehension with dtw_path which may not be optimized. Consider whether there are opportunities for caching or optimization, especially if the same samples are processed multiple times.

Copilot uses AI. Check for mistakes.

# kNN
nn_idx = np.argsort(dists)[:self.n_neighbors]
nn_dists = dists[nn_idx]
nn_labels = self.labels[nn_idx]

# choose gamma if not provided (scale to typical distance)
g = gamma
if g is None:
scale = np.median(nn_dists) if len(nn_dists) else 1.0
g = 1.0 / max(scale, eps)

weights = np.exp(-g * nn_dists) # closer -> bigger weight

# accumulate per class
for cls_j, cls in enumerate(self.classes_):
out[i, cls_j] = weights[nn_labels == cls].sum()

# normalize to probabilities
z = out[i].sum()
out[i] = out[i] / max(z, eps)

return out
95 changes: 95 additions & 0 deletions libemg/_discrete_models/MVLDA.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
import numpy as np
from scipy import stats


class MVLDA:
"""Majority Vote Linear Discriminant Analysis classifier.

A classifier that uses Linear Discriminant Analysis (LDA) on individual frames
and aggregates predictions using majority voting. This is designed for discrete
gesture recognition where each sample contains multiple frames.

Attributes
----------
model: LinearDiscriminantAnalysis
The underlying LDA model.
classes_: ndarray
The class labels known to the classifier.
"""

def __init__(self):
"""Initialize the MVLDA classifier."""
self.model = None
self.classes_ = None

def fit(self, x, y):
"""Fit the MVLDA classifier on training data.

Parameters
----------
x: list of ndarray
A list of samples where each sample is a 2D array of shape
(n_frames, n_features). Each sample can have a different number of frames.
y: array-like
The target labels for each sample.
"""
self.model = LinearDiscriminantAnalysis()
# Create a flat array of labels corresponding to every frame in x
labels = np.hstack([[v] * x[i].shape[0] for i, v in enumerate(y)])
self.model.fit(np.vstack(x), labels)
# Store classes for consistent probability mapping
self.classes_ = self.model.classes_

def predict(self, y):
"""Predict class labels using majority voting.

Performs frame-level LDA predictions and returns the majority vote
for each sample.

Parameters
----------
y: list of ndarray
A list of samples where each sample is a 2D array of shape
(n_frames, n_features).

Returns
-------
ndarray
Predicted class labels for each sample.
"""
preds = []
for s in y:
frame_predictions = self.model.predict(s)
# Majority vote on the labels
majority_vote = stats.mode(frame_predictions, keepdims=False)[0]
preds.append(majority_vote)
return np.array(preds)

def predict_proba(self, y):
"""Predict class probabilities using soft voting.

Calculates probabilities by averaging the frame-level probabilities
for each sample (soft voting).

Parameters
----------
y: list of ndarray
A list of samples where each sample is a 2D array of shape
(n_frames, n_features).

Returns
-------
ndarray
Predicted class probabilities of shape (n_samples, n_classes).
"""
probas = []
for s in y:
# Get probabilities for each frame: shape (n_frames, n_classes)
frame_probas = self.model.predict_proba(s)

# Average probabilities across all frames in this sample
sample_proba = np.mean(frame_probas, axis=0)
probas.append(sample_proba)

return np.array(probas)
Loading
Loading