This repository provides the implementation of the Univariate Gaussian Mixture Model Neural Network (uGMM-NN). This architecture extends standard feedforward neural networks by replacing their neurons (weighted sum + nonlinearity) with probabilistic univariate Gaussian mixture neurons.
Unlike standard neurons, which compute a weighted sum followed by a fixed activation, uGMM neurons are parameterized by learnable means, variances, and mixture weights. This allows each node to model multimodality and propagate uncertainty throughout the network, offering a richer probabilistic representation and opening the door to new architectures that unify deep learning with probabilistic reasoning.
A uGMM neuron j receives N inputs (x₁, ..., xₙ) from the previous layer. Its associated Gaussian Mixture Model has exactly N components, each corresponding to one input. The means (μⱼ,ₖ), variances (σ²ⱼ,ₖ), and mixing coefficients (πⱼ,ₖ) are learnable parameters unique to neuron j.
The overall structure of a uGMM-NN resembles that of a conventional feedforward network, with input, hidden, and output layers. However, each neuron corresponds to a univariate Gaussian mixture, and successive layers form a hierarchical composition of uGMMs, yielding high-dimensional probabilistic models through repeated transformation.
Instead of adding dense layers, we stack univariate Gaussian Mixture Model layers (uGMM) that represent mixtures over inputs from the previous layer:
class uGMMNN(nn.Module):
def __init__(self):
super(uGMMNN, self).__init__()
self.flatten = nn.Flatten()
self.fc1 = uGMMLayer(n_nodes_in=28*28, n_nodes=128, dropout=0.5)
self.fc2 = uGMMLayer(n_nodes_in=128, n_nodes=64, dropout=0.0)
self.fc3 = uGMMLayer(n_nodes_in=64, n_nodes=10, dropout=0.0)
def forward(self, x):
x = self.flatten(x)
x = self.fc1(x)
x = self.fc2(x)
x = self.fc3(x)
return xTraining a uGMM-NN looks almost identical to training a standard FFNN model. You define an optimizer and a loss function, then run a forward–backward pass loop:
model = uGMMNN().to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=1e-2)
criterion = nn.CrossEntropyLoss()
num_epochs = 100
for epoch in range(num_epochs):
model.train()
for batch_index, (inputs, labels) in enumerate(train_loader):
optimizer.zero_grad()
outputs = model(inputs.to(device))
loss = criterion(outputs, labels.to(device))
loss.backward()
optimizer.step()The notebooks directory contains Jupyter notebooks that demonstrate the usage of this library with complete examples.
- Example (discriminative) inference on the MNIST dataset
- Example (generative) inference on the Iris dataset using a uGMM trained with NLL loss.
This project is licensed under the terms of the MIT License.
For details on uGMM-NN, see the paper, and to cite it, use:
@article{Zakeria2025uGMM,
author = {Zakeria Sharif Ali},
title = {uGMM-NN: Univariate Gaussian Mixture Model Neural Network},
journal = {arXiv preprint arXiv:2509.07569},
year = {2025},
url = {https://arxiv.org/abs/2509.07569}
}}
