Maya-inspired numerical encodings for machine learning.
Two scikit-learn compatible transformers that use the mathematical structure of the ancient Maya number system and calendar to create richer feature representations.
| Encoder | Input | What it does | Use case |
|---|---|---|---|
| VFDEncoder | Numeric features | Decomposes into base-20 digits, bars (÷5), dots (%5) | Multi-scale numeric patterns |
| MayaCalendarEncoder | Dates | Extracts Tzolk'in (260d), Haab' (365d), Long Count cycles | Temporal feature engineering |
pip install maya-encodingWith optional dependencies:
pip install maya-encoding[viz] # matplotlib visualization
pip install maya-encoding[benchmarks] # xgboost, seaborn for benchmarks
pip install maya-encoding[dev] # development tools (ruff, pytest)import numpy as np
from maya_encoding import VFDEncoder
from sklearn.pipeline import Pipeline
from sklearn.ensemble import RandomForestRegressor
# VFD decomposes numbers into vigesimal digits, bars, and dots
encoder = VFDEncoder(components='full')
# Works seamlessly in sklearn pipelines
pipe = Pipeline([
('encode', VFDEncoder()),
('model', RandomForestRegressor())
])
pipe.fit(X_train, y_train)How it works — the number 347 becomes:
347 = 17×20 + 7
Level 0 (ones): digit=7, bars=1, dots=2
Level 1 (twenties): digit=17, bars=3, dots=2
Feature vector: [7, 1, 2, 17, 3, 2] → normalized: [0.37, 0.33, 0.50, 0.89, 1.00, 0.50]
Three "zoom levels" per number: coarse magnitude (digits), medium grouping (bars), and fine residual (dots).
Use passthrough=True to keep original features alongside VFD features — ideal for tree-based models:
# Original features + VFD features combined
pipe = Pipeline([
('encode', VFDEncoder(passthrough=True)),
('model', GradientBoostingRegressor())
])import numpy as np
from maya_encoding import MayaCalendarEncoder
# Encode dates using Maya calendar cycles
encoder = MayaCalendarEncoder(
components=['tzolkin', 'haab', 'long_count'],
cyclical=True, # sine/cosine for smooth cycle boundaries
)
dates = np.array(["2024-01-01", "2024-06-15", "2024-12-21"])
features = encoder.fit_transform(dates)The Maya calendar provides interlocking cycles of coprime periods (13, 20, 260, 365, 360), capturing multi-scale temporal patterns that standard encoding requires manual period selection to achieve.
from maya_encoding import maya_decompose, to_vigesimal, to_bars_dots
# Convert to vigesimal
digits = to_vigesimal(347) # [7, 17] (LSB first)
# Full decomposition
info = maya_decompose(347)
# {'digits': [7, 17], 'bars': [1, 3], 'dots': [2, 2], 'n_levels': 2}
# Visualize
from maya_encoding.visualization.glyphs import render_maya_text
print(render_maya_text(347))from maya_encoding.core.calendar import (
gregorian_to_jdn, jdn_to_tzolkin, jdn_to_haab, jdn_to_long_count
)
# December 21, 2012 — end of the 13th b'ak'tun
jdn = gregorian_to_jdn("2012-12-21")
print(jdn_to_tzolkin(jdn)) # (4, 19) → 4 Ajaw
print(jdn_to_haab(jdn)) # (13, 3) → month 13, day 3
print(jdn_to_long_count(jdn)) # (13, 0, 0, 0, 0) → 13.0.0.0.0| Encoding | Linear Regression | Ridge | Random Forest | Gradient Boosting |
|---|---|---|---|---|
| Raw + Scaled | 0.5530 | 0.5530 | 0.6561 | 0.6852 |
| VFD-lite | 0.5832 | 0.5812 | 0.5445 | 0.5742 |
| VFD-full | 0.5742 | 0.5723 | 0.5891 | 0.6184 |
| VFD-lite + passthrough | 0.5985 | 0.5968 | 0.6588 | 0.6899 |
| VFD-full + passthrough | 0.5908 | 0.5881 | 0.6615 | 0.6937 |
| Configuration | Train R² | Test R² |
|---|---|---|
| All components + cyclical | 0.9875 | 0.9146 |
| Tzolk'in only | 0.3656 | 0.0707 |
| Haab' only | 0.6212 | 0.5891 |
| Pipeline | Logistic Regression | Random Forest | Gradient Boosting |
|---|---|---|---|
| Baseline (PCA) | 0.7082 | 0.8961 | 0.8729 |
| VFD (replace amount) | 0.6876 | 0.8971 | 0.8816 |
| VFD + passthrough | 0.6903 | 0.8993 | 0.8816 |
Rule of thumb: Linear models → use VFD directly. Tree-based models → always use
passthrough=True.
| Encoder | Strong Fit | Acceptable Fit |
|---|---|---|
| VFDEncoder | Discrete/count data (retail, events, scores), linear models | Continuous features with passthrough=True for tree models |
| MayaCalendarEncoder | Tropical/biological time series (agriculture, epidemiology, climate) | General time series with unexplained seasonal variance |
VFD decomposes numbers into a natural hierarchy — digits (×20), bars (×5), dots (×1). This is a strict information superset: the model gets multi-scale structure for free. Linear models see +3–4% R²; tree-based models benefit with passthrough=True.
MCE provides orthogonal cycles with coprime periods (13, 20, 260, 365) that capture patterns Gregorian features miss. The 260-day Tzolk'in correlates with human gestation, maize growing cycles, and tropical astronomical events.
→ Full guide: When to Use Maya Encoding
| Parameter | Default | Description |
|---|---|---|
n_levels |
'auto' |
Vigesimal levels (auto-detected from data) |
components |
'full' |
'full', 'lite' (digits only), 'bars_dots' |
normalize |
True |
Normalize features to [0, 1] |
handle_negative |
'abs_sign' |
'abs_sign', 'shift', 'error' |
handle_float |
'scale' |
'scale', 'round', 'integer_part' |
passthrough |
False |
Keep original features alongside VFD output |
scale_factor |
'auto' |
Decimal precision auto-detection |
| Parameter | Default | Description |
|---|---|---|
components |
['tzolkin', 'haab', 'long_count'] |
Calendar systems to use |
tzolkin_encoding |
'separate' |
'separate' (number + name) or 'combined' (position 0-259) |
haab_encoding |
'hierarchical' |
'hierarchical' (with bars/dots) or 'flat' (day 0-364) |
long_count_levels |
3 |
1–5: k'in, uinal, tun, k'atun, b'ak'tun |
cyclical |
True |
Add sine/cosine pairs for smooth cycle boundaries |
epoch |
'gmt' |
'gmt' (standard), 'spinden', or custom JDN |
wayeb_flag |
True |
Binary flag for the 5-day Wayeb' period |
See the examples/ directory:
01_quickstart.ipynb— Basic VFD and MCE usage02_vfd_deep_dive.ipynb— Components, visualization, performance03_mce_temporal.ipynb— Calendar systems and time series04_benchmark_results.ipynb— Full benchmark with passthrough analysis05_fraud_detection.ipynb— Credit card fraud with VFD amount decomposition06_pricing_analysis.ipynb— Demand prediction with VFD price features
git clone https://github.com/DanielRegaladoUMiami/maya-encoding.git
cd maya-encoding
pip install -e ".[dev]"
pytest # Run 124 tests
ruff check . # LintRun benchmarks:
pip install -e ".[benchmarks]"
python benchmarks/run_vfd_benchmarks.py
python benchmarks/run_mce_benchmarks.pyIf you use maya-encoding in your research, please cite:
@software{regalado2026maya,
author = {Regalado, Daniel},
title = {maya-encoding: Maya-Inspired Numerical Encodings for Machine Learning},
year = {2026},
url = {https://github.com/DanielRegaladoUMiami/maya-encoding}
}MIT License. See LICENSE for details.