Skip to content

DanielRegaladoUMiami/maya-encoding

Repository files navigation

maya-encoding

CI PyPI version Python 3.9+ License: MIT Downloads Docs

Maya-inspired numerical encodings for machine learning.

Documentation · PyPI · Examples

Two scikit-learn compatible transformers that use the mathematical structure of the ancient Maya number system and calendar to create richer feature representations.

Overview

Encoder Input What it does Use case
VFDEncoder Numeric features Decomposes into base-20 digits, bars (÷5), dots (%5) Multi-scale numeric patterns
MayaCalendarEncoder Dates Extracts Tzolk'in (260d), Haab' (365d), Long Count cycles Temporal feature engineering

Installation

pip install maya-encoding

With optional dependencies:

pip install maya-encoding[viz]         # matplotlib visualization
pip install maya-encoding[benchmarks]  # xgboost, seaborn for benchmarks
pip install maya-encoding[dev]         # development tools (ruff, pytest)

Quick Start

VFD: Numeric Feature Encoding

import numpy as np
from maya_encoding import VFDEncoder
from sklearn.pipeline import Pipeline
from sklearn.ensemble import RandomForestRegressor

# VFD decomposes numbers into vigesimal digits, bars, and dots
encoder = VFDEncoder(components='full')

# Works seamlessly in sklearn pipelines
pipe = Pipeline([
    ('encode', VFDEncoder()),
    ('model', RandomForestRegressor())
])
pipe.fit(X_train, y_train)

How it works — the number 347 becomes:

347 = 17×20 + 7

Level 0 (ones):     digit=7,  bars=1, dots=2
Level 1 (twenties): digit=17, bars=3, dots=2

Feature vector: [7, 1, 2, 17, 3, 2]  →  normalized: [0.37, 0.33, 0.50, 0.89, 1.00, 0.50]

Three "zoom levels" per number: coarse magnitude (digits), medium grouping (bars), and fine residual (dots).

Passthrough Mode: Best of Both Worlds

Use passthrough=True to keep original features alongside VFD features — ideal for tree-based models:

# Original features + VFD features combined
pipe = Pipeline([
    ('encode', VFDEncoder(passthrough=True)),
    ('model', GradientBoostingRegressor())
])

MCE: Temporal Feature Encoding

import numpy as np
from maya_encoding import MayaCalendarEncoder

# Encode dates using Maya calendar cycles
encoder = MayaCalendarEncoder(
    components=['tzolkin', 'haab', 'long_count'],
    cyclical=True,  # sine/cosine for smooth cycle boundaries
)

dates = np.array(["2024-01-01", "2024-06-15", "2024-12-21"])
features = encoder.fit_transform(dates)

The Maya calendar provides interlocking cycles of coprime periods (13, 20, 260, 365, 360), capturing multi-scale temporal patterns that standard encoding requires manual period selection to achieve.

Explore Maya Numbers

from maya_encoding import maya_decompose, to_vigesimal, to_bars_dots

# Convert to vigesimal
digits = to_vigesimal(347)  # [7, 17] (LSB first)

# Full decomposition
info = maya_decompose(347)
# {'digits': [7, 17], 'bars': [1, 3], 'dots': [2, 2], 'n_levels': 2}

# Visualize
from maya_encoding.visualization.glyphs import render_maya_text
print(render_maya_text(347))

Explore Maya Calendar

from maya_encoding.core.calendar import (
    gregorian_to_jdn, jdn_to_tzolkin, jdn_to_haab, jdn_to_long_count
)

# December 21, 2012 — end of the 13th b'ak'tun
jdn = gregorian_to_jdn("2012-12-21")
print(jdn_to_tzolkin(jdn))     # (4, 19) → 4 Ajaw
print(jdn_to_haab(jdn))        # (13, 3) → month 13, day 3
print(jdn_to_long_count(jdn))  # (13, 0, 0, 0, 0) → 13.0.0.0.0

Results at a Glance

VFD — California Housing Regression (R², 5-fold CV)

Encoding Linear Regression Ridge Random Forest Gradient Boosting
Raw + Scaled 0.5530 0.5530 0.6561 0.6852
VFD-lite 0.5832 0.5812 0.5445 0.5742
VFD-full 0.5742 0.5723 0.5891 0.6184
VFD-lite + passthrough 0.5985 0.5968 0.6588 0.6899
VFD-full + passthrough 0.5908 0.5881 0.6615 0.6937

MCE — Temporal Cycle Detection (R², synthetic data)

Configuration Train R² Test R²
All components + cyclical 0.9875 0.9146
Tzolk'in only 0.3656 0.0707
Haab' only 0.6212 0.5891

Fraud Detection (F1, 5-fold stratified CV)

Pipeline Logistic Regression Random Forest Gradient Boosting
Baseline (PCA) 0.7082 0.8961 0.8729
VFD (replace amount) 0.6876 0.8971 0.8816
VFD + passthrough 0.6903 0.8993 0.8816

Rule of thumb: Linear models → use VFD directly. Tree-based models → always use passthrough=True.

When to Use Maya Encoding

Encoder Strong Fit Acceptable Fit
VFDEncoder Discrete/count data (retail, events, scores), linear models Continuous features with passthrough=True for tree models
MayaCalendarEncoder Tropical/biological time series (agriculture, epidemiology, climate) General time series with unexplained seasonal variance

VFD decomposes numbers into a natural hierarchy — digits (×20), bars (×5), dots (×1). This is a strict information superset: the model gets multi-scale structure for free. Linear models see +3–4% R²; tree-based models benefit with passthrough=True.

MCE provides orthogonal cycles with coprime periods (13, 20, 260, 365) that capture patterns Gregorian features miss. The 260-day Tzolk'in correlates with human gestation, maize growing cycles, and tropical astronomical events.

Full guide: When to Use Maya Encoding

API Reference

VFDEncoder

Parameter Default Description
n_levels 'auto' Vigesimal levels (auto-detected from data)
components 'full' 'full', 'lite' (digits only), 'bars_dots'
normalize True Normalize features to [0, 1]
handle_negative 'abs_sign' 'abs_sign', 'shift', 'error'
handle_float 'scale' 'scale', 'round', 'integer_part'
passthrough False Keep original features alongside VFD output
scale_factor 'auto' Decimal precision auto-detection

MayaCalendarEncoder

Parameter Default Description
components ['tzolkin', 'haab', 'long_count'] Calendar systems to use
tzolkin_encoding 'separate' 'separate' (number + name) or 'combined' (position 0-259)
haab_encoding 'hierarchical' 'hierarchical' (with bars/dots) or 'flat' (day 0-364)
long_count_levels 3 1–5: k'in, uinal, tun, k'atun, b'ak'tun
cyclical True Add sine/cosine pairs for smooth cycle boundaries
epoch 'gmt' 'gmt' (standard), 'spinden', or custom JDN
wayeb_flag True Binary flag for the 5-day Wayeb' period

Examples

See the examples/ directory:

Development

git clone https://github.com/DanielRegaladoUMiami/maya-encoding.git
cd maya-encoding
pip install -e ".[dev]"
pytest          # Run 124 tests
ruff check .    # Lint

Run benchmarks:

pip install -e ".[benchmarks]"
python benchmarks/run_vfd_benchmarks.py
python benchmarks/run_mce_benchmarks.py

Citation

If you use maya-encoding in your research, please cite:

@software{regalado2026maya,
  author = {Regalado, Daniel},
  title = {maya-encoding: Maya-Inspired Numerical Encodings for Machine Learning},
  year = {2026},
  url = {https://github.com/DanielRegaladoUMiami/maya-encoding}
}

License

MIT License. See LICENSE for details.

About

Maya-inspired numerical encodings for machine learning. Vigesimal Feature Decomposition (VFD) and Maya Calendar Encoding (MCE) — scikit-learn compatible transformers.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors