Releases · MeridianAlgo/FinAI

What's Changed

Architecture Rename

MeridianForCausalLM → MeridianSMoEForCausalLM (Sparse Mixture-of-Experts, accurate name)
MeridianConfig → MeridianSMoEConfig
Backward-compatible aliases kept — no breaking changes for existing imports

Critical Bug Fixes

_tied_weights_keys (transformers 5.9.0): Fixed from list to dict to match the new API. This was silently breaking save_pretrained for the custom research architecture.
_init_weights guard bypass: Fixed all module.weight.data.normal_() calls to use nn.init.normal_() so transformers' guard_torch_init_functions correctly marks weights as initialized. Previously, from_pretrained would re-randomize all weights after loading from checkpoint.

Training Improvements

Setting	Before	After
BLOCK_SIZE	256	512 — 2x context window
MAX_BYTES	15 MB	25 MB — ~67% more data per run
EWC_LAMBDA	500.0	75.0 — less overconstrained
EWC_SAMPLES	5	20 — more stable Fisher estimate
FISHER_THRESHOLD	1e-4	5e-4 — EWC state: 1.88 GB to ~200 MB
GRAD_ACCUM	8	4

Dataset Curriculum Rebalanced

Increased: sujet-ai/Sujet-Finance-Instruct-177k (12% to 18%)
Added: mhenrichsen/alpaca_data_cleaned (5%)
Reduced: nvidia/OpenMathInstruct-2 (25% to 15%), HuggingFaceFW/fineweb-edu (20% to 12%)

Repo Cleanup

Deleted debug scripts with hardcoded API keys (check_comet.py, find_workspace.py, check_cascade.py)
Deleted duplicate docs/examples/ (canonical location is examples/)
Deleted timing_test.py and stale .cometml-runs/ artifact
Fixed all stale SmolLM2-360M references to Qwen2.5-0.5B
Fixed examples/03_model_config.py (wrong param names and class names)

New Scripts

scripts/migrate_legacy_and_seed.py — copies checkpoint to legacy/v5.1.0/ on HF and seeds a fresh Qwen2.5-0.5B
scripts/cleanup_hf_checkpoint.py — removes duplicate pytorch_model.bin when model.safetensors exists
scripts/diagnose_and_test.py — full diagnostic: perplexity + 8 finance generation tests

Documentation

CHANGELOG.md — full version history v1.0.0 through v6.0.0 with 3-month training audit
Updated all CI defaults tables across README and docs
Clarified deployed model (Qwen2.5-0.5B) vs research architecture (MeridianSMoEForCausalLM)

Deployed model: https://huggingface.co/meridianal/FinAI

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

Architecture Rename

Critical Bug Fixes

Training Improvements

Dataset Curriculum Rebalanced

Repo Cleanup

New Scripts

Documentation

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Releases: MeridianAlgo/FinAI

Meridian.AI v6.0.0 — Architecture Rename & Training Overhaul

What's Changed

Architecture Rename

Critical Bug Fixes

Training Improvements

Dataset Curriculum Rebalanced

Repo Cleanup

New Scripts

Documentation

Uh oh!

v2.0.0 (Qwen2.5-0.5B AI)

Uh oh!

v1.0.0 (Legacy SmolLM2-360M)

Uh oh!