Skip to content

Releases: MeridianAlgo/FinAI

Meridian.AI v6.0.0 — Architecture Rename & Training Overhaul

26 May 17:26

Choose a tag to compare

What's Changed

Architecture Rename

  • MeridianForCausalLMMeridianSMoEForCausalLM (Sparse Mixture-of-Experts, accurate name)
  • MeridianConfigMeridianSMoEConfig
  • Backward-compatible aliases kept — no breaking changes for existing imports

Critical Bug Fixes

  • _tied_weights_keys (transformers 5.9.0): Fixed from list to dict to match the new API. This was silently breaking save_pretrained for the custom research architecture.
  • _init_weights guard bypass: Fixed all module.weight.data.normal_() calls to use nn.init.normal_() so transformers' guard_torch_init_functions correctly marks weights as initialized. Previously, from_pretrained would re-randomize all weights after loading from checkpoint.

Training Improvements

Setting Before After
BLOCK_SIZE 256 512 — 2x context window
MAX_BYTES 15 MB 25 MB — ~67% more data per run
EWC_LAMBDA 500.0 75.0 — less overconstrained
EWC_SAMPLES 5 20 — more stable Fisher estimate
FISHER_THRESHOLD 1e-4 5e-4 — EWC state: 1.88 GB to ~200 MB
GRAD_ACCUM 8 4

Dataset Curriculum Rebalanced

  • Increased: sujet-ai/Sujet-Finance-Instruct-177k (12% to 18%)
  • Added: mhenrichsen/alpaca_data_cleaned (5%)
  • Reduced: nvidia/OpenMathInstruct-2 (25% to 15%), HuggingFaceFW/fineweb-edu (20% to 12%)

Repo Cleanup

  • Deleted debug scripts with hardcoded API keys (check_comet.py, find_workspace.py, check_cascade.py)
  • Deleted duplicate docs/examples/ (canonical location is examples/)
  • Deleted timing_test.py and stale .cometml-runs/ artifact
  • Fixed all stale SmolLM2-360M references to Qwen2.5-0.5B
  • Fixed examples/03_model_config.py (wrong param names and class names)

New Scripts

  • scripts/migrate_legacy_and_seed.py — copies checkpoint to legacy/v5.1.0/ on HF and seeds a fresh Qwen2.5-0.5B
  • scripts/cleanup_hf_checkpoint.py — removes duplicate pytorch_model.bin when model.safetensors exists
  • scripts/diagnose_and_test.py — full diagnostic: perplexity + 8 finance generation tests

Documentation

  • CHANGELOG.md — full version history v1.0.0 through v6.0.0 with 3-month training audit
  • Updated all CI defaults tables across README and docs
  • Clarified deployed model (Qwen2.5-0.5B) vs research architecture (MeridianSMoEForCausalLM)

Deployed model: https://huggingface.co/meridianal/FinAI

v2.0.0 (Qwen2.5-0.5B AI)

05 Apr 23:17

Choose a tag to compare

Upgraded architecture to Qwen2.5-0.5B. Infused fully with FinanceAlpaca instruction datasets. Ready for hourly continual training.

v1.0.0 (Legacy SmolLM2-360M)

05 Apr 23:17

Choose a tag to compare

Legacy release using the SmolLM2-360M base architecture. Retired due to high perplexity and format memorization.