Releases: MeridianAlgo/FinAI
Releases · MeridianAlgo/FinAI
Meridian.AI v6.0.0 — Architecture Rename & Training Overhaul
What's Changed
Architecture Rename
MeridianForCausalLM→MeridianSMoEForCausalLM(Sparse Mixture-of-Experts, accurate name)MeridianConfig→MeridianSMoEConfig- Backward-compatible aliases kept — no breaking changes for existing imports
Critical Bug Fixes
_tied_weights_keys(transformers 5.9.0): Fixed fromlisttodictto match the new API. This was silently breakingsave_pretrainedfor the custom research architecture._init_weightsguard bypass: Fixed allmodule.weight.data.normal_()calls to usenn.init.normal_()so transformers'guard_torch_init_functionscorrectly marks weights as initialized. Previously,from_pretrainedwould re-randomize all weights after loading from checkpoint.
Training Improvements
| Setting | Before | After |
|---|---|---|
| BLOCK_SIZE | 256 | 512 — 2x context window |
| MAX_BYTES | 15 MB | 25 MB — ~67% more data per run |
| EWC_LAMBDA | 500.0 | 75.0 — less overconstrained |
| EWC_SAMPLES | 5 | 20 — more stable Fisher estimate |
| FISHER_THRESHOLD | 1e-4 | 5e-4 — EWC state: 1.88 GB to ~200 MB |
| GRAD_ACCUM | 8 | 4 |
Dataset Curriculum Rebalanced
- Increased: sujet-ai/Sujet-Finance-Instruct-177k (12% to 18%)
- Added: mhenrichsen/alpaca_data_cleaned (5%)
- Reduced: nvidia/OpenMathInstruct-2 (25% to 15%), HuggingFaceFW/fineweb-edu (20% to 12%)
Repo Cleanup
- Deleted debug scripts with hardcoded API keys (check_comet.py, find_workspace.py, check_cascade.py)
- Deleted duplicate docs/examples/ (canonical location is examples/)
- Deleted timing_test.py and stale .cometml-runs/ artifact
- Fixed all stale SmolLM2-360M references to Qwen2.5-0.5B
- Fixed examples/03_model_config.py (wrong param names and class names)
New Scripts
- scripts/migrate_legacy_and_seed.py — copies checkpoint to legacy/v5.1.0/ on HF and seeds a fresh Qwen2.5-0.5B
- scripts/cleanup_hf_checkpoint.py — removes duplicate pytorch_model.bin when model.safetensors exists
- scripts/diagnose_and_test.py — full diagnostic: perplexity + 8 finance generation tests
Documentation
- CHANGELOG.md — full version history v1.0.0 through v6.0.0 with 3-month training audit
- Updated all CI defaults tables across README and docs
- Clarified deployed model (Qwen2.5-0.5B) vs research architecture (MeridianSMoEForCausalLM)
Deployed model: https://huggingface.co/meridianal/FinAI
v2.0.0 (Qwen2.5-0.5B AI)
Upgraded architecture to Qwen2.5-0.5B. Infused fully with FinanceAlpaca instruction datasets. Ready for hourly continual training.
v1.0.0 (Legacy SmolLM2-360M)
Legacy release using the SmolLM2-360M base architecture. Retired due to high perplexity and format memorization.