Turn any gradient boosted tree ensemble into pure Python (and Swift) code.
| Backend | Supported Models |
|---|---|
| scikit-learn | HistGradientBoostingRegressor, HistGradientBoostingClassifier |
| XGBoost | XGBRegressor, XGBClassifier, Booster |
| LightGBM | LGBMRegressor, LGBMClassifier, Booster |
| CatBoost | CatBoostRegressor, CatBoostClassifier, CatBoost |
| WarpGBM | WarpGBM |
Output languages: Python, Swift
from export_gbm import export_model
export_model(model, "my_model.py")python export_gbm.py --model model.pkl --output model.pyKey flags:
--language— Export language (pythonorswift, default:python)--model-name— Exported class/type name (defaults to output filename stem)--force— Overwrite output file if it already exists
- joblib (
.joblib) - pickle (
.pkl) - XGBoost native (
.json,.ubj) - LightGBM native (
.txt) - CatBoost native (
.cbm) - Numerai-style cloudpickle (auto-unwraps predict functions)
gbm2py is especially useful for anyone using Numerai's model upload feature. gbm2py can also directly load Numerai-style cloudpickle files (which wrap a predict function) and auto-extract the embedded model.
WarpGBM needs a CUDA capable GPU for training and inference. Using gbm2py you can export WarpGBM models and use them in any environment, even Numerai's model upload docker environment.
For reference, the Numerai example model with 2000 estimators turns into a 12MB Python file. But then again, if you're using pickle for serialization, file sizes are the least of your problems. :)
Train a gradient boosted tree model with your favorite library, then point gbm2py at the saved model file. gbm2py walks the tree structure and serializes every split into a self-contained source file with zero dependencies.
python tests/test_export_histgbm.py
python tests/test_export_xgboost.py
python tests/test_export_lightgbm.py
python tests/test_export_catboost.py
python tests/test_export_warpgbm.py
python tests/test_export_numerai_pickle.pyThis project is licensed under the Apache License 2.0.
