Skip to content

Blanket58/scorecard

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

161 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

scorecard

Python package

A robust, sklearn-style scorecard toolbox for credit risk modeling. A Python alternative to R's scorecard package, providing end-to-end functionality from binning and WOE encoding to scorecard generation, SQL deployment, and model performance evaluation.

Key Features

  • Smart Binning: Supports both ChiMerge and Decision Tree-based binning methods.
  • WOE Encoding: Automatic Weight of Evidence (WOE) transformation for categorical and continuous features.
  • IV Calculation: Computes Information Value (IV) for data-driven feature selection.
  • Scorecard Generation: Creates interpretable scorecards with point values for each feature bin.
  • SQL Conversion: Converts scorecards to executable SQL CASE WHEN statements for production deployment.
  • Performance Evaluation: Includes KS, AUC, PSI (Population Stability Index), gains table, and lift analysis.
  • Sklearn Compatible: Follows sklearn API conventions for seamless integration with existing machine learning workflows.

Installation

Download the latest wheel file from Releases and install via pip:

pip install scorecard-0.0.8-py3-none-any.whl

(Note: Replace the wheel filename with the latest version from Releases.)

Quick Start

Here's a minimal end-to-end example:

import ssl
from ucimlrepo import fetch_ucirepo
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegressionCV

from scorecard import (
    ChiMergeWoeEncoder,
    scorecard,
    card2sql,
    perf_eva,
    perf_psi
)

# 1. Load sample data (German Credit Data)
ssl._create_default_https_context = ssl._create_unverified_context
statlog_german_credit_data = fetch_ucirepo(id=144)
X = statlog_german_credit_data.data.features
y = statlog_german_credit_data.data.targets.squeeze().map({1: 0, 2: 1})

# 2. WOE encoding with ChiMerge binning
woe_encoder = ChiMergeWoeEncoder()
X_woe = woe_encoder.fit_transform(X, y)

# 3. Feature selection (IV >= 0.02)
iv = woe_encoder.iv_table
variables = iv.loc[iv['total_iv'] >= 0.02, 'variable'].tolist()
bins = dict(filter(lambda x: x[0] in variables, woe_encoder.bins_result_.items()))

# 4. Train-test split
X_train, X_test, y_train, y_test = train_test_split(
    X_woe[variables], y, test_size=0.25, random_state=1, stratify=y
)

# 5. Train logistic regression model
clf = LogisticRegressionCV(
    Cs=10, fit_intercept=True, cv=5, l1_ratios=(1,),
    solver='liblinear', random_state=1
)
clf.fit(X_train, y_train)

# 6. Generate scorecard
card = scorecard(
    bins, clf.intercept_, clf.coef_.reshape(-1,), variables,
    points0=600, odds0=y_train.sum()/(y_train.size - y_train.sum()), pdo=50
)

# 7. Convert scorecard to SQL
sql = card2sql(card, to_clipboard=False)
print(sql)

Full Example

See the complete end-to-end workflow (including performance evaluation, PSI analysis, and visualization) on nbviewer:
📓 Example Notebook

About

Rewrite brand new robust scorecard toolbox, if you are looking for something similar as `scorecard` in R and `scorecardpy` in python.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages