Credit History & Utilization EDA

# Issue 3 – Credit History & Utilization EDA with OOP + Functional Report

**Goal**  
Explore how **credit history, balances, utilization, and inquiries** relate to `loan_status`.

You must use:

- **OOP**: create a class that encapsulates credit-history EDA.
- **Functional programming**: build a report function that stores and runs multiple EDA steps via functions/lambdas.

---

## Columns in scope

You are responsible for the following columns:

**Ratios**

- `dti`  
- `dti_joint`  
- `revol_util`  
- `il_util`  
- `all_util`  

**Delinquency / history**

- `delinq_2yrs`  
- `earliest_cr_line`  
- `mths_since_last_delinq`  
- `mths_since_last_record`  
- `mths_since_last_major_derog`  

**Accounts & public records**

- `open_acc`  
- `total_acc`  
- `pub_rec`  
- `acc_now_delinq`  

**Balances & limits**

- `revol_bal`  
- `total_rev_hi_lim`  
- `tot_coll_amt`  
- `tot_cur_bal`  
- `total_bal_il`  
- `max_bal_bc`  

**Recent account activity**

- `open_acc_6m`  
- `open_il_6m`  
- `open_il_12m`  
- `open_il_24m`  
- `mths_since_rcnt_il`  
- `open_rv_12m`  
- `open_rv_24m`  

**Inquiries / pulls**

- `inq_last_6mths`  
- `inq_last_12m`  
- `inq_fi`  
- `last_credit_pull_d`  

**Collections**

- `collections_12_mths_ex_med`  

**Shared target**

- `loan_status` (used for default-rate and correlation analyses)

---

## Files to edit

- `src/eda_credit_history.py`
- `notebooks/eda_credit_history_demo.ipynb`

---

## 1. Implement the `CreditHistoryEDA` class

In `src/eda_credit_history.py`, create and implement:

"""
import pandas as pd
from typing import Dict, Any, Callable

CREDIT_NUMERIC_COLS = [
    "dti", "dti_joint",
    "delinq_2yrs",
    "mths_since_last_delinq", "mths_since_last_record", "mths_since_last_major_derog",
    "open_acc", "total_acc", "pub_rec", "acc_now_delinq",
    "revol_bal", "revol_util", "total_rev_hi_lim",
    "tot_coll_amt", "tot_cur_bal", "total_bal_il",
    "open_acc_6m", "open_il_6m", "open_il_12m", "open_il_24m",
    "mths_since_rcnt_il",
    "open_rv_12m", "open_rv_24m",
    "max_bal_bc",
    "all_util",
    "inq_last_6mths", "inq_last_12m", "inq_fi",
    "collections_12_mths_ex_med",
]

class CreditHistoryEDA:
    def __init__(self, df: pd.DataFrame, target_col: str = "loan_status"):
        """
        Store the full DataFrame and the name of the target column.
        """
        self.df = df
        self.target_col = target_col

    def credit_structure_summary(self) -> pd.DataFrame:
        """
        One row per CREDIT_NUMERIC_COLS column with:
        - column
        - dtype
        - n_missing
        - missing_pct
        - mean (if numeric)
        - std (if numeric)
        """
        ...

    def default_rate_by_bucket(self, col: str, bins: int = 4) -> pd.DataFrame:
        """
        For a numeric credit column (e.g., dti, revol_util),
        create `bins` buckets and compute default rate per bucket.

        Return a DataFrame with columns:
        - bucket (interval)
        - n_loans
        - default_rate
        """
        ...

    def correlation_with_default(self) -> pd.Series:
        """
        Compute correlation of each numeric credit column with the target
        (assuming loan_status is encoded as 0/1).
        Return a Series indexed by column name.
        """
        ...
"""

---

## 2. Functional credit-history report

Add a **functional report generator** that coordinates several EDA steps:

"""
def credit_history_report(eda: CreditHistoryEDA) -> Dict[str, Any]:
    """
    Build a dict of step_name -> callable and run them to produce
    a combined report.

    Example steps:
      - "structure_summary": eda.credit_structure_summary
      - "dti_buckets": lambda: eda.default_rate_by_bucket("dti", bins=5)
      - "revol_util_buckets": lambda: eda.default_rate_by_bucket("revol_util", bins=5)
      - "correlation_with_default": eda.correlation_with_default

    Iterate over this dict, call each function, and return
    a result dict: step_name -> output.
    """
    ...
"""

Example idea:

"""
def credit_history_report(eda: CreditHistoryEDA) -> Dict[str, Any]:
    steps: Dict[str, Callable[[], Any]] = {
        "structure_summary": eda.credit_structure_summary,
        "dti_buckets": lambda: eda.default_rate_by_bucket("dti", bins=5),
        "revol_util_buckets": lambda: eda.default_rate_by_bucket("revol_util", bins=5),
        "correlation_with_default": eda.correlation_with_default,
    }

    report: Dict[str, Any] = {}
    for name, func in steps.items():
        report[name] = func()
    return report
"""

This should clearly show **higher-order functions** (functions stored and called later).

---

## 3. Create the demo notebook

In `notebooks/eda_credit_history_demo.ipynb`:

1. Load the dataset:

"""
import pandas as pd
from src.eda_credit_history import CreditHistoryEDA, credit_history_report

df = pd.read_csv("data/loan_sample.csv")  # or correct path
"""

2. Instantiate the EDA class:

"""
eda = CreditHistoryEDA(df, target_col="loan_status")
"""

3. Run the report:

"""
report = credit_history_report(eda)
"""

4. Display at least:

"""
report["structure_summary"]        # structure of all credit-history columns
report["dti_buckets"]             # default rate by DTI bucket
report["revol_util_buckets"]      # default rate by revol_util bucket
report["correlation_with_default"]  # correlation of each credit feature with default
"""

Optionally add one bar plot (e.g., default rate by DTI bucket), but stay within EDA (no modeling).

---

## Acceptance Criteria ✅

- `CreditHistoryEDA`:
  - Initializes correctly with a DataFrame.
  - `credit_structure_summary()` returns a DataFrame with one row per credit-history numeric column and basic stats.
  - `default_rate_by_bucket(col, bins)` returns a DataFrame with bucket, n_loans, and default_rate.
  - `correlation_with_default()` returns a Series of correlations.

- Functional report:
  - `credit_history_report(eda)` uses a dict of callables or similar FP construct.
  - It iterates over steps, calls each function, and returns a dict of results.

- Notebook:
  - Runs top-to-bottom without errors.
  - Shows structure summary, bucketed default-rate tables, and correlations.
  - Contains **only EDA** (no model training).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Credit History & Utilization EDA #3

Issue 3 – Credit History & Utilization EDA with OOP + Functional Report

Columns in scope

Files to edit

1. Implement the `CreditHistoryEDA` class

2. Functional credit-history report

3. Create the demo notebook

Acceptance Criteria ✅

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Credit History & Utilization EDA #3

Description

Issue 3 – Credit History & Utilization EDA with OOP + Functional Report

Columns in scope

Files to edit

1. Implement the CreditHistoryEDA class

2. Functional credit-history report

3. Create the demo notebook

Acceptance Criteria ✅

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

1. Implement the `CreditHistoryEDA` class