This Enterprise Credit Card Fraud Detection System is a modular, production-ready Python application designed to identify fraudulent transactions in high-volume financial datasets. Unlike simple script-based approaches, this system handles the entire machine learning lifecycle: data ingestion, cleaning, complex feature engineering, model training with class imbalance handling, and automated reporting.
The system is architected into distinct modules within the src/ directory, ensuring separation of concerns and maintainability.
- Responsibility: Efficiently loads and sanitizes raw transaction logs.
- Key Logic:
- Validates file existence and integrity.
- Standardizes column names (correcting inconsistencies like
oldbalanceOrgvsnewbalanceOrig). - Handles missing values (though rare in PaySim).
This is the critical component where domain knowledge is applied to extract signal from noise.
- Temporal Features: Extracts
hour_of_dayfrom thestepcolumn to capture time-based fraud patterns. - Behavioral Features:
errorBalanceOrig: Calculates the discrepancy between the transaction amount and the change in the originator's balance. Fraudulent transactions often show zero balance changes or unexpected amounts.errorBalanceDest: Similar calculation for the recipient's balance.
- Categorical Encoding: Converts transaction types (
CASH_OUT,TRANSFER, etc.) into one-hot encoded variables (type_CASH_OUT,type_TRANSFER) for mathematical model compatibility.
- Algorithm: Random Forest Classifier.
- Why? Random Forest is robust to noise, handles non-linear relationships effectively, and provides feature importance visibility.
- Class Imbalance Handling:
- The dataset is highly imbalanced (~0.17% fraud).
- We use
class_weight='balanced', which automatically adjusts weights inversely proportional to class frequencies. This penalizes the model heavily for missing a fraud case, ensuring it doesn't just predict "Legit" 99.9% of the time.
- Persistence: Models are serialized using
joblibtomodels/fraud_model.pklfor immediate deployment without retraining.
- Automated Reporting: Every training run generates:
- Classification Report: Precision, Recall, and F1-score for both classes.
- Confusion Matrix: Visualizes True Positives vs. False Positives.
- ROC Curve: Plots the True Positive Rate against the False Positive Rate.
- Audit Logging (
src/utils.py): All actions are logged tologs/in JSON format for compliance and debugging.
The training pipeline (main.py --mode train) executes the following steps:
- Ingest: Load the 6.3M+ row dataset.
- Split: Perform a Stratified Train-Test Split (80/20). Stratification is crucial to ensure the minority class (fraud) is present in both sets in the exact same proportion.
- Scale: Apply
StandardScalerto normalize numerical features (mean=0, variance=1). - Train: Fit the Weighted Random Forest on the training set.
- Validate: Predict on the held-out test set and generate metrics.
The model achieves exceptional performance on the PaySim dataset:
- ROC-AUC: ~0.999 (Near perfect)
- Precision (Fraud): 1.00 (No false positives in test set)
- Recall (Fraud): 1.00 (caught all fraud in test set)
Note: These perfect scores reflect the synthetic nature of the dataset, where fraud patterns are deterministic. In real-world production data, expect slightly lower scores (AUC ~0.95), but the architecture remains valid.
Run the model in prediction mode on past transaction logs to identify missed fraud cases.
python main.py --mode predict --data old_logs.csv --model_path models/fraud_model.pklThe FraudDetector class in src/model.py can be imported into a REST API (FastAPI/Flask) to score transactions in real-time before approval.
from src.model import FraudDetector
detector = FraudDetector()
detector.load_model('models/fraud_model.pkl')
prediction = detector.predict_proba(new_transaction_features)