Trained on the ULB public credit card fraud dataset — 284,807 transactions with only 0.172% labelled fraud. Tuned to minimise false positives while catching the long tail of class-imbalanced anomalies.
| Model | Prec | Recall | F1 | AUC |
|---|---|---|---|---|
| Logistic Regression | 0.87 | 0.62 | 0.72 | 0.94 |
| Random Forest | 0.91 | 0.78 | 0.84 | 0.97 |
| XGBoost | 0.93 | 0.85 | 0.87 | 0.98 |
XGBoost wins on the class-imbalanced F1 after SMOTE oversampling on the training set. Logistic Regression is included as a baseline — it reminds us how far a linear decision surface gets before precision collapses on PCA noise.