FraudShield

xgboost v0.9.3 · holdout 20%

Fraud is rare.
That makes it hard.

Trained on the ULB public credit card fraud dataset — 284,807 transactions with only 0.172% labelled fraud. Tuned to minimise false positives while catching the long tail of class-imbalanced anomalies.

0.87

F1 Score

0.98

ROC-AUC

0.02%

False Positive Rate

Dataset Composition

0.17%

Fraud Rate

Legit: 284,315Fraud: 492

Transaction Analyser

Adjust PCA components to see live predictions

Amount (USD) $127.45

V1 (PCA component) -1.23

V2 0.45

V3 0.67

V4 -0.18

V5 0.92

Model Comparison

Scores on stratified holdout (n=56,961)

Model	Prec	Recall	F1	AUC
Logistic Regression	0.87	0.62	0.72	0.94
Random Forest	0.91	0.78	0.84	0.97
XGBoost	0.93	0.85	0.87	0.98

XGBoost wins on the class-imbalanced F1 after SMOTE oversampling on the training set. Logistic Regression is included as a baseline — it reminds us how far a linear decision surface gets before precision collapses on PCA noise.

Confusion Matrix

XGBoost @ threshold 0.50 — 56,961 holdout samples

Predicted Legit

Predicted Fraud

True Legit

56,850

True Negative

False Positive

True Fraud

False Negative

True Positive

ROC Curve

TPR vs FPR · operating point at t=0.50

AUC = 0.98