← Projects

// applied machine learning · 2023

ML Research Projects

Financial fraud detection & loan default prediction

Two applied ML research projects using real-world financial datasets. Both involved heavy feature engineering, class imbalance handling, and systematic evaluation across multiple model architectures.

IEEE-CIS Fraud Detection dataset

Financial Fraud Detection

Binary classification on transaction-level data to identify fraudulent activity. The dataset contains anonymized transaction records with engineered behavioral and temporal features.

  • Feature engineering on transaction velocity, device fingerprints, and user behavior signals
  • Class imbalance handling — focal loss and per-class weighting
  • Evaluation across gradient boosting, neural networks, and ensemble methods
  • Threshold tuning for precision/recall trade-offs in fraud context

LendingClub loan dataset

Loan Default Prediction

Multi-class classification on loan applications to predict default risk. Involves credit history features, income ratios, and loan term characteristics.

  • Feature engineering on DTI ratios, credit utilization, and loan grade signals
  • TensorFlow training pipeline with configurable class weights
  • Evaluation across 8+ model architectures including LSTMs and dense networks
  • Calibration analysis for risk-scored output probabilities

Methodology

Both projects used a consistent experimental framework: baseline model → feature engineering iteration → class imbalance intervention → architecture search → evaluation. The goal was not to find the best model on a leaderboard but to understand which interventions produced the largest marginal gains and why.

Class imbalance was handled via focal loss (down-weighting easy negatives during training) and explicit class weight schedules — both approaches were compared across architectures.

Stack

TensorFlowPythonJupyterpandasscikit-learnIEEE-CISLendingClubFeature Engineering