// applied machine learning · 2023
ML Research Projects
Financial fraud detection & loan default prediction
Two applied ML research projects using real-world financial datasets. Both involved heavy feature engineering, class imbalance handling, and systematic evaluation across multiple model architectures.
IEEE-CIS Fraud Detection dataset
Financial Fraud Detection
Binary classification on transaction-level data to identify fraudulent activity. The dataset contains anonymized transaction records with engineered behavioral and temporal features.
- →Feature engineering on transaction velocity, device fingerprints, and user behavior signals
- →Class imbalance handling — focal loss and per-class weighting
- →Evaluation across gradient boosting, neural networks, and ensemble methods
- →Threshold tuning for precision/recall trade-offs in fraud context
LendingClub loan dataset
Loan Default Prediction
Multi-class classification on loan applications to predict default risk. Involves credit history features, income ratios, and loan term characteristics.
- →Feature engineering on DTI ratios, credit utilization, and loan grade signals
- →TensorFlow training pipeline with configurable class weights
- →Evaluation across 8+ model architectures including LSTMs and dense networks
- →Calibration analysis for risk-scored output probabilities
Methodology
Both projects used a consistent experimental framework: baseline model → feature engineering iteration → class imbalance intervention → architecture search → evaluation. The goal was not to find the best model on a leaderboard but to understand which interventions produced the largest marginal gains and why.
Class imbalance was handled via focal loss (down-weighting easy negatives during training) and explicit class weight schedules — both approaches were compared across architectures.