GitHub LinkedIn Email

Back to engineering

Engineering project

QuantLab

Research-to-production ML backtesting framework with leakage-safe validation and transaction-cost-aware evaluation.

Python pandas scikit-learn XGBoost Financial ML Backtesting

GitHub placeholder Resume

Problem

Financial ML experiments are easy to overfit and hard to trust without careful temporal validation, cost assumptions, and baseline comparisons.

Current status

Portfolio skeleton with planned reproducible examples.

What I built

Designed a pipeline from market data caching through features, labels, walk-forward splits, training, and backtesting.
Added comparison points for simple baselines and ML-driven strategies.
Structured reports around risk, turnover, transaction costs, and reproducibility.

Architecture / system design

01
Market Data
02
Feature Engineering
03
Label Generation
04
Walk-Forward Split
05
Model Training
06
Backtest
07
Metrics / Report

Technical highlights

Validation design is treated as part of the system architecture.
Transaction costs and risk metrics are included early instead of after model selection.
Reports are built to make failed experiments useful rather than invisible.

Future work

Add public toy datasets and deterministic example reports.
Compare tree-based models, linear baselines, and simple rule-based strategies.
Document leakage checks and experiment review criteria.

Tech stack

Python pandas NumPy scikit-learn XGBoost Matplotlib pytest

Demo / screenshots

Example reports will use public or synthetic data only.

Resume bullet draft

Built a financial ML backtesting framework with data caching, feature engineering, walk-forward validation, transaction costs, and reproducible reports.
Compared baseline and ML strategies with risk-aware metrics to reduce leakage and overfitting risk.