Back to projects

2025

Machine Learning Library (From Scratch)

NumPy-only machine learning and gradient-boosted tree library.

  • Python
  • NumPy
  • Machine Learning
  • XGBoost (inspiration)

Overview

This project focused on building a machine learning library entirely from scratch using only NumPy, with the goal of deeply understanding model optimization and learning dynamics.

Implemented Models

I implemented linear regression and logistic regression models without relying on external ML frameworks. Training, loss computation, and optimization routines were implemented directly on top of NumPy operations.

Beyond basic models, I designed and implemented a custom gradient-boosted decision tree algorithm inspired by XGBoost, including tree construction and boosting logic.

Evaluation and Data

The models were evaluated using decades of historical NBA and NFL game data. Performance was analyzed not only using standard predictive metrics, but also through expected-value evaluation against historical betting lines.

This framing allowed the project to connect classical supervised learning performance with real-world decision-making utility.

What I Learned

Building learning algorithms from first principles significantly improved my intuition for optimization, bias–variance trade-offs, and ensemble behavior. It also reinforced the importance of data quality and feature design when deploying predictive systems in noisy, real-world domains.

Media

Code