March 20, 2026 1 min read

Why Gradient Boosting Over Deep Learning for NBA Predictions

Decision

Featured

Depth: ●●●○○

Research consistently shows GBDT outperforms neural networks on tabular data. Here's why I chose scikit-learn over PyTorch for 12 prediction models.

The Architecture Decision

This is the single most important ML architecture decision in Hoop Almanac. The NBA data is tabular/structured (player stats, game logs — rows and columns), not images or text sequences. Research (Grinsztajn et al., NeurIPS 2022) shows GBDT outperforms neural nets on tabular data.

Why GBDT Wins

Feature engineering matters more than architecture: hand-crafted features (rolling windows, interaction terms, cumulative fatigue scores) already extract the signal. Small dataset (~600 players, ~1,200 games/season) means deep learning overfits. SHAP provides interpretable predictions. Models train in seconds on CPU, serve in milliseconds via joblib. No GPU needed.

When I'd Use PyTorch

Images (CNNs), sequences (Transformers), millions of rows with high-cardinality categoricals (embeddings), or transfer learning. None apply to structured NBA stats.

Related Projects

Hoop Almanac