Why Gradient Boosting Over Deep Learning for NBA Predictions
Research consistently shows GBDT outperforms neural networks on tabular data. Here's why I chose scikit-learn over PyTorch for 12 prediction models.
The Architecture Decision
This is the single most important ML architecture decision in Hoop Almanac. The NBA data is tabular/structured (player stats, game logs — rows and columns), not images or text sequences. Research (Grinsztajn et al., NeurIPS 2022) shows GBDT outperforms neural nets on tabular data.
Why GBDT Wins
Feature engineering matters more than architecture: hand-crafted features (rolling windows, interaction terms, cumulative fatigue scores) already extract the signal. Small dataset (~600 players, ~1,200 games/season) means deep learning overfits. SHAP provides interpretable predictions. Models train in seconds on CPU, serve in milliseconds via joblib. No GPU needed.
When I'd Use PyTorch
Images (CNNs), sequences (Transformers), millions of rows with high-cardinality categoricals (embeddings), or transfer learning. None apply to structured NBA stats.