- One of the biggest problems in predictive modeling is the conflation between classic hypothesis testing with careful model specification vis-a-vis pure data mining.
- “The key bottleneck is that most data comparison algorithms today rely on a human expert to specify what ‘features’ of the data are relevant for comparison.
- there are many viable approaches to model building that leverage, e.g., Lasso, LAR, stepwise algorithms or “elephant models” that use all of the available information. The reality is that, even with AWS or a supercomputer, you can’t use all of the available information at the same time