Web-based forecasting tool and model building tips
We get some tips on model building (specifically variable selection) and then also get our hands on a very sloppily made forecasting web tool.
I hope your week is going well!
New articles
Actor-Motivation Forecasting Web Tool
If you belong to the tiny intersection of people who (a) follow me, (b) have read some Bueno de Mesquita, and (c) are interested in trying to use his forceasting methods yourself (e.g. to win at office politics), I created a web-based tool off of my best interpretation of how Bueno de Mesquita did things throughout the '80s and maybe early '90s.
If you belong to the even narrower intersection of people who fulfill all the criteria above and (d) understand his forecasting methods even better than me, I'd like voluminous feedback on every misunderstanding built into the tool. I'm fairly certain it's wrong, because it predicts things slightly differently to the few worked examples Bueno de Mesquita have published.
Full article (1–4 minute read): Actor-Motivation Forecasting Web Tool
Flashcard of the week
This is from one of the many books on regression I've read. Probably Applied Logistic Regression. (Good book!)
When selecting variables to include in the preliminary model, the authors suggest not fitting all variables against the response, but fitting them individually, one by one, and then picking those with significant coefficients. Why is that?
There's also a bonus question related to this, but not on the same flashcard: the threshold the authors select for significance in this process is p<=0.25, rather than the more common p<=0.05. Why is that?
The answer to the first question is
Because when the model is overfit, the Wald significance of coefficients goes all of the place and no longer serves as a useful guide to predictive power.
The answer to the bonus question is that p<=0.05 is too strict a rule – it tends to discard variables that are known to have predictive power.
The authors then go on to list several more steps for systematic yet pragmatic model building. These steps are universally useful – not just for logistic regression. I do recommend reading that book if you've ever sat with some data in front of you and not known where to begin.
Premium newsletter
I have so much going on in my life that I'm still not finished with the next premium newsletter. Sorry! It will contain a list of the 20 best technical books I've read in the past five years – as well as how one can possibly approach creating such a list without having to manually create a global book ranking, because that would suck.
If this sounds interesting, you should upgrade to a premium subscription! If you upgrade now, you will pay only $2/month – as interest increases and I learn to hit my stride with publishing these, the price will be set higher for future subscribers. As always, you can cancel at any time.
To upgrade, click the subscription link at the top of this newsletter and fill in your email again.
Your opinions
I cannot improve without feedback. Reply to this email to share your thoughts on any of the topics above, or anything else!