Improving CausalImpact with a BSTS custom forecast

                August 26, 2020

            Improving CausalImpact with a BSTS custom forecast

            CausalImpact is an R package that helps analyse the causal effect of an event on a metric of interest.
It is often used by media analysts trying to pinpoint the uplift caused by
campaigns or by SEOs wanting to prove their impact to clients.
Mark Edmonson has made a tool called GA
Effect where you can try it online with
your Google Analytics data.
The output of the analysis can be visualised like this:

The first plot shows the actual data (black line) and a forecasted estimate
  for what would have happened based on the historical data prior to the
  vertical dotted line.
The key assumption is that any difference between the forecast and the actuals
  is because of the change that happened on the day marked by the dotted line.
The following two charts show the size of the daily effect and the cumulative
  overall effect.
The pale blue ribbon is a predictive interval. If this is entirely above or
  below zero then people will reject the hypothesis that there was no change and
  conclude that there was a positive/negative impact.

However, in real life, the results don't always look like this. Sometimes you
end up with something like this:

The dotted blue line in the first plot doesn't track the black line at all;
  this shows that the CausalImpact forecast is poor at predicting what happened
  in the past.
Eyeballing the data shows there was a strong negative trend starting on day 30
  and continuing until the intervention on day 75.
It looks as if the intervention had a positive effect
But the model looks like it is comparing the post intervention period against
  the average for the whole of the pre-intervention period rather than realising
  that there was a negative trend up until the intervention.
This means that the cumulative difference with the model is not good enough to
  consider that the intervention caused a positive change.

To me, this looks like a false negative; the intervention has had a positive
effect and the CausalImpact test is wrong to say there is no effect.
It is particularly concerning that there seems to be a very poor match between
the forecast and the actual values during the pre-intervention period.

You can get better results by using a custom forecast.
The default forecast in CausalImpact is a Local Level forecast. You can
see this in the code on
Github.
Local Level forecasts are very simple. It says the next value in the series is
equal to the previous value plus some (small) random amount.
This is also known as a "random walk plus noise" model.
The good thing about this model is that there are very few assumptions and, at
least for short range predictions, it can give good results for a very wide
range of timeseries.
But, the downside is that the simplicity and lack of structure in the model
means that is misses things like trends in the data and this can "fool" it into
false negative results as seen above.
One solution is to make a custom BSTS model that better fits your data in the
pre-intervention period. The CausalImpact docs have a section on how to do
this.
Which model to use? I like to start with an AutoAr component because this is
less flexible with how much it allows the trend to vary; this gives a narrower
credible interval for longer term forecasts.
ss <- AddAutoAr(list(),
            pre.period.response
            )

You can also add a static intercept with this kind of forecast:
ss <- AddStaticIntercept(ss,pre.period.response)

Then fit and plot the results like the documentation says.
bsts.model <- bsts(pre.period.response, ss, niter = 1000)
impact <- CausalImpact(bsts.model = bsts.model,
                       post.period.response = post.period.response)
plot(impact)

Much better!
AutoAr isn't necessarily the best forecast you can use in all circumstances;
it just works with this example. For serious analysis you should aim to build
the best forecast you can on the pre-intervention data and only then run it
through CausalImpact to see the results for the post-intervention period.
Another way to solve this problem is through the skillful use of regression
columns. These add extra data to help the model make predictions. If you have
some other metric that is closely correlated with the thing you are interested
in and it should be unaffected by the intervention then you can use it to
do a better analysis.
This is how some SEO split test tools work; half the pages get the intervention
and the traffic to the other half is used as a regressor. If the pages are split
randomly between the two groups then there should be a very good correlation
between traffic to the test and control groups in the pre-intervention period.
This makes it easier for the algorithm to find real changes in performance.

Don't miss what's next. Subscribe to The Anvil: