Statistics

Analysis strategies in online A/B experiments: Intention-to-treat, per-protocol, and other lessons from clinical trials

Epidemiologists analyse clinical trials to estimate the intention-to-treat and per-protocol effects. This post applies their strategies to online experiments.

Many is not enough: Counting simulations to bootstrap the right way

Going deeper into correct testing of different methods for bootstrap estimation of confidence intervals.

Bootstrapping the right way?

Video and summary of a talk I gave at YOW! Data on bootstrap estimation of confidence intervals.

Hackers beware: Bootstrap sampling may be harmful

Bootstrap sampling has been promoted as an easy way of modelling uncertainty to hackers without much statistical knowledge. But things aren’t that simple.

The most practical causal inference book I’ve read (is still a draft)

Causal Inference by Miguel Hernán and Jamie Robins is a must-read for anyone interested in the area.

Defining data science in 2018

Updating my definition of data science to match changes in the field. It is now broader than before, but its ultimate goal is still to support decisions.

Customer lifetime value and the proliferation of misinformation on the internet

There’s a lot of misleading content on the estimation of customer lifetime value. Here’s what I learned about doing it well.

If you don’t pay attention, data can drive you off a cliff

Seven common mistakes to avoid when working with data, such as ignoring uncertainty and confusing observed and unobserved quantities.

Is Data Scientist a useless job title?

It seems like anyone who touches data can call themselves a data scientist, which makes the title useless. The work they do can still be useful, though.

Making Bayesian A/B testing more accessible

A web tool I built to interpret A/B test results in a Bayesian way, including prior specification, visualisations, and decision rules.