Introducing pipe, The Automattic Machine Learning Pipeline

One of the main projects I’ve been working on over the past year.

Data for Breakfast

Screen Shot 2018-11-06 at 14.54.48

A generalized machine learning pipeline, pipe serves the entire company and helps Automatticians seamlessly build and deploy machine learning models to predict the likelihood that a given event may occur, e.g., installing a plugin, purchasing a plan, or churning.

A team effort, pipe provides general, long-term, and robust solutions to common or important problems our product and marketing teams face. When I first joined Automattic almost exactly three years ago, my tasks were two-fold:

  1. I had the autonomy and freedom to delve deep into topics of my choice, which at the time revolved around uncovering the networks hiding withinourcommunitiesusingnetworkscience.
  2. But like most data scientists in the industry, most of my time was spent serving product and marketing teams by providing answers to their data questions which ranged from running simple SQL-like queries to doing more in-depth — but one-off — statistical analyses.

We soon…

View original post 1,024 more words