Google's Rules of Machine Learning still apply in the age of large language models

Despite the excitement around large language models, building with machine learning remains an engineering problem with established best practices.

September 21, 2023

Was data science a failure mode of software engineering?

Yes, data science projects have suffered from classic software engineering mistakes, but the field is maturing with the rise of new engineering roles.

June 30, 2023

How hackable are automated coding assessments?

Exploring the hackability of speed-based coding tests, using CodeSignal’s Industry Coding Framework as a case study.

May 26, 2023

Building useful machine learning tools keeps getting easier: A fish ID case study

Lessons learned building a fish ID web app with and Streamlit, in an attempt to reduce my fear of missing out on the latest deep learning developments.

March 20, 2022

My work with Automattic

Back-dated meta-post that gathers my posts on Automattic blogs into a summary of the work I’ve done with the company.

October 7, 2021

Software commodities are eating interesting data science work

Being a data scientist can sometimes feel like a race against software commodities that replace interesting work. What can one do to remain relevant?

January 11, 2020

Bootstrapping the right way?

Video and summary of a talk I gave at YOW! Data on bootstrap estimation of confidence intervals.

October 6, 2019

Hackers beware: Bootstrap sampling may be harmful

Bootstrap sampling has been promoted as an easy way of modelling uncertainty to hackers without much statistical knowledge. But things aren’t that simple.

January 7, 2019

Exploring and visualising Reef Life Survey data

Web tools I built to visualise Reef Life Survey data and assist citizen scientists in underwater visual census work.

June 3, 2017

Is Data Scientist a useless job title?

It seems like anyone who touches data can call themselves a data scientist, which makes the title useless. The work they do can still be useful, though.

August 4, 2016

Migrating a simple web application from MongoDB to Elasticsearch

Migrating BCRecommender from MongoDB to Elasticsearch made it possible to offer a richer search experience to users at a similar cost, among other benefits.

November 4, 2015

The wonderful world of recommender systems

Giving an overview of the field and common paradigms, and debunking five common myths about recommender systems.

October 2, 2015


Migrating my web apps away from due to reliability issues. Self-hosting is a better solution.

July 31, 2015

First steps in data science: author-aware sentiment analysis

I became a data scientist by doing a PhD, but the same steps can be followed without a formal education program.

May 2, 2015

Automating bulk data imports

A script for importing data into the Parse backend-as-a-service.

January 15, 2015

What is data science?

Data science has been a hot term in the past few years. Still, there isn’t a single definition of the field. This post discusses my favourite definition.

October 23, 2014

Building a recommender system on a shoestring budget (or: BCRecommender part 2 – general system layout)

Iterating on my BCRecommender service with the goal of keeping costs low while providing a valuable music recommendation service.

September 7, 2014