Software Engineering

Keep learning: Your career is never truly done

Podcast chat on my career journey from software engineering to data science and independent consulting.

Stay alert! Security is everyone's responsibility

Questions to assess the security posture of a startup, focusing on basic hygiene and handling of sensitive data.

Is your tech stack ready for data-intensive applications?

Questions to assess the quality of tech stacks and lifecycles, with a focus on artificial intelligence, machine learning, and analytics.

Dealing with endless data changes

Quotes from Demetrios Brinkmann on the relationship between MLOps and DevOps, with MLOps allowing for managing changes that come from data.

How to avoid startups with poor development processes

Questions that prospective data specialists and engineers should ask about development processes before accepting a startup role.

Assessing a startup's data-to-AI health

Reviewing the areas that should be assessed to determine a startup’s opportunities and challenges on the data/AI/ML front.

AI does not obviate the need for testing and observability

It’s easy to prototype with AI, but production-grade AI apps require even more thorough testing and observability than traditional software.

Questions to consider when using AI for PDF data extraction

Discussing considerations that arise when attempting to automate the extraction of structured data from PDFs and similar documents.

Avoiding AI complexity: First, write no code

Two stories of getting AI functionality to production, which demonstrate the risks inherent in custom development versus starting with a no-code approach.

Nudging ChatGPT to invent books you have no time to read

Getting ChatGPT Plus to elaborate on possible book content and produce a PDF cheatsheet, with the goal of learning about its capabilities.

Future software development may require fewer humans

Reflecting on an interview with Jason Warner, CEO of poolside.

Supporting volunteer monitoring of marine biodiversity with modern web and data tools

Summarising the work Uri Seroussi and I did to improve Reef Life Survey’s Reef Species of the World app.

You don't need a proprietary API for static maps

For many use cases, libraries like cartopy are better than the likes of Mapbox and Google Maps.

Lessons from reluctant data engineering

Video and summary of a talk I gave at DataEngBytes Brisbane on what I learned from doing data engineering as part of every data science role I had.

Google's Rules of Machine Learning still apply in the age of large language models

Despite the excitement around large language models, building with machine learning remains an engineering problem with established best practices.

Was data science a failure mode of software engineering?

Yes, data science projects have suffered from classic software engineering mistakes, but the field is maturing with the rise of new engineering roles.

How hackable are automated coding assessments?

Exploring the hackability of speed-based coding tests, using CodeSignal’s Industry Coding Framework as a case study.

Building useful machine learning tools keeps getting easier: A fish ID case study

Lessons learned building a fish ID web app with fast.ai and Streamlit, in an attempt to reduce my fear of missing out on the latest deep learning developments.

My work with Automattic

Back-dated meta-post that gathers my posts on Automattic blogs into a summary of the work I’ve done with the company.

Software commodities are eating interesting data science work

Being a data scientist can sometimes feel like a race against software commodities that replace interesting work. What can one do to remain relevant?

Bootstrapping the right way?

Video and summary of a talk I gave at YOW! Data on bootstrap estimation of confidence intervals.

Hackers beware: Bootstrap sampling may be harmful

Bootstrap sampling has been promoted as an easy way of modelling uncertainty to hackers without much statistical knowledge. But things aren’t that simple.

Exploring and visualising Reef Life Survey data

Web tools I built to visualise Reef Life Survey data and assist citizen scientists in underwater visual census work.

Is Data Scientist a useless job title?

It seems like anyone who touches data can call themselves a data scientist, which makes the title useless. The work they do can still be useful, though.

Migrating a simple web application from MongoDB to Elasticsearch

Migrating BCRecommender from MongoDB to Elasticsearch made it possible to offer a richer search experience to users at a similar cost, among other benefits.

The wonderful world of recommender systems

Giving an overview of the field and common paradigms, and debunking five common myths about recommender systems.

Goodbye, Parse.com

Migrating my web apps away from Parse.com due to reliability issues. Self-hosting is a better solution.

First steps in data science: author-aware sentiment analysis

I became a data scientist by doing a PhD, but the same steps can be followed without a formal education program.

Automating Parse.com bulk data imports

A script for importing data into the Parse backend-as-a-service.

What is data science?

Data science has been a hot term in the past few years. Still, there isn’t a single definition of the field. This post discusses my favourite definition.

Building a recommender system on a shoestring budget (or: BCRecommender part 2 – general system layout)

Iterating on my BCRecommender service with the goal of keeping costs low while providing a valuable music recommendation service.