AI/ML lifecycle models versus real-world mess
The real world of AI/ML doesn’t fit into a neat diagram, so I created another diagram and a maturity heatmap to model the mess.
The real world of AI/ML doesn’t fit into a neat diagram, so I created another diagram and a maturity heatmap to model the mess.
Questions to assess the quality of tech stacks and lifecycles, with a focus on artificial intelligence, machine learning, and analytics.
Quotes from Demetrios Brinkmann on the relationship between MLOps and DevOps, with MLOps allowing for managing changes that come from data.
Reviewing the areas that should be assessed to determine a startup’s opportunities and challenges on the data/AI/ML front.
It’s easy to prototype with AI, but production-grade AI apps require even more thorough testing and observability than traditional software.
Discussing the use of AI to automate underwater marine surveys as an example of the uneven distribution of technological advancement.
Discussing considerations that arise when attempting to automate the extraction of structured data from PDFs and similar documents.
Classifying startups as ML-centric or non-ML is a helpful exercise to uncover the data challenges they’re likely to face.
Two stories of getting AI functionality to production, which demonstrate the risks inherent in custom development versus starting with a no-code approach.
An interesting approach to bidding of energy storage assets, showing that training on New York data is transferable to Queensland.
Summarising the work Uri Seroussi and I did to improve Reef Life Survey’s Reef Species of the World app.
Despite the excitement around large language models, building with machine learning remains an engineering problem with established best practices.
My perspective after a week of using ChatGPT: This is a step change in finding distilled information, and it’s only the beginning.
Reviewing the first three chapters of the book Causal Machine Learning by Robert Osazuwa Ness.
Lessons learned building a fish ID web app with fast.ai and Streamlit, in an attempt to reduce my fear of missing out on the latest deep learning developments.
Overview of a talk I gave at a deep learning course, focusing on AI ethics as the need for humans to think on the context and consequences of applying AI.
Back-dated meta-post that gathers my posts on Automattic blogs into a summary of the work I’ve done with the company.
Updating my definition of data science to match changes in the field. It is now broader than before, but its ultimate goal is still to support decisions.
Causality is often overlooked but is of much higher relevance to most data scientists than deep learning.
Nutritionism is a special case of misinterpretation and miscommunication of scientific results – something many data scientists encounter in their work.
Giving an overview of the field and common paradigms, and debunking five common myths about recommender systems.
Progress on my album cover classification project, highlighting lessons that would be useful to others who are getting started with deep learning.
To become proficient at solving data science problems, you need to get your hands dirty. Here, I used album cover classification to learn about deep learning.
I became a data scientist by doing a PhD, but the same steps can be followed without a formal education program.
An overview of my PhD in data science / artificial intelligence. Thesis title: Text Mining and Rating Prediction with Topical User Models.
My team’s solution to the Yandex Search Personalisation competition (finished 9th out of 194 teams).
Insights on search personalisation and SEO from participating in a Kaggle competition (finished 9th out of 194 teams).
Exploring an approach to choosing the optimal number of iterations in stochastic gradient boosting, following a bug I found in scikit-learn.