The wonderful world of recommender systems

I recently gave a talk about recommender systems at the Data Science Sydney meetup (the slides are available here). This post roughly follows the outline of the talk, expanding on some of the key points in non-slide form (i.e., complete sentences and paragraphs!). The first few sections give a broad overview of the field and the common recommendation paradigms, while the final part is dedicated to debunking five common myths about recommender systems....

October 2, 2015 · Yanir Seroussi

You don’t need a data scientist (yet)

The hype around big data has caused many organisations to hire data scientists without giving much thought to what these data scientists are going to do and whether they’re actually needed. This is a source of frustration for all parties involved. This post discusses some questions you should ask yourself before deciding to hire your first data scientist. Q1: Do you know what data scientists do? Somewhat surprisingly, there are quite a few companies that hire data scientists without having a clear idea of what data scientists actually do....

August 24, 2015 · Yanir Seroussi

Goodbye, Parse.com

Over the past year, I’ve been using Parse‘s free backend-as-a-service and web hosting to serve BCRecommender (music recommendation service) and Price Dingo (now-closed shopping comparison engine). The main lesson: You get what you pay for. Despite some improvements, Parse remains very unreliable, and any time saved by using their APIs and SDKs tends to be offset by having to work around the restrictions of their sandboxed environment. This post details some of the issues I faced and the transition away from the service....

July 31, 2015 · Yanir Seroussi

Learning about deep learning through album cover classification

In the past month, I’ve spent some time on my album cover classification project. The goal of this project is for me to learn about deep learning by working on an actual problem. This post covers my progress so far, highlighting lessons that would be useful to others who are getting started with deep learning. Initial steps summary The following points were discussed in detail in the previous post on this project....

July 6, 2015 · Yanir Seroussi

Hopping on the deep learning bandwagon

I’ve been meaning to get into deep learning for the last few years. Now, the stars having finally aligned and I have the time and motivation to work on a small project that will hopefully improve my understanding of the field. This is the first in a series of posts that will document my progress on this project. As mentioned in a previous post on getting started as a data scientist, I believe that the best way of becoming proficient at solving data science problems is by getting your hands dirty....

June 6, 2015 · Yanir Seroussi

First steps in data science: author-aware sentiment analysis

People often ask me what’s the best way of becoming a data scientist. The way I got there was by first becoming a software engineer and then doing a PhD in what was essentially data science (before it became such a popular term). This post describes my first steps in the field with the goal of helping others who are interested in making the transition from pure software engineering to data science....

May 2, 2015 · Yanir Seroussi

My divestment from fossil fuels

This post covers recent choices I've made to reduce my exposure to fossil fuels, including practical steps that can be taken by Australians and generally applicable lessons. I recently read Naomi Klein’s This Changes Everything, which deeply influenced me. The book describes how the world has been dragging its feet when it comes to reducing carbon emissions, and how we are coming very close to a point where climate change is likely to spin out of control....

April 24, 2015 · Yanir Seroussi

The long road to a lifestyle business

Almost a year ago, I left my last full-time job and decided to set on an independent path that includes data science consulting and work on my own projects. The ultimate goal is not to have to sell my time for money by generating enough passive income to live comfortably. My five main areas of focus are – in no particular order – personal branding & networking, data science contracting, Bandcamp Recommender, Price Dingo, and marine conservation....

March 22, 2015 · Yanir Seroussi

Learning to rank for personalised search (Yandex Search Personalisation – Kaggle Competition Summary – Part 2)

This is the second and last post summarising my team’s solution for the Yandex search personalisation Kaggle competition. See the first post for a summary of the dataset, evaluation approach, and some thoughts about search engine optimisation and privacy. This post discusses the algorithms and features we used. To quickly recap the first post, Yandex released a 16GB dataset of query & click logs. The goal of the competition was to use this data to rerank query results such that the more relevant results appear before less relevant results....

February 11, 2015 · Yanir Seroussi

Is thinking like a search engine possible? (Yandex search personalisation – Kaggle competition summary – part 1)

About a year ago, I participated in the Yandex search personalisation Kaggle competition. I started off as a solo competitor, and then added a few Kaggle newbies to the team as part of a program I was running for the Sydney Data Science Meetup. My team hasn’t done too badly, finishing 9th out of 194 teams. As is usually the case with Kaggle competitions, the most valuable part was the lessons learned from the experience....

January 29, 2015 · Yanir Seroussi