First steps in data science: author-aware sentiment analysis

People often ask me what’s the best way of becoming a data scientist. The way I got there was by first becoming a software engineer and then doing a PhD in what was essentially data science (before it became such a popular term). This post describes my first steps in the field with the goal of helping others who are interested in making the transition from pure software engineering to data science....

May 2, 2015 · Yanir Seroussi

Automating Parse.com bulk data imports

Parse is a great backend-as-a-service (BaaS) product. It removes much of the hassle involved in backend devops with its web hosting service, SDKs for all the major mobile platforms, and a generous free tier. Parse does have its share of flaws, including various reliability issues (which seem to be getting rarer), and limitations on what you can do (which is reasonable price to pay for working within a sandboxed environment). One such limitation is the lack of APIs to perform bulk data imports....

January 15, 2015 · Yanir Seroussi

What is data science?

Data science has been a hot term in the past few years. Despite this fact (or perhaps because of it), it still seems like there isn't a single unifying definition of data science. This post discusses my favourite definition. Data Scientist (n.): Person who is better at statistics than any software engineer and better at software engineering than any statistician. — Josh Wills (@josh_wills) May 3, 2012 One of my reasons for doing a PhD was wanting to do something more interesting than “vanilla” software engineering....

October 23, 2014 · Yanir Seroussi

Building a recommender system on a shoestring budget (or: BCRecommender part 2 – general system layout)

This is the second part of a series of posts on my BCRecommender – personalised Bandcamp recommendations project. Check out the first part for the general motivation behind this project. BCRecommender is a hobby project whose main goal is to help me find music I like on Bandcamp. Its secondary goal is to serve as a testing ground for ideas I have and things I’d like to explore. One question I’ve been wondering about is: how much money does one need to spend on infrastructure for a simple web-based product before it reaches meaningful traffic?...

September 7, 2014 · Yanir Seroussi