Discussion: Why Machine Learning Beginners Shouldn’t Avoid the Math

In a post I published yesterday, I argued that it is important for students of machine learning to understand the algorithms and underlying mathematics prior to using tools or libraries that black box the code. I suggested that to do so is likely to result in a lot of “time-wasting confusion” due to students not having the necessary understanding to configure parameters or interpret results. One of the examples I provided for the opposing view was this blog post from BigML, which argues that beginners don’t need courses such as those provided by Coursera if they use their tool.

Francisco J. Martin, CEO of Big ML, has tweeted in response.


So Kids shouldn’t avoid assembler, automata, and compilers when learning to code?

This is a very good question and one that grants us an opportunity to dig deeper into the issue. I am responding here because I don’t believe it’s a question I can answer in 140 characters.

The short answer is no, I’m perfectly ok with beginner programmers starting out in high-level languages and working their way down, or even stopping there and not working their way down. But this is not analagous to machine learning.

I see three big differences.

First of all, learning a high-level language is actually a constructive step towards learning lower level languages. If that’s the goal, and you started with something like Java, you could potentially learn quite a lot about programming in general. Then trying C++ would help to fill in blanks with resect to some of the aspects of programming the Java glosses over. Likewise, Assembler could take you a step further.

If playing with the parameters of black-boxed algorithms offers a path at all towards becoming proficient at machine learning, it’s an incredibly innefficient one. It’s an awfully big search space to approach by trial and error when you consider the combinations of parameters, feature selection and the question of whether you have enough or appropriate data examples.

The second difference is that to do high-level programming does not require an understanding of low-level programming. I can do anything that Java or c# will let me do without knowing anything about assembly language.  In comparison, a machine learning tool requires me to know how to set appropriate values of parameters that are parsed into the hidden algorithms. They also require me to understand whether or not I have an appropriate (representative) dataset with appropriate features. Then when it finishes I need to be able to interpret the results and take appropriate actions. Better outcomes come from more informed decisions.

The third difference relates to the potential benefits of exploring the low-level languages. There are some exceptions to this, but generally speaking, writing more efficient algorithms in low-level languages comes at such great expense in comparison to the constantly falling cost of computation, that it just isn’t worthwhile.

In my last post I cited Kaggle’s chief scientist, Jeremy Howard, who said there was a massive difference in capability between good and average data scientists. I take this to indicate that in machine learning, more knowledge leads to exponentially better outcomes.  Unlike low-level programming, there is a huge benefit to having a detailed knowledge of machine learning.

I have come across some arguments suggesting that as Moore’s law reaches its limit, low-level coding will become much more sought after. If that happens I’ll revisit my position on low-level coding, but for now I’m betting that specialist processors like GPUs will help to bridge the gap before the next paradigm of computation comes along to keep the gravy train of exponential price-performance improvement going.

Published by

James Burkill

Veteran software engineer and student of all things AI. LinkedIn: https://ie.linkedin.com/in/james-burkill-459a1513

2 thoughts on “Discussion: Why Machine Learning Beginners Shouldn’t Avoid the Math”

  1. If you consider an extended view of what Machine Learning is, you’ll realize that the algorithm (math) part is only one small piece. Have a look at this paper by Kiri Wagstaff for some refreshing inspiration: http://teamcore.usc.edu/WeeklySeminar/Aug31.pdf

    Kaggle competitions have nothing to do with real-world machine learning. Do not take me wrong. They are a great tool to learn a little bit about how algorithms perform but you can be the best in those competitions and be totally incapable of solving a real-world problem with Machine Learning.

    So I strongly believe that best way to learn Machine Learning is solving a basic real-world problem first (e.g., collecting and transforming some data, building a model independently of how good it is, making some predictions, measuring the impact, and repeating over and over). You’ll soon see that the
    learning algorithm is important but that the other pieces are even more.

    1. Thanks for the link, it’s a good critique of the failings of many universities. I’m fortunate in that I attend an institution that is much more industry focused, specialising in recommender systems. But I know that isn’t everyone’s experience and I have spent enough time immersed in the operations research literature to see what can happen. I’m not exaggerating when I say that almost all research into automated personnel scheduling algorithms has been conducted on nurse rostering datasets.

      However, I can’t see how this applies to Kaggle. The vast majority of their competitions are industry projects that are sponsored by companies with a specific interest in those projects. Often the competitions are conducted as part of a recruitment process.

      There are currently five open competitions on the Kaggle web site, all but one of which is based on a real-world industry problem where competitors work with real-world data. Santander customer satisfaction prediction, Home Depot product search relevance, BNP Paribas Cardif claims management and Yelp restuarant photo classification all look like real world predictive analytics problems to me.

Leave a Reply