Three recent prize winners of Kaggle competitions have taken the Coursera class in Machine Learning and this seems to be more than a coincidence.
Coursera co-founder Andrew Ng, who teaches this class which he originated at Stanford University doesn’t think so. He told GigaOm's Derrick Harris:
"Machine learning has matured to the point by where if you take one class you can actually become pretty good at applying it.”
and given that machine learning has become such a sought after skill taking this class can significantly boost someone’s salary and job prospects at companies where such knowledge is still in short supply, adding:
“I bet many students are going on to great things because of these courses".
According to Ng the only real prerequisite to his course is a basic understanding of programing although familiarity with algebra and probability are "helpful". My own conversations with students on the current presentation of the course suggest that you need both a solid foundation of both maths, in particular matrix algebra and calculus, and programming.
The programming environment used by the class is Octave, an open source interpreted language that is pragmatic rather than theoretically pure. It supports matrix operations, has a good plotting facility for visualizing data and lots of built in maths functions. On the other hand it has a clunky command line interface. However if you can get to grips with Octave you have a real advantage for many of the types of problems requiring predictive analysis that you'll come across in Kaggle competitions, as this video explains:
In his 10-week course Ng takes a an engineering-oriented approach to Machine Learning that concentrates on statistical models. If you are looking for an alternative Coursera also has Neural Networks for Machine Learning, a class taught by University of Toronto professor, Geoffry Hinton who is a leading proponent in the field from a cognitive science perspective. His eight-week course sets out to teach students artificial neural networks and how they're being used for machine learning, as applied to speech and object recognition, image segmentation, modeling language and human motion. Its prerequisites are programming proficiency in Matlab, Octave or Python, plus knowledge of calculus, linear algebra and probability theory.
There is now a third machine learning course, a 10-week class from Pedro Domingos of the Univerisity of Washinton, has now been added to Coursera'scatalog. In this one the emphasis is on supervised learning and will cover decision trees, rules, instances, Bayesian techniques, neural networks, model ensembles, and support vector machines. It has similarly prerequisites - basic knowledge of programming with some previous exposure to probability, statistics, linear algebra, calculus and/or logic being useful but not essential.
The FAQs for this course suggest that "Machine learning is the scientific method on steroids." and that's probably a fair description. It also has all the advantages of a relatively young discipline that has only come into its won as computers have become powerful enough to take the drudgery out of handling massive amounts of data. If you've missed out on this topic to date then there is plenty of time to catch up.