A couple of months ago I accidentaly run into lectures about Machine Learning and they immediately took my attention. Since that time I have seen couple of online lectures about Machine Learning. I also saw some videos how machine learning can be applied in robotics and I got so excited about this field that I bought Lego Mindstorms EV3 robot and decided to write some cool algorithm that would prove that robots can learn based on experience (but this is an another story I will share with you in the next post). Today, I’m going to very briefly explain what machine learning is and then I will let you know about great sources where you can learn everything you need to build your own learning algorithms.
We all interact with systems that use machine learning in some degree. You can think about these examples:
- Handwriting recognition – How would you solve this problem? It’s hard to imagine that you could write a program that would try to cover all possible styles people all around the world write. It’s better to use an algorithm that can learn based on training examples you provide it with.
- Chess AI – Now think about when you play chess against a computer AI and imagine that you would have to write this program which plays against a human being and wins. Not a trivial problem, right? Think about how many different scenarios can occur and how would you choose the right movements in order to beat the human being. Wouldn’t it be nice to let the computer plays thousands of games against itself and let it learn based on this experience?
- Fraud – How banks find out about suspected activity with your credit card? Do you think that their programs cover all possible scenarios? No, they learn based on previously claimed cases.
- Spam filters – Many new spam e-mails are created every day and therefore impossible to maintain a list of all of them in order to filter them. Rather, we use machine learning to learns how spam e-mails look like and how they spread.
- Netflix recommendations – Netlifx has million of users and there is no way you could write a program that would make personalized recommendations for all of them. Rather you would like the system to learn based on what other people watching similar movies like.
- Other examples can be a picture recognition, driverless cars, natural language procession, …
Machine Learning was defined by Arthur Samuel in 1958 as a “field of study that gives computers the ability to learn without being explicitly programmer”.
In 1998 Tom Mitchell came up with a more scientific definition “A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.”
Since 1958, we have developed several machine learning algorithms. There are usually grouped in to several categories. The two most famous are:
- Supervised learning algorithms – In this case you provide the algorithm with a training dataset that contains correct answers to the problem so the algorithm can use this data to learn. An example can be a spam filter which you provide with a dataset of e-mails flagged as spam and non-spam. The algorithm then learns what combination of characteristics most likely leads to an email being flagged as a spam. This would be a case of classification problem (result of the algorithm is a flag “spam” or “non-spam” – discrete value). Other example could be an algorithm that predicts stock prices based on historic data. This would be a regression problem since we are trying to predict a continuous value.
- Unsupervised learning algorithms – With this kind of algorithms we let the computer learn it by itself. You don’t provide the algorithm with a training dataset that contains correct answers to the problem. On the contrary, you let the algorithm find some structure in the data. An example can be a clustering algorithm that put friends from your social network into several categories based on similarities they share with each other.
I’m not going deeper into the topic and rather will tell about great sources where you can learn more about machine learning.
Probably the best learning source is the Machine Learning course from Stanford University taught by Andrew Ng. You can find recorded lectures with all other materials at coursera (which was btw co-founded by Andrew). I personally recommend this one since Andrew’s explanations are easy to understand and are very practically oriented so you actually write and apply several machine learning algorithms during his lectures. Moreover, cloudera streaming platform provides you with great features as subtitles in several languages, interactive quizzes, and ability to change playback speed. Also you get an access to the community of people that can help you to answer your questions. In the end, you can receive a certification of completion in case you get more than 80% of all possible points from assignments.
Other good source is the book Learning from Data written by Caltech professor Yaser S. Abu-Mostafa. This book is a very a good starting point for beginners (and has only around 200 pages). If you want to get more detailed into the field, you can also watch Yaser’s youtube videos.