No Free Lunch Theorem

10/03/2017 02:03:00 PM

One thing that loved about the first lecture in this module is that I brought me back a few years back in time when I was taking a stand alone course in Evolutionary Biology at Uppsala University in Sweden. It was a night course I took as I was working because the field have always interested me and I have read many books about the subject. My initial plan was to study biology and become a zoologist.

Many of the topics in this module fascinates me and I love when after all these years of technological innovations, we look back at nature and trying to simulate behaviour that we see others entities performing. Evolution has been around for so long that nature is pretty much bound to have figure some useful solutions for us.

By reading the paper on “No Free Lunch Theorem” I found it a bit hard to grasp the overall concept, specifically when you focus on the actual theorem about X being our countable search base and that you have to specify and objective function of f: X ->Y where Y is a proper subset of R as our countable set.. Gosh, too much for me to take in just by reading a plain text.

This can easily turn my head upside down when I am trying to grasp something new. I plowed through the paper, somewhat more confused as for when I started. By reading my notes and looking up additional information about the theorem, I think I managed to get some sort of grasp of it.

Let say that someone is actually offering you a “Free Lunch”, by doing this, he or she is most likely to want something in return for that “Free Lunch” he or she just offered you.

So there is an underlying trade-off for that “Free lunch”, very similar to how we use different machine learning classification algorithms.

The "No Free Lunch Theorem" is basically a formalisation of the fact that there is no learning algorithm that is inherently superior.

What I mean by that there is no inherently superior classifier is that there is no master algorithm that works for any given problem. You cannot get good performance on a certain problem without paying the price of a bad performance in another, so there is also a trade-off. There are many different methods to solve our problems but there is no method that dominates all others over all possible data sets.

I found an analogy that I liked and I found it to be useful for me to grasp the topic.

Imagine that we have a big room with a brand new wooden floor. The floor is representing all of our possible data sets. We can then lay any of our MLA ( machine learning algorithm ) on top of this floor as a carpet. There is no way to cover the whole floor with a single master MLA carpet.

There are good MLAs for a very specific task but this leads to that there will alway be some “blank” spots on our floor. Some MLAs will be subsets of each other and overlap be there will still not cover the whole floor.

That is why you might have to choose a specific algorithm for a specific task you want to explore to get the best possible outcome.