It is easy to find the bottom of a bowl no matter where you start -- if you toss a marble anywhere into the bowl, it will roll downhill and find its way to the bottom.

What does this have to do with Machine Learning? A natural way to try to construct an accurate classifier is to minimize the number of prediction errors the classifier makes on training data. The trouble is, even for moderate-sized data sets, minimizing the number of training errors is a computationally intractable problem. A popular way around this is to assign different training errors different costs and to minimize the total cost. If the costs are assigned in a certain way (according to a “convex loss function”), the total cost can be efficiently minimized the way a marble rolls to the bottom of a bowl.

In a recent paper, Rocco Servedio and I show that no algorithm that works this way can achieve a simple and natural theoretical noise-tolerance guarantee that can be achieved by other kinds of algorithms. A result like this is interesting for two reasons: first, it's important to understand what you cannot do with convex optimization in order to get a fuller understanding of what you can do with it. Second, this result may spur more research into noise-tolerant training algorithms using alternative approaches.