As simple as possible, and then some

September 20, 2007

This semester I’m taking a machine learning class with Geoffrey Hinton. The other day in class, he was talking about the balance between correctly learning a concept, and overfitting the data. Say you have a program that is trying to recognize people: an example of overfitting is if your program learns that only people between 5‘4” and 6‘3” are people, just it has never seen a person taller or shorter than that.

There’s a famous Einstein quote that goes something like: “Everything should be made as simple as possible, but no simpler.” According to Prof. Hinton, the principle in machine learning is more like: “Everything should be made as simple as possible, and then a bit simpler.”

This also seems like a good approach for designing software. Make things as simple as you can possibly bear, then remove a few features. Without fail, whatever you consider upon to be an absolutely minimal feature set, you can still remove a few features.

Today Mike and I went to see Joel Spolsky talk about FogBugz in Toronto. During the Q+A, there were two questions that I really liked Joel’s answer to:

Does FogBugz support cloning bugs?
Can FogBugz support create dependencies between bugs?

The answer to both was really “no”, but Joel said “Sure, just type ‘clone of case 123’” and “Sure, just type ‘depends on case 123’.”

That’s a perfect illustration of what I’m talking about. If someone was coming up with a list of features their bug tracking system must absolutely have, there’s a good chance they would include those two features. But specifically adding features to do those things is an example of overfitting — really, linking achieves 95% of what you need anyways, and it’s much simpler.

So my new design mantra is “as simple as possible, and then some.”