Are short methods actually worse?
I ran across an interesting post on programming.reddit called Anecdote Driven Development, or Why I Don’t Do TDD. The article focused on testing, but what I found most interesting was the part about how long a method or function should be:
I recently wrote some code for Class::Sniff which would detect “long methods” and report them as a code smell. […] Ben Tilly asked an embarrassingly obvious question: how do I know that long methods are a code smell?
I threw out the usual justifications, but he wouldn’t let up. He wanted information and he cited the excellent book Code Complete as a counter-argument. I got down my copy of this book and started reading “How Long Should A Routine Be” (page 175, second edition). The author, Steve McConnell, argues that routines should not be longer than 200 lines. Holy crud! That’s waaaaaay to long. If a routine is longer than about 20 or 30 lines, I reckon it’s time to break it up.
Regrettably, McConnell has the cheek to cite six separate studies, all of which found that longer routines were not only not correlated with a greater defect rate, but were also often cheaper to develop and easier to comprehend.
I’d never heard this before, but this is great, because it verifies what I’ve believed for a long time…
That which obscures my code is bad
Last year I wrote a post called If this is Object Calisthenics, I think I’ll stay on the couch where I argued (among other things) that making your methods as short as possible is NOT a good idea. My justification was that it just makes the code more complicated: “That which obscures my code is bad.” But this is even better…actual empirical evidence.
I don’t have a copy of Code Complete, so I did a bit more research to see if I could find the actual studies. I found a good summary here (links added by me):
McConnell cites the findings of several studies of the correlation between the size of routines and the cost and/or fault rate of routines. Some findings which favor longer routines are:
- Routine size is inversely correlated with errors, up to 200 lines of code. [Basili and Perricone, 1984]
- Larger routines (65 lines of code or more) are cheaper to develop per line of code. [Card, Church, and Agresti, 1986; Card and Glass, 1990]
- Routines with fewer than 143 source statements (including comments) had 23% more errors per line of code than larger routines. [Selby and Basili, 1991]
- Routines averaging 100 to 150 lines of code need to be changed least. [Lind and Vairavan, 1989]
Hmmm. It looks like the studies are all about 20-25 years old. I wonder if — or how — the results would apply now. I took a quick look at the papers (the ones I could get my hands on), and the programming languages used were: Fortran [Card ‘86], Pascal and Fortran [Lind ‘89], and a mix of custom languages (one being PL/1-like) and assembly [Selby ‘91].
Does anyone know of any more current results? (Greg?) It would be interesting to see if this can be shown with more modern languages. But intuitively, it makes sense. In her book Software Engineering: Theory and Practice, Joanne Atlee summarizes it nicely:
Card and Glass (1990) point out that the design complexity really involves two aspects: the complexity within each component and the complexity of the relationships among components.
By making your methods shorter, you’re just trading one kind of complexity for another.
Update: In the comments, Stephane Vaucher pointed to a much more recent study (from 2002): The Optimal Class Size for Object-Oriented Software. They point out that the conclusion that shorter methods are more error-prone is misleading, at best:
The observed phenomenon of smaller components having a higher fault density is due to an arithmetic artifact. First, note that the above conclusions were drawn based exclusively on examination of the relationship between fault density versus size […] However, by definition, if we model the relationship between any variable X and 1/X, we will get a negative association as long as the relationship between size and faults is growing at most linearly.
Another way of putting it is: short methods may have more defects per line, but they still have fewer defects overall. There may be a justification for not making methods too short, but these studies do not provide one.
The one sure thing is that the more code you write, the more bugs you will have.



