Blue Mountain
Resident Skeptical Hobbit
A fable.
A small company once hired an intern, and informed him one of his duties was to purchase a dozen doughnuts for the weekly company meeting. On the day of the meeting, the intern went to a local doughnut shop, fortunately located in the same building as was the business, and purchased one doughnut. This he brought back to the company and placed it on a plate in the meeting room. He then went to the doughnut shop a second time, purchased a second doughnut, and again brought it back to the company's meeting room a put it on to the place. A third time he went to doughnut shop ...
The next week he repeated this procedure, making twelve trips to the doughnut shop to purchase the dozen doughnuts for the company meeting.
On the third week, one of the management team asked the intern, "Do you not know that you can purchase all twelve doughnuts at once, and the person at the doughnut shop will put them into a box for you, which you can then bring back to the meeting room? That way, you will save yourself eleven trips to the shop every week. And that approach works for 24, 36, and even 100 doughnuts."
And thus the intern was enlightened.
Strange as it may seem, a scenario similar to this was played out not once but twice these past two weeks. In my job I do the maintenance work on the company's legacy application. Instead of getting doughnuts, the program I'm working on does database calls. In two separate functions, written by separate people, the programmer made hundreds (and in once case, thousands) of separate calls, all SELECT statements that needed to be read, parsed, executed, and returned by the database engine to fetch individual pieces of information. With a bit of foresight, additional programming, and writing better SQL queries, probably well over 95% of these calls could have been eliminated.
I refactored the function that took hundreds of calls and got it down to three. That's right: only three SQL calls to do the work that initially took over 500. In addition, it doesn't matter how much input there is; those three calls do the work if there's only one piece of source data to report on or a hundred. The original code took six calls for every piece of source data. The original programmer has a master's degree in computer science and is now the lead programmer on the new version of our company's core product.
The other piece of code was worse: a simple report required nearly 6,000 separate SELECT statements (again, all of which had to be read, parsed, executed, and returned by the database driver) for reporting on a single year's worth of results. I thought briefly about refactoring it, but to do that properly I'd need to spend probably a day writing and testing a new method in the data model before starting on reworking the code itself. That code was written by a person no longer with the company. He was a decent programmer but had an unfortunate tendency to copy and paste code all over the place.
One sad part of this tale is because the hardware on which we run the system is impressively fast, as is the database program we're using, this horribly inefficient code doesn't seem to be slow. Despite these bogosities I keep turning up in the code base, the product has scaled decently well. We're also a niche product, so we're not having to handle thousands of hits a second on the web site--it's more like hundreds a day.
What do you do if you find bad code in your systems? Fix it? Leave it alone? Analyse to determine if it's worth fixing? Fix the worst of it and leave the rest?
A small company once hired an intern, and informed him one of his duties was to purchase a dozen doughnuts for the weekly company meeting. On the day of the meeting, the intern went to a local doughnut shop, fortunately located in the same building as was the business, and purchased one doughnut. This he brought back to the company and placed it on a plate in the meeting room. He then went to the doughnut shop a second time, purchased a second doughnut, and again brought it back to the company's meeting room a put it on to the place. A third time he went to doughnut shop ...
The next week he repeated this procedure, making twelve trips to the doughnut shop to purchase the dozen doughnuts for the company meeting.
On the third week, one of the management team asked the intern, "Do you not know that you can purchase all twelve doughnuts at once, and the person at the doughnut shop will put them into a box for you, which you can then bring back to the meeting room? That way, you will save yourself eleven trips to the shop every week. And that approach works for 24, 36, and even 100 doughnuts."
And thus the intern was enlightened.
Strange as it may seem, a scenario similar to this was played out not once but twice these past two weeks. In my job I do the maintenance work on the company's legacy application. Instead of getting doughnuts, the program I'm working on does database calls. In two separate functions, written by separate people, the programmer made hundreds (and in once case, thousands) of separate calls, all SELECT statements that needed to be read, parsed, executed, and returned by the database engine to fetch individual pieces of information. With a bit of foresight, additional programming, and writing better SQL queries, probably well over 95% of these calls could have been eliminated.
I refactored the function that took hundreds of calls and got it down to three. That's right: only three SQL calls to do the work that initially took over 500. In addition, it doesn't matter how much input there is; those three calls do the work if there's only one piece of source data to report on or a hundred. The original code took six calls for every piece of source data. The original programmer has a master's degree in computer science and is now the lead programmer on the new version of our company's core product.
The other piece of code was worse: a simple report required nearly 6,000 separate SELECT statements (again, all of which had to be read, parsed, executed, and returned by the database driver) for reporting on a single year's worth of results. I thought briefly about refactoring it, but to do that properly I'd need to spend probably a day writing and testing a new method in the data model before starting on reworking the code itself. That code was written by a person no longer with the company. He was a decent programmer but had an unfortunate tendency to copy and paste code all over the place.
One sad part of this tale is because the hardware on which we run the system is impressively fast, as is the database program we're using, this horribly inefficient code doesn't seem to be slow. Despite these bogosities I keep turning up in the code base, the product has scaled decently well. We're also a niche product, so we're not having to handle thousands of hits a second on the web site--it's more like hundreds a day.
What do you do if you find bad code in your systems? Fix it? Leave it alone? Analyse to determine if it's worth fixing? Fix the worst of it and leave the rest?
Last edited: