Wednesday, July 05, 2006

(debugging) Out of the box programming.

Today, following a link from hexblog.org, I read an interesting article on programming bugs.

First of all, check the following Java code (it's not that language specific - anyone can read it):
1:     public static int binarySearch(int[] a, int key) {
2: int low = 0;
3: int high = a.length - 1;
4:
5: while (low <= high) {
6: int mid = (low + high) / 2;
7: int midVal = a[mid];
8:
9: if (midVal < key)
10: low = mid + 1;
11: else if (midVal > key)
12: high = mid - 1;
13: else
14: return mid; // key found
15: }
16: return -(low + 1); // key not found.
17: }
This might look like code you've used at least once in your programming life. I know I've used similar code. It is a binary search algorithm. What is does is search for an integer (aka key) inside a table (aka a). The table must be sorted ofcourse. It finds the middle value in the table and if key is greater that it, searches in the upper middle of the table, else in the lower middle. Quite simple right? Yet, it holds a terrible bug. I call it terrible because it has to do with things programmers handle every day and consider common (and therefore safe to use): integer variables.

If by now you haven't found the bug, here it is:
6: int mid =(low + high) / 2;
This will fail when the sum of low and high exceedes the maximum possible value with is (2^31)-1. Take a moment to think this... How many times have you used "int i;" and not consider the possible range of values it may hold?
I bet if you check your last programs you will find such a bug.

The most astonishing fact is that this very bug existed in an example in Chapter 5 of Programming Perls by Jon Bentley for nine years!

Ways of fixing this are:
6:             int mid = low + ((high - low) / 2);
6: int mid = (low + high) >>> 1;
6: mid = ((unsigned) (low + high)) >> 1;

Anyway, the article that hinted me writting all this also mentions ways of preventing such things reaching the market. One, nothing special - quite common and known to the public, is building test cases for your code so if something blows it will be contained, the program will terminate but not crash and of course leave a decent error message. Test cases are a funny thing. They can never be too many and have the detency of tripling the size of code and slowing down the program. Java has a good way of handling unexpected events: Exceptions. When something out of the ordinary happens, an exception is thrown. But what next? How you handle that exception is what matters. C and C++ don't have exceptions but that's not so important right now.

Nowadays an effort is made to constuct code testing tools but there are many problems towards that direction. What to do then?

In the middle of a software demanding market running at hundreds of miles, are we allowed to make a pit stop or concentrate and finish the race?

Programmers must be open-minded, equiped with radical vision and a small ego. A program must be designed to withstand a hurricane of unforseen events. Every day practices, taken for granted, must be tested and revised. A program is like a ship. It must control and contain its self. Otherwise, it may sink from a tiny leak.

No comments: