PCPlus 257: Efficient search tools

I write a monthly column for PCPlus, a computer news-views-n-reviews magazine in the UK (actually there are 13 issues a year — there's an Xmas issue as well — so it's a bit more than monthly). The column is called Theory Workshop and appears in the back of every issue. When I signed up, my editor and the magazine were gracious enough to allow me to reprint the articles here after say a year or so. After all, the PDFs do appear on each issue's DVD after a couple of months.

PCPlus logo This was my third article and the first one in which I attempted to put in some code. The topic was binary search, and, to be frank, the published article turned out to be a bit of a mess from the one I sent in.

A quick aside here. For an article, I submit a Word document in a certain format. My editor and typesetter then take over and coerce my linear document into the magazine format so that it fits on the required two pages. Unfortunately, the schedule is so tight (13 issues a year, four weeks in between issues) that there is no time to have a back-and-forth with the author. So, the first I see of an article after I've written it and sent it off is when I go down to my local Barnes & Noble to pick up one of the three copies they get.

So first off, the title should be, if anything, "Efficient search tool" since I'm only talking about one, binary search. Then: the strap is a tautology since the binary search algorithm is a "basic halving strategy", the code is hard to follow because of the narrow width of the columns, and the "Spotlight on" section has a nonsensical title (admittedly, my original one was way too long).

All told, I'd have to say this is my least satisfying published article to date in PCPlus, which is a shame because the topic is extremely important. It's incredible that such a simple algorithm to formulate is so hard to implement correctly — the latest example was the binary search in the generics run-time library in Delphi 2009 on release was broken and had to be fixed in the first minor version afterwards.

Having said all that, though, I did learn a lot from this experience as to the limitations of this particular monthly column, so all was not lost. Plus I can rectify things a little here. So download and open up the PDF and use the following code snippets instead of the code in the article.

First of all, here's the basic framework that supports a binary search:

static int BinarySearch(int[] array, int item) {
    int lower = 0;
    int upper = array.Length - 1;
    int middle;
    int foundAtIndex = -1;
    bool stillLooking = true;
    while (stillLooking) {
      //...code that searches...
    }
    return foundAtIndex;
}

Next, here's the actual searching code:

if (lower > upper) {
    stillLooking = false;
}
else {
    middle = (lower + upper) / 2;
    if (array[middle] == item) {
        stillLooking = false;
        foundAtIndex = middle;
    }
    else if (array[middle] < item) {
        lower = middle + 1;
    }
    else {
        upper = middle - 1;
    }
}

And, finally, here's the newer corrected code for finding the mid-point:

middle = lower + (upper - lower) / 2;

Do note that, despite this implementation's apparent verbosity, any speed issues won't be with this but with array accesses (especially with large arrays that span RAM pages, etc).

The article first appeared in issue 257, July 2007.

You can download the PDF here.

Album cover for The Age of Consent Now playing:
Bronski Beat - Love and Money
(from The Age of Consent)

Loading similar posts...   Loading links to posts on similar topics...

No Responses

Feel free to add a comment...

Leave a response

Note: some MarkDown is allowed, but HTML is not. Expand to show what's available.

  •  Emphasize with italics: surround word with underscores _emphasis_
  •  Emphasize strongly: surround word with double-asterisks **strong**
  •  Link: surround text with square brackets, url with parentheses [text](url)
  •  Inline code: surround text with backticks `IEnumerable`
  •  Unordered list: start each line with an asterisk, space * an item
  •  Ordered list: start each line with a digit, period, space 1. an item
  •  Insert code block: start each line with four spaces
  •  Insert blockquote: start each line with right-angle-bracket, space > Now is the time...
Preview of response