Priority queues – part one

First up in my “reprints from The Delphi Magazine, cast in JavaScript” posts must be the priority queue. It is after all, an important data structure, can be implemented quickly (although perhaps in not a very efficient manner), and introduces an excellent algorithm, the heap.

So, then, a priority queue. Something for queuing priorities?

First Cut is the Deepest

Let’s review quickly what an ordinary queue is. A queue is a data structure that has two main operations: add an item to the queue (usually known as enqueue()), and retrieve the oldest item from the queue (usually known as dequeue()). Generally the queue does not allow you to fiddle with other items in the queue (they’re deemed to be inaccessible until they’re at the front of the queue - in other words, the oldest - in which case they can be retrieved via the method discussed above), nor is it allowed to insert an item in the queue at some random place, or indeed at the front (in other words, to make it the oldest). The queue is known as a FIFO structure: First In, First Out (as compared with a stack, which is a LIFO structure: Last In, First Out).

In implementing a queue, we don’t bother to give each item the date and time of its arrival in the queue just so that we can find the oldest; the queue keeps the items in order of their arrival. Just like the checkout at a supermarket, in fact. The customers wait patiently (sometimes) in line, until their turn with the checkout person arrives. They tag on to the end of the queue and are processed in order from the front. Queue jumpers are not allowed (at least not in my supermarket, they’re not).

We’re actually pretty lucky in JavaScript: the normal array has the ability to act as a queue or a stack (or even a deque!), it has push(), pop(), shift(), unshift() predefined and they work as expected.

All fine and dandy, and the queue is an important data structure in its own right. However, it has a limitation in that items are processed in the order of their arrival. Suppose we want to process items in some other order entirely. In other words, a queue that still has the “add an item” operation, but whose second operation is not “retrieve the oldest”, but “retrieve the largest (or smallest).” We want to replace the simplistic age ordering with another ordering criterion completely.

The Legend of Xanadu

This new data structure is a priority queue. A priority queue has two main operations: add an item (as before), and retrieve the item with the largest priority. (We assume that each item has an associated priority.) What do I mean by priority in this context? Well, it can be anything. Classically, it’s usually a numeric value that denotes the item’s priority in some process. I’m thinking here of print queues in operating systems (or job queues, or threads in a multithreaded environment). Each print job is assigned a priority, a value that indicates how important that particular print job is. High priority print jobs would need to be processed before low priority print jobs. The operating system finishes off a particular print job, and then goes to the print queue and from it retrieves the print job with the highest priority. As work is done in the operating system, other print jobs get added to the print queue with various priorities.

Going back to the supermarket analogy, the deli counter in my supermarket operates a numbering scheme. You grab a number on a piece of paper from a dispenser and then wait around in some amorphous mass of customers until your number is called and you can go up and order your potato salad or whatever. This is almost the same as the traditional queue where the numbers represent how long you’ve been waiting, but note that we, the customers, don’t have to wait is a nice orderly line like a standard queue. Because we have a priority (a number on a piece of paper) we can shuffle around in a completely different configuration.

Do note that the value we are using as ‘priority’ doesn’t need to be a classic priority number in this vein. It can be any value, just so long as the queue is able to determine the item with the largest value. In other words, the values have to have some meaningful ordering. An example: the priority could be a name, and the ordering the standard alphabetic order. So the retrieval operation of getting the item with the largest priority would instead become getting the item earliest in alphabetic sequence (that is, the A’s before the B’s, etc). Imagine the uproar at my deli counter if that was the definition of ‘priority’! Certainly the Young family would never shop there. Still, when programming, the point is that we can select the retrieval ordering that suits our application.

The Carnival is Over

Enough chatter, let’s consider how to code a priority queue. The queue must have three main attributes: (1) store an arbitrary number of items, (2) add an item with associated priority to the queue, and (3) identify and retrieve the item with the largest priority. (Other attributes might be a function to return the number of items in the queue, the ability to peek at the item with the highest priority but not retrieve it, and so on, but let’s keep it simple for now.)

Traditionally, the first attribute (storing an arbitrary number of items) has been implemented by a linked list. This structure gives us a extensible container without too much overhead or too many efficiency constraints. Let’s be different, however, and use the data structure we’ve already mentioned, the JavaScript array. It can be grown; accessing items is fast; efficiency suffers a little when deleting a item from the middle of the list. Since we won’t be using this first design anyway (we’re just illustrating the concept of a priority queue at the moment, I’ll be revealing a better algorithm shortly), this last problem won’t cause us too much heartache. So we’ll use an internal array to store the items we get.

The next attribute (adding an item to the queue) is easy with our internal array: just call the push() method. We’ll take the view that the items we add to the priority queue will be objects of some description, with their priority as a property of the object. If we want the priority to be separate from the objects (or strings, or integers, or whatever) we add to the priority queue, then the details just become a little more involved (create a super-object that contains the original object and priority and push that to the array instead) and will just obscure what we’re trying to show.

The third attribute (finding the highest priority and returning the associated object, removing it from the priority queue in the process) is a teensy bit more involved but still quite simple. Essentially we iterate through the items in the array (the forEach() method) and for each item we see whether its priority is larger than the largest priority we’ve found so far. If it is, we take note of the index of the item in the array with this newer largest priority, and move on to the next item. After we’ve checked all of the items in the array, we know which is the largest (we took note of its index) and so we just remove it from the array (using the splice() method) and pass it back.

function createPriorityQueueArray(comparer) {
  "use strict";
  var queue = [];

  var push = function (item) {
    queue.push(item);
  };
  
  var pop = function () {
    if (queue.length === 0) {
      return null;
    }

    var indexOfHighest = 0;
    var highest = queue[0];
    queue.forEach(
      function(item, index) {
        if (comparer(item, highest) > 0) {
          highest = item;
          indexOfHighest = index;
        }
      }
    );

    queue.splice(indexOfHighest, 1);
    return highest;
  };

  return {
    push: push, 
    pop: pop
  };
}

The code in the above listing shows this simple priority queue. It uses a comparison function that you set up in order to determine whether an item’s ‘priority’ is greater than another’s. The priority queue therefore doesn’t need to know how to compare priorities (and hence whether they’re numbers or strings or something else): it just calls the comparison function, passing the two items whose priorities it needs to compare. This function is assumed to return a positive value if the first argument’s priority is greater than the second’s, 0 if they’re equal, or a negative value otherwise. Note also that the queue also doesn’t need to know what the items are, it just stores them.

(Aside: createPriorityQueueArray() is a function that returns an object. That object has just two methods, push() and pop(). Yes, I could have set up a class with a constructor to do the same, but I’ve kind of weaned myself off class hierarchies when using JavaScript. Your mileage may vary, etc.)

Here’s some basic code that shows this priority queue in operation:

function simpleComparer(a, b) {
  "use strict";
  if (a > b) { return 1; }
  if (a < b) { return -1; }
  return 0;
}

var PQ = createPriorityQueueArray(simpleComparer);
PQ.push(5);
PQ.push(7);
PQ.push(1);
PQ.push(3);
PQ.push(9);

console.log("pop (should be 9): ", PQ.pop());
console.log("pop (should be 7): ", PQ.pop());
console.log("pop (should be 5): ", PQ.pop());
console.log("pop (should be 3): ", PQ.pop());
console.log("pop (should be 1): ", PQ.pop());

Pretty good, I’d say. From a discussion of a concept to a working data structure in a few paragraphs. Before we get too carried away though, let’s think about the efficiency of our design. Firstly, adding an item will get done in pretty much constant time (ignoring time lost due to growing the array – I’m assuming here that JavaScript creates the array with some “spare” space at the end). In other words, adding an item to an empty queue or to a queue containing thousands of items will take roughly the same amount of time. In computer science speak, we say that the algorithm is O(1) - big-Oh of 1 – that is, no matter how many items there are, we consider the time taken to be constant.

Now let’s look at the opposite operation: removing an item. Here, we need to read through all of the items in the array, in order to find the one with the highest priority. The time taken for this operation with a queue with one item would obviously take a shorter amount of time that one with thousands of items. The time taken, in fact, is proportional to the number of items, or in computer science speak, O(N). (Again, I’m making some assumptions about how splice() works, that it leaves spare space at the end of the array rather than creating a new array and copying everything over. I could be cleverer/more efficient and move the last item to the position where the item with the highest priority is, and then delete the last item in the array, but I leave that to the interested reader.)

So we have a structure that implements a priority queue in which adding an item is an O(1) operation and removing it is an O(N) operation. For small numbers of items this structure is perfectly acceptable and usable.

I Like It

But, and this is a fairly big but, we can easily do better. I’m sure you can think of one efficiency improvement straight away: maintain the internal array in priority order, keep it sorted. If you think about it, this means we shift the work of the queue from item removal to item insertion. When we add an item we find its correct place inside the array, which is after all items with lower priority and before all those with higher priority. If we do this extra work during the add phase, the array will have all the items in priority order and hence, when we remove an item, all we need to do is to delete the last item. Pretty simple, huh? In fact, removal becomes an O(1) operation (we know exactly where the item with the largest priority is - it’s at the end, so removing it doesn’t depend on how many items there are).

function createPriorityQueueArray2(comparer) {
  "use strict";
  var queue = [];

  var push = function (item) {
    queue.push(item);

    if (queue.length > 1) {
      var i = queue.length - 2;
      while ((i >= 0) && (comparer(queue[i], item) > 0)) {
        queue[i + 1] = queue[i];
        i -= 1;
      }
      queue[i + 1] = item;
    }
  };
  
  var pop = function () {
    if (queue.length === 0) {
      return null;
    }

    return queue.pop();
  };

  return {
    push: push, 
    pop: pop
  };
}

Calculating the time required for insertion in this sorted array is a little more involved. The simplest way to think of how it’s done is to think of it as an insertion sort: grow the array by one item, and then move items along by one into the spare hole (like beads on a thread) starting from the end of the array. You stop when you reach an item that has a priority less than the one you are trying to insert. You then have a ‘hole’ in the array where you can put the new item. If you think about this for a moment, on average you’d move n/2 items for n items in the array. Hence insertion is an O(N) operation (the time taken is again proportional to the number of items in the queue), although with this improvement the time taken on average would be somewhat less than the previous implementation.

We’ve now moved from fast insertion/slow deletion to slow insertion/fast deletion. Can we do better than this? Yes, we can, but that’s going to have to wait until next time.

Delphi Magazine logo

(A long time ago I used to write a monthly algorithms column for The Delphi Magazine, called Algorithms Alfresco. This was a Delphi programmer magazine in the UK, which sadly closed down in 2007. I've decided to reuse/revamp those articles in a long series here on my blog, but coding in JavaScript instead of Delphi. This article first appeared in the November 1998 issue.)

Loading Zone

Loading similar posts...   Loading links to posts on similar topics...

No Responses

Feel free to add a comment...

Leave a response

Note: some MarkDown is allowed, but HTML is not. Expand to show what's available.

  •  Emphasize with italics: surround word with underscores _emphasis_
  •  Emphasize strongly: surround word with double-asterisks **strong**
  •  Link: surround text with square brackets, url with parentheses [text](url)
  •  Inline code: surround text with backticks `IEnumerable`
  •  Unordered list: start each line with an asterisk, space * an item
  •  Ordered list: start each line with a digit, period, space 1. an item
  •  Insert code block: start each line with four spaces
  •  Insert blockquote: start each line with right-angle-bracket, space > Now is the time...
Preview of response