Checking that a Binary Tree is a Heap

published: Mon, 19-Apr-2004   |   updated: Wed, 21-Apr-2004

Every now and then, I check out the search phrases used to access my web site. Some are funny, some are downright weird, and some are fairly thought-provoking. Naturally, given the bias of this site overall, the thought-provoking ones tend to be algorithm-based.

One search phrase that was used recently was "algorithms to test binary tree is a heap". This is an excellent example of a possibly interesting algorithm, and one that I haven't considered before.

First, let's recap what a heap is. A heap is a complete binary tree where every node's key is larger than or equal to its childrens' (should they exist). (Alternately, each node's key could be smaller than or equal to its childrens'; the two alternatives are called a max-heap and a min-heap respectively.)

So what do we have to prove to show that a given binary tree is a heap?

  • That the binary tree is complete.
  • That every node's key is larger than or equal to its childrens'.

We'll take these two attributes in turn.

A complete binary tree is one where all levels of the tree are filled with nodes, except for possibly the bottom row where the nodes are filled in from the left to the right. Let's assume that the binary tree we're testing for completeness is defined in the standard form where each node is a separate object with a key and links to the left and right children. If there is no child in a particular position its link is null (or nil, for the Delphi crowd). We're given the tree to test by being passed the root node (and from this we can access the entire tree).

So, how do we show that the tree is complete? We're not given any information about the tree at all, only its root node, remember. Taking the definition literally, one method would be to visit the nodes in level order, and make sure that there are no gaps. By level order I mean, visit the levels from top to bottom, and for every level, visit the nodes from left to right.

How do we visit nodes in level order? By using a queue, that's how. Enqueue the root into the queue. Now proceed as follows: Dequeue the node at the top of the queue, enqueue its left child (if it has one), then its right child (if it has one), and then process the node we just dequeued. Repeat until the queue is empty. So, to list the keys in level order, the processing we would do for each node would be to print its key. A fairly simple algorithm, but one that's not generally known compared to the usual pre-order, in-order, and post-order traversals.

So how can we change this to check for completeness? Well, we use a little trick. Proceed as for the general level-order traversal algorithm, except that when we find the a missing child, we enqueue a special value. We don't try and enqueue missing children of missing children, for that way lies infinity. No, just enqueue the special value for missing children of existing nodes. For each node, we don't have to do any processing.

Now, eventually, we shall dequeue one of these special values. What do we do then? Well, we switch processing: we check that every single remaining item in the queue is one of these special values. If there is an ordinary node left in the queue, we had a gap in the binary tree and therefore the tree was not complete.

(It seems we could improve a little on the algorithm: in reality we only need to enqueue one special value, one for the very first missing child we encountered. Ignore the second and subsequent missing children. When we dequeue the special value (which we can justifiably call a sentinel value), there should be no nodes left in the queue at all, it should be completely empty. However, the implementation of this "improvement" is full of if statements and much harder to read and understand.)

Next up is to check that the key of every node is greater than or equal to the keys in the node's children. By the way, this is known as the heap property. Compared to the prior check, this is fairly simple. In fact we can incorporate it into the level traversal algorithm. Every time we dequeue a node, all we've done is to use it to enqueue its children. We didn't do any processing on the node. Instead, now we'll do the check that each node's key is greater than or equal to its children.

Here's the code that checks a binary tree to be complete, written in C# using the Queue class in System.Collections. The special value I use for a missing child is of course the value null.

using System;
using System.Collections;

namespace CheckForHeap {

    class Node {
        private string key;
        private Node left;
        private Node right;

        public Node(string key) {
            this.key = key;
        }

        public string Key {
            get { return key; }
        }

        public Node Left {
            get { return left; }
            set {left = value; }
        }

        public Node Right {
            get { return right; }
            set {right = value; }
        }
    }

    class HeapChecker {
        private HeapChecker() {
        }

        static public bool IsValidHeap(Node root) {

            bool ScanningNulls = false;

            Queue q = new Queue();
            q.Enqueue(root);

            while (q.Count != 0) {
                Node n = q.Dequeue() as Node;

                if (ScanningNulls) {
                    if (n != null) 
                        return false;
                }
                else {
                    if (n == null) {
                        ScanningNulls = true;
                    }
                    else {
                        q.Enqueue(n.Left);
                        q.Enqueue(n.Right);
                        
                        if (!ValidHeapProperty(n)) {
                            return false;
                        }
                    }
                }
            }
            return true;
        }

        static private bool ValidHeapProperty(Node n) {
            if ((n.Left != null) && (n.Key.CompareTo(n.Left.Key) < 0))
                return false;
            if ((n.Right != null) && (n.Key.CompareTo(n.Right.Key) < 0))
                return false;
            return true;
        }
    }

    class Tester {

        static void CheckHeap(Node root) {
            if (HeapChecker.IsValidHeap(root)) 
                Console.WriteLine("passes");
            else 
                Console.WriteLine("FAILS");
        }

        static void CheckNotHeap(Node root) {
            if (!HeapChecker.IsValidHeap(root)) 
                Console.WriteLine("passes");
            else 
                Console.WriteLine("FAILS");
        }

        static void Main(string[] args) {

            Console.WriteLine("Checking null tree is heap");
            CheckHeap(null);

            Console.WriteLine("Checking single node tree is heap");
            CheckHeap(new Node("aaa"));

            Console.WriteLine("Checking tree with root and its left child is heap");
            Node root = new Node("ccc");
            root.Left = new Node("bbb");
            CheckHeap(root);
            
            Console.WriteLine("Checking tree with root & two children is heap");
            root.Right = new Node("aaa");
            CheckHeap(root);
            
            Console.WriteLine("Checking tree with root bigger child is not heap");
            root.Right = new Node("ddd");
            CheckNotHeap(root);

            Console.WriteLine("Checking tree with just right child is not heap");
            root.Left = null;
            CheckNotHeap(root);

            Console.ReadLine();
        }
    }
}

And there you have it: the implementation of an algorithm that can verify that a given binary tree is a heap.