JavaScript for C# programmers: object inheritance (part 2)

Continuing to learn JavaScript from the viewpoint of a die-hard C# programmer, using Firebug as our test engine.

In this episode, we continue writing the expression evaluator. (Please review part 1 before continuing.) You might want to have an extra browser open at the C# code, so you can follow along.

The next thing on the agenda are the result objects. In reality, the way I wrote the original code, there is only one failed result, whereas the successful result contains the RPN expression (or token, come to that) of the bit of the algebraic expression we got to. So let's code that up:

var failedResult = {
    failed: true,
    expression: null
};

var makeSuccessfulResult = function(expression) {
    var result = Object.create(failedResult);
    result.failed = false;
    result.expression = expression;
    return result;
};

Having coded it up, I'm not that happy with it. It seems as if I'm trying way too hard to force a class inheritance model approach to what are, after all, very simple objects. It would work equally well without the call to Object.create in there. I'll ignore my doubts for the moment and forge on.

Next is the parserState object. There's only one of these, so no inheritance semantics to worry about.

var makeParserState = function(expression) {
    var expr = expression;
    var position = -1;
    var current;
    var advancePosition = function() {
        position++;
        if (position >= expression.Length) {
            current = '';
        }
        else {
            current = expression.charAt(position);
        }
        return current;
    };

    advancePosition();

    return {
        getCurrent: function() { return current; },
        advance: function() { return advancePosition(); }
    };
};

I've coded this as a standard closure type function to give the state some privacy. The object that's returned has two public methods, getCurrent and advance, but a whole set of private members from the closure. There's the original expression string that we're going to read through, the current position, the current character, and a method to advance the string pointer and read the next character. (Remember that JavaScript has no character type; a character is represented as a one-character string.)

After I wrote it like this I discovered that charAt will return the empty string if the index is out of bounds; I had my C# hat on, obviously: I was assuming that it would throw an exception. So the private advancePosition method could be rewritten as

    var advancePosition = function() {
        position++;
        current = expression.charAt(position);
        return current;
    };

Now we get to the fun stuff, the actual parsing. In the original C# code, I wrote it all as a static class with static methods, but for now I wrote it in JavaScript as a set of methods. This is certainly not the best way, but I will get to that later.

First, parsing the operators:

var parseAdd = function(state) {
    var current = state.getCurrent();
    if ((current === '+') || (current === '-')) {
        state.advance();
        return makeSuccessfulResult(operator[current]);
    }
    return failedResult;
};

var parseMultiply = function(state) {
    var current = state.getCurrent();
    if ((current === '*') || (current === '/')) {
        state.advance();
        return makeSuccessfulResult(operator[current]);
    }
    return failedResult;
};

Notice something subtle going on. The call to makeSuccessfulResult is being passed a token and not an expression, as the implementation of that method would indicate. Keep that thought that at the back of your mind for now.

Parsing a parenthesized expression:

var parseParenthesizedExpression = function(state) {
    if (state.getCurrent() !== '(') { return failedResult; }
    state.advance();

    var result = parseExpression(state);
    if (result.failed) { return result; }

    if (state.getCurrent() !== ')') { return failedResult; }
    state.advance();

    return result;
};

This makes a call to parseExpression that we haven't written yet, to parse the bits in between the parentheses.

Parsing a number:

var parseNumber = function(state) {
    var current = state.getCurrent();
    var value = '';
    
    while (('0' <= current) && (current <= '9')) {
        value += current;
        current = state.advance();
    }
    
    if (current === '.') {
        value += '.';
        current = state.advance();
        while (('0' <= current) && (current <= '9')) {
            value += current;
            current = state.advance();
        }
    }

    var number = +value; // force conversion to number type
    if (isNaN(number) || !value) { return failedResult; }
    return makeSuccessfulResult(makeNumber(number));
};

This involves a couple of tricky bits, so follow along as I describe them. The value variable is going to be a string that we'll grow with the number-like characters we find in the expression. (Number-like in this respect means the digits and the decimal point — sorry, no internationalization yet). So we first gather all the digits we can. If there's then a decimal point, we add that, and then gather all the digits after the decimal point. Fun bit next: we force the string to be converted into a number type. We do this by using JavaScript's interpreter: we start the expression off with a plus sign. JavaScript will decide that the expression is going to be a number since this plus sign could only be a unary plus operator. We then give it the value string. Since the interpreter is in "making a number" mode, it will convert the string to a number which is what we want. (The alternative is to use parseFloat, or to use something like "1 * value".)

The big problem with this is if we start this method off with the current character not being a digit (say, a letter), then the string-to-number conversion will produce 0 without error. So we fail the parse if either the conversion failed (the value of number will then be NaN) or if the string is empty (remember an empty string is equivalent to false, so !value would evaluate to true).

Parsing a factor is easy, it's either a parenthesized expression or a number:

var parseFactor = function(state) {
    if (state.getCurrent() === '(') {
        return parseParenthesizedExpression(state);
    }
    else {
        return parseNumber(state);
    }
};

Parsing a term and parsing the expression are roughly the same (and I haven't yet extracted out the commonality, like I did with the C# code).

var parseTerm = function(state) {
    var operand = parseFactor(state);
    if (operand.failed) { return operand; }
    var rpn = operand.expression;

    var operator = parseMultiply(state);
    while (!operator.failed) {
        operand = parseFactor(state);
        if (operand.failed) { return operand; }
        rpn = joinRpnParts(rpn, operand.expression, operator.expression); 
        operator = parseMultiply(state);
    }

    return makeSuccessfulResult(rpn);
};

parseExpression = function(state) {
    var operand = parseTerm(state);
    if (operand.failed) { return operand; }
    var rpn = operand.expression;

    var operator = parseAdd(state);
    while (!operator.failed) {
        operand = parseTerm(state);
        if (operand.failed) { return operand; }
        rpn = joinRpnParts(rpn, operand.expression, operator.expression);
        operator = parseAdd(state);
    }

    return makeSuccessfulResult(rpn);
};

Both make use of a routine called joinRpnParts to stitch together the operands and operator postfix style.

var joinRpnParts = function(first, second, operator) {
    var result = makeExpression();

    var addTokens = function(operand) {
        if (operand.isToken) {
            result.add(operand);
        }
        else {
            operand.forEach(function(token) { result.add(token); });
        }
    };

    addTokens(first);
    addTokens(second);
    addTokens(operator);

    return result;
};

makeExpression is a slightly changed version of rpnExpression from last time. I added a isToken field to both the token ancestor and the RPN expression objects so that I could tell them apart. This is where I tried to resolve the token versus expression problem I alluded to before: I thought I was being clever here in using the lack of type safety to help me (and feeling all JavaScripty about it) and then having to hack this "well, is it a token or not" boolean in two different places (which to me says I'm not being JavaScripty enough). We'll sort it out later.

addTokens is just a helper function to add the a part onto the RPN expression. Before you get excited by the forEach call there, it's a function I wrote for the RPN expression object:

var makeExpression = function() {
    var expr = [];
    return {
        isToken: false,

        add: function(node) {
            expr.push(node);
        },

        clear: function() {
            expr = [];
        },

        evaluate: function() {
            var stack = [];
            for (var i = 0; i < expr.length; i++) {
                var token = expr[i];
                if (token.isOperator()) {
                    var rhs = stack.pop();
                    var lhs = stack.pop();
                    stack.push(token.evaluate(lhs, rhs));
                }
                else {
                    stack.push(token.value);
                }
            }
            return stack.pop();
        },

        forEach: function(action) {
            for (var i = 0; i < expr.length; i++) {
                action(expr[i]);
            }
        }
    };
};

As you can see, the forEach method takes an action function to call for each token in the expression array, and that for joinRpnParts just adds it to the expression being stitched together.

Finally we need a function to tie it all together:

var evaluateExpression = function(expression) {
    var state = makeParserState(expression);
    var result = parseExpression(state);
    if (result.failed) { return NaN; }
    return result.expression.evaluate();
};

console.log(evaluateExpression("1+2")); // outputs 3
console.log(evaluateExpression("1+2*3")); // outputs 7
console.log(evaluateExpression("((1+2)*(3-4))+((5*6)-(7/8))")); // outputs 26.125

Now that I've shown you all the code, I can reveal that I'm just not happy about it. Yes, it's a pretty close translation of the C# code to JavaScript (and I did copy/paste code from the C# implementation to help move things along — just a case of removing all the type identifiers, mostly), but it's just not good JavaScript to me.

Let me enumerate the problems as I see them:

So, next time, some major refactoring. We're going to be JavaScripty if it kills us.

Album cover for The Mirror Conspiracy Now playing:
Thievery Corporation - Le Monde
(from The Mirror Conspiracy)

Loading similar posts...   Loading links to posts on similar topics...

7 Responses

 avatar
#1 Sandeep said...
07-Mar-09 1:13 AM

Nice work sir keep posting such good stuff

 avatar
#2 Sandeep said...
07-Mar-09 1:14 AM

nice work

#3 Dew Drop - March 7, 2009 | Alvin Ashcraft's Morning Dew said...
07-Mar-09 7:31 AM

Pingback from Dew Drop - March 7, 2009 | Alvin Ashcraft's Morning Dew

julian m bucknall avatar
#4 julian m bucknall said...
07-Mar-09 9:28 AM

Sandeep: Glad you're enjoying the series. The next one gets hairy :).

Cheers, Julian

#5 JavaScript for C# programmers: object inheritance (part 2) : Algorithms for the masses - Julian M Bucknall said...
07-Mar-09 9:32 AM

Thank you for submitting this cool story - Trackback from DotNetShoutout

#6 My Bad Attitude » JavaScript for C# programmers: object inheritance (part 2) said...
07-Mar-09 6:43 PM

Pingback from My Bad Attitude » JavaScript for C# programmers: object inheritance (part 2)

#7 JavaScript for C# programmers: refactoring the expression evaluator said...
08-Mar-09 2:53 PM

Another in the series in learning JavaScript from the viewpoint of a C# programmer, using Firebug as our test engine. In this episode, we take the functioning expression evaluator from the last post and clean it up JavaScript style. Wrap in an object

Leave a response

Note: some MarkDown is allowed, but HTML is not. Expand to show what's available.

  •  Emphasize with italics: surround word with underscores _emphasis_
  •  Emphasize strongly: surround word with double-asterisks **strong**
  •  Link: surround text with square brackets, url with parentheses [text](url)
  •  Inline code: surround text with backticks `IEnumerable`
  •  Unordered list: start each line with an asterisk, space * an item
  •  Ordered list: start each line with a digit, period, space 1. an item
  •  Insert code block: start each line with four spaces
  •  Insert blockquote: start each line with right-angle-bracket, space > Now is the time...
Preview of response