Beware standards: a JSON story

Actually it’s more like a couple of little cautionary tales in one.

If you’re a developer, I’m sure you’re familiar with the concept of JSON, a very simple data format used for transferring data from computer to computer, from application to application. So simple in fact, it’s pretty much replaced XML as the way to transfer some data over the internet. It was first invented by Douglas Crockford as a natural extension of his work with JavaScript, with the name standing for “JavaScript Object Notation”.

The first tale is this. Paul Usher, DevExpress Tech Evangelist par excellence, and I were preparing the content for some webinar or other, and part of that was coding up some JSON to display in some client-side widget or other. He had an existing URL that returned JSON data, but there was an issue. He thought it was due to a problem with the widget (he was right), but I, for an unfathomable reason, fixated on the JSON. The returned data was coming in as a JSON array but I was 100% certain that JSON meant that the data should be an object (you know, because the “ON” part means Object Notation after all), with that object having a single property that was an array.

{ "data" : [ … ] }

I jumped over to json.org, Crockford’s site and, yes, it seemed to imply that I was correct. object was at the top of the hierarchy. (As it happens, a few years back, that’s how I coded up an extension to the blog engine I use: I needed to return an array to the client, so I used an object with a single data member.) Yet…

Further investigation (ECMA-404, RFC 7159, etc) refuted that assertion. In these documents, they expressly state that (ignoring whitespace) JSON-text is a value, and a value is either a string, a number, an object, an array, or one of the three tokens true, false, or null. So, not only can a JSON packet be an object as I’d assumed uniquely all along, or an array as I’d been shown in practice, but also a “primitive” type like a number or a string. And as for the three preset values, well.

In fact, this leads to the second thought. I don’t feel so bad about my long-held misunderstanding. It seems that there are JSON parsers out there that also don’t know about this top-level definition and always assume an object or an array. (This older JSON text checker makes this mistake, for example.) Now I’ll agree that the ultra vast majority of JSON passed around only uses the two main types of value, but that’s no excuse. (By the way, handy hint: if you really did want to pass back true or one of the other preset values, wrap it in [ ] and you can ignore that nagging JSON parser you may be forced to use.)

Mind you, the whole issue of what the definitive JSON standards actually are is pretty fraught and you start to feel bad for the developers writing parsers. We in the real world though just have to bite the bullet: we’re pretty much given the parser so it’s up to us to make sure the JSON we produce matches what our (defective?) parser can understand.

Which kind of leads onto the third bit. I remember a while back reading about a developer who was converting a system that used XML as a data transfer format to JSON. I think the rationale was that JSON was less “wordy” and more compact, but I may be misremembering. Anyway, they did the conversion, generated the JSON, but the tests were failing because the JSON was invalid. The reason? The original XML files had a comment block at the top, containing values like a copyright statement, date created, and other bits of documentation, and they assumed that they could port that across too. After all, it’s all JavaScript, right? Double slash at the start of a line and BOOM! we’re done, move on. Wrong: the JSON format has no support for comments at all.

So, all in all, be aware that the JSON standards landscape is not as smooth and sunny as you may have assumed and as the various toolkits and libraries would like you to believe in their advertising. Don’t try and be too clever, and be prepared to simplify if necessary.

And a top-level array is perfectly valid JSON.

Recursive columns

Loading similar posts...   Loading links to posts on similar topics...

1 Response

 avatar
#1 Juan Perez said...
26-Dec-16 4:52 AM

It seems that in some cases it could be dangerous: http://haacked.com/archive/2008/11/20/anatomy-of-a-subtle-json-vulnerability.aspx/

Leave a response

Note: some MarkDown is allowed, but HTML is not. Expand to show what's available.

  •  Emphasize with italics: surround word with underscores _emphasis_
  •  Emphasize strongly: surround word with double-asterisks **strong**
  •  Link: surround text with square brackets, url with parentheses [text](url)
  •  Inline code: surround text with backticks `IEnumerable`
  •  Unordered list: start each line with an asterisk, space * an item
  •  Ordered list: start each line with a digit, period, space 1. an item
  •  Insert code block: start each line with four spaces
  •  Insert blockquote: start each line with right-angle-bracket, space > Now is the time...
Preview of response