Which one among JSON and XML is the best format to release annotated texts?

Question

Which one among JSON and XML is the best format to release annotated texts?

commented Aug 17, 2015 by bsmith89 (0 points)

commented Aug 17, 2015 by Daniel Mietchen (2.8k points)

commented Aug 17, 2015 by Scott Chamberlain (470 points)

commented Aug 17, 2015 by Alexander Konovalov (155 points)

commented Aug 17, 2015 by Franck Dernoncourt (725 points)

2 Answers

Rex Kerr · Answer 1 · 2015-08-09T18:06:29+0000

If the structure of the data is not exceedingly complex, you should favor JSON because JSON is faster and easier to parse automatically, and also easier to read as a human. In principle, XML schema could be used to automatically identify parts of your data, but in practice there are so many different ideas for how to present data that it doesn't really work.

Thus, if you have to choose one, JSON. If you want to be super-nice, you could provide both.

There is one problem with JSON: it doesn't understand infinite and not-a-number floating point values. If you have lots of these in your data, you need some way to deal with this. Your favorite tools probably already have their own workaround. Alas, it's not standard yet. But XML doesn't even have a standard for how to present a number, so you're still modestly ahead with JSON.

This post has been migrated from the Open Science private beta at StackExchange (A51.SE)

Robin Berjon · Answer 2 · 2015-08-10T06:45:38+0000

JSON is generally poorly suited for markup. Unless the structure of the documents is extremely simple it will be difficult for humans to read the source and make much sense from it. To the best of my knowledge, JSON is used in some text-oriented systems but only to save an internal representation that the application makes use of, not for interchange purposes.

For document-oriented content, XML will work much better. You can interleave content and structure more readably. Don't bother with anything like XML Schema, it's absolutely useless for documents, but if you can reuse an existing language it's even better as your users might have a chance to reuse tooling they may already have.

Finally, have you considered HTML? It has a lot of tooling (you're likely reading this in one) and a lot of users. It has become pretty good at capturing the core structure of documents, and has the extension points that make it possibly to overlay richer semantics onto a document. I would certainly recommend looking at it twice, it's usually a very good choice.

This post has been migrated from the Open Science private beta at StackExchange (A51.SE)

Which one among JSON and XML is the best format to release annotated texts?

Please log in or register to add a comment.

Please log in or register to answer this question.

2 Answers

Please log in or register to add a comment.

Please log in or register to add a comment.

Categories

Most popular tags

Which one among JSON and XML is the best format to release annotated texts?

Please log in or register to add a comment.

Please log in or register to answer this question.

2 Answers

Please log in or register to add a comment.

Please log in or register to add a comment.

Related questions

Categories

Most popular tags