Support JS-style comments in JSON

Apologies if this isn't the right place - it's not, feel free to redirect me to where the right place is.

I propose we extend JSON with comments using JS's grammar:

  • Single-line comments start with // and end with line terminator sequences (CR, LF, LS, or PS).
  • Multi-line comments start with /* and end with */.
  • Implementations should (not "must") treat this as equivalent to whitespace.

Despite the extremely performance-sensitive nature of JSON parsers, I don't anticipate this to cause performance problems to existing parsers.

I don't really need to state the demand here (just search "json with comments" and it'll show up within the first page), and many things (ranging from ESLint and npm to Visual Studio and even Chrome in a few places) support it already. .NET has native support for it (off by default) in its built-in JSON parser. There's also many spec extensions providing just that. I feel at this point it's time to place it in the language proper, RFC update and all. If anything, it's already starting to result in a partial ecosystem fork (.NET's standard library and the increasing use of JSON5 in place of native JSON are evidence of this), and that's not exactly a good thing for interoperability.

1 Like

There are sooo many JSON parsers out there (many legacy ones that aren't supported anymore too), I doubt we can just update the JSON standard the same way we can do with EcmaScript. The only safe way I can think of to update the standard would be to make a separate standard for it, call it something else, and have everyone get on board with it. This unfortunately seems to be what many groups of people have already tried to do independent of each other, and we're really struggling with the whole "get everyone on board with a single new standard" thing. I guess it could help if whoever's in charge of the JSON standard came out with their own standard, or pointed to an existing standard and declared it as the next version. But, the "next" version would need to have a different name from just JSON, so old JSON parsers can still be considered valid.

As the previous comment says, JSON can never be updated without breaking a great many consumers, which we are not inclined to do. We could in principle add a new JSON5.parse or whatever, but it seems premature to do that at this point in time.

I guess it could help if whoever's in charge of the JSON standard came out with their own standard

That's also TC39.

How exactly would making what's currently invalid syntax valid potentially breaking?

Of course, legacy, unmaintained parsers won't be updated, but consider that we were able to add two previously invalid characters to the list of valid string characters. I don't see why that would be possible and this wouldn't.

In this case, it's slightly larger, but it still won't impact the semantics of JSON for most uses. (For the few it would, they're probably already using this proposed extension.)

Would love to be able to use comments inside JSON too, although I understand it is complicated.

Actually I think that it would not be such a breaking change. Most of the old / deprecated JSON parsers will still be using the same deprecated JSON files they were used to use, and in most places the new JSON standard is used, the parser used will be updated accordingly.
The case where an old parser is used but new JSON files are provided somehow should be quite rare, and if that ever happen, that also means that the project is probably still maintained (and so there are good chances that the parsers will be updated too).

Also, if at some point we do decide to update the current JSON standard, I propose to take this opportunity to also add support for trailing commas, because this would also be an easy fix for such a pain.

I would love to have comments and trailing commas in JSON as well. I'm not sure what the performance implications of all of this are for parsers, I doubt there would be any issues, but if there are, I would rather JSON tailors more to the "readable-data-transfer" idea, and less to the "config file" idea it was never meant to be used for.

As much as I hate to admit it, I also think there's value in having a JSON format that does not support comments - it provides some nice guarantees. For example, maybe you're on a server, with a config folder full of json config files. You might think you could add comments to these files, but if the server ever writes to any of these files, it'll blow away your comments. If some of the files were named .json5 for example, then it would be easier to know which files can have comments and which ones can't.

...I guess this isn't a very strong argument, but it's a little something to consider.

1 Like

I don't think it's possible or sensible to change the JSON standard, but one could make JSON5 an internet standard and provide native support for it in JavaScript - in addition to, not as a replacement of, JSON.

1 Like

Note there is also JSON6: GitHub - d3x0r/JSON6: JSON for Humans (ES6)

This whole thing is a bit sticky. JSON is intended to be for data-transfer, and is supposed to be written by computers, not humans, and yet we use it for config files and what-not all the time. Why? Maybe because it's so easy to do so. Most major languages come with good JSON support built-in. Many other config-file parsers have to be installed and are more cumbersome to use (it's hard to beat the simplicity of JSON.parse())

As much as I would love to add support for comments and trailing commas, etc, I worry that we'll deviate too far from its intended purpose and turn JSON into a configuration language. If we add comments in, someone else will ask for multiline string support. We add that in, then someone will ask for another thing. All of these changes, while convenient for configuration, have nothing to do with data transfer, and adds complexity to JSON parsers that are only being used for data-transfer purposes (think of IoT devices that need to have really slim parsers).

Yes, I realize I employed the slipper-sloap falacy there, but I think it's true to a degree. Whatever argument we give for one desired configuration-specific feature can apply to many. Why should one feature be let in and not another?

Maybe we need to look a little deeper and see what we can do to uproot the real issue, which is the fact that we keep using JSON for config files. Maybe what's needed is an entirely new, JSON-like proposal that's looks like JSON, will recieve native language support just like JSON, but will have a number of nice-to-have config-file related features, and will be a living proposal, that will be updated periodically to include whatever new features the general public wants.

We can base this new standard off of JSON5 or whatever, then deviate from it as needs be.

I guess this isn't very different from what we're already saying, just reinforcing the idea that this should be a completely different standard from JSON, and should be updated more frequently than the JSON standard.

2 Likes

The biggest issue with creating something else is you run the risk of just creating another thing that nobody actually adopts. So far, JSON5 and YAML seem to be the only two things anyone's attempting, but as I've noted above, there's several different JSON extensions that support comments in subtly different ways. I was hoping there could be a way to standardize those either as part of JSON (ideally) or as a standard that sits alongside JSON and is likewise maintained by TC39.

Also, just to be clear, I'm explicitly not proposing anything beyond this. I'm not even proposing adding trailing commas, and in fact I'd even question the value of it. What makes mine special is that we've already got some proliferation of apps and utilities just silently accepting and ignoring commas as well as at least .NET supporting it natively, or else I wouldn't have even bothered (I know it's virtually impossible to get changed).

The damage has already been done a long time ago, in many different ways. It's a simple syntax, and people love simple syntaxes for configuration files.

I guess I just used that exact argument described in XKCD, didn't I ...

Though, if we decide that we don't want to modify the existing JSON standard, and would rather create a "Next version" standard, then what I proposed doesn't seem too far-fetched to do. It wouldn't be that hard to just make that next-version standard pretty close to the original, then let it drift over time, and hopefully, people will be willing to just incorporate it as part of their JSON-parsing utilities. e.g. in Javascript just add a JSON.nonStrictParse(text), in python they add a json.loads(text, strict=False), or whatever. In fact, I'm not sure there's any other way to do it - if we create a new standard that's tailored for JSON as config, then it'll naturally receive a ton of requests to be updated in different ways, and that's should be fine, because we just created a new living standard for this purpose.

Maybe we even start by just adding support for comments. As long as it's clear that this new standard will be updated periodically.

The principle you are espousing is a good one. Legacy issues make it all but impossible.

That being said I have always liked the idea of a Javascript Active Notation which would include not only JSON but comments and fat arrow (this less) functions. This format would d provide the ability to generate itself as output or interpret itself as input. This would eliminate ingestion and export as separate functions related to the data.

I doubt it will get traction but how cool would it be if one could graphql in and out of rest apis without writing extra code?

I'm not advocating that we should implement comments, nor am I against such a proposal.
I'm mostly thinking about the backward compatibility aspect of it all.

if package.json had comments and we needed to add some dependencies to it
then it would not be as simple as just doing:

conf = JSON.parse(pkg)
conf.dependencies[name] = version
write(JSON.stringify(conf, null, 2))

parsing the json would strip out all the comments, so writing anything back strips it away.
so you would need some better tool to know where comments are. how to programmatically know what the comments are and where they are in the text file.
if we are going to introduce a new parser then i would maybe also suggest a lazy one as well that don't need to parse everything.

json = new JSON5(text)
json.get('user.preferences.food[0]')
json.get('user.preferences.food').forEach(xyz)
json.set('user.preferences.food[0]', 'pizza')
json.stringify() // preseves comments as is.

2 Likes