Proposal: Add annotations system and more handful block comments

Abstract

Add indentation-based block comments with annotation possibilities to solve poor JS documentation adoption.

Problem

It seems, like current JS comments is not very effective and is not in wide use for writing documentation. While there are a lot of developers who try to use JSDoc to describe their APIs, there are way much more developers who don't. And I think the problem is in JS block comment peculiar properties.

There are two main problems for this technical debt, expensive doc-tools implementation for code editor developers and current JS syntax. And the first two is led by the last one. I think that JS syntax has non-obvious problems and should be updated to let this issue be solved. And these problems are in JS block comments. Here they are:

  • Block comments couldn't be nested.
  • Block comments couldn't include blocks of code in JS and other languages without potential harm or additional work.
  • Lack of namespaces.
  • No native parser.

So how this problems affect the development process.

1. Block comments couldn't be nested

Block comment couldn't be put inside another block comment. As a result we got that JSDoc code example couldn't contain multiline comments. It makes developers to write single-line comments which is very hard to keep nice-looking when you edit block of text. Newlines make you go to the line start to add comment mark.

2. Block comments couldn't include blocks of code in JS and other languages without potential harm

You couldn't just copy/paste some code as a block-comment, in the case if it contains another block-comments.. Because it will brake the source code and require developers to replace block-comments into single-line comments.

3. Lack of namespaces

There is no way to determine which comment is a JSDoc, which one is just a copyright comment or ESLint ignore sequence without parsing it and if there is a problem with parsing, we couldn't emit an error, because it's actually could be a valid comment for another tool. To determine valid JSDoc syntax, it uses asterisks as a prefixes.

4. No native parser

Native JS parser just ignores comments and wouldn't notify developer if there is a problem with provided values.

All this leads to a situation, when:

  1. Dev-tools developers couldn't write simple-like-a-stick utils and should know any possible exceptions and use heuristics to determine when things gone wrong.
  2. Editor developers couldn't create reliable tools, because there is no standard.
  3. Developers try to avoid writing big detailed comments, due to bunch of handwork.
  4. Every dev-tool developer should create a custom parser even for pretty simple DSLs.
  5. Dev-tools developers "fights" for global namespace and should create very complex syntax constructions to avoid conflicts.
  6. Malformed annotation should be always decided valid because it could belong to another tool.
  7. No well-known behavior to run additional checks with a simple tool or a single command.
  8. Developers' experience reusability is pretty low, so we don't get synergy.

Motivation

Modern languages and environments have built-in documentation generation tools. Rust and Golang code repositories contain full documentation: crates.io and pkg.go.dev. Deno also supports documentation generation out of the box.

Some languages have block comments Haskell, Elm, which can be nested and use indentation instead of charters sequence. It seems very user-friendly.

Proposal

  1. Add annotations which contains namespace and indentation-based block of text.
  2. Add indentation-based block comment.

Spec

Syntax

Annotation
  = SinglelineAnnotation
  | MultilineAnnotation
  | CommentBlock

SinglelineAnnotation
  = "{?" Tag "?}"
  | "{?" Tag JSValue "?}"

MultilineAnnotation
  = "{?" Tag Newline IndentBlock "?}"
  | "{?" Tag JSValue Newline Block "?}"

CommentBlock
  = "{?" Newline IndentBlock "?}"

Newline = "\n" | "\r\n"

Block
  = Indent String Newline Block
  | Indent String Newline

:warning: JSValue is an instance of one of JS primitives: Null, Boolean, Number, String, Object, Array.

Additions

To solve conflicts in package development, annotation tag could be resolved to globally unique values:

  1. For Node.js annotations directive could be added to package.json to resolve annotation tags to package names.
  2. For browser there is no need in such resolution, because annotations should be ignored in runtime.

Examples

Block Comment

Simple block comment, which shouldn't be treated as annotation or service comment:

{?
  Hello this is a block comment.
  It uses indentation instead of closing sequence. Thus it could be nested...
  {? 
    ...more then once. Because nested code is just a text, not a JS source.
  ?}
  This is why such comment can contain almost anything.
?}

JSDoc

JSDoc could be used almost without changes, like this:

{? jsdoc
  Add returns sum of `a` and `b`.

  @param {number} a Left number
  @param {number} b Right number
  @returns {number} Sum of a and b
  @example
     add(1, 2) // -> 3
?}
function add(a, b) {
  return a + b
}

Eslint

ESLint can use single-line annotations:

{? eslint.disable ["eqeqeq", "no-console"] ?}
if (x != null) {
  console.log('Not null-alike')
}

Node debugger

Currently Node.js and V8 use non-standard debug keyword for debugging purposes. It could be replaced with inline annotation:

import run from './app'

{? v8.debug ?}

run()

Markdown and nesting

Safely use markdown and JS within it:

{? man
  The function squares passed values:

  ```
  // Nested annotation works fine
  {? assert 4 ?}
  square(2)
  ```
?}
function square(a) {
  return a ^^ 2
}

Notes:

  • Annotation is easily nested in each other. Because nested annotation is just a part of text block.
  • Code editor can determine text-block highlighting mode by resolving man tag with some dictionary or using package.json#annotations.

File metadata

Describe file metadata:

{? meta {
  authors: [{
    name: "Paul Rumkin",
    homepage: "https://rumk.in",
  }],
  date: "2021-02-02",
  licens: "MIT",
  tags: ["info"]
} ?}

export const authors = [{
  name: "Paul Rumkin",
  homepage: "https://rumk.in",
}]

Notes:

  • It's possible to replace author URL in metadata without affecting the code part. This work could be automated with code.
  • It's possible to add/remove/update tags assigned to source and use them for autogenerated commit messages.
  • This data could be extracted by dev-tools to analyze, e.g. Github can use tags in search.
  • Annotation value could be checked with JSONSchema to match the format.

Pros

  1. Standardized behavior helps developers to build sustainable and reliable tools.
  2. Other languages could be seamlessly nested in JS code.
  3. No escape sequences in text block simplifies manual editing.
  4. Namespaces solve conflicts issue.
  5. Annotations can help to finally solve decades-long documentation generation issue with better, simpler, and more reliable tools.
  6. Annotations handler could be realized as a simple function, without the need to write custom parsers in simple cases:
    function annotationHandler(tag:string, param:any, block:string?) {}
    
  7. Parsing errors are generated, if annotation syntax is invalid.

Cons

  1. Indentation-based syntax looks heterogenous to JS.
  2. Indentation-based syntax cannot be minimized with current tools, so it should be removed or left unchanged.
  3. Inconsistent tabs/spaces might be an issue in some cases.
1 Like

Alternatively existing doc tooling could standardise on Rust style doc comments, which use triple-slash. Documentation - Rust By Example

They can be nested. They can include code samples. Simple to parse. Backwards compatible with existing EcmaScript standards.

/// Strips the comments from an input string:
/// @param code - the input string
///
/// @example
/// ```ts
/// // prints " some text":
/// console.log(stripComments("/* a comment */ some text");
/// ```
function stripComments(code: string): string;

code example from Triple slash support (C# style) ยท Issue #160 ยท microsoft/tsdoc ยท GitHub

Blocks of single-line comments don't solve other addressed issues (custom parsers, shared namespace, fault-prone behavior). And they still have bad UX.

  • Users couldn't just copy/paste such block as a text without /// . It reduces interoperability.
  • It's still impossible to use tabulation inside such comment block to format code.
  • Uncommenting part of block and commenting it back produces different code, if there is an additional indentation after ///.

JS code is used in many environments, it could be presented as text, html, and other formats across the Web. Sometimes it's just impossible to use smart tools to remove slashes.

I can agree it's harder to parse huge indentation-based blocks and will add this to cons section. But I've never seen comments longer than two or three screens-high in a production code. And I don't think it will be real issue to have even tens or hundreds of kilobytes long indentation comment. Especially after binary sources will be rolled out.

Documentation is just one part of the bigger problem. I'd like to solve the bigger problem and to implement solution which will serve for decades and not just be a patch.

UPD. I couldn't update cons section due to platform's restriction on editing old posts.

Also I've found the dedent proposal which is at stage 1. So it seems like indentation-based parsing is about to be part of the language and it's a breaking change.

Also this breaking change could be transpiled to backward-compatible code. So it does seem like a solvable problem.

Just to clarify process: stage 1 in no way means "about to be part of the language". It can take a very long time for stage 1 proposals to advance, and many never do.

3 Likes

One more use-case. Today there is an annotation which is in a wide use โ€“ it's "use strict". It utilizes JS ability to skip standalone expressions. But it always looked like a hack and temporal solution, not like a mature language design feature. But somehow we decided it's ok to have such hacks and to make them a standard. Instead of solving this and bringing some mature and permanent solution for this.

Today there is a proposal to add more annotations like this one GitHub - tc39/proposal-function-implementation-hiding: JavaScript language proposal: function implementation hiding. I do believe it should be decided to standardize a special syntax for this to make it recognizable and easy to learn. This proposal could be enhanced to support standard ECMA annotations.

Those are pragmas, not annotations. Pragmas can impact not only runtime behavior, but parsing behavior as well. They're subtly different.

Thanks for this clarification. AFAIK the only difference between annotations and pragmas is that pragmas affect runtime behavior. But I think they have much enough in common. They both consist of single expression, both affect current and all underlying blocks. I think some additional characters could be enough to differentiate them and make the difference better recognizable.

Example:

// Annotation. Does not affect runtime behavior.
{? JS.useStrict ?}

// Pragma. Affects runtime behavior
{? <JS.useStrict> ?}

(I think there could be better syntax for that, the current is using for example purposes)

Is there anything else I missed?