Improve sub-lange support for template literals

My point is that the syntax is not sufficient. Editors do need to be able to parse the syntax, but they also need other things, like the scoping rules. So just specifying the syntax by itself isn't enough.

Editors rely on tools that use grammars to convert syntax into some kind of normalized data structure. The normalized structure is used for tasks like syntax highlighting.

I see no fundamental distinction that prevents the same mechanisms being applied to scope, except that until now nobody has built those tools.

What you're describing is to teach the editor, not just the syntax of a particular language, but also the semantics of how the language operates. And, I'm not sure there's a practical way to go about doing that, since each language has slightly different semantics. A "scope" in one language doesn't always behave the same as a "scope" in another, so we can't just state that "this is a scope" and expect an editor to know what to do with it.

For example, let's say you perform a rename operation on this JavaScript code snippet:

let x = 2; // A
const obj = { x: 3 } // B
with (obj) {
  x = 4; // C <-- I want to rename this variable
}

What should the rename operation do? If we were to encode the semantics of scoping rules into a grammar, then an editor would try to use that knowledge to rename both A and C in that code snippet. But that's not necessarily correct in this scenario, is it? We need to also encode the semantics of a with statement into the grammar somehow, to teach it that this is a very special scope where properties of an object can shadow variables from outer scopes. So a proper rename implementation would rename the field in the object (B and C), or, would just give you an error saying this sort of thing is way to crazy for an editor to handle robustly.

Another simpler example would be renaming a variable found in a function. Should the editor go looking for a variable in the outer scope when this happens, or not? That depends on the semantics of how variable capture works, which vary widely from language to language. Perhaps, the variable being used in the function isn't the one from an outer scope, instead, it's some global variable. How do we encode these behaviors into a grammar?

That being said, what I'm describing is trying to bring 100% of the power of language plugins into a completely declarative form within this metadata file, which I believe is Bakkot's concern. I don't think that's possible We can gradually make these metadata files more and more powerful, and perhaps, we can eventually even find a way of declaratively explaining to editors how to do a rename operation in most scenarios, while also telling the editor the kinds of rename operations we can't properly support (this could be taught within the grammar itself, or elsewhere). But these metadata files will never be as powerful as just executing arbitrary code within a plugin.

Still, being able to automatically receive basic language features within a template tag automatically, without having to install anything would be really nice.

@theScottyJam Congrats on coming up with an example that is maximally awful to try to analyze, but I don't really see it as a problem. The core of my technology doesn't include renaming variables, it just describes what it can understand about the program in a unified structure. If you want to write a variable renaming operation it's up to you if you want to blow it all up if you hit a with. For now it seems like the sane thing to do.

I also have an ace up my sleeve: because my architecture is generator-based, at any time it is possible to pause evaluation and ask the user what the correct thing to do is!

To whom it might concern, I've given template literals tag a chance to represent the ESX tokenizer I've discussed and proposed in the other thread.

I am pretty pleased so far about its outcome, performance, and features, and I think I have found a very handy, yet portable, way to have scoped components within esx tag, and through the main factory function.

In a nutshell:

import {ESX, Token} from '@ungap/esx';

// the factory function accepts an object
// that will be used to tokenize components
const esx = ESX({MyComponent});

esx`
  <div data-any="value">
    <MyComponent a="1" b=${2} />
  </div>
`;

function MyComponent(props, ...children) {
  // note: props doesn't need ${...props} or ...${props}
  //       because any value can be passed as interpolation
  return esx`
    <div ${props}>
      ${children}
    </div>
  `;
}

If I had full highlight in my IDE I most likely wouldn't even notice I am using a template literal tag and no ESX through its Babel's transformer + there are benefits around cross-realm / client / server / IoT applications possible through the serialization helpers around this module.

Any feedback welcome, it's possible that "upsetting me" in the other thread made me create a wonderful new pattern and general purpose tokenizer for XML like structures or, at least, what I came up with is extremely close to what I wanted previously with the Babel ESX transformer, except here I haave the uniqueness provided by the template, and structs / tokens are always unique by default, being these outer tokens, or inner tokens, everything is created once and never again, but interpolation and/or dynamic attributes values are updated on repeated invokes of the same template.

3 Likes

This looks a lot like what I had in mind. Thanks for exploring this avenue.

1 Like

One interesting topic we could look into is the https://wiki.haskell.org/Quasiquotation Haskell.

@WebReflection I really like that. And I am hoping that I can provide the editor experience that ensures that editing the syntax inside the template tags is just as rich as editing the same syntax when it isn't embedded.

That’d be awesome! It’s basically about enabling JSX within the tag with extra $ before curly braces and without needing spread operator. I had a look yesterday at the default JSX highlighter within TypeScripReact rules but I’m not too familiar with VSCode plugins, even if I wrote one for hyperHTML.

If anyone could help me it’d be very appreciated