Just like Java and the .NET CLR (and countless other still-modern environments), ES internally represents a String as a sequence of UTF-16 code units. You may be confusing that internal representation with UTF-8 encoding of text output or decoding of text input. But those are different concerns: a String is not the same as "text," despite the terms being used interchangeably at times.
The ES spec itself is agnostic about text encodings used at runtime; those are governed by other standards.
Perhaps you could explain more your proposal to have UTF-8 be used as the internal representation of Strings? It's true that very modern languages (Go, Rust) use UTF-8 internally and there's a slow creep
— enabled of course by faster processors and cheaper storage — toward UTF-8 in brand new technologies. But establishing UTF-16 to be "awful and not acceptable" is a tall order, given its continued dominance.
There is, unfortunately, absolutely zero chance of changing the way that JS strings encode their characters. That would be a massive breaking change across the entire ecosystem.
Modern string APIs often address text as codepoints, like String.prototype.codePointAt(), which is generally what you want.
(You rarely, if ever, actually want UTF-8; encoding details are a detail you rarely want to be aware of, instead of just getting codepoints and/or grapheme clusters. The problem with JS is that it exposes encoding details, and in particular it exposes details of a really bad encoding (UCS-2-ish, not even UTF-16).)
Again, UTF-8 is not "Unicode", it's an encoding of Unicode; a way of turning unicode characters into bits (and back). JS already supports all of Unicode. The default string indexing ("foo") is busted, because it indexes the string according to UCS-2 code units, rather than characters. That's unfortunate and bad, but it's impossible to change. JS has many new ways of interacting with strings that do work on characters - [..."foo"] is character-based, "foo".codePointAt(0) is character-based, String.fromCodePoint(0xfffd) is character-based. Regexes also have recently gained ways of interacting with strings properly as Unicode characters (and are continuing to evolve in that direction).
So everything you need is already present or upcoming. We're stuck with the bad parts forever.
It looks like that MDN snippet you linked to was recently updated in the last few days to say "unicode" instead of "UTF-16". From what I understand, (git history says it happened 18 days ago, but I'm not sure how often MDN updates its content from the git repo, so perhaps you were viewing the older content?). From what I understand, codePointAt() returns a number that's assigned to a specific unicode character (like a unique id for that character), which is irrelevant to however the engine may be encoding that specific character.