The main function
rEntDecode() is imported like this:
It’s a function which takes two input arguments:
Type: Plain object
|The Optional Options Object
The Optional Options Object has the following shape:
The Optional Options Object completely matches the he.js options as of
|If on, entities will be decoded as if they were in attribute values. If off (default), entities will be decoded as if they were in HTML text. Read more here.
|If on, entities that can cause parsing errors will cause
throws. Read more here.
Here are all defaults in one place for copying:
Function will return ranges — a
null or array of one or more range arrays:
You can import
It's a plain object:
The main function calculates the options to be used by merging the options you passed with these defaults.
You can import
The biggest pain to code and the main USP of this library is being able to recursively decode and give the result as ranges.
By recursively, we mean, the input string is decoded over and over until there’s no difference in the result between previous and last decoding. Practically, this means we can tackle the unlikely, but possible cases of double and triple encoded strings, for example, this is a double-encoded string:
&mdash;. The original m-dash was turned into
— on the first encoding round; then during second round its ampersand got turned into
& which lead to
By ranges we mean, the result is not a decoded string, but instructions — what to change in that string in order for the string to be decoded. Practically, this means, we decode and don’t lose the original character indexes. In turn, this means, we can gather more “instructions” (ranges) and join them later.
If you wonder, where’s
encode() in ranges, we don’t need it! When you traverse the string and gather ranges, you can pass each
code point grapheme (where emoji of length six should be counted “one”) through
he.js encode, compare “before” and “after” and if the two are different, create a new range for it.
decode() is not that simple because the input string has to be processed, you can’t iterate grapheme-by-grapheme (or character-by-character, if you don’t care about Unicode’s astral characters).