charcode-is-valid-xml-name-character1.10.64

Does a given character belong to XML spec's "Production 4 OR 4a" type (is acceptable for XML element's name)

§ Quick Take

import { strict as assert } from "assert";
import {
  isProduction4,
  isProduction4a,
  validFirstChar,
  validSecondCharOnwards,
} from "charcode-is-valid-xml-name-character";

// Spec: https://www.w3.org/TR/REC-xml/#NT-NameStartChar

assert.equal(isProduction4("Z"), true);
assert.equal(isProduction4("?"), false);

assert.equal(isProduction4a("?"), false);
assert.equal(isProduction4a("-"), true);

assert.equal(validFirstChar("a"), true);
assert.equal(validFirstChar("1"), false);

assert.equal(validSecondCharOnwards("a"), true);
assert.equal(validSecondCharOnwards("?"), false);

§ What is does

It returns a Boolean, is the given character the Production 4a opens in a new tab of XML spec, or in human terms, a possible ending character of an XML element.

This library is used to detect where (X)HTML element names end.

This article contains an in-depth explanation of the spec terminology: https://www.xml.com/pub/a/2001/07/25/namingparts.html — it helped me to get up to speed on this subject.

§ In practice

Let's say you are iterating through string, character-by-character and it contains (X)HTML code source. This library will evaluate any given character and tell, is it a valid character for an element name. You use this library to detect where element names end.

In the most simple scenario:

<img class="">
    ^     ^
    1     2

Characters space (1) and = (2) in the example above mark the ending of the element names (img and class). OK, so we know spaces and equals' are not allowed as element names and therefore mark their ending. Are there more of such characters? Oh yes. Quite a lot according to spec opens in a new tab what warrants a proper library dedicated only for that purpose.

§ API

Two functions - one to check requirements for first character, another to check requirements for second character and onwards. Both functions return a Boolean.

Function's namePurpose
isProduction4To tell, is this character suitable to be the first character
validFirstChar
isProduction4aTo tell, is this character suitable to be the second character and onwards
validSecondCharOnwards

§ isProduction4() / validFirstChar() - requirements for 1st char

XML spec production 4 opens in a new tab - the requirements for the first character of the XML element. It's more strict than requirements for the subsequent characters, see production 4a below.

Pass any character (including astral-one) into function isProduction4(), and it will respond with a Boolean, is it acceptable as first XML character (or not).

const {
isProduction4,
validFirstChar,
// isProduction4a,
} = require("charcode-is-valid-xml-name-character");

const res1 = isProduction4("-"); // or use validFirstChar(), the same
console.log("res1 = " + res1);
// => 'res1 = false <- minus is not allowed for first character

const res2 = isProduction4("z"); // or use validFirstChar(), the same
console.log("res2 = " + res2);
// => 'res2 = true

It consumes a single character (can be any Unicode character, including astral-one, comprising two surrogates). Returns Boolean - is it acceptable as the first character in XML element's name.

§ isProduction4a() / validSecondCharOnwards() - requirements for 2nd char onwards

XML spec production 4a opens in a new tab - the requirements for the second character onwards in XML element's name.

Pass any character (including astral-one) into function isProduction4a(), and it will respond with a Boolean, is it acceptable as second XML character and onwards (or not). Requirements are same as for the first character but a bit more permissive.

const {
// isProduction4,
isProduction4a,
} = require("charcode-is-valid-xml-name-character");

const res1 = isProduction4a("-"); // or use validSecondCharOnwards(), the same
console.log("res1 = " + res1);
// => 'res1 = true <---- minus is allowed for second character-onwards

const res2 = isProduction4a("z"); // or use validSecondCharOnwards(), the same
console.log("res2 = " + res2);
// => 'res2 = true

It consumes a single character (can be any Unicode character, including astral-one, comprising two surrogates). Returns Boolean - is it acceptable as the second or subsequent character in XML element's name.

§ Licence

MIT opens in a new tab

Copyright © 2010–2020 Roy Revelt and other contributors

Related packages:

📦 emlint 2.18.17
Pluggable email template code linter
📦 is-media-descriptor 1.2.19
Is given string a valid media descriptor (including media query)?
📦 is-relative-uri 1.0.20
Is given string a relative URI?