Answers, is the string input string more an HTML or XHTML (or neither)

§ Example

const detect = require("detect-is-it-html-or-xhtml");
'<img src="some.jpg" width="zzz" height="zzz" border="0" style="display:block;" alt="zzz"/>'
// => 'xhtml'

§ Purpose

Feed the string into this library. If it's more of an HTML, it will output a string "html". If it's more of an XHTML, it will output a string xhtml. If your code doesn't contain any tags, or it does, but there is no doctype, and it's impossible to distinguish between the two, it will output null.

§ API - Input

This package exports a function:


Input argumentTypeObligatory?Description
htmlAsStringStringyesString, hopefully containing some HTML code

If the input is not String type, this package will throw an error. If the input is missing completely, it will return null.

§ API - Output

String or null'html', 'xhtml' or nullIdentified type of your input

§ Under the hood

The algorithm is the following:

  1. Look for doctype. If recognised, Bob's your uncle, here's your answer.
  2. IF there's no doctype or it's messed up beyond recognition, DO scan all singleton tags (<img>, <br> and <hr>) and see which type the majority is (closed or not closed).
  3. In a rare case when there is an equal amount of both closed and unclosed tags, lean for html.
  4. If (there are no tags in the input) OR (there are no doctype tags and no singleton tags), return null.

§ Licence

MITopens in a new tab

Copyright © 2015–2020 Roy Revelt and other contributors

