For an exported function, string-in, string-out API is awesome because it’s simple. The problem happens later when you want to add more to the output, for example, a log with time spent. Or an alternative output, like locations of string indexes. Or the version
from package.json.
If we could send a message to the past ourselves, we would send “always return a plain object from a function, never return a string”.
Chummy API would be to keep string-in, string-out function, but switch between different outputs using options object flags. Like we did with opts.returnRangesOnly
on previous versions of string-strip-html
.
Today we’ve released v.5 of string-strip-html
to end the chumminess.
A plain object is returned now:
stripHtml("abc<a>click me</a>def");
// => {
// log: {
// timeTakenInMilliseconds: 6
// },
// result: "abc click me def",
// ranges: [
// [3, 6, " "],
// [14, 18, " "],
// ],
// allTagLocations: [
// [3, 6],
// [14, 18],
// ],
// filteredTagLocations: [
// [3, 6],
// [14, 18],
// ],
// }
Why change what’s returned, upon user’s request, when we can return everything?
We removed opts.returnRangesOnly
— no need to choose — you always get everything now.
Function’s output is a plain object now, containing:
- a cleaned string (considering
opts.ignoreTags
andopts.onlyStripTags
) - gathered ranges, used to produce cleaned string (considering
opts.ignoreTags
andopts.onlyStripTags
) - tag locations of all spotted HTML tags IGNORING the whitelist/blacklist
opts.ignoreTags
andopts.onlyStripTags
- locations of filtered HTML tags (considering
opts.ignoreTags
andopts.onlyStripTags
) - plus, some statistics
New additions
allTagLocations
reports simple from-to string index locations of all detected tags — it can be used for syntax highlighting, for example. It’s different from ranges
output which contains whitespace corrections which are meant to be applied onto a string.
log
is handy for the perf investigations or in GUI web apps.
string-strip-html
migration instructions from v.4.x
to v.5
TLDR: Grab the key you need from an output object.
before, v.4
:
const result = stripHtml("abc<a>click me</a>def");
console.log(result);
// => abc click me def
now, v.5
— destructure what you need:
const { result } = stripHtml("abc<a>click me</a>def");
console.log(result);
// => abc click me def
If you need ranges, now they’re always returned:
before, v.4
:
const result = stripHtml("abc<a>click me</a>def", {
returnRangesOnly: true,
});
console.log(result);
// => [[3, 6, " "], [14, 18 ," "]]
now, v.5
:
const { ranges } = stripHtml("abc<a>click me</a>def");
console.log(ranges);
// => [[3, 6, " "], [14, 18 ," "]]
opts.filteredTagLocations
While allTagLocations
contains locations of all HTML tags, the filteredTagLocations
takes into consideration opts.ignoreTags
and opts.onlyStripTags
. This way, you can, for example, ask program to strip only tr
tags, but then you actually grab the indexes of their locations:
const stripHtml = require("string-strip-html");
const input = `<table width="100">
<tr>
<td>
<table width="100">
<tr>
<td>
This is content.
</td>
</tr>
</table>
</td>
</tr>
</table>`;
const { filteredTagLocations } = stripHtml(input, {
onlyStripTags: ["tr"],
});
console.log("Here are TR tags: ${JSON.stringify(filteredTagLocations, null, 4)}")
// => [
[22, 26],
[70, 74],
[143, 148],
[176, 181],
]
const gatheredExtractedTagStrings = [];
filteredTagLocations.forEach(([from, to]) => {
gatheredExtractedTagStrings.push(input.slice(from, to));
});
console.log(JSON.stringify(gatheredExtractedTagStrings, null, 4));
// => [`<tr>`, `<tr>`, `</tr>`, `</tr>`]
For even more control over the result, use opts.cb
.