Quick Take

import { strict as assert } from "assert";
import { splitByW } from "string-split-by-whitespace";

// Split by whitespace is easy - use native String.prototype.split()
assert.deepEqual("abc  def ghi".split(/\s+/), [

const source = `\n     \n    a\t \nb    \n      \t`;

// this program is nearly equivalent to regex-based split:
assert.deepEqual(source.split(/\s+/), [
assert.deepEqual(splitByW(source), ["a", "b"]);
// regex-based split needs more filtration but it's native solution


// this program allows to exclude certain index ranges:
  splitByW("a b c d e", {
    ignoreRanges: [[0, 2]], // that's "a" and space after it
  ["b", "c", "d", "e"]


When String.split(/\s+/) is not enough, for example, when you need to exclude certain substrings, this program will help.

It splits the string by whitespace — definition of "whitespace" being "anything that trims to zero-length" — that's tabs, line breaks (CR and LF), space character and raw non-breaking space. There are quite few Unicode characters across the whole Unicode range.


splitByW(str, [opts])

In other words, it's a function which takes two input arguments, second-one being optional (marked by square brackets).

API - Input

Input argument Type Obligatory? Description
str String yes Source string upon which to perform the operation
opts Plain object no Optional Options Object, see below for its API

An Optional Options Object

Optional Options Object's key Type of its value Default Description
ignoreRanges Array of zero or more range arrays [] Feed zero or more string slice ranges, arrays of two natural number indexes, like [[1, 5], [6, 10]]. Algorithm will not include these string index ranges in the results.

The opts.ignoreRanges can be an empty array, but if it contains anything else then arrays inside, error will be thrown.

API - Output

Program returns array of zero or more strings. Empty string yields empty array.


Some basics first. When we say "heads" or "tails", we mean some templating literals that wrap a value. "heads" is frontal part, for example {{ below, "tails" is ending part, for example }} below:

Hi {{ firstName }}!

Now imagine that we extracted heads and tails and we know their ranges: [[3, 5], [16, 18]]. (If you select {{ and }} from in front of "Hi" to where each head and tail starts and ends, you'll see that these numbers match).

Now, imagine, we want to split Hi {{ firstName }}! into array ["Hi", "firstname", "!"].

For that we need to skip two ranges, those of a head and tail.

That's where opts.ignoreRanges become handy.

In example below, we used library string-find-heads-tails to extract the ranges of variables' heads and tails in a string, then split by whitespace:

const input = "some interesting {{text}} {% and %} {{ some more }} text.";
const headsAndTails = strFindHeadsTails(
["{{", "{%"],
["}}", "%}"]
).reduce((acc, curr) => {
acc.push([curr.headsStartAt, curr.headsEndAt]);
acc.push([curr.tailsStartAt, curr.tailsEndAt]);
return acc;
}, []);
const res1 = split(input, {
ignoreRanges: headsAndTails,
console.log(`res1 = ${JSON.stringify(res1, null, 4)}`);
// => ['some', 'interesting', 'text', 'and', 'some', 'more', 'text.']

You can ignore whole variables, from heads to tails, including variable's names:

const input = "some interesting {{text}} {% and %} {{ some more }} text.";
const wholeVariables = strFindHeadsTails(
["{{", "{%"],
["}}", "%}"]
).reduce((acc, curr) => {
acc.push([curr.headsStartAt, curr.tailsEndAt]);
return acc;
}, []);
const res2 = split(input, {
ignoreRanges: wholeVariables,
// => ['some', 'interesting', 'text.']

We need to perform the array.reduce to adapt to the string-find-heads-tails output, which is in format (index numbers are only examples):

headsStartAt: ...,
headsEndAt: ...,
tailsStartAt: ...,
tailsEndAt: ...,

and with the help of array.reduce we turn it into our format:

(first example with res1)

[headsStartAt, headsEndAt],
[tailsStartAt, tailsEndAt],

(second example with res2)

[headsStartAt, tailsEndAt],


See it in the monorepo opens in a new tab, on GitHub.


To report bugs or request features or assistance, raise an issue on GitHub opens in a new tab.

Any code contributions welcome! All Pull Requests will be dealt promptly.


MIT opens in a new tab

Copyright © 2010–2021 Roy Revelt and other contributors

Related packages:

📦 detergent 8.0.1
Extracts, cleans and encodes text
📦 string-extract-class-names 7.0.1
Extracts CSS class/id names from a string
📦 string-match-left-right 8.0.1
Match substrings on the left or right of a given index, ignoring whitespace
📦 string-unfancy 5.0.1
Replace all n/m dashes, curly quotes with their simpler equivalents
📦 string-extract-sass-vars 3.0.1
Parse SASS variables file into a plain object of CSS key-value pairs
📦 string-remove-duplicate-heads-tails 6.0.1
Detect and (recursively) remove head and tail wrappings around the input string
📦 string-remove-thousand-separators 6.0.1
Detects and removes thousand separators (dot/comma/quote/space) from string-type digits