The String.prototype.match
method, as specified in the ECMAScript spec:
If regexp is not an object whose [[Class]] property is
"RegExp",
it is replaced with the result of the
expressionnew RegExp(
regexp). Let string denote
the result of converting the this value to a string. Then do one
of the following:
- If regexp.
global
is false: Return
the result obtained by invokingRegExp.prototype.exec
(see 15.10.6.2) on regexp with
string as parameter.- If regexp.
global
is true:
Set theregexp.lastIndex
property to 0 and invoke
RegExp.prototype.exec
repeatedly until there is no
match. If there is a match with an empty string (in other words, if
the value of regexp.lastIndex
is left
unchanged), increment regexp.lastIndex
by 1.
Let n be the number of matches. If n=0, then the value returned isnull
; otherwise, the value returned
is an array with thelength
property set to n and properties 0 through n+1 corresponding
to the first elements of the results of all matching invocations ofRegExp.prototype.exec
.
The first thing to note: even if you don’t pass a regular expression to .match()
, it’ll be put through new RegExp()
. So, this:
'foo'.match('fo+'); |
… is functionally the same as these:
'foo'.match(/fo+/); 'foo'.match(RegExp('fo+')); 'foo'.match(new RegExp('fo+')); |
(yes, calling RegExp()
as a function instead of a constructor still creates a new instance.)
Another thing to take from the ECMAScript spec on .match()
is the following:
If n=0, then the value returned is
null
; otherwise, the value returned
is an array with thelength
property set to n and properties 0 through n+1 corresponding
to the first elements of the results of all matching invocations ofRegExp.prototype.exec
.
So, .match()
will only ever return a non-empty array or null
. Since its return value on a non-match (null
) is falsey, we can do some expression magic to make sure we get something useful. First consider how it’s normally done:
var match = fooString.match(/.../); if (match) { // ... act on the match } |
Depending on the circumstance this can be a nuisance. Plus, a positive match doesn’t always represent an important branch in your program, meaning that the if
statement is redundant in that it doesn’t add anything to the program, but instead panders to the requirement of the language.
Let’s say, for example, I want to extract the result_count
number from the string "result_count=23&err=1"
. If it doesn’t exist, I’d like it to default to, say, 5
.
This is the regular expression we’ll be using:
/result_count=(d+)/ |
The number we want will be in the first capture group:
thatString.match(/result_count=(d+)/)[1] |
That’s fine if we *know* that thatString
will contain a result_count and that the result_count is a number (matched by d+
), but what if it’s not? .match()
will return null
, which’ll make the last expression throw a TypeError, since null
has no property by the name of 1
.
And remember, if there is no match, we want resultCount
to default to 5
.
To get around this, and to avoid the use of extra unecessary vars, we can do:
( thatString.match(/result_count=(d+)/) || [,5] )[1] |
It even takes care of the default result count (5).
We can split the above expression like so:
( A || B )[1] |
The logical OR operator (||
) will behave like so: If A is truthy then A is returned. If A is falsey then B is evaluated and returned.
So, if our .match()
returns null
, which is falsey, then the OR operator evaluates the right hand side of the OR operation ([,5]
) which is an array and therefore truthy.
There are only two things that can return from this expression:
thatString.match(/result_count=(d+)/) || [,5] |
… either the match array from the .match()
method, or the array we specified on the right hand side, both of which have an item in the second slot ([...,HERE]
). So, we can bung it in parenthesis and access the second slot:
( thatString.match(/result_count=(d+)/) || [,5] )[1] |
This is what makes JS fun for me. Going mad with expressions!
Thanks for reading! Please share your thoughts with me on Twitter. Have a great day!
Nice! I hate when the regexp doesn’t match and the array element is not there making everything fail… Your solution is a good practice 😀 Thanks!
I used to use this technique all the time when accessing fields from the eBay web service. For instance, if I was pulling the prices from an item I’d have something like this:
instead of
Sometimes I’d have result sets where there could be about 5 optional fields, so this simplified things without having to write a separate function. It meant I could separate my logic from the parsing very easily. It was a trick I’d taught myself so I was really proud at the time, it’s one of the things that made me eager to refactor as much as possible 🙂 Of course, if there had been too many, I’d just write a function to handle it and let that function do the extra leg work to keep the code tight.
I do agree, though, that this is one of the sweeter sides to JS.
Genius, as always, James!
I love these little language gotchas, especially when they’re useful like this!
I used this pretty often with event global object for Internet Explorer.
The way javascript handles logical OR in assignment is one of my favorite things about the language. It allows you to do so many things more elegantly:
etc… good times.
If you wanted to make sure you were dealing with a number you can wrap it even more!
Or, Craig, you can do my favorite tick…
Nice post mate, enjoyed it!
Solid. Thanks so much James! I love this language more every day…
@MGoulart
Actually this is double redundant/bad:
var e = e || event;
because 1)
e
alreay exists in the scope and 2) you might be assigninge
toe
like this.Better (and perfect and the only right syntax) would be:
e || (e=event);
Stil bad would be:
e = e || event; // at least no var
@DataZombies
You forgot to remove the
, 10
. Now the comma operator comes into play and it always uses10
and never the regex result. Comma operator? Si.Great post