Regular Expression - ignore or exclude a specific word, find everything else
Posted in development on March 24, 2021 by Adrian Wyssmann ‐ 2 min read
I want to use a regular expression to exclude a complete word. I need this for a particular situation which I explain further
Problem
As part of Implementing a vulnerability Waiver Process for infected 3rd party libraries I have a jira transition dialog, which excepts the user to set some values. There are two drop-down fields or as JIRA calls it “Select List (single choice)”. These always present a value None
in case nothing is selected.
In order to ensure, that when doing a transition to a specific state, a proper value is selected we use jira-validators. These validators support regular expressions, so the question is now, how I ensure that the selected value is not None
:
Solution
Some searching in the web I found a solution in the regular-expressions-cookbook - which sample is readable. So the solution is
The result is a proper evaluation of the value in the dialog:
Explain the details
As explained in regular-expressions-cookbook and while looking at the regular-expressions.info you can understand why the above solution works:
Typing a caret
^
after the opening square bracket negates the character class. The result is that the character class matches any character that is not in the character class.The issue with this is the part highlighted: It matches any character, so using
[^None]
ignores anything containingN
,o
,n
ande
- but we care about the whole word.\b
allows you to perform a “whole words only” search using a regular expression in the form of\bword\b
The issue with that is that
\b[^None]\w+\b
is still looking at the character class thus ignoring any word that containsN
,o
,n
ande
Similar to positive lookahead, except that negative lookahead only succeeds if the regex inside the lookahead fails to match.
So the final solution using the techniques mentioned above
\b
asserts the position at a word boundary(?!
not followed byNone
the word we want to “ignore” i.e. should not match\b
asserts the position at a word boundary)
ends the negative lookahead\w+
still match anything other