Regex: match everything but:
- a string starting with a specific pattern (e.g. any - empty, too - string not starting with
foo
):
- Lookahead-based solution for NFAs:
- Negated character class based solution for regex engines not supporting lookarounds:
- a string ending with a specific pattern (say, no
world.
at the end):
- Lookbehind-based solution:
- Lookahead solution:
- POSIX workaround:
- a string containing specific text (say, not match a string having
foo
):
- Lookaround-based solution:
- POSIX workaround:
- a string containing specific character (say, avoid matching a string having a
|
symbol):
- a string equal to some string (say, not equal to
foo
):
- a sequence of characters:
- a certain single character or a set of characters:
Demo note: the newline
is used inside negated character classes in demos to avoid match overflow to the neighboring line(s). They are not necessary when testing individual strings.
Anchor note: In many languages, use A
to define the unambiguous start of string, and z
(in Python, it is
, in JavaScript, $
is OK) to define the very end of the string.
Dot note: In many flavors (but not POSIX, TRE, TCL), .
matches any char but a newline char. Make sure you use a corresponding DOTALL modifier (/s
in PCRE/Boost/.NET/Python/Java and /m
in Ruby) for the .
to match any char including a newline.
Backslash note: In languages where you have to declare patterns with C strings allowing escape sequences (like
for a newline), you need to double the backslashes escaping special characters so that the engine could treat them as literal characters (e.g. in Java, world.
will be declared as "world\."
, or use a character class: "world[.]"
). Use raw string literals (Python r'world'
), C# verbatim string literals @"world."
, or slashy strings/regex literal notations like /world./
.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…