Matching a backslash, Not matching string and Matching with a regex literal


Not matching a given string

To match something that does not contain a given string, one can use negative lookahead:

Regex syntax: (?!string-to-not-match)

Example:

//not matching "popcorn"
String regexString = "^(?!popcorn).*$";
System.out.println("[popcorn] " + ("popcorn".matches(regexString) ? "matched!" : "nope!"));
System.out.println("[unicorn] " + ("unicorn".matches(regexString) ? "matched!" : "nope!"));

Output:

[popcorn] nope!
[unicorn] matched!

Matching with a regex literal

If you need to match characters that are a part of the regular expression syntax you can mark all or part of the pattern as a regex literal.

\Q marks the beginning of the regex literal. \E marks the end of the regex literal.

// the following throws a PatternSyntaxException because of the un-closed bracket
"[123".matches("[123");
 
// wrapping the bracket in \Q and \E allows the pattern to match as you would expect.
"[123".matches("\\Q[\\E123"); // returns true

An easier way of doing it without having to remember the \Q and \E escape sequences is to use Pattern.quote()

"[123".matches(Pattern.quote("[") + "123"); // returns true

Matching a backslash

If you want to match a backslash in your regular expression, you'll have to escape it

Backslash is an escape character in regular expressions. You can use '\\' to refer to a single backslash in a regular expression.

However, backslash is also an escape character in Java literal strings. To make a regular expression from a string literal, you have to escape each of its backslashes. In a string literal '\\\\' can be used to create a regular expression with '\\', which in turn can match '\'

For example, consider matching strings like "C:\dir\myfile.txt". A regular expression ([A-Za-z]):\\(.*) will match, and provide the drive letter as a capturing group. Note the doubled backslash.

To express that pattern in a Java string literal, each of the backslashes in the regular expression needs to be escaped.

String path = "C:\\dir\\myfile.txt";
System.out.println( "Local path: " + path ); // "C:\dir\myfile.txt"
 
String regex = "([A-Za-z]):\\\\.*"; // Four to match one
System.out.println("Regex: " + regex ); // "([A-Za-z]):\\(.*)"
 
Pattern pattern = Pattern.compile( regex );
Matcher matcher = pattern.matcher( path );
if ( matcher.matches()) {
	System.out.println( "This path is on drive " + matcher.group( 1 ) + ":.");
	// This path is on drive C:.
}

If you want to match two backslashes, you'll find yourself using eight in a literal string, to represent four in the regular expression, to match two.

String path = "\\\\myhost\\share\\myfile.txt";
System.out.println( "UNC path: " + path ); // \\myhost\share\myfile.txt"
 
String regex = "\\\\\\\\(.+?)\\\\(.*)"; // Eight to match two
System.out.println("Regex: " + regex ); // \\\\(.+?)\\(.*)
 
Pattern pattern = Pattern.compile( regex );
Matcher matcher = pattern.matcher( path );
 
if ( matcher.matches()) {
	System.out.println( "This path is on host '" + matcher.group( 1 ) + "'.");
	// This path is on host 'myhost'.
}

Basic Programs