Java – Why doesn’t Pattern.pattern() embed flags?

Why doesn’t Pattern.pattern() embed flags?… here is a solution to the problem.

Why doesn’t Pattern.pattern() embed flags?

I’ve been working on regular expressions recently and noticed this.

Pattern pNoEmbed = Pattern.compile("[ a-z]+", Pattern.CASE_INSENSITIVE);
Pattern pEmbed = Pattern.compile("(?i)[ a-z]+");

This is the output of the pattern() method that should return the pattern string. toString() seems to return the same thing.

Neither is case-sensitive, so why doesn’t the first one (?i)?
If I want it, how do I get it other than “(?i)" + pattern?

System.out.println(pNoEmbed.pattern());  [ a-z]+
System.out.println(pEmbed.pattern());  (?i)[ a-z]+

For the sake of sanity, both match well.

String s = "hello World";
System.out.println(pNoEmbed.matcher(s).matches());  true
System.out.println(pEmbed.matcher(s).matches());  true

(Testing with Java 8).


Clearer:

I want to embed one regular expression in another regular expression

Pattern p1 = Pattern.compile("[ a-z]+", Pattern.CASE_INSENSITIVE);
Pattern p2 = Pattern.compile(p1.pattern() + "\\s+");

A bad example because I know I can do it

Pattern p2 = Pattern.compile(p1.pattern() + "\\s+", p1.flags());

But, basically, I want p2.pattern() to be “(?i)[a-z]+\\s+"

Solution

Both have case-insensitivity, so why no (?i) in the first one?

Most directly, because Pattern.pattern().

Returns the regular expression from which this pattern was compiled.

I guess this begs the question of why there isn’t an extra or different way to return a regular expression string that represents a combination of the original regular expression and the applied flag. Only speculative answers are possible, but I observe

  • Pattern also has a flags() method through which flags can be retrieved. Using it with pattern() allows you to compile a new pattern that is valid and identical to the original file, provided that the pattern does not modify the flag globally (see the issue comment for more information on this qualification).

  • As you can imagine, Pattern users can distinguish between flags that are incorporated into a regular expression string and flags that are passed separately as flags.

And if I wanted it, how would I get it other than "(?i)" + pattern?

As far as I know, there is no built-in mechanism to get the regular expression string you want. However, you can build such a mechanism with the help of Pattern.flags(). However, the basic mode of operation of this mechanism may not be much different from what you describe.

Related Problems and Solutions