Why doesn’t Pattern.pattern() embed flags?
I’ve been working on regular expressions recently and noticed this.
Pattern pNoEmbed = Pattern.compile("[ a-z]+", Pattern.CASE_INSENSITIVE);
Pattern pEmbed = Pattern.compile("(?i)[ a-z]+");
This is the output of the pattern()
method that should return the pattern string. toString()
seems to return the same thing.
Neither is case-sensitive, so why doesn’t the first one (?i)?
If I want it, how do I get it other than “(?i)" + pattern?
System.out.println(pNoEmbed.pattern()); [ a-z]+
System.out.println(pEmbed.pattern()); (?i)[ a-z]+
For the sake of sanity, both match well.
String s = "hello World";
System.out.println(pNoEmbed.matcher(s).matches()); true
System.out.println(pEmbed.matcher(s).matches()); true
(Testing with Java 8).
Clearer:
I want to embed one regular expression in another regular expression
Pattern p1 = Pattern.compile("[ a-z]+", Pattern.CASE_INSENSITIVE);
Pattern p2 = Pattern.compile(p1.pattern() + "\\s+");
A bad example because I know I can do it
Pattern p2 = Pattern.compile(p1.pattern() + "\\s+", p1.flags());
But, basically, I want p2.pattern
() to be “(?i)[a-z]+\\s+"
Solution
Both have case-insensitivity, so why no (?i) in the first one?
Most directly, because Pattern.pattern().
Returns the regular expression from which this pattern was compiled.
I guess this begs the question of why there isn’t an extra or different way to return a regular expression string that represents a combination of the original regular expression and the applied flag. Only speculative answers are possible, but I observe
Pattern
also has a flags()
method through which flags can be retrieved. Using it with pattern() allows you to compile a new pattern that is valid and identicalto the original file, provided that the pattern does not modify the flag globally
(
see the issue comment for more information on this qualification).As you can imagine,
Pattern
users can distinguish between flags that are incorporated into a regular expression string and flags that are passed separately as flags.
And if I wanted it, how would I get it other than
"(?i)" + pattern
?
As far as I know, there is no built-in mechanism to get the regular expression string you want. However, you can build such a mechanism with the help of Pattern.flags().
However, the basic mode of operation of this mechanism may not be much different from what you describe.