Java – Lazy (greedy) uses regular expressions to match multiple groups

Lazy (greedy) uses regular expressions to match multiple groups… here is a solution to the problem.

Lazy (greedy) uses regular expressions to match multiple groups

I want to grab the content tag of any value between the <tag></tag> pairs.

<tag>
This is one block of text
</tag>

<tag>
This is another one
</tag>

The regular expression I came up with is

/<tag>(.*)</tag>/m

Although, it looks greedy and is capturing everything inside the parentheses until the last </tag >. I want it to be as lazy as possible so that every time it sees the closing tag, it treats it as a matching group and starts over.

How do I write a regular expression so that I can get multiple matches in a given scenario?

I

have included an example of what I described in the following link

http://rubular.com/r/JW5M3rnqIE

Note: This is not XML, nor is it really based on any existing standard format. I don’t need anything complicated like a full-fledged library with a nice parser.

Solution

Use the regular expression pattern:

/<tag>(.*?) <\/tag>/im

Inertia (not greedy) is .*?, not .*.

To find multiple occurrences, use:

string.scan(/<tag>(.*?) <\/tag>/im) 

Related Problems and Solutions