Lazy (greedy) uses regular expressions to match multiple groups
I want to grab the content tag of any value between the <tag></tag>
pairs.
<tag>
This is one block of text
</tag>
<tag>
This is another one
</tag>
The regular expression I came up with is
/<tag>(.*)</tag>/m
Although, it looks greedy and is capturing everything inside the parentheses until the last </tag >
. I want it to be as lazy as possible so that every time it sees the closing tag, it treats it as a matching group and starts over.
How do I write a regular expression so that I can get multiple matches in a given scenario?
I
have included an example of what I described in the following link
http://rubular.com/r/JW5M3rnqIE
Note: This is not XML, nor is it really based on any existing standard format. I don’t need anything complicated like a full-fledged library with a nice parser.
Solution
Use the regular expression pattern:
/<tag>(.*?) <\/tag>/im
Inertia (not greedy) is .*?
, not .*.
To find multiple occurrences, use:
string.scan(/<tag>(.*?) <\/tag>/im)