Find all the matching substrings, not only the & ldquo; the most extensive & rdquo; a

advertisements

The code

String s = "y z a a a b c c z";
Pattern p = Pattern.compile("(a )+(b )+(c *)c");
Matcher m = p.matcher(s);
while (m.find()) {
    System.out.println(m.group());
}

prints

a a a b c c

which is right.

But logically, the substrings

a a a b c
a a b c c
a a b c
a b c c
a b c

match the regex too.

So, how can I make the code find those substrings too, i.e. not only the most extended one, but also its children?


You can use the reluctant qualifiers such as *? and +?. These match as little as possible, in contrast to the standard * and + which are greedy, i.e. match as much as possible. Still, this only allows you to find particular "sub-matches", not all of them. Some more control can be achieved using lookahead controlling non-capturing groups, also described in the docs. But in order to really find all sub-matches, you would probably have to do stuff yourself, i.e. build the automaton to which the regex corresponds and navigate it using custom code.