Greedy
|
Reluctant
|
Possessive
|
Description
|
X?
|
X??
|
X?+
|
X, once or not at all
|
X*
|
X*?
|
X*+
|
X, zero or more times
|
X+
|
X+?
|
X++
|
X, one or more times
|
X{n}
|
X{n}?
|
X{n}+
|
X, exactly n times
|
X{n,}
|
X{n,}?
|
X{n,}+
|
X, at least n times
|
X{n,m}
|
X{n,m}?
|
X{n,m}+
|
X, at least n but not more than m times
|
import java.util.regex.*; import java.io.*; public class RegExHarness { public static void main(String args[]) throws IOException{ BufferedReader br = new BufferedReader(new InputStreamReader(System.in)); Pattern pattern; Matcher matcher; String regex; String text; System.out.println("Enter the string"); text = br.readLine(); while(true){ try{ System.out.println("Enter Regular Expression"); regex = br.readLine(); pattern = Pattern.compile(regex); matcher = pattern.matcher(text); while (matcher.find()) { System.out.print("I found the text " + matcher.group()); System.out.print(" starting at index " + matcher.start()); System.out.println(" Ending at index " + matcher.end()); } } catch(IOException e){ System.out.println(e); } } } }
Greedy Quantifier
(Longest Match)
Greedy quantifier gives you the longest match in the given string. For
Example for the input '12345', the valid values for '\d+' are 1, 12, 123, 1234,
12345 etc., But the Greedy quantifier gives you the longest match '12345'. Run
the above program with input string '12345' like below.
Enter the string
12345
Enter Regular Expression
\d+
I found the text 12345 starting at index 0 Ending at index 5
For the input string 'xfooxxxxxxfoo' and regular expression '.*foo'.
Enter the string
xfooxxxxxxfoo
Enter Regular Expression
.*foo
I found the text xfooxxxxxxfoo starting at index 0 Ending at index 13
A greedy quantifier first matches as much as possible. So the .*
matches entire string. Then the matcher tries to match the 'f' following, but
there are no characters left. So it "backtracks", making the greedy
quantifier match one less thing. That still doesn't match the f in the regex,
so it "backtracks" one more step, making the greedy quantifier match
one less thing again (leaving the "oo" at the end of the string
unmatched). That still doesn't match the 'f' in the regex, so it backtracks one
more step (leaving the "foo" at the end of the string unmatched).
Now, the matcher finally matches the 'f' in the regex, and the 'o' and the next
'o' are matched too. So it return entire string as output for the given regular
expression.
Reluctant
quantifier
Reluctant quantifier starts by first consuming
'nothing'. It match as little as possible and once it finds match, it starts
again the same process, until the string exhausted.
Enter the string
xxxfooxxfoo
Enter Regular Expression
.*?foo
I found the text xxxfoo starting at index 0 Ending at
index 6
I found the text xxfoo starting at index 6 Ending at
index 11
Enter the string
12345
Enter Regular Expression
\d+?
I found the text 1 starting at index 0 Ending at index 1
I found the text 2 starting at index 1 Ending at index 2
I found the text 3 starting at index 2 Ending at index 3
I found the text 4 starting at index 3 Ending at index 4
I found the text 5 starting at index 4 Ending at index 5
Possessive
Quantifier
A possessive quantifier is just like the greedy
quantifier, but it doesn't backtrack.
Enter the string
xxxfooxxfoo
Enter Regular Expression
.*+foo
Enter Regular Expression
In the above case
it starts out with .* matching the entire string, leaving nothing
unmatched. Then there is nothing left for it to match with the 'f' in the
regex. Since the possessive quantifier doesn't backtrack, the match fails
there.
Enter the string
\d+?
Enter Regular Expression
\d+?
Enter Regular Expression
No comments:
Post a Comment