Friday, 19 February 2016

Julia: Regular expression using match function

In previous post, I explained regular expressions using ismatch function, ismatch function return true if the given regex matches the string, else false. Some times you require more information like how the regex matched. You can capture how the regex matched using match function.

match(r::Regex, s::AbstractString[, idx::Integer[, addopts]])
Search for the first match of the regular expression r in s, match() returns 'nothing' regex is not matched, else it returns RegexMatch object.

Argument
Description
Regex
Reguar expression
s
String to match against regular expression
idx
It is optional, specifies an index at which to start the search.


regEx.jl
m = match(r"llo", "Hello, How are you, stello")
if m == nothing
  println("not a comment")
else
  println(m)
end

$ julia regEx.jl
RegexMatch("llo")


You can specify optional argument idx, which specifies an index at which to start the search.
julia> m = match(r"[0-9]", "Hello 1, i met 2 asked about 3 and 4", 1)
RegexMatch("1")

julia> m = match(r"[0-9]", "Hello 1, i met 2 asked about 3 and 4", 10)
RegexMatch("2")

julia> m = match(r"[0-9]", "Hello 1, i met 2 asked about 3 and 4", 15)
RegexMatch("2")

julia> m = match(r"[0-9]", "Hello 1, i met 2 asked about 3 and 4", 25)
RegexMatch("3")

julia> m = match(r"[0-9]", "Hello 1, i met 2 asked about 3 and 4", 32)
RegexMatch("4")


As I said match() function return RegexMatch object. RegExMatch object contain following fields.

  match    :: SubString{UTF8String}
  captures :: Array{Union{SubString{UTF8String},Void},1}
  offset   :: Int64
  offsets  :: Array{Int64,1}
  regex    :: Regex

Field
Description
match
Return the entire substring matched
captures
Return the captured substrings as an array of strings
offset  
Return the offset at which the whole match begins
offsets 
Return the offsets of the captured substrings as a vector.
regex   
Return the regular expression

julia> m = match(r"(w|x|y)(z[a-z])(a)", "wzbawza")
RegexMatch("wzba", 1="w", 2="zb", 3="a")

julia> m.match
"wzba"

julia> m.captures
3-element Array{Union{SubString{UTF8String},Void},1}:
 "w" 
 "zb"
 "a" 

julia> m.offset
1

julia> m.offsets
3-element Array{Int64,1}:
 1
 2
 4

julia> m.regex
r"(w|x|y)(z[a-z])(a)"







Previous                                                 Next                                                 Home

No comments:

Post a Comment