length(s)
return the number of characters in string s, endof(s) returns the last index of
the string s.
What is exact difference?
You can see actual difference when you
are working with Unicode characters whose Unicode point value is > 1 byte. Julia
supports all Unicode characters. Unicode characters are specified by using \u
(or) \U.
For example,
julia> s = "\u2900 x \U2903 y" "⤀ x ⤃ y"
All Non-ASCII characters are encoded
using UTF-8 encoding. UTF-8 is variable length encoding which uses 8-bit code
units. UTF-8 encodes all the Unicode characters using one to four 8-bit bytes.
Unicode points with lower numeral number are encoded with fewer bytes.
julia> s = "\u2900x\u2903y" "⤀x⤃y" julia> s[1] '⤀' julia> s[2] ERROR: UnicodeError: invalid character index in next at /Applications/Julia-0.4.1.app/Contents/Resources/julia/lib/julia/sys.dylib in getindex at strings/basic.jl:37 julia> s[3] ERROR: UnicodeError: invalid character index in next at /Applications/Julia-0.4.1.app/Contents/Resources/julia/lib/julia/sys.dylib in getindex at strings/basic.jl:37 julia> s[4] 'x'
Observe the above code snippet, s[1]
point to the symbol ‘⤀’
which takes three bytes to represent, so s[2] and s[3] are invalid indexes for
string s.
julia> s = "\u2900x\u2903y" "⤀x⤃y" julia> length(s) 4 julia> endof(s) 8
No comments:
Post a Comment