a. Most traditional programming
languages like C, C++, Java use eager evaluation strategy, where an expression
is evaluated as soon as it is bound to a variable, argument to a function etc.,
Lazy evaluation is completely opposite to eager evaluation, where expression is
evaluated when it is required.
For example,
let fun x y = x
Above function returns the value of x,
it don’t depend on the value y. No point in evaluating the second argument. In
case of eager evaluation strategy, both arguments x and y are evaluated, before
calling the function. But lazy programming evaluates the value only when it is
required, so it don’t evaluate the second argument at all in this case.
Prelude> let fun x y = x Prelude> Prelude> fun 10 20 10 Prelude> fun 10 [1..] 10 Prelude> fun 10 (2^123456789) 10 Prelude>
Observe above snippet,
fun 10 [1..] : Second argument is
infinite list, eager programming languages end up in calculating the infinite
list, where as lazy programs return first argument and exit.
fun 10 (2^123456789) : Second argument
is 2^123456789, eager programming languages calculate the value 2^123456789 even
though it is not required, where as lazy programs return first argument and
exit.
b. One advantage of eager evaluation is
that it eliminates the need to track and schedule the evaluation of
expressions. Where as lazy evaluation needs to maintain track of the
expressions. Haskell keep track of the expressions using thunk. I explained
about this in my previous post “How Lazy
evaluation works in Haskell”.
What
is thunk?
Thunk represents a value that yet to be
evauated. Haskell don’t evaluate thunk unless it has to.
Prelude> let fun x y = x Prelude> fun 10 [1..] 10
In above case
thunk represents [1..], since it is not used any where, Haskell don’t evaluate
this infinite list.
Problem with thunks
If you keep
building complex expressions to solve later, it keeps on consuming memory,
obviously affects performance and you may end up in memory overflow error.
foldl (+) 0 [1, 2, 3, 4] ==> foldl (+) (0 + 1) [2, 3, 4] ==> foldl (+) ((0 + 1) + 2) [3, 4] ==> foldl (+) (((0 + 1) + 2) + 3) [4] ==> foldl (+) ((((0 + 1) + 2) + 3) + 4) [] ==> (((0 + 1) + 2) + 3) + 4 ==> ((1 + 2) + 3) + 4 ==> (3 + 3) + 4 ==> 6 + 4 ==> 10
Explicit strictness
To solve above
kind of problems, Haskell provides a way to enforce strict evaluation on
expressions, instead of maintaining long thunk.
By using
bang(!) operator, we can make an argument strict explicitly.
For example, let me try to show you the memory
consumption of a program which calculates the sum of elements of a list.
Without Bang pattern
listUtil.hs
sumList :: [Integer] -> Integer sumList xs = sumOfEle xs 0 sumOfEle :: [Integer] -> Integer -> Integer sumOfEle [] acc = acc sumOfEle (x:xs) acc = sumOfEle xs (acc+x) main = print (sumList [1..1000000])
Following
commands are used to calculate the memory usage of above program.
ghc listUtil.hs
-rtsopts
./listUtil +RTS
-s -h -i0.001
$ ghc listUtil.hs -rtsopts [1 of 1] Compiling Main ( listUtil.hs, listUtil.o ) Linking listUtil ... $ ./listUtil +RTS -s -h -i0.001 500000500000 161,344,208 bytes allocated in the heap 1,533,989,880 bytes copied during GC 65,033,816 bytes maximum residency (54 sample(s)) 776,072 bytes maximum slop 98 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 266 colls, 0 par 0.090s 0.094s 0.0004s 0.0010s Gen 1 54 colls, 0 par 2.510s 2.587s 0.0479s 0.1063s INIT time 0.000s ( 0.001s elapsed) MUT time 0.049s ( 0.053s elapsed) GC time 2.600s ( 2.681s elapsed) EXIT time 0.000s ( 0.000s elapsed) Total time 2.650s ( 2.734s elapsed) %GC time 98.1% (98.0% elapsed) Alloc rate 3,264,689,261 bytes per MUT second Productivity 1.9% of total user, 1.8% of total elapsed
Now update the
program using bangpattern, and rerun the profiling.
listUtil.hs
{-# LANGUAGE BangPatterns #-} sumList :: [Integer] -> Integer sumList xs = sumOfEle xs 0 sumOfEle :: [Integer] -> Integer -> Integer sumOfEle [] acc = acc sumOfEle (x:xs) !acc = sumOfEle xs (acc+x) main = print (sumList [1..1000000])
Re run
profiling again.
$ ./listUtil +RTS -s -h -i0.001 500000500000 128,051,920 bytes allocated in the heap 157,128 bytes copied during GC 44,312 bytes maximum residency (34 sample(s)) 21,224 bytes maximum slop 1 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 213 colls, 0 par 0.000s 0.000s 0.0000s 0.0000s Gen 1 34 colls, 0 par 0.002s 0.005s 0.0002s 0.0035s INIT time 0.000s ( 0.001s elapsed) MUT time 0.029s ( 0.032s elapsed) GC time 0.002s ( 0.006s elapsed) EXIT time 0.000s ( 0.000s elapsed) Total time 0.032s ( 0.038s elapsed) %GC time 6.2% (15.4% elapsed) Alloc rate 4,342,656,763 bytes per MUT second Productivity 93.5% of total user, 79.7% of total elapsed
Following table summarizes the memory
usage.
Name
|
With
out Bang pattern
|
With
Bang pattern
|
Bytes allocated in the heap
|
161,344,208
|
128,051,920
|
Bytes copied during GC
|
1,533,989,880
|
157,128
|
Bytes maximum residency
|
65,033,816
|
44,312
|
Bytes maximum slop
|
776,072
|
21,224
|
Total memory in use
|
98 MB
|
1 MB
|
No comments:
Post a Comment