Haskell is a lazy language, but lazy
programming is not always good. Lazy evaluation needs to maintain track of the
expressions. Haskell keep track of the expressions using thunk. I explained
about this in my previous post “How Lazy
evaluation works in Haskell”.
What
is thunk?
Thunk represents a value that yet to be
evauated. Haskell don’t evaluate thunk unless it has to.
Prelude> let fun x y = x Prelude> fun 10 [1..] 10
In above case thunk represents [1..],
since it is not used any where, Haskell don’t evaluate this infinite list.
Problem
with thunks
If you keep building complex expressions
to solve later, it keeps on consuming memory, obviously affects performance and
you may end up in memory overflow error.
foldl (+) 0 [1, 2, 3, 4] ==> foldl (+) (0 + 1) [2, 3, 4] ==> foldl (+) ((0 + 1) + 2) [3, 4] ==> foldl (+) (((0 + 1) + 2) + 3) [4] ==> foldl (+) ((((0 + 1) + 2) + 3) + 4) [] ==> (((0 + 1) + 2) + 3) + 4 ==> ((1 + 2) + 3) + 4 ==> (3 + 3) + 4 ==> 6 + 4 ==> 10
Explicit
strictness
To solve above kind of problems, Haskell
provides a way to enforce strict evaluation on expressions, instead of
maintaining long thunk.
By using Bang pattern we can easily
write strict programs in Haskell.
For example, let me try to show you the
memory consumption of a program which calculates the sum of elements of a list.
Without
Bang pattern
listUtil.hs
sumList :: [Integer] -> Integer sumList xs = sumOfEle xs 0 sumOfEle :: [Integer] -> Integer -> Integer sumOfEle [] acc = acc sumOfEle (x:xs) acc = sumOfEle xs (acc+x) main = print (sumList [1..1000000])
Following commands are used to calculate
the memory usage of above program.
ghc listUtil.hs
-rtsopts
./listUtil +RTS
-s -h -i0.001
$ ghc listUtil.hs -rtsopts [1 of 1] Compiling Main ( listUtil.hs, listUtil.o ) Linking listUtil ... $ ./listUtil +RTS -s -h -i0.001 500000500000 161,344,208 bytes allocated in the heap 1,533,989,880 bytes copied during GC 65,033,816 bytes maximum residency (54 sample(s)) 776,072 bytes maximum slop 98 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 266 colls, 0 par 0.090s 0.094s 0.0004s 0.0010s Gen 1 54 colls, 0 par 2.510s 2.587s 0.0479s 0.1063s INIT time 0.000s ( 0.001s elapsed) MUT time 0.049s ( 0.053s elapsed) GC time 2.600s ( 2.681s elapsed) EXIT time 0.000s ( 0.000s elapsed) Total time 2.650s ( 2.734s elapsed) %GC time 98.1% (98.0% elapsed) Alloc rate 3,264,689,261 bytes per MUT second Productivity 1.9% of total user, 1.8% of total elapsed
Now update the
program using bangpattern, and rerun the profiling.
listUtil.hs
{-# LANGUAGE BangPatterns #-} sumList :: [Integer] -> Integer sumList xs = sumOfEle xs 0 sumOfEle :: [Integer] -> Integer -> Integer sumOfEle [] acc = acc sumOfEle (x:xs) !acc = sumOfEle xs (acc+x) main = print (sumList [1..1000000])
Re run profiling
again.
$ ./listUtil +RTS -s -h -i0.001 500000500000 128,051,920 bytes allocated in the heap 157,128 bytes copied during GC 44,312 bytes maximum residency (34 sample(s)) 21,224 bytes maximum slop 1 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 213 colls, 0 par 0.000s 0.000s 0.0000s 0.0000s Gen 1 34 colls, 0 par 0.002s 0.005s 0.0002s 0.0035s INIT time 0.000s ( 0.001s elapsed) MUT time 0.029s ( 0.032s elapsed) GC time 0.002s ( 0.006s elapsed) EXIT time 0.000s ( 0.000s elapsed) Total time 0.032s ( 0.038s elapsed) %GC time 6.2% (15.4% elapsed) Alloc rate 4,342,656,763 bytes per MUT second Productivity 93.5% of total user, 79.7% of total elapsed
Following table summarizes the memory
usage.
Name
|
With
out Bang pattern
|
With
Bang pattern
|
Bytes allocated in the heap
|
161,344,208
|
128,051,920
|
Bytes copied during GC
|
1,533,989,880
|
157,128
|
Bytes maximum residency
|
65,033,816
|
44,312
|
Bytes maximum slop
|
776,072
|
21,224
|
Total memory in use
|
98 MB
|
1 MB
|
No comments:
Post a Comment