In this post, I am going to explain how
Haskell programs are executed on a machine. By knowing the execution sequence
on a machine, you can able to write efficient programs. As a functional
programming language, all Haskell implementations are lazy. In the following
sections, I am going to explain about how the reduction of expression
performed, Normal form, lazy evaluation etc.,
Reductions
Functions are the first class entities in
Haskell. Suppose, you want to implement (a+b)2 function, your code looks like
below.square x = x * x twiceAB a b = 2 * a * b aPlusbSquare a b = (square a) + (square b) + (twiceAB a b) aPlusbSquare a b = (square a) + (square b) + (twiceAB a b) aPlusbSquare 2 3 = (square 2) + (square 3) + (twiceAB 2 3) = (2 * 2) + (square 3) + (twiceAB 2 3) = 4 + (square 3) + (twiceAB 2 3) = 4 + (3 * 3) + (twiceAB 2 3) = 4 + 9 + (twiceAB 2 3) = 4 + 9 + (2 * 2 * 3) = 4 + 9 + (4 * 3) = 4 + 9 + 12 = 13 + 12 = 25
As you observe, above evaluation
sequence, every reduction replaces a sub expression called reducible expression
(redex). An expression without reducible expression is said to be in normal
form.
Following are standard definitions
a. redex: Reducible Expression
b. leftmost
redex : is the one
whose abstraction is textually to the left of all other redexes.
For example, in
the exporession (x+2)(y+10), left most redex is (x+2)
c. rightmost
redex : is the one
whose abstraction is textually to the right of all other redexes.
For example, in
the exporession (x+2)(y+10), left most redex is (y+10)
d. Innermost
Redex: is one that
contains no other redex.
e. Outermost
Redex: is one that is
contained within no other redex.
For example, In
the expression x * ((x+ (y * (z + 10)))), Innermost Redex is z+10, outermost
exprsession is x * (....)
Reduction
strategies
Reduction strategy specifies a way that
a complex expression is reduced to a simple expression by successive reduction
steps. Following are the popular reduction strategies.
Strategy
|
Description
|
Full beta reductions
|
Any redex can be reduced at any time.
It don’t follow any particular reduction strategy.
|
Applicative order
|
The rightmost, innermost redex is
always reduced first. In case of functions evaluation, function arguments are
evaluated first.
|
Normal order
|
The leftmost, outermost redex is
always reduced first.
|
Call by name
|
As normal order, but no reductions are
performed inside abstractions. The arguments to a function are not evaluated
before the function is called — rather, they are substituted directly into
the function body
|
Call by value
|
Only the outermost redexes are reduced.
A redex
is reduced only when its right hand side has reduced to a value. The argument
expression is evaluated, and the resulting value is bound to the
corresponding variable in the function
|
Call by need
|
Perform reduction only when it is
needed, also called as lazy evaluation.
|
Is
reduction strategies matter?
Yes, of course reduction strategies
matter.
For example, will try to evaluate the
expression ‘fst (square 10, square 20)’ using innermost, outermost reduction
strategies.
If you use innermost reduction strategy,
the evaluation looks like below.
fst (square 10, square 20) = fst (square 10, 20 * 20)
=
fst (square 10, 400)
=
fst (10*10, 400)
=
fst (100, 400)
=
100
If you use outermost reduction strategy,
the evaluation looks like below.
fst (square 10, square 20) = square 10
=
10 * 10
=
100
In above scenario, outermost reduction
takes lesser reductions that innermost reduction strategy, it is because, fst function
doesn’t require second argument evaluation.
Some times innermost reduction strategy
end up in performing unnecessary calculations, which are not required to get
final result.
For example,
infLoop = 1 + infLoop
fst (42, infLoop)
If you use innermost reduction strategy,
system endup in processing infLoop, which is not at all required to get the
first element of a tuple.
That
means outermost reduction better than innermost reduction strategy?
No, it depends on the expression.
For example,
cube x = x * x * x
Using
innermost reduction strategy
cube (1+2+3) = cube (1 + 5)
=
cube (6)
= 6 * 6 * 6
=
36 * 6
=
216
Innermost reduction strategy requires 5
reduction steps to evaluate the expression.
Using
outermost reduction strategy
cube (1+2+3) = (1 + 2 + 3) * (1 + 2 + 3) * (1 + 2 + 3)
=
(3 + 3) * (1 + 2 + 3) * (1 + 2 + 3)
=
6 * (1 + 2 + 3) * (1 + 2 + 3)
=
6 * (3 + 3) * (1 + 2 + 3)
=
6 * 6 * (1 + 2 + 3)
=
36 * (1 + 2 + 3)
=
36 * (3 + 3)
=
36 * 6
=
216
Outermost reduction strategy requires 9
reduction steps to evaluate the expression; it is because we are performing
duplicate reduction ‘(1 + 2 + 3)’.
We can solve above problem of outermost
reduction using outermost graph reduction technique.
Outermost
graph reduction
Outermost graph reduction = Reduction +
Sharing
In outermost graph reduction, each
argument is evaluated at most once.
(1 + 2 + 3) = (3 + 3)
=
6
cube (1 + 2 + 3) = (1 + 2 + 3) * (1 + 2 + 3) * (1 + 2 + 3)
=
6 * 6 * 6
=
36 * 6
=
216
It evaluates the argument (1 + 2 + 3),
once and reuses it in other places. As I said Haskell uses lazy evaluation (don't
do anything unless you have to.), lets see how lazy evaluation works in
Haskell.
How
Lazy evaluation works in Haskell?
Before going to explain about lazy
evaluation, I want to explain about some examples of strict evaluation, which is
opposite to lazy. In strict evaluation strategy, arguments to a function are
evaluated before calling the function. You may ask me, what is wrong in this
approach. For example, Observe following Java language snippet.
// Perform some calculation and return the result long getData() { int data = 0; for (int i = 0; i < 1000000; i++) { data = data + rand.nextInt(); } return data; } boolean isDataValid(boolean b, long data) { return b && (data > 1000000); }
‘getData()’ function perform some calculation
and return some value. ‘isDataValid’ function takes two arguments and return whether
data is valid (or) not.
Suppose if you call the method ‘isDataValid(false,
getData());’, Do you think is it really required to evaluate the function
getData(), no it is not, Since first argument to the function isDataValid is
false, false && getData() is any way false, we no need to evaluate. But
when you run above snippet in strict languages like Java, getData () function
will be evaluated before calling isDataValid(false, getData()). In case of
functional programming, getData() will not be evaluated at all.
Lazy evaluation is an execution
strategy, which delays the evaluation of an expression until the value is
needed and which also avoids repeated evaluations of same expression (I
discussed about this in Outermost graph
reduction).
Technically Lazy evaluation = call-by-name
+ Sharing.
Sharing means if same expression is
repeated in multiple places, the value of the expression is calculated on
demand, saved the result and the result is used in multiple places whenever
required (We are eliminating reevaluating the same expression).
Prelude> let fun x y = x
Prelude> fun 10 [1..]
10
fun x y = x : It simply returns the
value of x, So when I called ‘fun 10
[1..]’ it return 10 immediately, It never evaluate the second argument
[1..], which is an infinite list.
When evaluating fun 10 [1..], the second
argument will simply be packaged up into a thunk without doing any actual
computation, and fun will be called immediately. Since fun never uses its
second argument the thunk will just be thrown away by the garbage collector.
What
is thunk?
Thunk represents a value that yet to be
evauated. Haskell don’t evaluate thunk unless it has to.
Prelude> let fun x y = x
Prelude> fun 10 [1..]
10
In above case thunk represents [1..],
since it is not used any where, Haskell don’t evaluate this infinite list.
Problem
with thunks
If you keep building complex expressions
to solve later, it keeps on consuming memory, obviously affects performance and
you may end up in memory overflow error.
foldl (+) 0 [1, 2, 3, 4]
==> foldl (+) (0 + 1) [2, 3, 4]
==> foldl (+) ((0 + 1) + 2) [3, 4]
==> foldl (+) (((0 + 1) + 2) + 3) [4]
==> foldl (+) ((((0 + 1) + 2) + 3) +
4) []
==> (((0 + 1) + 2) + 3) + 4
==> ((1 + 2) + 3) + 4
==> (3 + 3) + 4
==> 6 + 4
==> 10
Explicit
strictness
To solve above kind of problems, Haskell
provides a way to enforce strict evaluation on expressions, instead of
maintaining long thunk.
By using bang(!) operator, we can make an
argument strict explicitly.
For example, let me try to show you the memory
consumption of a program which calculates the sum of elements of a list.
Without
Bang pattern
listUtil.hs
sumList :: [Integer] -> Integer sumList xs = sumOfEle xs 0 sumOfEle :: [Integer] -> Integer -> Integer sumOfEle [] acc = acc sumOfEle (x:xs) acc = sumOfEle xs (acc+x) main = print (sumList [1..1000000])
Following
commands are used to calculate the memory usage of above program.
ghc listUtil.hs
-rtsopts
./listUtil +RTS
-s -h -i0.001
$ ghc listUtil.hs -rtsopts [1 of 1] Compiling Main ( listUtil.hs, listUtil.o ) Linking listUtil ... $ ./listUtil +RTS -s -h -i0.001 500000500000 161,344,208 bytes allocated in the heap 1,533,989,880 bytes copied during GC 65,033,816 bytes maximum residency (54 sample(s)) 776,072 bytes maximum slop 98 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 266 colls, 0 par 0.090s 0.094s 0.0004s 0.0010s Gen 1 54 colls, 0 par 2.510s 2.587s 0.0479s 0.1063s INIT time 0.000s ( 0.001s elapsed) MUT time 0.049s ( 0.053s elapsed) GC time 2.600s ( 2.681s elapsed) EXIT time 0.000s ( 0.000s elapsed) Total time 2.650s ( 2.734s elapsed) %GC time 98.1% (98.0% elapsed) Alloc rate 3,264,689,261 bytes per MUT second Productivity 1.9% of total user, 1.8% of total elapsed
Now update the
program using bangpattern, and rerun the profiling.
listUtil.hs
{-# LANGUAGE BangPatterns #-} sumList :: [Integer] -> Integer sumList xs = sumOfEle xs 0 sumOfEle :: [Integer] -> Integer -> Integer sumOfEle [] acc = acc sumOfEle (x:xs) !acc = sumOfEle xs (acc+x) main = print (sumList [1..1000000])
Re run
profiling again.
$ ./listUtil +RTS -s -h -i0.001 500000500000 128,051,920 bytes allocated in the heap 157,128 bytes copied during GC 44,312 bytes maximum residency (34 sample(s)) 21,224 bytes maximum slop 1 MB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 213 colls, 0 par 0.000s 0.000s 0.0000s 0.0000s Gen 1 34 colls, 0 par 0.002s 0.005s 0.0002s 0.0035s INIT time 0.000s ( 0.001s elapsed) MUT time 0.029s ( 0.032s elapsed) GC time 0.002s ( 0.006s elapsed) EXIT time 0.000s ( 0.000s elapsed) Total time 0.032s ( 0.038s elapsed) %GC time 6.2% (15.4% elapsed) Alloc rate 4,342,656,763 bytes per MUT second Productivity 93.5% of total user, 79.7% of total elapsed
Following table summarizes the memory
usage.
Name
|
With
out Bang pattern
|
With
Bang pattern
|
Bytes allocated in the heap
|
161,344,208
|
128,051,920
|
Bytes copied during GC
|
1,533,989,880
|
157,128
|
Bytes maximum residency
|
65,033,816
|
44,312
|
Bytes maximum slop
|
776,072
|
21,224
|
Total memory in use
|
98 MB
|
1 MB
|
No comments:
Post a Comment