Wednesday, September 13, 2006

Erlang in 5 seconds.

There was recently a thread on erlang-questions that discussed how you would present Erlang in 5 seconds. In an unrelated thread it is possible Richard Carlsson managed to nail it:

In most respects, Erlang _is_ a concurrent, nondestructive Lisp with a Prolog-inspired syntax that focuses on pattern matching and rules.

For my own reference he original email:

Nick Linker wrote:
I wonder why Erlang is not Lisp? I mean why inventors of Erlang chose
to create its own language instead of creating just ERTS-specific
library for LISP (or at least Scheme)?

Here are some reasons why Lisp might not be a perfect match to
the problem they wanted to solve:
 - no built-in concurrency
 - destructive updates abound (in Scheme, too)
 - no pattern matching

And if you're going to fix those things, you might as well use a syntax
that feels more comfortable to you. In particular, pattern matching
makes function definitions and selective receives much more readable,
which I assume was an important goal for the kind of industrial
applications that Erlang was created for.

In most respects, Erlang _is_ a concurrent, nondestructive Lisp with
a Prolog-inspired syntax that focuses on pattern matching and rules.

 /Richard

The Monad Laws

Every now and then I come across a post that explains a concept so clearly it is inspiring. I'd like to thank Albert Lai for just such a post. Re: [Haskell-cafe] Monad laws

Deokhwan Kim  writes:
>
>What is the practical meaning of monad laws?
>
>  1. (return x) >>= f == f x
>  2. m >>= return == m
>  3. (m >>= f) >>= g == m >>= (\x -> f x >>= g)

I offer to re-write the laws in do-notation.  (Please view with a
fixed-width (non-proportional) font.)

1. do { x' <- return x            do { f x
      ; f x'               ==        }
      }

2. do { x <- m             ==     do { m
      ; return x }                   }

3. do { y <- do { x <- m          do { x <- m
                ; f x                ; do { y <- f x
                }          ==             ; g y
      ; g y                               }
      }                              }

                                  do { x <- m
                    using 3.14       ; y <- f x
                           ==        ; g y
                                     }

I think in this notation everyone sees the laws as plain common sense.
If you do write a monad that doesn't follow some common sense, the
dire consequence (practical or theoretical) is obvious.

Just in case it is still not obvious to somebody...

When we see a program written in a form on the LHS, we expect it to do
the same thing as the corresponding RHS; and vice versa.  And in
practice, people do write like the lengthier LHS once in a while.
First example: beginners tend to write

  skip_and_get = do { unused <- getLine
                    ; line <- getLine
                    ; return line
                    }

and it would really throw off both beginners and veterans if that did
not act like (by law #2)

  skip_and_get = do { unused <- getLine
                    ; getLine
                    }

Second example: Next, you go ahead to use skip_and_get:

  main = do { answer <- skip_and_get
            ; putStrLn answer
            }

The most popular way of comprehending this program is by inlining
(whether the compiler does or not is an orthogonal issue):

  main = do { answer <- do { unused <- getLine
                           ; getLine
                           }
            ; putStrLn answer
            }

and applying law #3 so you can pretend it is

  main = do { unused <- getLine
            ; answer <- getLine
            ; putStrLn answer
            }

Law #3 is amazingly pervasive: you have always assumed it, and you
have never noticed it.  (To put it into perspective, you hardly notice
yourself breathing, but this only makes the practical meaning of
breathing more profound, not less.)

Whether compilers exploit the laws or not, you still want the laws for
your own sake, just so you can avoid pulling your hair for
counter-intuitive program behaviour that brittlely depends on how many
redundant "return"s you insert or how you nest your do-blocks.

It is also worth reading apfelmus' followup for further elaboration on the intuition behind the monad laws.

Deokhwan Kim wrote:
> But what practical problems can unsatisfying them cause? In other words,
> I wonder if declaring a instance of the Monad class but not checking it
> for monad laws may cause any problems, except for not being qualified as
> a theoretical monad?

This question is likely to be a result of an unlucky introduction to
monads where they are introduced top down: "Hear ye, a monad, this is
some mystic thing obeying the spiritual laws 1.,2. and 3.", isn't it?
It is this way that monads get the attribute "theoretical".

Asking what the practical meaning of the monad laws might be is like
asking what the practical meaning of the laws for natural number
addition could be: what does
i)  a + (b+c) == (a+b) + c mean?
How can i understand
ii)  a + 0 == a ?
What does
iii)  a + b == b + a signify?

These question are unlikely to arise because you have an intuition of
what a natural number is: a number of bullets in sack, coins in your
pocket, people in the mailing-list etc. With this knowledge, you will
most likely not have any problems explaining the laws i),ii),iii) to
somebody else and most likely you will have not doubt about *why* they
must be true.



For monads, my intuition is as following: a value of type (M a) is an
action, something producing a value of type  a  and (or by) executing a
side-effect like drawing on the screen or screwing up the hard drive.


With the operator >>=, I can execute such actions in a specific
sequence. For the sequence, it is of course unimportant how i group my
actions: i can group actions act1 and act2 first and then postpend act3,
or i can group act2 and act3 first and then prepend it with act1.

To simplify writing down a formular corresponding to this fact, we
introduce the operator >> defined by
    act1 >> act2 = act1 >>= \x -> act2
which sequences actions but for simplicity discards the computed value x
of type a. It is only the side-effect of act1 we are interested in.

Now, the thought about grouping written does as formular is just
    (act1 >> act2) >> act3  ==  act1 >> (act2 >> act3)
and this is the simplified version of law 3. Of course, we know that
this is coined "associativity".

The actual law 3 is just a formulation for >>= that takes proper care of
the intermediate calculation result x.


With  return x , we can create an action which computes the value x but
 has absolutely no side-effects.
This can also be stated in formulas, as Mr "return" explains:
1. "if i am prepended to guys doing side-effects, i give them the value
x but do not take any responsibility for side-effects happening"
   (return x) >>= (\y -> f y) ==  f x
2. "if i am postponed to an action which computes a value x, i don't do
any additional side-effects but just return the value i have been given"
   m >>= (\x -> return x)  ==  m
which is of course equivalent to
   m >>= return  ==  m



So to answer your question:
> In other words, I wonder if declaring a instance of the Monad class
> but not checking it for monad laws may cause any problems, except for not
> being qualified as a theoretical monad?
A thing you declare to be an instance of the Monad class, but one that
does not fulfill the three laws above, simply does not match the
intuition behind a monad. I.e. your definitions of (>>=) and (return)
are most likely to be void of the intended meaning.

Friday, September 01, 2006

Why I prefer LALR parsers

I was asked on a mailing list why I prefer LALR to LL based parser generators. I ended up spending enough time on it to justify posting it here. Summary: "Reduced complexity in the semantic analyser is well worth the 'cost' of thinking bottom-up in the parser."

How come you prefer LALR? What about LR(k)?

I haven't really found the need for the added expressiveness of LR(k); this combined with the dearth of good GLR and LR(k) generators for many languages means that I haven't ended up using one for a project yet. When people write production generators (as opposed to experimental), they seem to be either LL(k) or yacc+.

I'm not surprised at the popularity of LL, as you say, if you're going to write your parser by hand you're going to use top-down, recursive decent - ie LL. Consequently for people coming to the problem of parsing for the first time, they are much easier to visualise, and as you point-out consequently to debug. I recently had to assist a colleague with a couple of sablecc[0] bugs that were a direct result of visualising the parse top-down, while using a bottom-up parser.

However once you've written a handful of parsers, the expressive limitations of LL(k), and the software engineering challenges top-down imposes on semantic analysis quickly teach you to appreciate LALR. Consequently I do not voluntarily use LL parsers in new projects.

I prefer LL(k) as it generates code similar to that you'd write yourself (top down parser) and that's useful when you are debugging. LL(k) is quicker :).

You're right, debugging of the parser is simpler with LL(k). That of course is beside the point, as I mentioned in my original email, the parser is the easy stage - the hard part is the semantic analysis, and LL makes this tougher. Simplifying the parser at the expense of the SA is false economy.

The added complexity derives from two places. The first derives from the additional information available with non-terminal productions in bottom-up parsers. Because bottom-up parsers reduce non-terminals after their children are fully parsed. This additional information makes using S or L-attribute grammars feasible for the first stage of SA. When working with LL I have come across three ways of dealing with this:

  1. Don't use attribute grammars, and use gradually (ie. most of the time, partially) initialised parser-global mutable state.
  2. Use nested parsers as accumulators.
  3. Use a eulers walk to convert top-down into bottom-up by attaching the SA to the post-order events.
In my experience the general approach is a combination of 1 and 2 - and they both cause serious structural problems in the resulting code, which all manner of coupling and structural aliasing between the parser and the SA that complicate ongoing maintenance. 3 wastes alot of memory (although admittably that is less of a problem now), but more importantly you have thrown away the very reason you went LL(k) in the first-place - top-down parsing.

The second source of complexity is the consequence of losing access to left-recursive grammars - which has two key impacts.

  1. Elimination of left-recursion replaces one simple rule with two coupled rules.
  2. This elimination is achieved by converting left-recursion to right-recursion; which will always[4] convert an S-attribute grammar into an L-attribute grammar; and seriously complicate any L-attribute grammar.

This causes problems analogous to the interaction of non-associative operators with foldr.
Consider the following grammar (where n is a terminal matching {digit}+ )

E -> E - n | n
converted[1] becomes
E -> nT*
T -> - n
"10 - 5 - 3" is then parsed:
  • L-recursive : (((10) - 5) - 3)
  • R-recusive : (10 (- 5 (- 3)))
Now of course SA can evaluate the R-form correctly, it just takes more work[2]; less intuitive code[3]; greater coupling; greater complexity.

LL parsers can't handle the L-recursive grammar, LR parsers can.

Reduced complexity in the SA is well worth the 'cost' of thinking bottom-up; seriously, it's not that difficult to do.

[1] The kleene star simplifies this a bit, using e for the empty match the classic conversion is

    E -> nE'
    E' -> - nE' | e
See Aho et al for details.

[2] The L-recursive SA is a trivial S-attribute grammar defined as

E -> E' - n { E.val := E'.val - n.val }
E -> n      { E.val := n.val }

[3] In this case the L-recursive form is so trivial that the resulting R-form produces a trivial accumulator based SA. However note that even at this basic level, we have converted a simple declarative form into a recursive form that is fundamentally more complex. In cases where the L-form is naturally recursive, the resulting R-form invariably introduces mutable shared state, with it's attendant complications.

[4] I'm reasonably certain this is the case, I don't however have time to double check this. Presumably someone will correct me if I'm mistaken here.