Writing monadic Haskell to evaluate arithmetic expression

Question

I have been studying Haskell by myself for about a little over a year. And I have been stuck at monad/monad transformers for quite a while until recently some examples I read online enlightened me. So I decided to try on the following problem with writing monadic Haskell code.

The problem is to evaluate a string that contains only 0-9, +, - and *, which represents addition, subtraction and multiplication separately. The string itself should represent a valid math expression and starts with a number always.

"3+5" -> 8 "3+25*4" -> 103 "1-2*2*2+7" -> 0

The goal of the exercise is not to write a perfect parsing engine to evaluate any math expression but to try to learn to use monad as a tool to write program that could be relatively straight forward in an imperative language such as C++.

It is a linear algorithm and the main the idea is to use two stacks to track numbers and operators.

On a new digit, update the current on-the-run number
On any operator, push the on-the-run number to the number stack. Update the stacks if the existing operator on the top of the stack is '*'. If this new operator is a '+' or '-', update the stacks if only the existing operator is '+' or '-'. Once the update is done, push the new operator to the stack
repeat the process until there is one number left.

This algorithm is used to develop the solutions in both C++ and Haskell.

C++ solution:

 #include <stack> #include <iostream> #include <string> #include <stdexcept> using namespace std; int calc(char c, int n1, int n2) { // cout << c << "-->" << n1 << " and " << n2 << endl; if (c == '+') return n1+n2; else if (c == '-') return n1-n2; else if (c == '*') return n1*n2; else throw runtime_error("bad operator"); } void update(stack<int>& numbers, stack<char>& operators) { if (operators.size() + 1 != numbers.size()) throw runtime_error("bad"); char op = operators.top(); operators.pop(); int n2 = numbers.top(); numbers.pop(); int n1 = numbers.top(); numbers.pop(); numbers.push(calc(op, n1, n2)); } int processMath(const string& input) { int num = 0; stack<int> numbers; stack<char> operators; for (char c : input) { if (c == '+' || c == '-' || c == '*') { numbers.push(num); num = 0; // reset number if (c == '*' && !operators.empty() && operators.top() == '*') { update(numbers, operators); } else if (c == '+' || c == '-') { // c is + or - while (!operators.empty()) update(numbers, operators); } operators.push(c); } else { num = num*10+(c-'0'); // cout << "num=" << num << endl; } } numbers.push(num); while (!operators.empty()) update(numbers, operators); return numbers.top(); } // To execute C++, please define "int main()" int main() { string exp1 = "13+15"; string exp2 = "3+25*4"; string exp3 = "1-2*2*2+7"; cout << exp1 << endl << processMath(exp1) << endl << endl; cout << exp2 << endl << processMath(exp2) << endl << endl; cout << exp3 << endl << processMath(exp3) << endl << endl; return 0; }

The following part is the Haskell program I came up with, without using anything specific for parsing or math evaluation.

import Control.Monad.State import Data.Char data MathStacks = MathStacks { numbers :: [Int] , operators :: [Char] , current :: Int } deriving Show data EvalErr = ParseErr { position :: Int, reason :: String } | StackErr String | OpErr String deriving Show collapseOn :: MathStacks -> [Char] -> Either EvalErr MathStacks collapseOn ms@(MathStacks ns ops _) permittedOps | null ops = return ms | length ns < 2 = Left $ StackErr ("numbers length < 2:" ++ show ns) | not $ op `elem` "+-*" = Left $ OpErr ("invalid op=" ++ [op]) | not $ op `elem` permittedOps = return ms | otherwise = do n <- calc op n1 n2 return $ ms { numbers=(n:nrest), operators=oprest } where (n2:n1:nrest) = ns (op:oprest) = ops calc :: Char -> Int -> Int -> Either EvalErr Int calc c n1 n2 | c == '+' = return $ n1 + n2 | c == '-' = return $ n1 - n2 | c == '*' = return $ n1 * n2 | otherwise = Left $ OpErr ("invalid op=" ++ [c]) exec :: MathStacks -> Either EvalErr MathStacks exec ms@(MathStacks ns ops curr) | nlen /= olen + 1 = Left $ StackErr ("inconsistent stacks") | olen == 0 = Right ms | otherwise = do let (n2:n1:nrest) = ns (op:oprest) = ops n <- calc op n1 n2 return $ MathStacks (n:nrest) oprest curr where nlen = length ns olen = length ops exec' :: MathStacks -> Either EvalErr MathStacks exec' ms@(MathStacks ns ops _) | null ops = return ms | otherwise = (exec ms) >>= exec' eval :: MathStacks -> Either EvalErr Int eval (MathStacks ns ops curr) | nlen /= 1 || olen /= 0 = Left $ StackErr ("inconsistent stacks") | otherwise = Right $ head ns where nlen = length ns olen = length ops horner :: Int -> Int -> Int horner digit num = num * 10 + digit updateCurr :: Int -> MathStacks -> MathStacks updateCurr digit ms@(MathStacks _ _ curr) = ms { current=horner digit curr } updateOps :: Char -> MathStacks -> Either EvalErr MathStacks updateOps op ms@(MathStacks _ ops _) | op `elem` ['+', '-', '*'] = return $ ms { operators=(op:ops) } | otherwise = Left $ OpErr ("invalid op=" ++ [op]) updateNum :: MathStacks -> MathStacks updateNum ms@(MathStacks ns _ curr) = ms { numbers=(curr:ns), current=0 } parse :: (Char, Int) -> MathStacks -> Either EvalErr MathStacks parse (c, idx) ms@(MathStacks ns ops curr) | c `elem` ['+', '-', '*'] = do -- current number run is done let ms0 = updateNum ms -- if there is existing multiplication on top. collapse it ms1 <- collapseOn ms0 "*" ms2 <- if c == '+' || c == '-' -- if there is existing addition or subtraction, do it then collapseOn ms1 "+-" else return ms1 updateOps c ms2 | isDigit c = Right $ updateCurr (digitToInt c) ms | otherwise = Left $ ParseErr idx ("err char at pos=" ++ show idx ++ " char:" ++ [c]) where nlen = length ns olen = length ops updateOnceT :: StateT MathStacks (Either EvalErr) () updateOnceT = do -- in side of StateT MathStacks (Either EvalErr) monad ms <- get ms' <- lift $ exec ms put ms' evalCharT :: (Char, Int) -> StateT MathStacks (Either EvalErr) () evalCharT (c, idx) = do ms <- get -- ms :: MathStacks -- promotes from Either EvalErr MathStacks type to StateT monad ms' <- lift $ parse (c, idx) ms put ms' evalStringT :: String -> StateT MathStacks (Either EvalErr) () evalStringT s = mapM_ evalCharT $ zip s [1..] evalStringE :: String -> Either EvalErr MathStacks evalStringE s = foldM (flip parse) emptyStack $ zip s [1..] calcStringE :: String -> Either EvalErr MathStacks calcStringE s = do (_, ms) <- (runStateT $ evalStringT s) emptyStack return ms top :: MathStacks -> Either EvalErr Int top ms = do let ns = numbers ms if null ns then Left $ StackErr "no value left" else return $ head ns calcString :: String -> Either EvalErr Int calcString s = do ms <- evalStringE s -- or use ms <- calcStringE s ms' <- exec' $ updateNum ms top ms' emptyStack = MathStacks [] [] 0 main :: IO () main = do print $ calcString "13+15" print $ calcString "3+25*4" print $ calcString "1-2*2*2+7"

The solution is a much longer program than the C++ counterpart, which is not the impression I got with Haskell program. The part that I used StateT monad transformer is probably not necessary (function evalStringT and function calcStringE), however even without these functions I don't think my solution will get much shorter. I thought using State monad could be a natural solution as it involves quite some state updates in the whole process but it looks like foldM over Either monad seems doable. Overall I am not even sure my solution is Haskellish enough so please point out anything that I can improve on my code.

Can you explain why "1-2*2*2+7" yields "7" and not 0 (\$1-8+7\$)? You should probably add more detail on the grammar behind your math expressions. — Zeta, CommentedJan 22, 2019 at 11:29

sshine · Accepted Answer · 2019-01-24 13:51:58Z

Update:Zeta suggested that what I gave was not really a review. So I present a review first:

The data type EvalErr has both ParseErr, StackErr and OpErr. A common error type for your entire pipeline seems like an OK idea, since the individual parts (parser, evaluator) will not be used independently.
Your error values are all parameterised with a String, which can be super useful as you write the parser, but this makes testing negative cases more difficult. A StackErr may be parameterised with the actual stack that broke. This also makes negative testing easier; a good sign of code quality is testability. You can always produce meaningful error messages based on StackErr ns (and whatever remaining context that makes for a good message; what was the operator that failed?).
Similarly, OpErr could take a single Char.
You perform unsafe pattern matching in the where of collapseOn:
```
where (n2:n1:nrest) = ns (op:oprest) = ops 
```
You justify this by guarding against too short lists.
But this creates a dependency between multiple lines.
You can avoid this either by using pattern matching to restrict access to executing code: A function body that will only execute once a pattern matches is safe. Or you can extract values monadically, providing for more abstraction (including implicit error handling). For example, a monadic stack may work like:
```
eval Add = do v1 <- pop v2 <- pop push (v1 + v2) 
```
I'm not sure exactly what collapseOn does. It handles a bunch of types of errors that are at different levels of abstraction. And then it calls eval, pushes the result to the stack, and removes an operator from a list of operators for some reason or other.
Is collapse a metaphor for error handling? Or for reducing the stack?
So I'd say it does too many things.
You can check that there are enough elements via pattern matching or monadic popping from the stack without calculating the entire length (a full traversal of the stack) every time you handle a new element.
Your list of supported operators is repeated many times. This makes adding new ones difficult and error-prone. The precedence and associativity of your operators is embedded in the ordering of your code and makes it hard to derive, extend or verify that they're right.
The following StateT functions seem a little off:
```
updateOnceT :: StateT MathStacks (Either EvalErr) () updateOnceT = do ms <- get ms' <- lift $ exec ms put ms' evalCharT :: (Char, Int) -> StateT MathStacks (Either EvalErr) () evalCharT (c, idx) = do ms <- get -- ms :: MathStacks ms' <- lift $ parse (c, idx) ms put ms' 
```
There is a modify combinator. But I would probably ditch the StateT altogether to begin with and either
1. Build a non-monadic stack-based parser from scratch, simplify it and extend it. (You'll eventually end up with something that is somewhat equivalent to parser combinators, since they're also recursive descent parsers, but not explicitly recursive.)
2. Build a parser using parser combinators (see below) and either construct a syntax tree or make the parser produce the evaluator directly.
I'd recommend reading up on separation of concerns.

Previous:I wrote this suggestion to solve the problem by dividing the problem into parsing and evaluation, and to use other abstractions than a stack-based algorithm.

What you can do is convert the expression into a syntax tree using a parser combinator library like Megaparsec and evaluate that syntax tree. The author of Megaparsec, Mark Karpov, wrote a tutorial called Parsing a simple imperative language. It has a section called Expressions where he demonstrates the makeExprParser combinator:

aExpr :: Parser AExpr aExpr = makeExprParser aTerm aOperators aOperators :: [[Operator Parser AExpr]] aOperators = [ [ Prefix (Neg <$ symbol "-") ] , [ InfixL (ABinary Multiply <$ symbol "*") , InfixL (ABinary Divide <$ symbol "/") ] , [ InfixL (ABinary Add <$ symbol "+") , InfixL (ABinary Subtract <$ symbol "-") ] ]

As for building a monadic evaluator, I'd read The Monadic Way on the Haskell Wiki. It starts by building a regular evaluator and then adds features that are greatly complicated by the lack of monads, and then it introduces them.

It seems that your examples do not mention division, which is a pretty good example of something that may fail during evaluation because of division by zero. If you had the following syntax tree,

data AExpr = IntConst Integer | Neg AExpr | ABinary ABinOp AExpr AExpr deriving (Show) data ABinOp = Add | Subtract | Multiply | Divide deriving (Show, Eq) data Error = DivisionByZero deriving (Show, Eq)

you could write something like,

eval :: AExpr -> Either Error Integer eval (IntConst i) = return i eval (Neg e) = negate <$> eval e eval (ABinary op e1 e2) = do i1 <- eval e1 i2 <- eval e2 if op == Divide && i2 == 0 then Left DivisionByZero else return $ binop op i1 i2 binop :: ABinOp -> (Integer -> Integer -> Integer) binop Add = (+) binop Subtract = (-) binop Multiply = (*) binop Divide = quot

This separates the concerns of syntax analysis and evaluation. This also means that different kinds of errors are handled at different layers of abstraction. And it means you get a declarative, high-level way of expressing the precedence and associativity of your operators.

More:I added this after to suggest a middle ground.

If you think makeExprParser feels like cheating, and you think the intermediate syntax-tree representation is redundant, you can

Make your life harder by writing your expression parser using the chainl, chainr etc. combinators, or write your own hierarchy of nested combinators (makes precedence and associativity slightly more obscure, but much less so than an explicitly recursive function would).
Make the program simpler by writing a parser that generates a function that evaluates:
```
evalP :: Parser (String -> Integer) evalP = ... 
```

Hm. You have presented a well-thought alternative solution, but haven't reviewed the code. Please explain your reasoning (why it is better than the original) so that the author and other readers can learn from your thought process. I'd start with the saparation of concerns first, as it is the most prominent issue of the original code. — Zeta, CommentedJan 24, 2019 at 9:02

Stack Exchange Network

Writing monadic Haskell to evaluate arithmetic expression

1 Answer 1

Hot Network Questions

Writing monadic Haskell to evaluate arithmetic expression

1 Answer 1

Related

Hot Network Questions