2
\$\begingroup\$

I teach a data structures and algorithms course. Students can choose C++ or Java. As an exercise for myself, I am doing the assignments in Haskell. My primary goal is to understand what aspects of my class are easier in Haskell, and which aspects are more challenging. I am a fairly noob Haskell coder. Hence, I tend to be unaware of changes within the Haskell community, best practices, and "the right things". This is my secondary goal.

The class includes a List, Stack, Queue, Heap, and hash table. I'm up to the Queue structure so far.

Questions

  1. For the List, I implemented Foldable, Functor, Monoid, Semigroup to allow my List to be "list like" with the rest of Haskell's ecosystem, e.g., fmap, and <>. Should I do that for the Queue too?

  2. In the C++/Java version of the class we implement a doubly-linked list then apply a Stack Interface on top of it. This is to show the flexibility of the doubly-linked list structure. However for the Haskell version it seemed more prudent to express the Queue structure directly, i.e, a thing with a beginning and ending, with some stuff in the middle. Is this a good choice? Is there a better expression of the Queue structure?

    1. Is push still constant time, i.e., O(1)? The recursive call for the More case will match either One or Two and be done.
  3. Errors. head Nil is undefined. This is lame. While head :: Queue a -> Maybe a is an option, is there a better option?

  4. Is there a question I should have asked?

Related Work

  1. A simple queue implementation in Haskell
    1. Uses Maybe which is a hint my implementation should as well.
    2. Derives Eq. That makes sense to compare two Queue's. I'll add that to mine.

Queue Data Structure

module Queue ( Queue (Nil, Unit, More), head, pop, push, fromList, empty, ) where import Data.List (foldl') import Prelude hiding (head) data Queue a = Nil | Unit a | Two a a | More a (Queue a) a deriving (Show) head :: Queue a -> a head Nil = undefined head (Unit x) = x head (Two _ x) = x head (More _ _ x) = x pop :: Queue a -> (Queue a, a) pop Nil = (Nil, undefined) pop (Unit x) = (Nil, x) pop (Two x y) = (Unit x, y) pop (More x middle y) = (More x middle' newEnd, y) where (middle', newEnd) = pop middle empty :: Queue a -> Bool empty Nil = True empty _ = False push :: a -> Queue a -> Queue a push x Nil = Unit x push x (Unit y) = Two x y push x (Two y z) = More x (Unit y) z push w (More x middle y) = More w (push x middle) y fromList :: [a] -> Queue a fromList = foldl' push' Nil where push' = flip push 

Tests

module QueueSpec (spec) where import Data.Foldable (toList) import Data.List (sort) import Queue as Q import Test.Hspec import Test.Hspec.QuickCheck import Test.QuickCheck import Test.QuickCheck.Modifiers spec :: Spec spec = do describe "head" $ do it "put gets back" $ do Q.head (Unit 5) `shouldBe` 5 it "pop unit" $ do let x = Nil let x2 = push 5 x let (q, y) = pop x2 y `shouldBe` 5 it "pop more" $ do let x = Unit 1 let x2 = push 2 x let (q, y) = pop x2 y `shouldBe` 1 it "pop more with priority" $ do let x = Unit 2 let x2 = push 1 x let (q, y) = pop x2 y `shouldBe` 2 it "push a list" $ do let x = fromList [1, 2, 3, 4, 5] let (q, y) = pop x y `shouldBe` 1 it "first element of list is first of queue" $ property prop_firstListIsFirstQueue prop_firstListIsFirstQueue :: NonEmptyList Int -> Bool prop_firstListIsFirstQueue xs = Prelude.head l == Q.head q where l = getNonEmpty xs q = fromList l 
\$\endgroup\$

    2 Answers 2

    1
    \$\begingroup\$

    Use the Right Data Structure

    In Haskell, a double-linked list is, as you know, inefficient, as removing an element from the end requires copying all other elements of the list.

    In this case, you define four cases, which try to build a list from the middle out. Your push and pop operations still run in linear time, needing to recurse to the root of the structure and generate an entirely new list. This vitiates any advantage of storing the first and last elements in the outermost node.

    You are better off using a self-balancing binary tree here, such as a red-black tree, which would be able to re-use the majority of the nodes on each operation. There are also functional FIFO queue structures using, e.g., a push list and a pop list, where the pop list is created by reversing the push list.

    If you’re going to perform a linear-time copy of the data, the structure you probably want to use is a Data.Vector, which can add data to the right or shift it to the left as efficiently as possible, even amortizing most of the operations. For example, enqueueing elements could be a constant-time insertion at the end, and dequeueing them could be a constant-time increment of an offset from the beginning. Eventually, your queue will run out of space, and you will need to copy the entire thing to a new Vector. But many smaller queues might never need to do this, and would just get constant time. When the vector does get copied, this is fast because it performs one allocation and then copies a contiguous block of memory.

    Replace undefined with a Permanent Solution

    If pop on an empty queue should be a runtime error, declare it as an error with a meaningful message. This will help you, or anyone else using your library, debug their code.

    Alternatively, pop could return a Maybe value, and pop on an empty queue could then return Nothing, rather than crash the program.

    There is, Indeed, a Bug

    As you thought, pop Nil can get called. Consider the following sequence of operations. Please read from the bottom up.

    test = fst . pop $ -- Error! Attempts to pop Nil. fst . pop $ -- More 3 Nil 2 push 3 $ -- More 3 (Unit 2) 1 push 2 $ -- Two 2 1 push 1 $ -- Unit 1 Nil 

    Or with the aid of import Data.Function ((&)):

     Nil & push 1 & -- Unit 1 push 2 & -- Two 2 1 push 3 & -- More 3 (Unit 2) 1 pop & fst & -- More 3 Nil 2 pop & fst -- Error! Attempts to pop Nil. 

    Consider Using the Names Enqueue and Dequeue

    This seems less ambiguous to me than push and pop operations, but those names are widely-used enough for queues that I won’t tell you they’re wrong.

    \$\endgroup\$
    1
    • \$\begingroup\$> push and pop operations still run in linear time Well shoot. Thank you for pointing that out. Add that to my list of challenges: estimating time complexity presents differently than imperative languages. Thank you for the thoughtful answer. I'll keep working on it.\$\endgroup\$CommentedJul 6, 2022 at 4:00
    2
    \$\begingroup\$

    Unordered remarks:

    • You have an explicit export list. Nice! This is a good habit to get into for the sake of tight abstraction boundaries.

    • (partial response to question #2) I don't think I agree with your choice of representation for Queue as four cases (empty, thing, two things, first-middle-last). The main operations you define (push, pop, empty, and head) can all be implemented in O(1) time with a standard data Queue a = Empty | Cons a (Queue a).

    • (partial response to question #2) Generally speaking, I don't think there's anything wrong in Haskell with having a type that uses another, more complex, type, under the hood.

      That being said, I would be wary about using any old list-like structure sitting around to implement a Queue without taking time to consider. This holds true in any language, but I think can bite especially hard in Haskell, where it's eas(y|ier) to accidentally write beautiful-but-not-especially-performant code.

    • (response to question #3) Using undefined like this is not idiomatic. Generally speaking idiomatic Haskell tries to convey as much information in a type as (reasonably) possible, especially errors. Hence, head :: Queue a -> Maybe a is appropriate.

      As per "is there a better option?" to Maybe a -- perhaps? It depends on what you mean by better. Without more info, I would recommend use of Maybe as the idiomatic solution.

      Coming from other languages, representing errors with types like Maybe can feel clunky. When I pop the item off the top of my Queue a, I've received a Maybe a, but I want to work with an a! So now I have to use a case to get my a out. Annoying! Haskell has a number of affordances to improve ergonomics in these kinds of situations, such as <$>, <*>, and do notation, so with practice it gets far less annoying and far more natural!

      Anyway, I would recommend using head :: Queue a -> Maybe a and also pop :: Queue a -> Maybe (Queue a, a)

    • (response to question #1) I think implementing these classes for Queue would be reasonable, yes.

    • (response to question #4) Nothing comes to mind!

    • You have tests! Nice.

      Aside: take a look at QuickCheck if you have the time and don't already know about property testing. Cool stuff, and very appropriate for structures

    • Nice use of _ in the implementation for empty

    \$\endgroup\$

      Start asking to get answers

      Find the answer to your question by asking.

      Ask question

      Explore related questions

      See similar questions with these tags.