upd: I am very sorry about my mistake. The old version of data is download from luogu, but the website do NOT allow me to download the data of the worst case. In fact, when facing same amount of operations, the memory usage depends on the type of operations and their orders. I re-generate some data, and update the link and profiling logs.
I am new in Haskell, and I try to implement an AVL tree. To test my code, I submit it to an Online Judge, which has a problem of balanced-tree. However, I get a MLE (Memory Limit Exceeded).
In fact, when facing 10^5 times of operations, my code requires 58 MB memories, while native C++ code only need about 2 to 3 MB. The speed of C++ code is also 10x faster.
I wonder if haskell can deal with such large data structure while having performance close to native code. What should I do?
Here is my code:
type Info = (Int, Int) data Tree a = Empty | Ord a => Node (Tree a) a Info (Tree a) just :: Maybe a -> a just (Just a) = a height :: Tree a -> Int height Empty = 0 height (Node _ _ (h,_) _) = h size :: Tree a -> Int size Empty = 0 size (Node _ _ (_,s) _) = s maintainInfo :: Tree a -> Tree a maintainInfo (Node ls v _ rs) = Node ls v (max (height ls) (height rs) + 1, size ls + size rs + 1) rs rotateL :: Tree a -> Tree a rotateL (Node (Node ll lv _ lr) v _ rs) = maintainInfo (Node ll lv (0,0) (maintainInfo (Node lr v (0,0) rs))) rotateR :: Tree a -> Tree a rotateR (Node ls v _ (Node rl rv _ rr)) = maintainInfo (Node (maintainInfo (Node ls v (0,0) rl)) rv (0,0) rr) rotateL_ :: Tree a -> Tree a rotateL_ (Node (Node ll lv _ lr) v _ rs) | height ll >= height lr = rotateL (Node (Node ll lv (0,0) lr) v (0,0) rs) | otherwise = rotateL (Node (rotateR (Node ll lv (0,0) lr)) v (0,0) rs) rotateR_ :: Tree a -> Tree a rotateR_ (Node ls v _ (Node rl rv _ rr)) | height rl > height rr = rotateR (Node ls v (0,0) (rotateL (Node rl rv (0,0) rr))) | otherwise = rotateR (Node ls v (0,0) (Node rl rv (0,0) rr)) maintain :: Tree a -> Tree a maintain (Node ls v info rs) | height ls > height rs + 1 = rotateL_ (Node ls v info rs) | height rs > height ls + 1 = rotateR_ (Node ls v info rs) | otherwise = maintainInfo (Node ls v info rs) insert :: Ord a => a -> Tree a -> Tree a insert val Empty = Node Empty val (1,1) Empty insert val (Node ls v _ rs) | val < v = maintain (Node (insert val ls) v (0,0) rs) | val >= v = maintain (Node ls v (0,0) (insert val rs)) iter :: Tree a -> [a] iter Empty = [] iter (Node ls v _ rs) = iter ls ++ v : iter rs maxi :: Tree a -> Maybe a maxi Empty = Nothing maxi (Node _ v _ Empty) = Just v maxi (Node _ _ _ rs) = maxi rs mini :: Tree a -> Maybe a mini Empty = Nothing mini (Node Empty v _ _) = Just v mini (Node ls _ _ _) = mini ls eraseSwap :: Tree a -> Tree a eraseSwap Empty = Empty eraseSwap (Node ls _ _ Empty) = ls eraseSwap (Node ls _ _ rs) = let next = just (mini rs) in maintain (Node ls next (0,0) (erase next rs)) erase :: a -> Tree a -> Tree a erase _ Empty = Empty erase val (Node ls v _ rs) | val < v = maintain (Node (erase val ls) v (0,0) rs) | val > v = maintain (Node ls v (0,0) (erase val rs)) | val == v = eraseSwap (Node ls v (0,0) rs) next :: a -> Tree a -> Maybe a next _ Empty = Nothing next val (Node ls v _ rs) | val < v = Just (maybe v (min v) (next val ls)) | val >= v = next val rs prev :: a -> Tree a -> Maybe a prev _ Empty = Nothing prev val (Node ls v _ rs) | val <= v = prev val ls | val > v = Just (maybe v (max v) (prev val rs)) rank :: a -> Tree a -> Int rank _ Empty = 1 rank val (Node ls v _ rs) | val <= v = rank val ls | val > v = size ls + 1 + rank val rs select :: Int -> Tree a -> Maybe a select _ Empty = Nothing select k (Node ls v _ rs) | size ls >= k = select k ls | size ls == k - 1 = Just v | otherwise = select (k - size ls - 1) rs update :: [Int] -> Tree Int -> Tree Int update [1,x] = insert x update [2,x] = erase x ask :: [Int] -> Tree Int -> Int ask [3,x] tree = rank x tree ask [4,x] tree = just (select x tree) ask [5,x] tree = just (prev x tree) ask [6,x] tree = just (next x tree) solve :: Int -> Tree Int -> IO () solve 0 _ = return () solve n tree = do inputQuery <- getLine let l = map read (words inputQuery) :: [Int] if head l < 3 then do solve (n-1) (update l tree) else do print (ask l tree) solve (n-1) tree return () main :: IO () main = do inputN <- getLine let n = read inputN :: Int solve n Empty return ()
I humbly accept all suggestions, including those that are not aimed at performance but at coding style.
To test my code, I submit it to Luogu P3369, which is in Chinese. The simple translation is as follow:
You need to dynamically maintain a re-set M and provide the following operations:
- Insert a number x into M.
- Delete a number x from M (if there are multiple identical numbers, only delete one).
- Query how many numbers in M are smaller than x, and add one to the answer.
- Query the number that ranks in the x-th position after M is sorted from small to large.
- Query the predecessor of x in M (the predecessor is defined as the largest number that is smaller than x).
- Query the successor of x in M (the successor is defined as the smallest number that is greater than x).
- For operations 3, 5, and 6, it is not guaranteed that the number x exists in the current re-set.
- The first line is n, which indicates the number of operations. Each of the following n lines has two numbers opt and x. opt indicates the sequence number of the operation (1 <= opt <= 6)
- For operations 3,4,5,6, each line outputs a number, indicating the corresponding answer.
If anyone has difficulty to login luogu, here is also some data files to download:
Here is the output of ghc -O2 -rtsopts main.hs && ./main < avl.in > tmp.out +RTS -s
:
1,579,041,840 bytes allocated in the heap 309,624,880 bytes copied during GC 19,305,064 bytes maximum residency (21 sample(s)) 186,640 bytes maximum slop 58 MiB total memory in use (0 MB lost due to fragmentation) Tot time (elapsed) Avg pause Max pause Gen 0 360 colls, 0 par 0.324s 0.329s 0.0009s 0.0055s Gen 1 21 colls, 0 par 0.419s 0.425s 0.0202s 0.0500s INIT time 0.001s ( 0.001s elapsed) MUT time 0.897s ( 0.907s elapsed) GC time 0.743s ( 0.754s elapsed) EXIT time 0.000s ( 0.009s elapsed) Total time 1.641s ( 1.671s elapsed) %GC time 0.0% (0.0% elapsed) Alloc rate 1,759,782,364 bytes per MUT second Productivity 54.7% of total user, 54.3% of total elapsed