2
\$\begingroup\$

A thought exercise on my part as I'm relatively new to Haskell. I wanted an interesting project to work on so I decided to implement the Hashcash Algorithm, which is most commonly used as the basis of Bitcoin Proof of Work scheme. I am implementing the original specification that utilizes SHA1 and the description of the algorithmic steps are described well in the above Wikipedia article.

This appears to function correctly to the best of my knowledge, however I feel it is somewhat slower than it should be. Any potential suggestions for performance improvements are welcome here. Furthermore, as I am new to writing Haskell, if I am violating common expected conventions here then please feel free to point out how I can write more readable and standard code here.

{-# LANGUAGE BangPatterns #-} module HashCash where import Data.Int import Data.List import Data.List.Split (splitOn) import Data.Char import Data.Function import System.Random import Data.Bits import Data.Either import Data.Binary.Strict.Get import System.IO as SIO import Data.Word (Word32) import Data.ByteString as B import Data.ByteString.Char8 as BC import Data.ByteString.UTF8 as BU import Data.ByteString.Base64 as B64 import Data.ByteString.Conversion as BCON import Data.ByteArray as BA import Crypto.Random import Crypto.Hash startingCounter :: Int32 startingCounter = 1 difficulty :: Int difficulty = 20 headerPrefix = "X-Hashcash: " template = "1:{:{:{::{:{" dateTemplate = "YYMMDDhhmmss" address = "a@a" -- example date because I dont want to mess with date formatting just now exampleDate = "150320112233" convertToString :: ByteString -> String convertToString b = BU.toString b convertFromString :: String -> ByteString convertFromString s = BU.fromString s convertIntToString :: Int -> String convertIntToString a = convertToString . BCON.toByteString' $ a encodeInt32 :: Int32 -> ByteString encodeInt32 a = B64.encode . BCON.toByteString' $ a mahDecoder :: Get Word32 mahDecoder = do first32Bits <- getWord32be return first32Bits firstBitsZero :: (Bits a) => a -> Bool firstBitsZero val = Data.List.foldr (\x acc -> ((not $ testBit val x) && acc)) True [0..(difficulty - 1)] formatTemplate :: String -> [String] -> String formatTemplate base [] = base formatTemplate base (x:xs) = let splix = (Data.List.Split.splitOn "{" base) :: [String] splixHead = Data.List.head splix ++ x splixTail = Data.List.tail splix concatSplitTail = Data.List.init $ Data.List.concatMap (++ "{") splixTail in formatTemplate (splixHead ++ concatSplitTail) xs get16RandomBytes :: (DRG g) => g -> IO (ByteString, g) get16RandomBytes gen = do let a = randomBytesGenerate 16 gen return $ a getBaseString :: ByteString -> Int32 -> String getBaseString bs counter = let encodedVal = B64.encode bs encodedCounter = encodeInt32 counter baseParams = [(convertIntToString difficulty), exampleDate, address, (convertToString encodedVal), (convertToString encodedCounter)] in formatTemplate template baseParams hashSHA1Encoded :: ByteString -> ByteString hashSHA1Encoded bs = let hashDigest = hash bs :: Digest SHA1 byteString = B.pack . BA.unpack $ hashDigest in byteString -- Pass a counter and if the first 20 bits are zero then return the same counter value else increment it -- signifying it is time to test the next number (NOTE: recursive style, may overflow stack) testCounter :: ByteString -> Int32 -> Int32 testCounter rb !counter = let baseString = getBaseString rb counter hashedString = hashSHA1Encoded $ convertFromString baseString !eitherFirst32 = runGet mahDecoder hashedString incCounter = counter + 1 in case eitherFirst32 of (Left first32, _) -> testCounter rb incCounter (Right first32, _) -> if (firstBitsZero first32) then counter else testCounter rb incCounter testCounterBool :: ByteString -> Int32 -> Bool testCounterBool rb counter = let baseString = getBaseString rb counter hashedString = hashSHA1Encoded $ convertFromString baseString eitherFirst32 = runGet mahDecoder hashedString in case eitherFirst32 of (Left first32, _) -> False (Right first32, _) -> firstBitsZero first32 -- Keep taking incrementing counters from an infinite list and testing them until we find a counter -- that generates a valid header findValidCounter :: ByteString -> Int32 findValidCounter ran = Data.List.last $ Data.List.takeWhile (not . testCounterBool ran) [1..] generateHeader :: IO String generateHeader = do g <- getSystemDRG (ran, _) <- get16RandomBytes g let validCounter = findValidCounter ran let validHeader = getBaseString ran validCounter return $ headerPrefix ++ validHeader main :: IO () main = do header <- generateHeader SIO.putStrLn header return () 
\$\endgroup\$

    1 Answer 1

    3
    \$\begingroup\$

    That's a lot of imports, and several seem strange. For example Data.List, since you only use head, tail, foldr or other Prelude functions. The problem is that you've use as, but left the qualified. This should be:

    import Crypto.Hash import Crypto.Random import Data.Binary.Strict.Get import Data.Bits import Data.ByteString (ByteString) import qualified Data.ByteArray as BA import qualified Data.ByteString as B import qualified Data.ByteString.Base64 as B64 import qualified Data.ByteString.Char8 as BC import qualified Data.ByteString.Conversion as BCON import qualified Data.ByteString.UTF8 as BU import Data.Either (either) import Data.Int (Int32) import Data.List.Split (splitOn) import Data.Word (Word32) import System.Random 

    qualified will prevent names like head from Data.ByteString getting imported into the global namespace. See "import" for more information. The whitespace is purely custom. I like to sort the modules by name, but that's up to you.

    Next, I would make clear that your convert* functions are only new names, e.g.:

    convertToString :: ByteString -> String convertToString = BU.toString convertFromString :: String -> ByteString convertFromString = BU.fromString convertIntToString :: Int -> String convertIntToString = convertToString . BCON.toByteString' encodeInt32 :: Int32 -> ByteString encodeInt32 = B64.encode . BCON.toByteString' mahDecoder :: Get Word32 mahDecoder = getWord32be 

    The mahDecoder change is a little bit different, since you've used originally something like

    foo = do x <- func return x 

    However, this is fine due to the right-identity monad law:

    func >>= return === func 

    Next, firstBitsZero can be rewritten using all:

    firstBitsZero :: (Bits a) => a -> Bool firstBitsZero val = all (\x -> not $ testBit val x) [0..(difficulty - 1)] 

    Note that a bit mask would be faster would make it faster, e.g.:

    bitMask :: Num a => a bitMask = (2 ^ difficulty) - 1 firstBitsZero :: Int32 -> Bool firstBitsZero val = bitMask .&. val == zeroBits 

    That way you only have to create the mask once and then only use bitwise AND, which usually gets compiled into a single CPU instruction.

    When working with list, you want to add elements at the front, not at the back. Instead of

    init (concatMap (++ "{") splixTail) 

    you should use

    tail (concatMap ('{' :) splixTail) 

    The problem is that ++ is linear in its first argument:

    (x:xs) ++ ys = x : (xs ++ ys) 

    (:) on the other hand is constant in terms of time.

    We end up with:

    formatTemplate :: String -> [String] -> String formatTemplate base [] = base formatTemplate base (x:xs) = let (splixHead:splixTail) = splitOn "{" base concatSplitTail = tail $ concatMap ("{" :) splixTail in formatTemplate (splixHead ++ (x : concatSplitTail)) xs 

    Your next function doesn't need to be in IO:

    get16RandomBytes :: (DRG g) => g -> (ByteString, g) get16RandomBytes = randomBytesGenerate 16 

    In your following functions, I would prefer to use where, but that's completely up to personal preference. Also, if you use an expression only once, it might make sense to get rid of its binding if the code stays readable:

    hashSHA1Encoded :: ByteString -> ByteString hashSHA1Encoded bs = B.pack . BA.unpack $ (hash bs :: Digest SHA1) 

    or

    hashSHA1Encoded :: ByteString -> ByteString hashSHA1Encoded bs = B.pack . BA.unpack $ hashDigest where hashDigest = hash bs :: Digest SHA1 

    Both testCounter and testCounterBool use some duplicate code, which should be placed in its own function:

    decodeFirst32 :: ByteString -> Int32 -> Either String Word32 decodeFirst32 rb = fst . runGet mahDecoder . hashSHA1Encoded . convertFromString . getBaseString rb 

    This makes testCounter and testCounterBool a lot simpler:

    testCounter :: ByteString -> Int32 -> Int32 testCounter rb !counter = case decodeFirst32 rb counter of Right f32 | firstBitsZero f32 -> counter _ -> testCounter rb (counter + 1) testCounterBool :: ByteString -> Int32 -> Bool testCounterBool rb counter = either (const False) firstBitsZero $ decodeFirst32 rb counter 

    With findValidCounter, I'm somewhat sure that your logic isn't 100% correct. According to the documentation, you check with increasing counters, but last . takeWhile p will take the last element for which p holds. Since p = not . testCounterBool ran, you will end up with the last element for which testCounterBool ran does not hold.

    Either way, if you look for the first element that holds a predicate, you can use find from Data.List instead:

    findValidCounter ran = fromJust $ find (testCounterBool ran) [1..] 

    Given the changes with get16RandomBytes, we end up with

    generateHeader :: IO String generateHeader = do (ran, _) <- fmap get16RandomBytes getSystemDRG let validCounter = findValidCounter ran let validHeader = getBaseString ran validCounter return $ headerPrefix ++ validHeader 
    \$\endgroup\$
    5
    • 1
      \$\begingroup\$I'll have to stop the review there, I'll try to review the rest on Saturday.\$\endgroup\$
      – Zeta
      CommentedJan 12, 2017 at 21:26
    • \$\begingroup\$This is such an informative answer. Thank you so much! I knew I could solve this with a bit mask but for some reason I wasnt sure if it would be faster but it makes sense when you explain it in terms of assembly.\$\endgroup\$CommentedJan 12, 2017 at 22:58
    • \$\begingroup\$I only understand the do syntax for monadic actions. Looks like I need to do more reading.\$\endgroup\$CommentedJan 12, 2017 at 23:45
    • \$\begingroup\$@maple_shaft: basically do { x <- foo; bar} is the same as foo >>= (\x -> bar), which sheds some light on the error you'll get if you use x <- foo as the last expression in a do block. I'll elaborate on Saturday.\$\endgroup\$
      – Zeta
      CommentedJan 13, 2017 at 5:49
    • \$\begingroup\$Small bug in your optimized bitMask expression, you need to swap the x and n arguments: bitMask = foldr (\x n -> setBit n x) zeroBits [0..(difficulty - 1)]\$\endgroup\$CommentedJan 13, 2017 at 13:52

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.