Custom Non-Crypto Hashing Algorithm

Question

Rule of Thumb: Don't roll your own encryption--I'm no cryptographer obviously. Also note my apologies for my awful C-style C++.

For a fun little project I've written my own hash algorithm as a practice project. The goal was to write an algorithm that took an input digest and gave me a generally unique output hash in the scope of 1024 bits per hash. I believe I've achieved something along those lines. On top of that the idea was to base the hash on as much of the digest as possible--theoretically making the hash more unique.

As far as I can tell I've accomplished two things:

The avalanche affect.
Basic level: general uniqueness (to some degree however much or little that may be actually turn out to be...).

The idea behind the algorithm is pretty simple. I stole some things from SHA-512 (SIGMA0/1, ROT macros and prime cuberoots). From there I setup the algorithm in two steps where I largely based the the hash on the data in the actual digest. All rotations and iterations in the hash are based on the data in the digest. The idea was that it'd be harder to find a collision if more of the hash is based on the actual digest.

/* XOR: exclusive-or. SUB: subtract ADD: add ROT: rotate N/NOT: bit-wise NOT. R/L: right/left. FBA: Fast byte/bit alignment of powers of 2. */ #ifndef CUBE_HASH #define CUBE_HASH // Align some data by n (where n is some power of 2). #define FBA(i,n) ((i + (n-1)) & ~(n-1)) // Rotate bits and also bitwise NOT rotated bits by a rotation of 0 <= N < 64. #define NROTR(i,n) ~((i >> n) | (i << (sizeof(__uint64) - n))) #define NROTL(i,n) ~((i << n) | (i >> (sizeof(__uint64) - n))) #define ROTR(i,n) ((i >> n) | (i << (sizeof(__uint64) - n))) #define ROTL(i,n) ((i << n) | (i >> (sizeof(__uint64) - n))) // MOD(i,p) returns i MOD 2^p and does not work on negative numbers. #define MOD(i,p) (i & ((1 << p) - 1)) // SEE: SHA-512 for SGMA0, SIGMA1 uses ~ROTR for it's first rotation (SHA-512 doesn't). #define SGMA0(x) (ROTR(x,7) ^ ROTR(x,18) ^ ROTR(x,3)) #define SGMA1(x) (NROTR(x,28) ^ ROTR(x,34) ^ ROTR(x,39)) #define SGMA2(x) (NROTR(x,25) ^ ROTL(x,43) ^ NROTR(x,9)) typedef unsigned __int64 __uint64; /* The 64 fractional parts of the cuberoots of the first 64 prime numbers. */ const __uint64 fractional_cuberoots[64]{ 0x428a2f98d728ae22, 0x7137449123ef65cd, 0xb5c0fbcfec4d3b2f, 0xe9b5dba58189dbbc, 0x3956c25bf348b538, 0x59f111f1b605d019, 0x923f82a4af194f9b, 0xab1c5ed5da6d8118, 0xd807aa98a3030242, 0x12835b0145706fbe, 0x243185be4ee4b28c, 0x550c7dc3d5ffb4e2, 0x72be5d74f27b896f, 0x80deb1fe3b1696b1, 0x9bdc06a725c71235, 0xc19bf174cf692694, 0xe49b69c19ef14ad2, 0xefbe4786384f25e3, 0x0fc19dc68b8cd5b5, 0x240ca1cc77ac9c65, 0x2de92c6f592b0275, 0x4a7484aa6ea6e483, 0x5cb0a9dcbd41fbd4, 0x76f988da831153b5, 0x983e5152ee66dfab, 0xa831c66d2db43210, 0xb00327c898fb213f, 0xbf597fc7beef0ee4, 0xc6e00bf33da88fc2, 0xd5a79147930aa725, 0x06ca6351e003826f, 0x142929670a0e6e70, 0x27b70a8546d22ffc, 0x2e1b21385c26c926, 0x4d2c6dfc5ac42aed, 0x53380d139d95b3df, 0x650a73548baf63de, 0x766a0abb3c77b2a8, 0x81c2c92e47edaee6, 0x92722c851482353b, 0xa2bfe8a14cf10364, 0xa81a664bbc423001, 0xc24b8b70d0f89791, 0xc76c51a30654be30, 0xd192e819d6ef5218, 0xd69906245565a910, 0xf40e35855771202a, 0x106aa07032bbd1b8, 0x19a4c116b8d2d0c8, 0x1e376c085141ab53, 0x2748774cdf8eeb99, 0x34b0bcb5e19b48a8, 0x391c0cb3c5c95a63, 0x4ed8aa4ae3418acb, 0x5b9cca4f7763e373, 0x682e6ff3d6b2b8a3, 0x748f82ee5defb2fc, 0x78a5636f43172f60, 0x84c87814a1f0ab72, 0x8cc702081a6439ec, 0x90befffa23631e28, 0xa4506cebde82bde9, 0xbef9a3f7b2c67915, 0xc67178f2e372532b }; /* The 16 fractional parts of the cuberoots of the first 64 prime numbers. */ const __uint64 fractional_cubehash[16]{ 0x6a09e667f3bcc908, 0xbb67ae8584caa73b, 0x3c6ef372fe94f82b, 0xa54ff53a5f1d36f1, 0x510e527fade682d1, 0x9b05688c2b3e6c1f, 0x1f83d9abfb41bd6b, 0x5be0cd19137e2179, 0xcbbb9d5dc1059ed8, 0x629a292a367cd507, 0x9159015a3070dd17, 0x152fecd8f70e5939, 0x67332667ffc00b31, 0x8eb44a8768581511, 0xdb0c2e0d64f98fa7, 0x47b5481dbefa4fa4 }; class CUBE1024 { public: static __uint64* hash(const __int8* message) { /* NOTE: MOD(i,p) because the macro returns: (i % (2^p)) in bitwise (i & ((1 << p) - 1)). Find the digets length aligned to 1024 bits--128 bytes. */ size_t message_length = strlen(message), digest_length = FBA(message_length + 8, 128); /* Make sure both arrays rae zero'd out: (). Build the array of 16--64-bit base hash values. Allocate the digest. */ __uint64* cube_hash = new __uint64[16](); memcpy(cube_hash, fractional_cubehash, 16); /* Allocate the digest and copy the message into the digest array and append the ~length of the message to the end. */ __uint64* digest = new __uint64[digest_length](); memcpy(digest, message, message_length); digest[digest_length - 1] = NROTR(message_length, MOD(message_length, 3)); size_t z; for (size_t i = 0; i < digest_length; i++) { /* Z: (i + 1) if Z > digest_length, Z wraps back to 0 due to modulo digest_length. Two Steps: XOR the current digest block with the current digets block NOTR of by N-bits which is then SUB by the the SIGMA0 of the current digets block. The current digest block digested the next(i + 1) digest block in the previous step. We don't want to use this same data again. So we ADD the current digest block by it's XOR of the NOT of the SIGMA0 of the current cube_hash where the current cube_hash is i % 16 (the cube hash array is only 16 entries long, so we wrap i). */ z = MOD(i + 1, digest_length); digest[i] ^= NROTR(digest[z], MOD(digest[i], z)) - SGMA0(digest[i]); digest[z] += digest[i] ^ ~SGMA1(cube_hash[MOD(i, 4)]); } size_t x, y; for (size_t i = 0; i < 64; i++) { /* NOTE: We previosuly loaded in all of the fractional_cubehash[0...15] in the cube_hash array before the loop. X: (i) if X > 16, X wraps back to 0 due to modulo 16 (1 << 4). Y: i,n) if y > V, Y wraps back to 0 due to modulo 64. Where V is the ROTR of the cube_hash[x] % B. Where B is the cube[16 - x] % 64. Three Steps: XOR the current cube_hash by the the NROTL of the fractional_cuberoots[y] OR'd by the constant fractional_cubehash[x]. We previously consumed the NROTL of the cuberoot, so now we'll use the NOT of the cuberoot. So we ADD the NOT of the fractional_cuberoot[y] ADD SIGMA2 of the current digest[V] Where V is the i % digest_length. NOTE: if the digest_length > 64, we may not consume the whole digets in this loop. Finally XOR the NROTL of the cube_hash[x] by 2-bytes ADD the opposite fracitonal_cuberoot[64 - i]. */ x = MOD(i, 4); y = MOD(ROTR(cube_hash[x], MOD(cube_hash[16 - x], 6)), 6); cube_hash[x] ^= NROTL(fractional_cuberoots[y], y) | fractional_cubehash[x]; cube_hash[x] += ~fractional_cuberoots[y] + SGMA2(digest[MOD(i, digest_length)]); cube_hash[x] ^= NROTL(cube_hash[x], 16) + fractional_cuberoots[64 - i]; } /* Cleanup the digest. */ delete[] digest; return cube_hash; }; }; #endif

What language version are you using? Can you use something current like C++14 or are you bound to a specific version? Would you be open to using the standard library, or are you intentionally avoiding everything except strlen() and memcpy()? — amon, CommentedFeb 28, 2017 at 10:12
<cstdint> defines fixed width integers which, although not the prettiest, still look better and are more portable than the ones with two leading underscores used in your code. — D. Jurcau, CommentedFeb 28, 2017 at 18:19
Ill add it to my list of fixes, Ill redo the code and use more standardized and portable approaches when an answer is checked off. — Kayle, CommentedFeb 28, 2017 at 18:27

Loki Astari · Accepted Answer · 2017-02-28 20:26:33Z

Cant comment on your algorithm as I don't have enough knowledge about hashes to comment.

Code Wise.

Don't use types with double underscore. __int64. These are implementation specific types that are not designed for use by general programmers. And definitely do not define your own identifiers with a double underscore.

 typedef unsigned __int64 __uint64;

The standard (Since C++11 (so 6 years now)) has supported 64 bit integer types directly in <cstdint> You can use: uint64_t.

Don't use macros where function can be used.

#define NROTR(i,n) ~((i >> n) | (i << (sizeof(__uint64) - n)))

You can define a function and it will be exactly as efficient as a macro.

uint64_t nrotr(uint64_t i, int n) { return ~((i >> n) | (i << (sizeof(uint64_t) - n))); }

Don't define class names that are all uppercase.

class CUBE1024

These identifiers are traditionally reserved for macros. Don't use their namespace it dangerous.

Interface Semantics

What is the meaning of this interface?

static __uint64* hash(const __int8* message)

Do I pass dynamically allocated memory (as you will free it for me) as the message or can I pass the address of a local variable (then you better not free or delete it)?

The return value has this been dynamically allocated? Am I supposed to free/delete it when I am finished or is this an internal buffer in your class that I should make a copy of before the next call to this function.

After reading the function. I think a better interface would be:

static std::unique_ptr<uint64_t[]> hash(std::array_view const& message); // Actually got that wrong. Which shows how important // that the interface is correct is. // What you really want is: // You are returned a reference to the buffer. It is not yours // If you want to keep it you need to copy the content. static std::array_view const& hash(std::array_view const& message);

Prefer to use C++ standard library over the C standard library

Do you really want to use this?

memcpy(cube_hash, fractional_cubehash, 16);

You don't think this is more readable?

std::copy(fractional_cubehash, fractional_cubehash + 16, cube_hash);

Don't use new/delete

There are all sorts of issues with memory management lets not make things worse by doing the memory management manually.

 __uint64* digest = new __uint64[digest_length](); // Lots of stuff // It was all convoluted but it could quite easily contain a // a return added by a maintainer. delete[] digest;

Prefer to use the standard library which does all that memory management for you. It does it correctly no matter what people do in the code so it is hard to break.

 std::vector<uint64_t> digest(digest_length);

That's it. Not complex. Use it.

Modern Language constructs

Prefer to use the modern range based for()

for (size_t i = 0; i < digest_length; i++) {

Easier to write and read when written like this:

for(auto const& value: digest) {

Tob Ernack · Accepted Answer · 2017-02-28 22:54:19Z

Some general comments about the code:

/* The 16 fractional parts of the cuberoots of the first 64 prime numbers. */ const __uint64 fractional_cubehash[16]{

Is perhaps the comment / array name misleading? The fractional_cuberoots[64] array supposedly contains the fractional parts of the cuberoots of the first 64 prime numbers, and fractional_cubehash[16] contains the 16 first. But the values look totally different from those in fractional_cuberoots[64]. So how is the second array generated?

Also, in this line:
memcpy(cube_hash, fractional_cubehash, 16);
You are only copying 16 bytes, instead of 16 entries of __uint64. If your intent was to copy the entire array, you would need to use
memcpy(cube_hash, fractional_cubehash, 16 * sizeof (__uint64));

I don't have time to delve into the actual hashing algorithm right now, but I might edit my answer to add more details when I have a look.

EDIT
Ok, there are actually a lot more problems that don't involve specifically hashing.

Your usage of the MOD function is incorrect given its definition. You often calculate things like MOD(i + 1, digest_length) or MOD(digest[i], z) where the second argument will likely be larger than 64 (the digest_length is calculated as digest_length = FBA(message_length + 8, 128);, so for messages of more than 64 bytes). In this case the 1 << p in the MOD(i, p) macro gets shifted past its bit size and becomes 0: #define MOD(i,p) (i & ((1 << p) - 1)). Note that if p >= 64, then 1 << p is 0 and the macro results in i & (-1) = i. (In fact... since you are actually shifting the constant '1', this might become 0 even when p >= 32, sorry for the confusion here).

There might also be similar issues with your use of the other *ROT* macros since the shifts could well be beyond the bit sizes used. And also, for example this line:
#define ROTL(i,n) ((i << n) | (i >> (sizeof(__uint64) - n)))
You use sizeof(__uint64) as a bit size while this returns bytes...

Also you are allocating the digest as __uint64* digest = new __uint64[digest_length]();. Thus the digest length will in fact be 8 times larger than what you might expect from the digest_length variable (if you were intending to use that as byte length, otherwise ignore this comment).

This is also inconsistent with your usage of message_length, which contains the length of the message in bytes, while digest_length is counting in 8-byte chunks.

There are lots of these types of confusions in the code, so I suggest resolving these inconsistencies before anything else since it is hard to understand what the hashing does if such details are unclear.

You're correct, the memcpy issue was a type indeed! Ill note that for the changes I need to make. I'll check back if you find anything else wrong. — Kayle, CommentedFeb 28, 2017 at 18:08
I actually mentioned that I am using modulo specifically for the wrapping effect, see the comments the first comment in the loop for(size_t i = 0; i < digest_length;...). So my usage here is purely intentional. — Kayle, CommentedMar 1, 2017 at 2:12
Let's say i = 127 and digest_length = 128. What do you expect MOD(i + 1, digest_length) to evaluate to? The macro expands as i + 1 & ((1 << digest_length) - 1) = 128 & ((1 << 128) - 1) = 128 & (0 - 1) = 128 & (-1) = 128. In fact it will always evaluate to i + 1, since i + 1 & (-1) equals i + 1. So it's effectively a NOP. — Tob Ernack, CommentedMar 1, 2017 at 3:24
I will also provide an example for my comment about the ROT macros. You have this expression: NROTR(digest[z], MOD(digest[i], z)). Let's say we are still in that loop where i = 127, z = 128 (from the above). Then MOD(digest[i], z) will evaluate to digest[127]. Now suppose digest[127] >= 64 (let's say equals 128 again). Then NROTR(digest[z], digest[127]) = NROTR(digest[128], 128) = ~((digest[128] >> 128) | (digest[128] << (sizeof(__uint64) - 128))) = ~((0) | (digest[128] << (8 - 128))) = ~(digest[128] << (-120)). Negative left-shifts are undefined, and this is probably not what you indended. — Tob Ernack, CommentedMar 1, 2017 at 3:32
At any rate, a bitwise left-shift of -120 bits will either be undefined behaviour, or reasonably evaluate to 0, and the final bitwise NOT will evaluate to 0xffffffffffffffff. If this is what you intended then I guess this remark is not going to be helpful, but I suspect this will cause the hashing function to have some very predictable behaviour in certain circumstances. — Tob Ernack, CommentedMar 1, 2017 at 3:44

D. Jurcau · Accepted Answer · 2017-02-28 18:39:18Z

Memory Management

In your function, you allocate memory for the result and return a pointer to it. This passes the responsibility of deallocating the memory to the caller which must also be informed how to do that, as new is not the only way of allocating memory.

I would prefer the function receiving a large enough buffer (fixed-size array) in which it is to store the result, thus leaving it up to the caller to provide the memory (could just as well be on the stack).

For memory you only temporary allocate in the function and then free, consider using a std::unique_ptr. It relieves you of the burder of having to remember to free the memory and also does it for you should an exception interrupt your function's normal flow.

Determining the length of the input

Hashes should not make assumptions about the input such that the input is a null terminating string. Consider requesting the size of the input from the caller

Static member function

Instead of a class with just a single, static function, consider simply providing a function.

Macros

There's no need for macros in this code. Inline functions serve a much better purpose and don't have all the downsides of using macros.

Prefer: std::vector<> over std::unique_ptr<> if you are not going to return the value. — Loki Astari, CommentedFeb 28, 2017 at 20:28

smac89 · Accepted Answer · 2017-02-28 22:50:55Z

Use C++

If your goal was to write this in C++ and with C++14 especially, you should atleast make the attempt to have some C++-specific code in there.

cube_hash.hpp

#pragma once #include <type_traits> #include <string> #include <cstdint> #include <cstring> #include <vector> #include <array> namespace cube_hash { class BitManipPolicy { public: virtual uint64_t fast_bit_align(uint64_t n, uint64_t p2) const = 0; virtual uint64_t not_rotate_right(uint64_t n, uint64_t p2) const = 0; virtual uint64_t not_rotate_left(uint64_t n, uint64_t p2) const = 0; virtual uint64_t rotate_right(uint64_t n, uint64_t p2) const = 0; virtual uint64_t rotate_left(uint64_t n, uint64_t p2) const = 0; virtual uint64_t mod(uint64_t n, uint64_t p2) const = 0; virtual uint64_t sgma0(uint64_t x) const = 0; virtual uint64_t sgma1(uint64_t x) const = 0; virtual uint64_t sgma2(uint64_t x) const = 0; }; namespace detail { struct DefaultBitManipPolicy : public BitManipPolicy { uint64_t fast_bit_align(const uint64_t i, uint64_t n) const; uint64_t not_rotate_right(uint64_t i, uint64_t n) const; uint64_t not_rotate_left(uint64_t i, uint64_t n) const; uint64_t rotate_right(uint64_t i, uint64_t n) const; uint64_t rotate_left(uint64_t i, uint64_t n) const; uint64_t mod(uint64_t i, uint64_t p) const; uint64_t sgma0(uint64_t x) const; uint64_t sgma1(uint64_t x) const; uint64_t sgma2(uint64_t x) const; }; } /* The 64 fractional parts of the cuberoots of the first 64 prime numbers. */ constexpr std::array<uint64_t, 64> fractional_cuberoots { 0x428a2f98d728ae22, 0x7137449123ef65cd, 0xb5c0fbcfec4d3b2f, 0xe9b5dba58189dbbc, 0x3956c25bf348b538, 0x59f111f1b605d019, 0x923f82a4af194f9b, 0xab1c5ed5da6d8118, 0xd807aa98a3030242, 0x12835b0145706fbe, 0x243185be4ee4b28c, 0x550c7dc3d5ffb4e2, 0x72be5d74f27b896f, 0x80deb1fe3b1696b1, 0x9bdc06a725c71235, 0xc19bf174cf692694, 0xe49b69c19ef14ad2, 0xefbe4786384f25e3, 0x0fc19dc68b8cd5b5, 0x240ca1cc77ac9c65, 0x2de92c6f592b0275, 0x4a7484aa6ea6e483, 0x5cb0a9dcbd41fbd4, 0x76f988da831153b5, 0x983e5152ee66dfab, 0xa831c66d2db43210, 0xb00327c898fb213f, 0xbf597fc7beef0ee4, 0xc6e00bf33da88fc2, 0xd5a79147930aa725, 0x06ca6351e003826f, 0x142929670a0e6e70, 0x27b70a8546d22ffc, 0x2e1b21385c26c926, 0x4d2c6dfc5ac42aed, 0x53380d139d95b3df, 0x650a73548baf63de, 0x766a0abb3c77b2a8, 0x81c2c92e47edaee6, 0x92722c851482353b, 0xa2bfe8a14cf10364, 0xa81a664bbc423001, 0xc24b8b70d0f89791, 0xc76c51a30654be30, 0xd192e819d6ef5218, 0xd69906245565a910, 0xf40e35855771202a, 0x106aa07032bbd1b8, 0x19a4c116b8d2d0c8, 0x1e376c085141ab53, 0x2748774cdf8eeb99, 0x34b0bcb5e19b48a8, 0x391c0cb3c5c95a63, 0x4ed8aa4ae3418acb, 0x5b9cca4f7763e373, 0x682e6ff3d6b2b8a3, 0x748f82ee5defb2fc, 0x78a5636f43172f60, 0x84c87814a1f0ab72, 0x8cc702081a6439ec, 0x90befffa23631e28, 0xa4506cebde82bde9, 0xbef9a3f7b2c67915, 0xc67178f2e372532b }; /* The 16 fractional parts of the cuberoots of the first 64 prime numbers. */ constexpr std::array<uint64_t, 16> fractional_cubehash { 0x6a09e667f3bcc908, 0xbb67ae8584caa73b, 0x3c6ef372fe94f82b, 0xa54ff53a5f1d36f1, 0x510e527fade682d1, 0x9b05688c2b3e6c1f, 0x1f83d9abfb41bd6b, 0x5be0cd19137e2179, 0xcbbb9d5dc1059ed8, 0x629a292a367cd507, 0x9159015a3070dd17, 0x152fecd8f70e5939, 0x67332667ffc00b31, 0x8eb44a8768581511, 0xdb0c2e0d64f98fa7, 0x47b5481dbefa4fa4 }; template <typename T = detail::DefaultBitManipPolicy, typename std::enable_if<std::is_base_of<BitManipPolicy, T>::value>::type* = nullptr> uint64_t* hash(const std::string &message) { T bitManip; /* NOTE: MOD(i,p) because the macro returns: (i % (2^p)) in bitwise (i & ((1 << p) - 1)). Find the digets length aligned to 1024 bits--128 bytes. */ size_t message_length = message.length(); size_t digest_length = bitManip.fast_bit_align(message_length + 8, 128); /* Make sure both arrays rae zero'd out: (). Build the array of 16--64-bit base hash values. Allocate the digest. */ uint64_t* cube_hash = new uint64_t[16](); memcpy(cube_hash, fractional_cubehash.data(), fractional_cubehash.size()); /* Allocate the digest and copy the message into the digest array and append the ~length of the message to the end. */ std::vector<uint64_t> digest(digest_length); memcpy(digest.data(), message.c_str(), message_length); digest[digest_length - 1] = bitManip.not_rotate_right(message_length, bitManip.mod(message_length, 3)); size_t z; for (size_t i = 0; i < digest_length; i++) { /* Z: (i + 1) if Z > digest_length, Z wraps back to 0 due to modulo digest_length. Two Steps: XOR the current digest block with the current digets block NOTR of by N-bits which is then SUB by the the SIGMA0 of the current digets block. The current digest block digested the next(i + 1) digest block in the previous step. We don't want to use this same data again. So we ADD the current digest block by it's XOR of the NOT of the SIGMA0 of the current cube_hash where the current cube_hash is i % 16 (the cube hash array is only 16 entries long, so we wrap i). */ z = bitManip.mod(i + 1, digest_length); digest[i] ^= bitManip.not_rotate_right(digest[z], bitManip.mod(digest[i], z)) - bitManip.sgma0(digest[i]); digest[z] += digest[i] ^ ~bitManip.sgma1(cube_hash[bitManip.mod(i, 4)]); } size_t x, y; for (size_t i = 0; i < 64; i++) { /* NOTE: We previosuly loaded in all of the fractional_cubehash[0...15] in the cube_hash array before the loop. X: (i) if X > 16, X wraps back to 0 due to modulo 16 (1 << 4). Y: i,n) if y > V, Y wraps back to 0 due to modulo 64. Where V is the ROTR of the cube_hash[x] % B. Where B is the cube[16 - x] % 64. Three Steps: XOR the current cube_hash by the the NROTL of the fractional_cuberoots[y] OR'd by the constant fractional_cubehash[x]. We previously consumed the NROTL of the cuberoot, so now we'll use the NOT of the cuberoot. So we ADD the NOT of the fractional_cuberoot[y] ADD SIGMA2 of the current digest[V] Where V is the i % digest_length. NOTE: if the digest_length > 64, we may not consume the whole digets in this loop. Finally XOR the NROTL of the cube_hash[x] by 2-bytes ADD the opposite fracitonal_cuberoot[64 - i]. */ x = bitManip.mod(i, 4); y = bitManip.mod(bitManip.rotate_right(cube_hash[x], bitManip.mod(cube_hash[16 - x], 6)), 6); cube_hash[x] ^= bitManip.not_rotate_left(fractional_cuberoots[y], y) | fractional_cubehash[x]; cube_hash[x] += ~fractional_cuberoots[y] + bitManip.sgma2(digest[bitManip.mod(i, digest_length)]); cube_hash[x] ^= bitManip.not_rotate_left(cube_hash[x], 16) + fractional_cuberoots[64 - i]; } return cube_hash; } }

cube_hash.cpp

#include <cstdint> #include "cube_hash.hpp" namespace cube_hash { namespace detail { inline uint64_t DefaultBitManipPolicy::fast_bit_align(const uint64_t i, uint64_t n) const { return ~((i >> n) | (i << (sizeof(uint64_t) - n))); } inline uint64_t DefaultBitManipPolicy::not_rotate_right(uint64_t i, uint64_t n) const { return ~((i >> n) | (i << (sizeof(uint64_t) - n))); } inline uint64_t DefaultBitManipPolicy::not_rotate_left(uint64_t i, uint64_t n) const { return ~((i >> n) | (i << (sizeof(uint64_t) - n))); } inline uint64_t DefaultBitManipPolicy::rotate_right(uint64_t i, uint64_t n) const { return (i >> n) | (i << (sizeof(uint64_t) - n)); } inline uint64_t DefaultBitManipPolicy::rotate_left(uint64_t i, uint64_t n) const { return (i << n) | (i >> (sizeof(uint64_t) - n)); } inline uint64_t DefaultBitManipPolicy::mod(uint64_t i, uint64_t p) const { return i & ((1 << p) - 1); } inline uint64_t DefaultBitManipPolicy::sgma0(uint64_t x) const { return rotate_right(x,7) ^ rotate_right(x,18) ^ rotate_right(x,3); } inline uint64_t DefaultBitManipPolicy::sgma1(uint64_t x) const { return not_rotate_right(x,28) ^ rotate_right(x,34) ^ not_rotate_right(x,39); } inline uint64_t DefaultBitManipPolicy::sgma2(uint64_t x) const { return not_rotate_right(x,25) ^ rotate_left(x,43) ^ not_rotate_right(x,9); } } }

This is quite the attempt and although not complete, it shows what you can do with C++ without going as far as C++14.

Use of Policy

You can think of the use of policies as a way to help you easily modify your algorithm on the fly for different endian machines. I believe the way your algorithm works now, it conforms to little-endian architectures, but you can create a class that implements the interface BitManipPolicy for big-endian machines and call the function with that class and everything should still work as intended. Policy can even help you change the way the code works.

Example usage:

struct MyBigEndianPolicy: public cube_hash::BitManipPolicy { ... }; cube_hash::hash<MyBigEndianPolicy>("A loooott of bytess. Big chunk of data");

What you did right

Separating the bit shifts used in the algorithm into different macros is definitely good programming practice (although functions would have been preferred), so kudos on that

I'd actually avoid endian-ness. My code is not endian-dependant. Endian-ness is handled by the compiler via the machine's architecture IIRC. Never pursue endian-ness. — Kayle, CommentedMar 3, 2017 at 4:34

Stack Exchange Network

Custom Non-Crypto Hashing Algorithm

4 Answers 4

Code Wise.

Interface Semantics

Prefer to use C++ standard library over the C standard library

Don't use new/delete

Modern Language constructs

Use C++

Use of Policy

Example usage:

What you did right

Hot Network Questions

Custom Non-Crypto Hashing Algorithm

4 Answers 4

Code Wise.

Interface Semantics

Prefer to use C++ standard library over the C standard library

Don't use new/delete

Modern Language constructs

Use C++

Use of Policy

Example usage:

What you did right

Related

Hot Network Questions