List with hash collisions

11.02.2025

My friend and fellow blogger Sam recently published his  
third illustrated guide on hashing. There’s no real need to read his guide before reading this article, but I highly recommend you do so. If only to see the amazing animations that are hard to tear yourself away from. Honestly, they’re just amazing. Unfortunately, there are no animations here, so watch them in that article and then come back here. You can have some fun finding collisions in the MurmurHash3 hashing algorithm here.

I want to point out right away that I wrote this material purely for fun. Nothing I show here is intended for  production code . And here, as always,  is a link  to the GitHub repository where you can find the code for this article.

The idea for this post came about after I was asked to help with collision mining. I was then overcome with an overwhelming desire to know how many hashes per second I could squeeze out of the hardware available to me. In fact, here I will talk about searching for MurmurHash3 hash collisions at a speed of 200 gigahashes per second.

Snakes and Ladders

Sam kindly shared  a Python script for this . On my aging machine (i7-6700k), this script produced 3,311,080 hashes in 10 seconds:

import time
import mmh3
from uuid import uuid4

start = time.perf_counter()
inc = 0

while True:
val = str(uuid4())
h = mmh3.hash(val)
if h == 1228476406:
print(val)
stop = time.perf_counter()
duration = stop - start
inc = inc + 1
if duration >= 10:
print(inc)
break

This script worked for me for about half an hour, I did not find a single collision during this time. My working programming language is C#, so I decided to transfer the calculations to a more familiar environment.

Sharpening of tools

The first attempt  to rewrite this Python script in C#, without resorting to any tricks, gave 37759012 hashes in 10 seconds:

while (true)
{
var guidString = Guid.NewGuid().ToString();
var stringBytes = Encoding.ASCII.GetBytes(guidString);
var computeHash = murmurHash32.ComputeHash(stringBytes);
var int32 = BitConverter.ToInt32(computeHash);

if (int32 == 1228476406)
{
Console.WriteLine(guidString);
}
}

This is not bad, but I thought there were some things I could improve. First, I decided to switch from UTF8 to ASCII. This brought the speed up to 39656732 hashes in 10 seconds:

39,656,732 hashes per ten seconds
Took: 10,000 ms
Allocated: 7,435,745 kb
Peak Working Set: 26,180 kb
Gen 0 collections: 1820
Gen 1 collections: 1
Gen 2 collections: 0

I ran this version of the code in 4-6 copies and was able to find about twenty collisions in about half an hour.

Random bytes

Now we  use the method  to get the GUID v4Guid.NewGuid() . But we don’t really need the GUID v4. We’ll be fine with a value similar to this identifier:

byte[] guidBytes = new byte[16];
var random = new Random();

while (true)
{
    Array.Clear(guidBytes);
    random.NextBytes(guidBytes);

    var guidString = new Guid(guidBytes).ToString();
    // сокращено
}

By filling a byte array with random values ​​and constructing our GUID variant based on it, we were able to achieve a speed of 49788883 hashes in 10 seconds:

49,788,883 hashes per ten seconds
Took: 10,000 ms
Allocated: 9,335,517 kb
Peak Working Set: 26,416 kb
Gen 0 collections: 2285
Gen 1 collections: 1
Gen 2 collections: 0

And switching from a byte array to  Span<byte> gave another small, but pleasant, speed boost:

Span<byte> guidBytes = stackalloc byte[16];
var random = new Random();

while (true)
{
guidBytes.Clear();
random.NextBytes(guidBytes);

var guidString = new Guid(guidBytes).ToString();
// сокращено
}

Here’s how this version of the code performed:

53,015,764 hashes per ten seconds
Took: 10,016 ms
Allocated: 9,940,562 kb
Peak Working Set: 26,448 kb
Gen 0 collections: 2434
Gen 1 collections: 1
Gen 2 collections: 0

Faster implementation of hash function

You can get some more performance by converting the string into a structure  Span:

Span<byte> guidBytes = stackalloc byte[16];
Span<byte> stringBytes = stackalloc byte[36];

while (true)
{
guidBytes.Clear();
stringBytes.Clear();
random.NextBytes(guidBytes);

var guidString = new Guid(guidBytes).ToString().AsSpan();
Encoding.ASCII.GetBytes(guidString, stringBytes);
// сокращено
}

At this stage of our code development, the most significant improvement that could be made to it is to abandon the work with strings (because strings are  evil ). Currently, we use the implementation of the MurmurHash3 algorithm from the FastHashes library . Before getting rid of strings, let’s see if there are other implementations of this algorithm. To evaluate the performance of the code, we will use  BenchmarkDotNet  – this will ensure accurate and comparable research results.

Turning to Google, I immediately found  Oren Eini’s article , whose ideas were, in the comments, refined by  Stuart Lang . The code inspired by all this is what I called  OrenAndStu. Performance evaluation of my original code (A_FastHashes in the following table) and the new one (B_OrenAndStu) over a million iterations showed that the variant  OrenAndStu is much faster than the variant  FastHashes:

MethodAverage, msError, msStandard deviation, msGen0Selected
A_FastHashes18.2470.35970.41427625.000032000043 B
B_OrenAndStu8.6580.04830.04289 B

Now my task was to find an implementation of the algorithm that would be faster than  OrenAndStu. I was too lazy to do much work, so I just installed a few packages from  nuget  and let BenchmarkDotNet work:

MethodAverage,
ms
Error,
ms
Standard deviation, msGen0Selected
A_FastHashes17.8840.29830.27917625.000032000043 B
B_OrenAndStu8.6120.00990.00879 B
C_HashDepot12.1390.08680.07699 B
D_MurmurHash_Net12.5870.06770.06349 B
E_MurmurHash_net_core68.3010.61980.579815250.000064000123 B
F_System_Data_HashFunction_MurmurHash98.6311.95282.248968800.0000288000192 B
G_JeremyEspresso_MurmurHash7.8400.03710.0328

The only implementation of the algorithm that was faster than  OrenAndStuwas JeremyEspresso . Switching to it allowed to overcome the barrier of 60,000,000 hashes in 10 seconds:

60,358,517 hashes per ten seconds
Took: 10,000 ms
Allocated: 5,658,711 kb
Peak Working Set: 25,992 kb
Gen 0 collections: 1385
Gen 1 collections: 1
Gen 2 collections: 0

Bytes and nothing but bytes

If a program has a lot of lines (and therefore a lot of memory allocation operations), this is usually not very good for performance. Right now, we work like this:

  • Generate random bytes.
  • We convert what we get into a GUID.
  • Let’s convert this GUID to a string.
  • Extract bytes from the string representation of the GUID.

Ideally, we would work only with bytes all the time. Then we would be able to get rid of most of the memory allocation operations, and hopefully, we would be able to speed up the code. In order to do this, we need to look into the class responsible for working with GUIDs and borrow  a few  internal  methods from there :

Span<byte> guidBytes = stackalloc byte[16];
Span<byte> stringBytes = stackalloc byte[36];

while (true)
{
guidBytes.Clear();
stringBytes.Clear();
random.NextBytes(guidBytes);

Ugly.GuidBytesToRegularBytes(guidBytes, stringBytes);

ReadOnlySpan<byte> stringBytes1 = stringBytes;
var int32 = MurmurHash.MurmurHash3.Hash32(ref stringBytes1, 0);
}

With this approach, we will always work with bytes. This only slightly speeds up the code, but it also eliminates almost all memory allocation operations:

64,052,282 hashes per ten seconds
Took: 10,031 ms
Allocated: 110 kb
Peak Working Set: 21,224 kb
Gen 0 collections: 0
Gen 1 collections: 0
Gen 2 collections: 0

Getting rid of GUID

Given that we are interested in finding any input data that has a specific hash (i.e., collisions), we don’t need to stick to the GUID. Any input data is fine. Let’s try generating an eight-byte array based on alphanumeric data ( a-z,  A-Z,  0-9) and see what happens:

Span<byte> stringBytes = stackalloc byte[8];
var chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789".Select(x => (byte)x).ToArray();

while (true)
{
stringBytes.Clear();
stringBytes[0] = chars[random.Next(0, chars.Length)];
stringBytes[1] = chars[random.Next(0, chars.Length)];
stringBytes[2] = chars[random.Next(0, chars.Length)];
stringBytes[3] = chars[random.Next(0, chars.Length)];
stringBytes[4] = chars[random.Next(0, chars.Length)];
stringBytes[5] = chars[random.Next(0, chars.Length)];
stringBytes[6] = chars[random.Next(0, chars.Length)];
stringBytes[7] = chars[random.Next(0, chars.Length)];
// сокращено
}

And this is what happened:

156,659,240 hashes per ten seconds
Took: 10,047 ms
Allocated: 102 kb
Peak Working Set: 22,132 kb
Gen 0 collections: 0
Gen 1 collections: 0
Gen 2 collections: 0

What if we remove the alphanumeric data and limit ourselves to an eight-byte array filled, as before, with random values?

Span<byte> stringBytes = stackalloc byte[8];

while (true)
{
stringBytes.Clear();
random.NextBytes(stringBytes);
}

Here is the result:

329,339,000 hashes per ten seconds
Took: 10,031 ms
Allocated: 102 kb
Peak Working Set: 21,076 kb
Gen 0 collections: 0
Gen 1 collections: 0
Gen 2 collections: 0

While we have achieved incredible performance, the many collisions we find are not particularly useful. The problem is that they are not easily (maybe impossible?) displayed on the screen. I will probably conclude that the “alphanumeric” version of the code is the best for practical use.

(Hash)cat: a cat among pigeons

So far, the bottleneck in our hashing system has been the CPU. But the thing is, these types of workloads are not really suited for CPUs. GPUs are the best choice for this type of work.  Hashcat  is a command-line-only program designed to crack password hashes (another interesting story: we used it in production to crack production passwords, but that’s a story for another time). It has a good  help file .

There is one subtlety in using Hashcat. The thing is that it expects the target value to be in a certain format. Usually, it is  hash:salt. Running the command  hashcat.exe --hash-info -m 27800shows that the value  hashcat, represented as plain text, has a hash of  23e93f65. This is a bit of a problem, because so far we have been using integers to represent hashes. If we run it  hashcat through one of the versions of our code, we get an integer value of  602488677. Then, to get the same string, we need to reverse the hashed byte array and generate a string representing the hexadecimal value:

var buffer = Encoding.ASCII.GetBytes("hashcat");
var computeHash = new FastHashes.MurmurHash32().ComputeHash(buffer);
Console.WriteLine(BitConverter.ToString(computeHash.Reverse().ToArray()).Replace("-", "").ToLower());
Console.WriteLine(BitConverter.ToInt32(computeHash, 0));

I’m pretty sure the array needs to be reversed because of  the byte order  of the system, so if you’re reproducing my experiments, make sure you have everything represented correctly. After doing the same thing for one of the known “integer” hash inputs ( 1228476406), we learned that the “hexadecimal” string we’re interested in is  49390ff6.

.\hashcat.exe -a 3 -m 27800 49390ff6:00000000 ?a?a?a?a?a?a?a?a --keep-guessing --runtime=10

My ancient  HD6950 video card  returned 31 input values ​​that had the hash collisions we were interested in. Hashcat also outputs statistics after it has finished:

Session..........: hashcat
Status...........: Aborted (Runtime)
Hash.Mode........: 27800 (MurmurHash3)
Hash.Target......: 49390ff6:00000000
Time.Started.....: Sun Jun 18 00:04:22 2023 (10 secs)
Time.Estimated...: Sun Jun 25 01:15:33 2023 (7 days, 1 hour; Runtime limit exceeded)
Kernel.Feature...: Optimized Kernel
Guess.Mask.......: ?a?a?a?a?a?a?a?a [8]
Guess.Queue......: 1/1 (100.00%)
Speed.#1.........: 10892.3 MH/s (4.10ms) @ Accel:128 Loops:256 Thr:64 Vec:4
Recovered........: 0/1 (0.00%) Digests
Progress.........: 107105746944/6634204312890625 (0.00%)
Rejected.........: 0/107105746944 (0.00%)
Restore.Point....: 0/7737809375 (0.00%)
Restore.Sub.#1...: Salt:0 Amplifier:544512-544768 Iteration:0-256
Candidate.Engine.: Device Generator
Candidates.#1....: Uz#erane -> C3$B?tan

Rent GPU

Considering that my video card was made in 2010, I am sure that modern hardware will help improve the situation with the seven-day wait for results that Hashcat warned us about:  Time.Estimated...: Sun Jun 25 01:15:33 2023 (7 days, 1 hour; Runtime limit exceeded). I decided to rent an RTX 3090 from  genesiscloud.com  (I don’t want to hide anything – this is a referral link, using it you will get bonuses of $ 50). Creating new instances in this service is very easy, but installing Hashcat on Ubuntu was not such a simple task. The thing is that I mainly work in Windows. But I still managed to get everything running (I had to, I admit, copy-paste a lot). As a result, the same code already produced 433 input values.

Session..........: hashcat
Status...........: Aborted (Runtime)
Hash.Mode........: 27800 (MurmurHash3)
Hash.Target......: 49390ff6:00000000
Time.Started.....: Sat Jun 17 23:21:21 2023 (10 secs)
Time.Estimated...: Sun Jun 18 07:57:16 2023 (8 hours, 35 mins; Runtime limit exceeded)
Kernel.Feature...: Optimized Kernel
Guess.Mask.......: ?a?a?a?a?a?a?a?a [8]
Guess.Queue......: 1/1 (100.00%)
Speed.#1.........:   214.3 GH/s (6.32ms) @ Accel:64 Loops:1024 Thr:256 Vec:1
Recovered........: 0/1 (0.00%) Digests
Progress.........: 2112133758976/6634204312890625 (0.03%)
Rejected.........: 0/2112133758976 (0.00%)
Restore.Point....: 1343488/7737809375 (0.02%)
Restore.Sub.#1...: Salt:0 Amplifier:714752-715776 Iteration:0-1024
Candidate.Engine.: Device Generator
Candidates.#1....: UcX}eTON -> xw(>N223
Hardware.Mon.#1..: Temp: 49c Fan:  0% Util: 99% Core:1890MHz Mem:9501MHz Bus:16

The moral of the story is: if you need to find hash collisions (for non-cryptographic hash algorithms), don’t bother writing your own code. Just call (Hash)cat.