What the fork is a Merkle Tree?

5 min readNov 12, 2022

NFT whitelists, allowlists and private sales have become essential tools right now to launch and manage the minting for your NFT projects. It’s also is a way to reward your community’s initial supporters & active users, while being an effective instrument to prevent fraud. We recently ran into an issue of trying to store the whitelist on chain which we found to be quite gas-expensive (case in point 120 WL eth addresses = 0.89 GoerliETH 😱) so instead of storing all users’ data on-chain, we can perhaps try to generate a Merkle tree from the users’ data and use that method instead. From that tree, we can extract and store only the tree’s root (32 bytes) on-chain and perhaps derive the individual leaf nodes to verify authenticity with the whitelisted users. Hopefully it will save a ton of gas fees!

Hashing

Hashing functions are mathematical algorithms that take inputs and provide unique outputs. Hashing is the process of transforming any given key or a string of characters into another value. This is usually represented by a shorter, fixed-length value or key that represents and makes it easier to find or employ the original string. The most popular use for hashing is the implementation of hash tables.

Some of the most common hashing functions are MD5, SHA-3, and SHA-256 — the last of which is used by Bitcoin.

Hash — The Puzzle of Bitcoin (What exactly are the Miner’s mining?)

Bitcoin mining is a key part of the security of the Bitcoin system. The idea is that Bitcoin miners group a bunch of Bitcoin transactions into a block, then repeatedly perform a cryptographic operation called hashing zillions of times until someone finds a special extremely rare hash value. At this point, the block has been mined and becomes part of the Bitcoin block chain. The hashing task itself doesn’t accomplish anything useful in itself, but because finding a successful block is so difficult, it ensures that no individual has the resources to take over the Bitcoin system. For more details on mining, see my Bitcoin mining article.

A cryptographic hash function takes a block of input data and creates a smaller, unpredictable output. The hash function is designed so there’s no “short cut” to get the desired output — you just have to keep hashing blocks until you find one by brute force that works. For Bitcoin, the hash function is a function called SHA-256. To provide additional security, Bitcoin applies the SHA-256 function twice, a process known as double-SHA-256.

In Bitcoin, a successful hash is one that starts with enough zeros.[1] Just as it is rare to find a phone number or license plate ending in multiple zeros, it is rare to find a hash starting with multiple zeros. But Bitcoin is exponentially harder. Currently, a successful hash must start with approximately 17 zeros, so only one out of 1.4x1020 hashes will be successful. In other words, finding a successful hash is harder than finding a particular grain of sand out of all the grains of sand on Earth.

The following diagram shows a block in the Bitcoin blockchain along with its hash. The yellow bytes are hashed to generate the block hash. In this case, the resulting hash starts with enough zeros so mining was successful. However, the hash will almost always be unsuccessful. In that case, the miner changes the nonce value or other block contents and tries again.

What are Merkle Trees?

The first concept of a hash tree is named after Ralph Merkle, who patented it in 1979. He’s one of the inventors of public-key cryptography and the inventor of cryptographic hashing. He proposed the hash tree or merkle tree in his Master’s thesis entitled: SECRECY, AUTHENTICATION, AND PUBLIC KEY SYSTEMS.

Merkle trees essentially are just a way to quickly verify data integrity in a set. For example you want to check if a certain data or eth address is in the list, you can just use a merkle root and merkle proof to confirm that it’s present there without having to see all of the data in the list.

How does Merkle Trees work?

First you take a list of data and group them together by two’s to form a pair each. Then you apply a hash function for each data in a pair to form a merkle leaf. Then you concatenate (or join) them together and hash it again forming another leaf in the merkle branch. You repeat this process until you have nothing left to hash and concatenate and all you have is the Merkle root.

Merkle trees in Bitcoin

Merkle trees are also being used in building new bitcoin block headers as they help maintain the integrity and validity of data. It also helps in saving the memory by just storing the merkle root instead of the whole transactions data which is also called: Simple Payment Verification

Other use-cases of Merkle tress

Merkle DAG: when you upload a folder directory on IPFS, Merkle Directed Acyclic Graph is a way to verify if the directory contents has been changed.

Patricia Merkle Trie: is one of the key data structures for the Ethereum’s storage layer. Instead of having just one merkle tree, it has three (3) actually: stateRoot, transactionRoot, receiptsRoot.