Merkle Trees and SPVs in Bitcoin

Open-Source Internship opportunity by OpenGenus for programmers. Apply now.

In this article, we learn about Merkle trees and simplified payment verification (SPV) and how the two relate to each other.

Table of contents.

Introduction.
Merkle Trees in Bitcoin.
Merkle Trees and SPVs.
Summary.
References.

Introduction.

In blockchains, a Merkle tree is a data structure used to encode blockchain data efficiently and securely. Transactions are passed through a hashing function to produce a transaction hash. A block stores all transactions in the blockchain. These transactions in a block after being hashed pairs of transactions are concatenated together to form a single hash. This process continues until there is a single hash for the whole block of transactions. In the case there are an odd number of transactions, one will be double hashed and its hash is concatenated with itself.

A Merkle tree / binary hash tree is used to summarize and verify the integrity of these transactions. These trees hold cryptographic hashes at their nodes and the tree is usually displayed up-side-down whereby the root is at the top and the leaves at the bottom.

Merkle Trees in Bitcoin.

In Bitcoin, merkle trees are used to summarize all transactions in a block to produce an overall digital fingerprint of all transactions in the block. This process produces a hash such that if a transaction in the block is mutated or removed, it would result in a totally different hash rendering the block invalid.

The resulting hash is referred to as the Merkle root and the hashing algorithm used is SHA-256.
Let's consider the case where we have four transactions, A, B, C, and S. An even number of transactions. In this case, we will have four leaves HA, HB, HC, and HD where;
HA = SHA-256(SHA-256(Transaction A))
To obtain the parent node HAB, we concatenate the two hashes HA and HB to obtain a 64-byte string(32-byte each). We then double hash the string to obtain the parent;
HAB = SHA-256(SHA-256(HA + HB))

To obtain the parent node HCD, the process is repeated;
HCD = SHA-256(SHA-256(HC + HD))

The following image summarizes the above concept nicely;

bit125

In the case there is an odd number of transactions, the last transaction hash is duplicated to create an even number of leaves. This is demonstrated in the image below;

bit126

In this case, we have three transactions HA, HB, and HC. HC is duplicated to form a balanced tree.

This concept can be used to construct Merkle trees of all sizes. In Bitcoin, a block is mined after every 10 minutes, meaning it will contain transactions in the past 10 minutes. This number ranges from hundreds to thousands of transactions in a block depending on the network activity. This number of transactions is combined to form a single hash, the root of the tree. The beauty of hash functions is that any size or length of transaction passed through the hashing function results in a fixed-length string(32-byte).

Now, to make sure that a transaction is in a block, a node produces $\log_{2}$ (N) 32-byte hashes that constitute if an authentication/Merkle path that connects the transaction to the root. In this case, the base-2 logarithm increases much more steadily as the number of transaction increase. This enables the efficient production of 10-path to 12-path hashes that provide proof that a transaction exists within the tree.

Given the following Merkle tree;

bit127

A node can prove that transaction K is in the block by producing a 32-byte hash long Merkle/authentication path consisting of four hashes(HL, HIJ, HMNOP, and HABCDEFGH). With this path, any node can verify that a transaction(HK) exists in a block by computing four pair-wise hashes, HKL, HIJKL, HIJKLMNOP + root.

Note that, as the number of blocks and transactions increases, the Merkle tree only gets better and more efficient. The following image summarizes its efficiency;

bit128

Merkle trees in Bitcoin allow nodes to just download block headers and be able to know if a transaction has been included in the block instead of storing the whole copy of the blockchain. This is applicable in mobile wallets where there is not enough space to store the full blockchain. In this case, the app will retrieve a small Merkle path from a full node without the need to store or transmit it.
These nodes are referred to as simplified payment verification nodes. They use Merkle paths without having to download and store the full blockchain copy.

Merkle Trees and SPVs.

As mentioned SPV nodes don't store the full blockchain copy and therefore don't have all transactions. They only download block headers. To verify a transaction is included in a block, they use an authentication/Merkle path.

For example, we have an SPV node that verifies payments to a wallet address. It will establish a bloom filter on its peers. This filters the number of transactions that are received and results in only those containing the wallet address.
When a transaction matches the filter, the node sends the block using the merkleblock message. This message contains the block header and the merkle path linking the transaction in question.
The node uses the path to connect the transaction to the block and verify its inclusion in the block.
The block header is also used t link the block to the rest of the chain. These two links(transaction-block, block-blockchain) when combined are all that is needed to prove a transaction is included in the blockchain.

Summary.

In blockchains, a Merkle tree is a data structure used to encode blockchain data efficiently and securely.

Merkle trees are used to summarize all transactions in a block to produce an overall digital fingerprint of all transactions in the block.

Merkle trees in Bitcoin allow nodes to just download block headers and be able to know if a transaction has been included in the block instead of storing the whole copy of the blockchain.

SPV nodes don't store the full blockchain copy. They only download block headers. To verify a transaction is included in a block, they use an authentication/Merkle path.

With this article at OpenGenus, you must have the complete idea of Merkle Trees and SPVs in Bitcoin.

Merkle Trees and SPVs in Bitcoin

Blockchain

Table of contents.

Introduction.

Merkle Trees in Bitcoin.

Merkle Trees and SPVs.

Summary.

Ethereum's Four Development Stages

7 Components of Ethereum