Beyond the Blockchain: Exploring Merkle Trees in Ethereum

Beyond the Blockchain: Exploring Merkle Trees in Ethereum #merkletree #ethereum Exploring Merkle Trees: Beyond the Blockchain

Feb 26, 2024

Beyond the Blockchain: Exploring Merkle Trees in Ethereum

The Merkle Patricia Tree (MPT), also known as the Trie, is a data structure used in Ethereum to securely store data, including transactions, balances, and the state of smart contracts. It combines the benefits of a Merkle tree (cryptographic proof of included data) and a Patricia tree (a compact form of a trie for efficient lookups). Here's a step-by-step explanation of how the Merkle Patricia Tree hash is calculated:

Each leaf node represents a single piece of data, usually a transaction in a blockchain context.

Let's say you have four transactions: Tx1, Tx2, Tx3, and Tx4.

You calculate the cryptographic hash of each transaction: H(Tx1), H(Tx2), H(Tx3), H(Tx4).

These hashes become the values stored in the leaf nodes of the Merkle tree.

Intermediate Nodes:
- Level 1: Pair up the leaf nodes.
  - Concatenate H(Tx1) + H(Tx2) and compute the hash, let's call it H12.
  - Concatenate H(Tx3) + H(Tx4) and compute the hash, let's call it H34.
  - H12 and H34 become the values of the two intermediate nodes at this level.
- Level 2 (Root):
  - Concatenate H12 + H34 and compute the final hash. This hash is the Merkle root (HRoot).

Diagram Representation:

               HRoot
               /  \
             H12   H34
            / \    / \
        H(Tx1) H(Tx2) H(Tx3) H(Tx4)

This is merkle tree code.

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters

Show hidden characters

	from typing import List
	import hashlib

	class Node:
	def __init__(self, left, right, value: str, content) -> None:
	self.left = left
	self.right = right
	self.value = value
	self.content = content

	@staticmethod
	def hash(val: str) -> str:
	return hashlib.sha256(val.encode("utf-8")).hexdigest()

	def __str__(self):
	return str(self.value)

	class MerkleTree:
	def __init__(self, values: List[str]) -> None:
	self.__buildTree(values)

	def __buildTree(self, values: List[str]) -> None:
	leaves = [Node(None, None, Node.hash(e), e) for e in values]
	while len(leaves) % 2 == 1:
	leaves.append(leaves[-1]) # Duplicate last element if odd number of elements
	self.root = self.__buildTreeRec(leaves)

	def __buildTreeRec(self, nodes: List[Node]) -> Node:
	if len(nodes) == 1:
	return nodes[0]
	new_level = []
	for i in range(0, len(nodes), 2):
	left = nodes[i]
	right = nodes[i + 1]
	value = Node.hash(left.value + right.value)
	content = left.content + "+" + right.content
	new_level.append(Node(left, right, value, content))
	return self.__buildTreeRec(new_level)

	def printTree(self) -> None:
	self.__printTreeRec(self.root)

	def __printTreeRec(self, node) -> None:
	if node is not None:
	if node.left is not None:
	print("Left: " + str(node.left))
	print("Right: " + str(node.right))
	else:
	print("Input")
	print("Value: " + str(node.value))
	print("Content: " + str(node.content))
	print("")
	self.__printTreeRec(node.left)
	self.__printTreeRec(node.right)

	def getRootHash(self) -> str:
	return self.root.value

	def mixmerkletree() -> None:
	elems = ["x", "y", "z", "f", "s", "e", "r", "n"]
	print("Inputs: ")
	print(*elems, sep=" \| ")
	print("")
	mtree = MerkleTree(elems)
	print("Root Hash: " + mtree.getRootHash() + "\n")
	mtree.printTree()

	mixmerkletree()

view raw MerkleTree.py hosted with ❤ by GitHub

Step-by-step diagram of Merkle Tree creation:

Inputs: x | y | z | f | s | e | r | n

1. Initial Hashes:

         x       y       z       f       s       e       r       n
         |       |       |       |       |       |       |       |
  Hash(x)  Hash(y) Hash(z) Hash(f) Hash(s) Hash(e) Hash(r) Hash(n)

2. Level 1:

         Hash(m)     Hash(b)      Hash(v)      Hash(c)
        /                \            /             \
  Hash(x+y)          Hash(z+f)   Hash(s+e)      Hash(r+n)

3. Level 2:

         Hash(m+b)                                  Hash(v+c)
        /        \                                 /        \
 Hash(Hash(x+y)+Hash(z+f))                    Hash(Hash(s+e)+Hash(r+n))

4. Root:

         Hash(Hash(m+b)+Hash(v+c))
                        |
                    Root Hash

You can use also the following online tool for understanding the merkle tree.

Let's explore why this system is beneficial, particularly in the contexts of blockchain and distributed ledger technologies:

1. Data Integrity and Verification

The primary reason for using a Merkle Tree structure is to ensure data integrity and provide an efficient way to verify the contents of large datasets without needing to compare the entire dataset. Each node in a Merkle Tree is a hash of its children, and the root hash represents the entire dataset.

2. Efficient Data Verification

Merkle Trees allow for efficient and secure verification of specific data elements within a large set. This is particularly useful in distributed systems like blockchains, where it is impractical to download and verify the entire blockchain.

3. Tamper Evident

Any change to a data element in a Merkle Tree changes the hash of the leaf node representing that data. Because parent nodes are hashes of their children, this change propagates all the way up to the root node, changing the root hash. Therefore, any tampering with the data becomes immediately evident when the root hash is checked.

4. Blockchain Application

In the context of blockchain, Merkle Trees are used to summarize all the transactions in a block into a single hash (the Merkle Root), which is included in the block's header. This allows nodes in the network to quickly and securely verify whether a transaction is included in a block without needing the entire block's content.

Good resource.

Conclusion

The Merkle Tree system offers a powerful method for ensuring data integrity, enabling efficient verification, and providing a tamper-evident structure. These features make it invaluable for blockchain applications, where security, efficiency, and the ability to verify data without possessing the entire dataset are crucial.

Engin YILMAZ (@veridelisi)

VeriDelisi

Discussion about this post