How to Use Distributed SQLite and Merkle Trees to Detect Divergence

How to Use Distributed SQLite and Merkle Trees to Detect Divergence blog post fallback blur image How to Use Distributed SQLite and Merkle Trees to Detect Divergence blog post main image
Stephen CollinsSep 29, 2025
4 mins

Key Points

What problem does using Merkle trees with multiple SQLite databases solve?

Using Merkle trees with multiple SQLite databases allows each database to act as a peer in a network, with a Merkle root summarizing its change history. This makes it easy to detect if any peer's record has been tampered with or has diverged from others, simulating distributed immutability without needing complex infrastructure.

How does a Merkle root help peers detect if their histories have diverged?

A Merkle root is a single cryptographic hash representing all events in a peer's ledger. If two peers compare their Merkle roots and find a difference, they immediately know their histories are not identical, indicating divergence without needing to compare entire datasets.

Why are the ledgers append-only, and how does this improve trust?

Ledgers are append-only to ensure no record can be altered or removed after being added. This, combined with timestamps and SHA256 hashes, provides an immutable audit trail, which is essential for building trust in distributed or tamper-evident systems.

How does this demo relate to real-world systems like Git or blockchains?

The demo models how distributed systems such as Git and blockchains maintain independent histories and use cryptographic fingerprints (Merkle roots) to compare or synchronize their states. Just like in these systems, peers quickly detect divergence and maintain data integrity.

What is the next step after having peers compare Merkle roots, and why is it important?

After peers can detect divergence by comparing Merkle roots, the next step is to spread this information across the network using gossip protocols. This is crucial for keeping all peers informed and synchronized in real distributed systems.

Distributed Systems with SQLite Series

In Part 1, we explored how a Merkle tree inside SQLite can turn a local database into a tamper-evident ledger. That gave us local immutability: a single .db file could prove its history hadn’t been altered.

Now we’ll take the next step: imagine each database file is a peer in a network. Each peer appends events to its own ledger and computes a Merkle root as a cryptographic fingerprint. By exchanging and comparing roots, peers can quickly detect whether their histories diverge.

This pattern — multiple independent ledgers, compared by fingerprints — underlies real systems like Git, blockchains, and distributed databases.


Why “Distributed SQLite”?

SQLite itself is not distributed. But by treating multiple .db files as independent peers, we can:

  • Simulate distributed immutability without infrastructure overhead.
  • Teach how Merkle roots summarize history compactly.
  • Show how peers can detect divergence with minimal data exchange.

This demo isn’t production-ready, but it’s the perfect teaching tool.

All the code is available on GitHub.


Step 1: Append-Only Ledgers in SQLite

Each peer maintains its own ledger table. In our repo, this is managed by ledger.py:

import sqlite3, time, hashlib

def init_db(db_file: str) -> sqlite3.Connection:
    conn = sqlite3.connect(db_file)
    conn.execute("""
        CREATE TABLE IF NOT EXISTS ledger (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            timestamp REAL NOT NULL,
            data TEXT NOT NULL,
            hash TEXT NOT NULL
        )
    """)
    conn.commit()
    return conn

def add_record(conn: sqlite3.Connection, data: str) -> int:
    timestamp = time.time()
    record_hash = hashlib.sha256(data.encode('utf-8')).hexdigest()
    cursor = conn.execute(
        "INSERT INTO ledger (timestamp, data, hash) VALUES (?, ?, ?)",
        (timestamp, data, record_hash)
    )
    conn.commit()
    return cursor.lastrowid

Records are append-only: each entry gets a timestamp, data, and SHA256 hash.


Step 2: Compute Merkle Roots

The merkle.py module builds a Merkle tree from ledger rows:

def sha256(data: str) -> str:
    return hashlib.sha256(data.encode("utf-8")).hexdigest()

def compute_merkle_root(hashes: list[str]) -> str:
    if not hashes: return sha256("")
    if len(hashes) == 1: return hashes[0]
    next_level = []
    for i in range(0, len(hashes), 2):
        left = hashes[i]
        right = hashes[i+1] if i+1 < len(hashes) else left
        next_level.append(sha256(left + right))
    return compute_merkle_root(next_level)

def merkle_root(conn) -> str:
    cursor = conn.execute("SELECT id, data FROM ledger ORDER BY id")
    records = cursor.fetchall()
    if not records: return sha256("")
    record_hashes = [sha256(f"{id_val}:{data}") for id_val, data in records]
    return compute_merkle_root(record_hashes)

This gives each peer a single hash that represents its entire ledger. Any change — even one byte in one row — produces a completely different root.


Step 3: Object-Oriented Peers

To make things easier to work with, the repo includes a Node class in node.py:

from ledger import init_db, add_record
from merkle import merkle_root

class Node:
    def __init__(self, name, db_file):
        self.name = name
        self.db_file = db_file
        self.conn = None

    def initialize(self):
        self.conn = init_db(self.db_file)
        return self

    def add_event(self, data: str):
        return add_record(self.conn, data)

    def get_merkle_root(self) -> str:
        return merkle_root(self.conn)

    def compare(self, other):
        root_self = self.get_merkle_root()
        root_other = other.get_merkle_root()
        return root_self == root_other

This abstraction makes it easy to spin up multiple nodes and compare them.


Step 4: Compare Peers

Let’s see it in action. Create two nodes with the same events:

from node import Node

nodeA = Node("Alice", "alice.db").initialize()
nodeB = Node("Bob", "bob.db").initialize()

for event in ["Alice pays Bob 10", "Bob pays Carol 5"]:
    nodeA.add_event(event)
    nodeB.add_event(event)

print("Node A root:", nodeA.get_merkle_root())
print("Node B root:", nodeB.get_merkle_root())
print("Identical?", nodeA.compare(nodeB))

✅ Both roots match — the ledgers are consistent.

Now diverge:

nodeB.add_event("Carol pays Dave 3")
print("Identical?", nodeA.compare(nodeB))

⚠️ Roots differ — the nodes know they’re out of sync.


Step 5: Network-Wide Analysis

The repo also includes a compare_network function to check many nodes at once:

from node import compare_network

network = [nodeA, nodeB, Node("Charlie", "charlie.db").initialize()]
analysis = compare_network(network)

print("Sync groups:", analysis["sync_groups"])
print("Network healthy?", analysis["network_healthy"])

This shows which nodes are in sync and whether the network is “healthy” (e.g. 80%+ agree).


Why This Matters

This demo is a simplified version of how real distributed systems work:

  • Git: Each commit is a Merkle root. Repos diverge when their roots differ.
  • Blockchains: Each block contains a Merkle root of transactions. Nodes gossip these roots to detect forks.
  • Distributed databases: Systems like Cassandra use Merkle trees to detect replica drift.

The principle is the same everywhere: compact cryptographic fingerprints let peers compare histories without exchanging everything.


What’s Next

Now that each peer can detect divergence, the next step is:

👉 How do peers actually share and spread their roots across a network?

That’s the role of gossip protocols, which we’ll cover in Part 3.


Conclusion

By moving from a single SQLite ledger to multiple peers, we’ve shown how distributed immutability works:

  • Each peer has its own append-only ledger.
  • Each computes a Merkle root as a fingerprint of history.
  • Peers can compare roots to instantly detect divergence.

This simple pattern is the foundation of Git, blockchains, and distributed databases — and now you’ve built it yourself with just SQLite and Python.