How to Use Distributed SQLite and Merkle Trees to Detect Divergence

Distributed Systems with SQLite Series

Merkle Trees in SQLite with Python: A Practical Tutorial
How to Use Distributed SQLite and Merkle Trees to Detect Divergence
How to Implement Gossip Protocols with SQLite (Coming Soon)
How to Reconcile Divergent SQLite Ledgers with Merkle Trees (Coming Soon)

In Part 1, we explored how a Merkle tree inside SQLite can turn a local database into a tamper-evident ledger. That gave us local immutability: a single .db file could prove its history hadn’t been altered.

Now we’ll take the next step: imagine each database file is a peer in a network. Each peer appends events to its own ledger and computes a Merkle root as a cryptographic fingerprint. By exchanging and comparing roots, peers can quickly detect whether their histories diverge.

This pattern — multiple independent ledgers, compared by fingerprints — underlies real systems like Git, blockchains, and distributed databases.

Why “Distributed SQLite”?

SQLite itself is not distributed. But by treating multiple .db files as independent peers, we can:

Simulate distributed immutability without infrastructure overhead.
Teach how Merkle roots summarize history compactly.
Show how peers can detect divergence with minimal data exchange.

This demo isn’t production-ready, but it’s the perfect teaching tool.

All the code is available on GitHub.

Step 1: Append-Only Ledgers in SQLite

Each peer maintains its own ledger table. In our repo, this is managed by ledger.py:

import sqlite3, time, hashlib

def init_db(db_file: str) -> sqlite3.Connection:
    conn = sqlite3.connect(db_file)
    conn.execute("""
        CREATE TABLE IF NOT EXISTS ledger (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            timestamp REAL NOT NULL,
            data TEXT NOT NULL,
            hash TEXT NOT NULL
        )
    """)
    conn.commit()
    return conn

def add_record(conn: sqlite3.Connection, data: str) -> int:
    timestamp = time.time()
    record_hash = hashlib.sha256(data.encode('utf-8')).hexdigest()
    cursor = conn.execute(
        "INSERT INTO ledger (timestamp, data, hash) VALUES (?, ?, ?)",
        (timestamp, data, record_hash)
    )
    conn.commit()
    return cursor.lastrowid

Records are append-only: each entry gets a timestamp, data, and SHA256 hash.

Step 2: Compute Merkle Roots

The merkle.py module builds a Merkle tree from ledger rows:

def sha256(data: str) -> str:
    return hashlib.sha256(data.encode("utf-8")).hexdigest()

def compute_merkle_root(hashes: list[str]) -> str:
    if not hashes: return sha256("")
    if len(hashes) == 1: return hashes[0]
    next_level = []
    for i in range(0, len(hashes), 2):
        left = hashes[i]
        right = hashes[i+1] if i+1 < len(hashes) else left
        next_level.append(sha256(left + right))
    return compute_merkle_root(next_level)

def merkle_root(conn) -> str:
    cursor = conn.execute("SELECT id, data FROM ledger ORDER BY id")
    records = cursor.fetchall()
    if not records: return sha256("")
    record_hashes = [sha256(f"{id_val}:{data}") for id_val, data in records]
    return compute_merkle_root(record_hashes)

This gives each peer a single hash that represents its entire ledger. Any change — even one byte in one row — produces a completely different root.

Step 3: Object-Oriented Peers

To make things easier to work with, the repo includes a Node class in node.py:

from ledger import init_db, add_record
from merkle import merkle_root

class Node:
    def __init__(self, name, db_file):
        self.name = name
        self.db_file = db_file
        self.conn = None

    def initialize(self):
        self.conn = init_db(self.db_file)
        return self

    def add_event(self, data: str):
        return add_record(self.conn, data)

    def get_merkle_root(self) -> str:
        return merkle_root(self.conn)

    def compare(self, other):
        root_self = self.get_merkle_root()
        root_other = other.get_merkle_root()
        return root_self == root_other

This abstraction makes it easy to spin up multiple nodes and compare them.

Step 4: Compare Peers

Let’s see it in action. Create two nodes with the same events:

from node import Node

nodeA = Node("Alice", "alice.db").initialize()
nodeB = Node("Bob", "bob.db").initialize()

for event in ["Alice pays Bob 10", "Bob pays Carol 5"]:
    nodeA.add_event(event)
    nodeB.add_event(event)

print("Node A root:", nodeA.get_merkle_root())
print("Node B root:", nodeB.get_merkle_root())
print("Identical?", nodeA.compare(nodeB))

✅ Both roots match — the ledgers are consistent.

Now diverge:

nodeB.add_event("Carol pays Dave 3")
print("Identical?", nodeA.compare(nodeB))

⚠️ Roots differ — the nodes know they’re out of sync.

Step 5: Network-Wide Analysis

The repo also includes a compare_network function to check many nodes at once:

from node import compare_network

network = [nodeA, nodeB, Node("Charlie", "charlie.db").initialize()]
analysis = compare_network(network)

print("Sync groups:", analysis["sync_groups"])
print("Network healthy?", analysis["network_healthy"])

This shows which nodes are in sync and whether the network is “healthy” (e.g. 80%+ agree).

Why This Matters

This demo is a simplified version of how real distributed systems work:

Git: Each commit is a Merkle root. Repos diverge when their roots differ.
Blockchains: Each block contains a Merkle root of transactions. Nodes gossip these roots to detect forks.
Distributed databases: Systems like Cassandra use Merkle trees to detect replica drift.

The principle is the same everywhere: compact cryptographic fingerprints let peers compare histories without exchanging everything.

What’s Next

Now that each peer can detect divergence, the next step is:

👉 How do peers actually share and spread their roots across a network?

That’s the role of gossip protocols, which we’ll cover in Part 3.

Conclusion

By moving from a single SQLite ledger to multiple peers, we’ve shown how distributed immutability works:

Each peer has its own append-only ledger.
Each computes a Merkle root as a fingerprint of history.
Peers can compare roots to instantly detect divergence.

This simple pattern is the foundation of Git, blockchains, and distributed databases — and now you’ve built it yourself with just SQLite and Python.