Binary Search Tree

Introduction

A Binary Search Tree (BST) is a node-based binary tree data structure that maintains a strict order among keys to support fast search, insertion, and deletion. For any node X:

  • Every key in X’s left subtree is strictly less than X’s key.
  • Every key in X’s right subtree is strictly greater than X’s key.
  • Both left and right subtrees are themselves BSTs.

Because of this ordering, an in-order traversal of a BST always yields the keys in sorted order. Operations on a BST take time proportional to the tree’s height h. In a well-shaped (balanced) BST, h ≈ log2(n), giving O(log n) operations; in a skewed tree, h can be n, giving O(n) operations. To keep focus, we’ll assume unique keys; handling duplicates is discussed briefly in theory and code comments.

Theory (from book)

Authoritative textbooks define the binary search tree property as a recursive invariant: for any node with key K, keys in the left subtree are less than K, and keys in the right subtree are greater than K. A direct corollary is that an in-order traversal (Left, Node, Right) visits keys in non-decreasing order, which both proves correctness of search and enables ordered operations (min, max, predecessor, successor, range queries).

  • Height and runtime: Let h be the height (number of edges on longest root-to-leaf path). Search, insert, and delete take Θ(h) time because they follow a single path determined by comparisons. Thus, Θ(log n) in balanced trees and Θ(n) in worst-case skewed trees.
  • Insertion: Insertions always occur at a leaf position reached by following comparisons. This preserves the BST invariant by placing the new key where it fits the ordering.
  • Deletion: Three canonical cases:
    1. Leaf node: remove directly.
    2. One child: splice the child up.
    3. Two children: replace the node’s key with its in-order successor (minimum of right subtree) or in-order predecessor (maximum of left subtree), then delete that successor/predecessor node (which will be in case 1 or 2). This preserves ordering.
  • Predecessor/Successor: In-order predecessor is the maximum in the left subtree; in-order successor is the minimum in the right subtree. If the appropriate subtree is missing, walk up ancestors.
  • Duplicates: Common book strategies include: (a) disallow duplicates, (b) store a count in the node, or (c) place equals consistently to one side (e.g., “duplicates to the right”). The invariant must be updated accordingly.
  • Space: Pointer-based BSTs store two child pointers per node; recursive algorithms consume O(h) stack space.

Algorithm Steps

The steps below reflect classic textbook algorithms.

Search(key)

search(root, key):
  cur = root
  while cur != null:
    if key == cur.key: return cur
    else if key < cur.key: cur = cur.left
    else: cur = cur.right
  return null

Insert(key) [no duplicates]

insert(node, key):
  if node == null: return new Node(key)
  if key < node.key:
    node.left = insert(node.left, key)
  else if key > node.key:
    node.right = insert(node.right, key)
  else:
    // duplicate policy: ignore or handle count
    return node
  return node

Delete(key)

delete(node, key):
  if node == null: return null
  if key < node.key: node.left = delete(node.left, key)
  else if key > node.key: node.right = delete(node.right, key)
  else:
    // node.key == key: remove this node
    if node.left == null and node.right == null:
      free node; return null
    else if node.left == null:
      tmp = node.right; free node; return tmp
    else if node.right == null:
      tmp = node.left; free node; return tmp
    else:
      // two children: use inorder successor (min in right)
      succ = minNode(node.right)
      node.key = succ.key
      node.right = delete(node.right, succ.key)
  return node

minNode(node):
  while node.left != null: node = node.left
  return node

Example

We’ll build a BST by inserting the keys: [13, 7, 15, 3, 8, 14, 19, 18]

Insert 13:      13

Insert 7:        13
                /
               7

Insert 15:       13
                /  \
               7    15

Insert 3:         13
                 /  \
                7    15
               /
              3

Insert 8:         13
                 /  \
                7    15
               / \
              3   8

Insert 14:        13
                 /  \
                7    15
               / \   /
              3   8 14

Insert 19:        13
                 /  \
                7    15
               / \   / \
              3   8 14 19

Insert 18:        13
                 /  \
                7    15
               / \   / \
              3   8 14 19
                       /
                      18
  • In-order traversal yields: 3, 7, 8, 13, 14, 15, 18, 19 (sorted).
  • Searching for 8 follows comparisons: 13 → left to 7 → right to 8 (found).
  • Deleting 15 (two children): successor is min of right subtree = 18. Replace 15 with 18; then delete the leaf 18 under node 19.

Code (C & Python)

C Implementation

#include <stdio.h>
#include <stdlib.h>

typedef struct Node {
    int key;
    struct Node *left, *right;
} Node;

Node* newNode(int key) {
    Node* n = (Node*)malloc(sizeof(Node));
    n->key = key;
    n->left = n->right = NULL;
    return n;
}

Node* insert(Node* root, int key) {
    if (root == NULL) return newNode(key);
    if (key < root->key) root->left = insert(root->left, key);
    else if (key > root->key) root->right = insert(root->right, key);
    // else duplicate: ignore or handle separately
    return root;
}

Node* search(Node* root, int key) {
    Node* cur = root;
    while (cur != NULL) {
        if (key == cur->key) return cur;
        else if (key < cur->key) cur = cur->left;
        else cur = cur->right;
    }
    return NULL;
}

Node* minNode(Node* root) {
    Node* cur = root;
    while (cur && cur->left != NULL) cur = cur->left;
    return cur;
}

Node* delete(Node* root, int key) {
    if (root == NULL) return NULL;

    if (key < root->key) {
        root->left = delete(root->left, key);
    } else if (key > root->key) {
        root->right = delete(root->right, key);
    } else {
        // found node to delete
        if (root->left == NULL && root->right == NULL) {
            free(root);
            return NULL;
        } else if (root->left == NULL) {
            Node* tmp = root->right;
            free(root);
            return tmp;
        } else if (root->right == NULL) {
            Node* tmp = root->left;
            free(root);
            return tmp;
        } else {
            Node* succ = minNode(root->right);
            root->key = succ->key;
            root->right = delete(root->right, succ->key);
        }
    }
    return root;
}

void inorder(Node* root) {
    if (root == NULL) return;
    inorder(root->left);
    printf("%d ", root->key);
    inorder(root->right);
}

int main() {
    int keys[] = {13, 7, 15, 3, 8, 14, 19, 18};
    int n = sizeof(keys)/sizeof(keys[0]);
    Node* root = NULL;
    for (int i = 0; i < n; i++) {
        root = insert(root, keys[i]);
    }

    printf("In-order: ");
    inorder(root);
    printf("\n");

    int q = 8;
    printf("Search %d: %s\n", q, search(root, q) ? "found" : "not found");

    root = delete(root, 15);
    printf("After delete 15, In-order: ");
    inorder(root);
    printf("\n");

    // Optionally free remaining nodes (omitted for brevity)
    return 0;
}

Explanation:

  • insert descends left/right and attaches a new leaf node.
  • search performs iterative descent by comparisons.
  • delete handles the three deletion cases; for two children it replaces the key with the in-order successor (leftmost in right subtree) and removes the duplicate from the right subtree.
  • inorder confirms the sorted property of a BST.

Python Implementation

from __future__ import annotations
from typing import Optional, List

class Node:
    __slots__ = ("key", "left", "right")
    def __init__(self, key: int):
        self.key: int = key
        self.left: Optional[Node] = None
        self.right: Optional[Node] = None

class BST:
    def __init__(self):
        self.root: Optional[Node] = None

    def insert(self, key: int) -> None:
        def _insert(node: Optional[Node], key: int) -> Node:
            if node is None:
                return Node(key)
            if key < node.key:
                node.left = _insert(node.left, key)
            elif key > node.key:
                node.right = _insert(node.right, key)
            # else duplicate: ignore or increment a counter if maintained
            return node
        self.root = _insert(self.root, key)

    def search(self, key: int) -> Optional[Node]:
        cur = self.root
        while cur:
            if key == cur.key:
                return cur
            elif key < cur.key:
                cur = cur.left
            else:
                cur = cur.right
        return None

    def _min_node(self, node: Node) -> Node:
        cur = node
        while cur.left:
            cur = cur.left
        return cur

    def delete(self, key: int) -> None:
        def _delete(node: Optional[Node], key: int) -> Optional[Node]:
            if node is None:
                return None
            if key < node.key:
                node.left = _delete(node.left, key)
            elif key > node.key:
                node.right = _delete(node.right, key)
            else:
                # delete this node
                if not node.left and not node.right:
                    return None
                if not node.left:
                    return node.right
                if not node.right:
                    return node.left
                succ = self._min_node(node.right)
                node.key = succ.key
                node.right = _delete(node.right, succ.key)
            return node
        self.root = _delete(self.root, key)

    def inorder(self) -> List[int]:
        out: List[int] = []
        def _in(node: Optional[Node]):
            if not node: return
            _in(node.left)
            out.append(node.key)
            _in(node.right)
        _in(self.root)
        return out

if __name__ == "__main__":
    bst = BST()
    for k in [13, 7, 15, 3, 8, 14, 19, 18]:
        bst.insert(k)
    print("In-order:", bst.inorder())
    print("Search 8:", "found" if bst.search(8) else "not found")
    bst.delete(15)
    print("After delete 15, In-order:", bst.inorder())

Explanation:

  • BST wraps node operations and exposes insert/search/delete.
  • _min_node finds the leftmost node for successor during deletion.
  • inorder returns a Python list of keys in sorted order.

Dry Run

Starting from the example BST built above:

          13
        /    \
       7      15
      / \    /  \
     3   8  14  19
               /
              18
  1. Search 8
    • Compare with 13 → 8 < 13 → go left.
    • Compare with 7 → 8 > 7 → go right.
    • Compare with 8 → found.
  2. Delete 15 (two children)
    • Find in-order successor: min of right subtree rooted at 19 is 18.
    • Copy 18 into node 15 (node becomes key=18).
    • Delete the original 18 from the right subtree: it’s a leaf under 19 → remove directly.

    Tree after deletion:

              13
            /    \
           7      18
          / \    /  \
         3   8  14  19
    

Time Complexity (HTML Table)

Operation Best / Balanced (h ≈ log n) Average (random keys) Worst (skewed) Notes
Search O(log n) O(log n) O(n) Follows one root-to-leaf path; cost = Θ(h)
Insert O(log n) O(log n) O(n) Inserts at a leaf position
Delete O(log n) O(log n) O(n) Successor/predecessor step is O(h)
Min / Max O(1) to O(log n) O(log n) O(n) Walk extreme left/right
Traversal (in-order) O(n) Visits every node exactly once
Space O(n) for nodes; O(h) extra for recursion Iterative variants reduce extra space to O(1)

Advantages & Disadvantages

(Derived from common web references and practice.)

Advantages

  • Fast ordered operations: Search/insert/delete are O(log n) on balanced trees; in-order traversal returns a sorted sequence directly.
  • No shifting in memory: Unlike arrays, insertions/deletions do not require shifting elements; only pointer changes.
  • Supports order queries: min/max, predecessor/successor, range queries, k-th element with augmentations (e.g., subtree sizes).
  • Foundation for self-balancing trees: AVL, Red-Black Trees provide worst-case guarantees by maintaining BST structure with extra invariants.

Disadvantages

  • Shape-sensitive performance: Without balancing, adversarial insert order (e.g., sorted keys) degrades to O(n) like a linked list.
  • Pointer overhead: Extra memory for left/right pointers; typically less cache-friendly than arrays.
  • Duplicates handling: Requires a policy (ignore, count, or side-placement); incorrect handling can violate invariants.
  • Implementation complexity for robustness: Correct deletion logic and edge cases (two children) are subtle; self-balancing variants are more complex.

Applications

  • Symbol tables and dictionaries: Maintain a dynamic set of key-value pairs with ordered iteration.
  • Range queries: Retrieve all keys within [L, R] efficiently via guided traversal.
  • Auto-complete and spell-check (as a component): Order-aware lookups and prefix ranges over keys.
  • Indexing concepts: While production databases favor B/B+ trees for disk locality, BSTs model core indexing ideas in memory.
  • Event simulation and schedulers: Use ordered sets to pick the next event by time-stamp.
  • Tree sort: Insert all items into a BST and perform in-order traversal to output sorted order.
  • Building blocks: Basis for AVL, Red-Black Trees, Treaps, Order-Statistic Trees.

Conclusion

Binary Search Trees provide a clean, recursive structure for maintaining ordered data with efficient search, insert, and delete operations tied to tree height. They are easy to reason about, enable rich ordered queries, and serve as the conceptual backbone for self-balancing trees that guarantee O(log n) performance. For robust performance in practice, prefer self-balancing variants; for learning and many in-memory tasks on modest inputs, a well-implemented BST is simple and effective.

Scroll to Top