Binary Search Tree

Introduction

A Binary Search Tree (BST) is a binary tree organized so that keys in the left subtree of any node are smaller than the node’s key, and keys in the right subtree are larger. This simple ordering makes dynamic set operations such as search, insert, delete, and ordered traversals efficient. In balanced cases, these operations run in logarithmic time; in the worst case (a skewed tree), they degrade to linear time. BSTs are foundational for ordered maps/sets and underpin many advanced balanced trees (AVL, Red-Black, etc.).

Theory (from book)

  • BST property: For any node with key K:
    • All keys in its left subtree are less than K.
    • All keys in its right subtree are greater than K.
    • Both left and right subtrees are themselves BSTs (recursively).
  • In-order traversal yields sorted order: Visiting Left, Node, Right lists all keys in non-decreasing order. This is a direct consequence of the BST property.
  • Height and complexity: Let h be the tree height. Search, insert, and delete cost O(h). For a balanced tree, h ≈ ⌊log2 n⌋, giving O(log n). For a skewed tree, h = n − 1, giving O(n).
  • Deletion cases:
    1. Deleting a leaf: remove it directly.
    2. Deleting a node with one child: replace the node by its single child.
    3. Deleting a node with two children: replace the node’s key by its in-order successor (minimum of right subtree) or in-order predecessor (maximum of left subtree), then delete that successor/predecessor node.
  • Duplicates: Standard theoretical treatment assumes distinct keys. Practical variants store a frequency count or choose a side (e.g., “equal goes right”).

Algorithm Steps

Search(x)

curr = root
while curr is not null:
  if x == curr.key: return curr
  else if x < curr.key: curr = curr.left
  else: curr = curr.right
return null

Insert(x)

def insert(node, x):
  if node is null: return new Node(x)
  if x < node.key: node.left = insert(node.left, x)
  elif x > node.key: node.right = insert(node.right, x)
  else: # duplicate
       # either ignore, count++, or choose a side; here we ignore
       return node
  return node

Delete(x)

def delete(node, x):
  if node is null: return null
  if x < node.key: node.left = delete(node.left, x)
  elif x > node.key: node.right = delete(node.right, x)
  else:
    # found node to delete
    if node.left is null and node.right is null:
      return null
    elif node.left is null:
      return node.right
    elif node.right is null:
      return node.left
    else:
      succ = min(node.right)  # in-order successor
      node.key = succ.key
      node.right = delete(node.right, succ.key)
  return node

Useful helpers

min(node): while node.left: node = node.left; return node
max(node): while node.right: node = node.right; return node
inorder(node): L, Node, R  (prints sorted order)
preorder(node): Node, L, R
postorder(node): L, R, Node

Example

Build a BST by inserting keys in this order: [13, 7, 15, 3, 8, 14, 19, 18]

After all insertions, one valid BST is:

        13
       /  \
      7    15
     / \   / \
    3   8 14  19
              /
             18

In-order traversal: 3, 7, 8, 13, 14, 15, 18, 19  (sorted)
Search 14: 13 → right (15) → left (14) → found.
Delete 15 (has two children):
  - In-order successor = min(right subtree of 15) = 18
  - Replace 15 with 18, then delete original 18 leaf

Result:

        13
       /  \
      7    18
     / \   / \
    3   8 14  19

Code (C & Python)

C Implementation


// Binary Search Tree in C (integers, ignoring duplicates)
#include <stdio.h>
#include <stdlib.h>

typedef struct Node {
    int key;
    struct Node *left, *right;
} Node;

Node* newNode(int key) {
    Node* n = (Node*)malloc(sizeof(Node));
    n->key = key;
    n->left = n->right = NULL;
    return n;
}

Node* insert(Node* root, int key) {
    if (root == NULL) return newNode(key);
    if (key < root->key) root->left = insert(root->left, key);
    else if (key > root->key) root->right = insert(root->right, key);
    // if equal, ignore (or handle duplicates as needed)
    return root;
}

Node* search(Node* root, int key) {
    Node* curr = root;
    while (curr != NULL) {
        if (key == curr->key) return curr;
        else if (key < curr->key) curr = curr->left;
        else curr = curr->right;
    }
    return NULL;
}

Node* minNode(Node* root) {
    Node* curr = root;
    while (curr && curr->left) curr = curr->left;
    return curr;
}

Node* delete(Node* root, int key) {
    if (root == NULL) return NULL;

    if (key < root->key) {
        root->left = delete(root->left, key);
    } else if (key > root->key) {
        root->right = delete(root->right, key);
    } else {
        // found
        if (root->left == NULL && root->right == NULL) {
            free(root);
            return NULL;
        } else if (root->left == NULL) {
            Node* tmp = root->right;
            free(root);
            return tmp;
        } else if (root->right == NULL) {
            Node* tmp = root->left;
            free(root);
            return tmp;
        } else {
            Node* succ = minNode(root->right);
            root->key = succ->key;
            root->right = delete(root->right, succ->key);
        }
    }
    return root;
}

void inorder(Node* root) {
    if (!root) return;
    inorder(root->left);
    printf("%d ", root->key);
    inorder(root->right);
}

void preorder(Node* root) {
    if (!root) return;
    printf("%d ", root->key);
    preorder(root->left);
    preorder(root->right);
}

void postorder(Node* root) {
    if (!root) return;
    postorder(root->left);
    postorder(root->right);
    printf("%d ", root->key);
}

int main() {
    int keys[] = {13, 7, 15, 3, 8, 14, 19, 18};
    int n = sizeof(keys)/sizeof(keys[0]);

    Node* root = NULL;
    for (int i = 0; i < n; i++) root = insert(root, keys[i]);

    printf("In-order: ");
    inorder(root); // should print sorted order
    printf("\n");

    int target = 14;
    printf("Search %d: %s\n", target, search(root, target) ? "found" : "not found");

    root = delete(root, 15);
    printf("After deleting 15, In-order: ");
    inorder(root);
    printf("\n");

    return 0;
}

Explanation (C):

  • insert recursively finds the leaf position to maintain the BST property.
  • delete handles 3 cases (leaf, one child, two children); for two children, it swaps in the in-order successor from the right subtree.
  • inorder prints keys in sorted order, a quick sanity check for BST correctness.

Python Implementation


from __future__ import annotations
from dataclasses import dataclass
from typing import Optional, Iterable, List

@dataclass
class Node:
    key: int
    left: Optional['Node'] = None
    right: Optional['Node'] = None

class BST:
    def __init__(self) -> None:
        self.root: Optional[Node] = None

    def insert(self, key: int) -> None:
        def _ins(node: Optional[Node], k: int) -> Node:
            if node is None:
                return Node(k)
            if k < node.key:
                node.left = _ins(node.left, k)
            elif k > node.key:
                node.right = _ins(node.right, k)
            # if equal, ignore (or handle duplicates as needed)
            return node
        self.root = _ins(self.root, key)

    def search(self, key: int) -> Optional[Node]:
        cur = self.root
        while cur:
            if key == cur.key:
                return cur
            cur = cur.left if key < cur.key else cur.right
        return None

    def _min_node(self, node: Node) -> Node:
        cur = node
        while cur.left:
            cur = cur.left
        return cur

    def delete(self, key: int) -> None:
        def _del(node: Optional[Node], k: int) -> Optional[Node]:
            if node is None:
                return None
            if k < node.key:
                node.left = _del(node.left, k)
            elif k > node.key:
                node.right = _del(node.right, k)
            else:
                # found
                if node.left is None and node.right is None:
                    return None
                if node.left is None:
                    return node.right
                if node.right is None:
                    return node.left
                succ = self._min_node(node.right)
                node.key = succ.key
                node.right = _del(node.right, succ.key)
            return node
        self.root = _del(self.root, key)

    def inorder(self) -> List[int]:
        res: List[int] = []
        def _in(n: Optional[Node]) -> None:
            if not n: return
            _in(n.left)
            res.append(n.key)
            _in(n.right)
        _in(self.root)
        return res

    def preorder(self) -> List[int]:
        res: List[int] = []
        def _pre(n: Optional[Node]) -> None:
            if not n: return
            res.append(n.key)
            _pre(n.left)
            _pre(n.right)
        _pre(self.root)
        return res

    def postorder(self) -> List[int]:
        res: List[int] = []
        def _post(n: Optional[Node]) -> None:
            if not n: return
            _post(n.left)
            _post(n.right)
            res.append(n.key)
        _post(self.root)
        return res

if __name__ == "__main__":
    bst = BST()
    for k in [13, 7, 15, 3, 8, 14, 19, 18]:
        bst.insert(k)

    print("In-order:", bst.inorder())   # sorted
    print("Search 14:", "found" if bst.search(14) else "not found")
    bst.delete(15)
    print("After deleting 15, In-order:", bst.inorder())

Explanation (Python):

  • BST.insert and BST.delete are recursive, returning the (possibly new) subtree root to maintain parent pointers.
  • inorder/preorder/postorder return lists for easy verification and testing.

Dry Run

Using the Python or C code with input [13, 7, 15, 3, 8, 14, 19, 18]:

  1. Insert 13 → becomes root.
  2. Insert 7 → 7 < 13 → left of 13.
  3. Insert 15 → 15 > 13 → right of 13.
  4. Insert 3 → 3 < 13 → left; 3 < 7 → left of 7.
  5. Insert 8 → 8 < 13 → left; 8 > 7 → right of 7.
  6. Insert 14 → 14 > 13 → right; 14 < 15 → left of 15.
  7. Insert 19 → 19 > 13 → right; 19 > 15 → right of 15.
  8. Insert 18 → 18 > 13 → right; 18 > 15 → right; 18 < 19 → left of 19.

In-order now prints: 3, 7, 8, 13, 14, 15, 18, 19.

Delete 15 (two-children case):

  • Successor = min(right subtree of 15) = 18.
  • Copy 18 into node 15’s position (that node now holds 18).
  • Recursively delete the original 18 (a leaf) from the right subtree.

Final in-order: 3, 7, 8, 13, 14, 18, 19.

Time Complexity (HTML Table)

Operation Best Average (random-like) Worst (skewed) Aux Space
Search O(log n) O(log n) O(n) O(h) recursion or O(1) iterative
Insert O(log n) O(log n) O(n) O(h) recursion or O(1) iterative
Delete O(log n) O(log n) O(n) O(h) recursion or O(1) iterative
In-order/Pre/Post traversal O(n) O(n) O(n) O(h) recursion
Find min/max O(1) O(log n) O(n) O(1) iterative

Note: h is the height of the tree; h ≈ log2 n when balanced, h ≈ n − 1 when skewed.

Advantages & Disadvantages

Advantages (practical, from common web sources and practice):

  • Maintains dynamic data in sorted order; in-order traversal gives a sorted list without extra work.
  • Efficient search/insert/delete in O(log n) when reasonably balanced, with no need to shift elements in memory (unlike arrays).
  • Supports order-based queries naturally: min/max, floor/ceil, predecessor/successor, range queries, k-th smallest (with augmentation).
  • Pointer-based structure allows flexible growth and memory usage without contiguous allocation.

Disadvantages:

  • Performance is input-order sensitive; can degrade to O(n) if the tree becomes skewed.
  • Pointer overhead per node can be higher than array-based structures.
  • Requires careful deletion logic (three cases) to maintain invariants.
  • Compared to hash tables, average search can be slower (O(log n) vs O(1)), though BSTs provide ordering which hash tables do not.

Comparisons:

  • BST vs Hash Table: Hash tables offer average O(1) search/insert/delete but no ordering; BSTs offer ordering and range queries with O(log n) average/balanced time.
  • BST vs Sorted Array: Binary search on an array is O(log n) search, but insert/delete are O(n) due to shifting. BSTs offer O(log n) insert/delete when balanced.
  • BST vs Balanced BSTs (AVL/Red-Black): Self-balancing trees guarantee O(log n) worst-case by performing rotations; plain BSTs are simpler but risk O(n) in the worst case.

Applications

  • Ordered maps/sets in libraries (e.g., TreeMap/TreeSet, std::map) are typically backed by balanced BSTs.
  • Symbol tables in compilers and interpreters (name lookup with ordering).
  • Range queries and interval lookups (e.g., retrieve all keys in [L, R]).
  • Autocomplete and dictionary features where ordered iteration and floor/ceil are needed.
  • Implementing priority-like operations when exact ordering and predecessor/successor are required.
  • Foundation for more advanced structures: AVL trees, Red-Black trees, Treaps, Order-statistics trees.

Conclusion

Binary Search Trees provide an elegant, order-aware structure for dynamic sets. Their power lies in the BST property, enabling fast search, insertion, deletion, and ordered traversals. While the plain BST is simple and effective for many workloads, its performance depends on shape; when worst-case guarantees are needed, self-balancing variants are preferred. The C and Python implementations above demonstrate the essential mechanics—search, insert, delete, and traversals—forming a solid base for study and extension.

Scroll to Top