Introduction
A Binary Search Tree (BST) is a binary tree organized so that keys in the left subtree of any node are smaller than the node’s key, and keys in the right subtree are larger. This simple ordering makes dynamic set operations such as search, insert, delete, and ordered traversals efficient. In balanced cases, these operations run in logarithmic time; in the worst case (a skewed tree), they degrade to linear time. BSTs are foundational for ordered maps/sets and underpin many advanced balanced trees (AVL, Red-Black, etc.).
Theory (from book)
- BST property: For any node with key K:
- All keys in its left subtree are less than K.
- All keys in its right subtree are greater than K.
- Both left and right subtrees are themselves BSTs (recursively).
- In-order traversal yields sorted order: Visiting Left, Node, Right lists all keys in non-decreasing order. This is a direct consequence of the BST property.
- Height and complexity: Let h be the tree height. Search, insert, and delete cost O(h). For a balanced tree, h ≈ ⌊log2 n⌋, giving O(log n). For a skewed tree, h = n − 1, giving O(n).
- Deletion cases:
- Deleting a leaf: remove it directly.
- Deleting a node with one child: replace the node by its single child.
- Deleting a node with two children: replace the node’s key by its in-order successor (minimum of right subtree) or in-order predecessor (maximum of left subtree), then delete that successor/predecessor node.
- Duplicates: Standard theoretical treatment assumes distinct keys. Practical variants store a frequency count or choose a side (e.g., “equal goes right”).
Algorithm Steps
Search(x)
curr = root
while curr is not null:
if x == curr.key: return curr
else if x < curr.key: curr = curr.left
else: curr = curr.right
return null
Insert(x)
def insert(node, x):
if node is null: return new Node(x)
if x < node.key: node.left = insert(node.left, x)
elif x > node.key: node.right = insert(node.right, x)
else: # duplicate
# either ignore, count++, or choose a side; here we ignore
return node
return node
Delete(x)
def delete(node, x):
if node is null: return null
if x < node.key: node.left = delete(node.left, x)
elif x > node.key: node.right = delete(node.right, x)
else:
# found node to delete
if node.left is null and node.right is null:
return null
elif node.left is null:
return node.right
elif node.right is null:
return node.left
else:
succ = min(node.right) # in-order successor
node.key = succ.key
node.right = delete(node.right, succ.key)
return node
Useful helpers
min(node): while node.left: node = node.left; return node
max(node): while node.right: node = node.right; return node
inorder(node): L, Node, R (prints sorted order)
preorder(node): Node, L, R
postorder(node): L, R, Node
Example
Build a BST by inserting keys in this order: [13, 7, 15, 3, 8, 14, 19, 18]
After all insertions, one valid BST is:
13
/ \
7 15
/ \ / \
3 8 14 19
/
18
In-order traversal: 3, 7, 8, 13, 14, 15, 18, 19 (sorted)
Search 14: 13 → right (15) → left (14) → found.
Delete 15 (has two children):
- In-order successor = min(right subtree of 15) = 18
- Replace 15 with 18, then delete original 18 leaf
Result:
13
/ \
7 18
/ \ / \
3 8 14 19
Code (C & Python)
C Implementation
// Binary Search Tree in C (integers, ignoring duplicates)
#include <stdio.h>
#include <stdlib.h>
typedef struct Node {
int key;
struct Node *left, *right;
} Node;
Node* newNode(int key) {
Node* n = (Node*)malloc(sizeof(Node));
n->key = key;
n->left = n->right = NULL;
return n;
}
Node* insert(Node* root, int key) {
if (root == NULL) return newNode(key);
if (key < root->key) root->left = insert(root->left, key);
else if (key > root->key) root->right = insert(root->right, key);
// if equal, ignore (or handle duplicates as needed)
return root;
}
Node* search(Node* root, int key) {
Node* curr = root;
while (curr != NULL) {
if (key == curr->key) return curr;
else if (key < curr->key) curr = curr->left;
else curr = curr->right;
}
return NULL;
}
Node* minNode(Node* root) {
Node* curr = root;
while (curr && curr->left) curr = curr->left;
return curr;
}
Node* delete(Node* root, int key) {
if (root == NULL) return NULL;
if (key < root->key) {
root->left = delete(root->left, key);
} else if (key > root->key) {
root->right = delete(root->right, key);
} else {
// found
if (root->left == NULL && root->right == NULL) {
free(root);
return NULL;
} else if (root->left == NULL) {
Node* tmp = root->right;
free(root);
return tmp;
} else if (root->right == NULL) {
Node* tmp = root->left;
free(root);
return tmp;
} else {
Node* succ = minNode(root->right);
root->key = succ->key;
root->right = delete(root->right, succ->key);
}
}
return root;
}
void inorder(Node* root) {
if (!root) return;
inorder(root->left);
printf("%d ", root->key);
inorder(root->right);
}
void preorder(Node* root) {
if (!root) return;
printf("%d ", root->key);
preorder(root->left);
preorder(root->right);
}
void postorder(Node* root) {
if (!root) return;
postorder(root->left);
postorder(root->right);
printf("%d ", root->key);
}
int main() {
int keys[] = {13, 7, 15, 3, 8, 14, 19, 18};
int n = sizeof(keys)/sizeof(keys[0]);
Node* root = NULL;
for (int i = 0; i < n; i++) root = insert(root, keys[i]);
printf("In-order: ");
inorder(root); // should print sorted order
printf("\n");
int target = 14;
printf("Search %d: %s\n", target, search(root, target) ? "found" : "not found");
root = delete(root, 15);
printf("After deleting 15, In-order: ");
inorder(root);
printf("\n");
return 0;
}
Explanation (C):
insertrecursively finds the leaf position to maintain the BST property.deletehandles 3 cases (leaf, one child, two children); for two children, it swaps in the in-order successor from the right subtree.inorderprints keys in sorted order, a quick sanity check for BST correctness.
Python Implementation
from __future__ import annotations
from dataclasses import dataclass
from typing import Optional, Iterable, List
@dataclass
class Node:
key: int
left: Optional['Node'] = None
right: Optional['Node'] = None
class BST:
def __init__(self) -> None:
self.root: Optional[Node] = None
def insert(self, key: int) -> None:
def _ins(node: Optional[Node], k: int) -> Node:
if node is None:
return Node(k)
if k < node.key:
node.left = _ins(node.left, k)
elif k > node.key:
node.right = _ins(node.right, k)
# if equal, ignore (or handle duplicates as needed)
return node
self.root = _ins(self.root, key)
def search(self, key: int) -> Optional[Node]:
cur = self.root
while cur:
if key == cur.key:
return cur
cur = cur.left if key < cur.key else cur.right
return None
def _min_node(self, node: Node) -> Node:
cur = node
while cur.left:
cur = cur.left
return cur
def delete(self, key: int) -> None:
def _del(node: Optional[Node], k: int) -> Optional[Node]:
if node is None:
return None
if k < node.key:
node.left = _del(node.left, k)
elif k > node.key:
node.right = _del(node.right, k)
else:
# found
if node.left is None and node.right is None:
return None
if node.left is None:
return node.right
if node.right is None:
return node.left
succ = self._min_node(node.right)
node.key = succ.key
node.right = _del(node.right, succ.key)
return node
self.root = _del(self.root, key)
def inorder(self) -> List[int]:
res: List[int] = []
def _in(n: Optional[Node]) -> None:
if not n: return
_in(n.left)
res.append(n.key)
_in(n.right)
_in(self.root)
return res
def preorder(self) -> List[int]:
res: List[int] = []
def _pre(n: Optional[Node]) -> None:
if not n: return
res.append(n.key)
_pre(n.left)
_pre(n.right)
_pre(self.root)
return res
def postorder(self) -> List[int]:
res: List[int] = []
def _post(n: Optional[Node]) -> None:
if not n: return
_post(n.left)
_post(n.right)
res.append(n.key)
_post(self.root)
return res
if __name__ == "__main__":
bst = BST()
for k in [13, 7, 15, 3, 8, 14, 19, 18]:
bst.insert(k)
print("In-order:", bst.inorder()) # sorted
print("Search 14:", "found" if bst.search(14) else "not found")
bst.delete(15)
print("After deleting 15, In-order:", bst.inorder())
Explanation (Python):
BST.insertandBST.deleteare recursive, returning the (possibly new) subtree root to maintain parent pointers.inorder/preorder/postorderreturn lists for easy verification and testing.
Dry Run
Using the Python or C code with input [13, 7, 15, 3, 8, 14, 19, 18]:
- Insert 13 → becomes root.
- Insert 7 → 7 < 13 → left of 13.
- Insert 15 → 15 > 13 → right of 13.
- Insert 3 → 3 < 13 → left; 3 < 7 → left of 7.
- Insert 8 → 8 < 13 → left; 8 > 7 → right of 7.
- Insert 14 → 14 > 13 → right; 14 < 15 → left of 15.
- Insert 19 → 19 > 13 → right; 19 > 15 → right of 15.
- Insert 18 → 18 > 13 → right; 18 > 15 → right; 18 < 19 → left of 19.
In-order now prints: 3, 7, 8, 13, 14, 15, 18, 19.
Delete 15 (two-children case):
- Successor = min(right subtree of 15) = 18.
- Copy 18 into node 15’s position (that node now holds 18).
- Recursively delete the original 18 (a leaf) from the right subtree.
Final in-order: 3, 7, 8, 13, 14, 18, 19.
Time Complexity (HTML Table)
| Operation | Best | Average (random-like) | Worst (skewed) | Aux Space |
|---|---|---|---|---|
| Search | O(log n) | O(log n) | O(n) | O(h) recursion or O(1) iterative |
| Insert | O(log n) | O(log n) | O(n) | O(h) recursion or O(1) iterative |
| Delete | O(log n) | O(log n) | O(n) | O(h) recursion or O(1) iterative |
| In-order/Pre/Post traversal | O(n) | O(n) | O(n) | O(h) recursion |
| Find min/max | O(1) | O(log n) | O(n) | O(1) iterative |
Note: h is the height of the tree; h ≈ log2 n when balanced, h ≈ n − 1 when skewed.
Advantages & Disadvantages
Advantages (practical, from common web sources and practice):
- Maintains dynamic data in sorted order; in-order traversal gives a sorted list without extra work.
- Efficient search/insert/delete in O(log n) when reasonably balanced, with no need to shift elements in memory (unlike arrays).
- Supports order-based queries naturally: min/max, floor/ceil, predecessor/successor, range queries, k-th smallest (with augmentation).
- Pointer-based structure allows flexible growth and memory usage without contiguous allocation.
Disadvantages:
- Performance is input-order sensitive; can degrade to O(n) if the tree becomes skewed.
- Pointer overhead per node can be higher than array-based structures.
- Requires careful deletion logic (three cases) to maintain invariants.
- Compared to hash tables, average search can be slower (O(log n) vs O(1)), though BSTs provide ordering which hash tables do not.
Comparisons:
- BST vs Hash Table: Hash tables offer average O(1) search/insert/delete but no ordering; BSTs offer ordering and range queries with O(log n) average/balanced time.
- BST vs Sorted Array: Binary search on an array is O(log n) search, but insert/delete are O(n) due to shifting. BSTs offer O(log n) insert/delete when balanced.
- BST vs Balanced BSTs (AVL/Red-Black): Self-balancing trees guarantee O(log n) worst-case by performing rotations; plain BSTs are simpler but risk O(n) in the worst case.
Applications
- Ordered maps/sets in libraries (e.g., TreeMap/TreeSet, std::map) are typically backed by balanced BSTs.
- Symbol tables in compilers and interpreters (name lookup with ordering).
- Range queries and interval lookups (e.g., retrieve all keys in [L, R]).
- Autocomplete and dictionary features where ordered iteration and floor/ceil are needed.
- Implementing priority-like operations when exact ordering and predecessor/successor are required.
- Foundation for more advanced structures: AVL trees, Red-Black trees, Treaps, Order-statistics trees.
Conclusion
Binary Search Trees provide an elegant, order-aware structure for dynamic sets. Their power lies in the BST property, enabling fast search, insertion, deletion, and ordered traversals. While the plain BST is simple and effective for many workloads, its performance depends on shape; when worst-case guarantees are needed, self-balancing variants are preferred. The C and Python implementations above demonstrate the essential mechanics—search, insert, delete, and traversals—forming a solid base for study and extension.