
@@ -51,8 +46,7 @@ What is important is that this **'good' property holds even after every change**
The 'good' property in AVL Trees is the **height-balanced** property. Height-balanced on a node is defined as
**difference in height between the left and right child node being not more than 1**.
We say the tree is height-balanced if every node in the tree is height-balanced. Be careful not to conflate
-the concept of "balanced tree" and "height-balanced" property. They are not the same; the latter is used to achieve the
-former.
+the concept of "balanced tree" and "height-balanced" property. They are not the same; the latter is used to achieve the former.
Ponder..
@@ -63,8 +57,8 @@ Yes! In fact, you can always construct a large enough AVL tree where their diffe
-It can be mathematically shown that a **height-balanced tree with n nodes, has at most height <= 2log(n)** (
-in fact, using the golden ratio, we can achieve a tighter bound of ~1.44log(n)).
+It can be mathematically shown that a **height-balanced tree with n nodes, has at most height <= `2log(n)`** (
+in fact, using the golden ratio, we can achieve a tighter bound of ~`1.44log(n)`).
Therefore, following the definition of a balanced tree, AVL trees are balanced.
@@ -73,19 +67,35 @@ Therefore, following the definition of a balanced tree, AVL trees are balanced.
Credits: CS2040s Lecture 9
+### Balance Factor
+To detect imbalance, each node tracks a **balance factor**:
+
+```
+balance factor = height(left subtree) - height(right subtree)
+```
+
+A node is height-balanced if its balance factor is in `{-1, 0, 1}`. When `|balance factor| > 1`, rebalancing is required.
+
+- **Positive** balance factor → left-heavy
+- **Negative** balance factor → right-heavy
+
## Complexity Analysis
-**Search, Insertion, Deletion, Predecessor & Successor queries Time**: O(height) = O(logn)
+**Time:**
+| Operation | Complexity |
+|-----------|------------|
+| Search | `O(log n)` |
+| Insert | `O(log n)` |
+| Delete | `O(log n)` |
+| Predecessor/Successor | `O(log n)` |
+| Single Rotation | `O(1)` |
-**Space**: O(n)
-where n is the number of elements (whatever the structure, it must store at least n nodes)
+**Space**: `O(n)` where `n` is the number of elements
## Operations
-Minimally, an implementation of AVL tree must support the standard **insert**, **delete**, and **search** operations.
-**Update** can be simulated by searching for the old key, deleting it, and then inserting a node with the new key.
+An AVL tree supports the standard **insert**, **delete**, and **search** operations.
+**Update** can be simulated by deleting the old key and inserting the new one.
-Naturally, with insertions and deletions, the structure of the tree will change, and it may not satisfy the
-"height-balance" property of the AVL tree. Without this property, we may lose our O(log(n)) run-time guarantee.
-Hence, we need some re-balancing operations. To do so, tree rotation operations are introduced. Below is one example.
+Insertions and deletions can violate the height-balanced property. To restore it, we use **rotations**.

@@ -93,19 +103,56 @@ Hence, we need some re-balancing operations. To do so, tree rotation operations
Credits: CS2040s Lecture 10
-Prof Seth explains it best! Go re-visit his slides (Lecture 10) for the operations :P
-Here is a [link](https://www.youtube.com/watch?v=dS02_IuZPes&list=PLgpwqdiEMkHA0pU_uspC6N88RwMpt9rC8&index=9)
-to prof's lecture on trees.
-_We may add a summary in the near future._
-
-## Application
-While AVL trees offer excellent lookup, insertion, and deletion times due to their strict balancing,
-the overhead of maintaining this balance can make them less preferred for applications
-where insertions and deletions are significantly more frequent than lookups. As a result, AVL trees often find itself
-over-shadowed in practical use by other counterparts like RB-trees,
-which boast a relatively simple implementation and lower overhead, or B-trees which are ideal for optimizing disk
-accesses in databases.
-
-That said, AVL tree is conceptually simple and often used as the base template for further augmentation to tackle
-niche problems. Orthogonal Range Searching and Interval Trees can be implemented with some minor augmentation to
-an existing AVL tree.
+### The 4 Rotation Cases
+After an insert or delete, we walk back up to the root, checking balance factors. When a node has `|balance factor| > 1`, one of four cases applies:
+
+| Case | Condition | Fix |
+|------|-----------|-----|
+| **Left-Left (LL)** | Left-heavy, left child is left-heavy or balanced | Single right rotation |
+| **Right-Right (RR)** | Right-heavy, right child is right-heavy or balanced | Single left rotation |
+| **Left-Right (LR)** | Left-heavy, left child is right-heavy | Left rotate left child, then right rotate node |
+| **Right-Left (RL)** | Right-heavy, right child is left-heavy | Right rotate right child, then left rotate node |
+
+
+How to identify the case
+
+1. Node has balance factor `> 1` (left-heavy):
+ - If left child's balance factor `>= 0` → **LL case**
+ - If left child's balance factor `< 0` → **LR case**
+
+2. Node has balance factor `< -1` (right-heavy):
+ - If right child's balance factor `<= 0` → **RR case**
+ - If right child's balance factor `> 0` → **RL case**
+
+
+
+**Interview tip:** Rotations are `O(1)` - just pointer updates. The `O(log n)` cost of insert/delete comes from traversing the height of the tree, not from rotations.
+
+Prof Seth explains it best! For visual demonstrations, see [Prof Seth's lecture 10](https://www.youtube.com/watch?v=dS02_IuZPes&list=PLgpwqdiEMkHA0pU_uspC6N88RwMpt9rC8&index=9) on trees.
+
+## Notes
+1. **Height guarantee**: AVL trees have height at most `~1.44 log(n)`, tighter than Red-Black trees' `2 log(n)`. This makes AVL faster for lookup-heavy workloads.
+
+2. **Rebalancing frequency**: AVL may rotate more often than RB-trees on insert/delete since it enforces stricter balance. This is the trade-off for faster lookups.
+
+3. **Duplicate keys**: The implementation here does not support duplicate keys. To handle duplicates, you could store a count in each node or use a list as the value.
+
+4. **Augmentation**: AVL trees are a great base for augmented structures. Store additional info (e.g., subtree size for order statistics) and update it during rotations.
+
+## Applications
+AVL trees offer excellent lookup times due to strict balancing, but the overhead of maintaining balance
+can make them less preferred when insertions/deletions vastly outnumber lookups.
+
+| Use Case | Best Choice | Why |
+|----------|-------------|-----|
+| Lookup-heavy workloads | AVL | Stricter balance → faster search |
+| Insert/delete-heavy | Red-Black | Fewer rotations on average |
+| Disk-based storage | B-tree | Optimized for block I/O |
+| In-memory databases | AVL or RB | Both work well |
+
+**Interview tip:** "When would you choose AVL over Red-Black?" → When reads dominate writes, AVL's tighter height bound (`1.44 log n` vs `2 log n`) gives faster lookups.
+
+AVL trees are also commonly used as a base for augmented structures:
+- **Order Statistics Tree** - find k-th smallest element in `O(log n)`
+- **Interval Tree** - find all intervals overlapping a point
+- **Orthogonal Range Tree** - 2D range queries
diff --git a/src/main/java/dataStructures/binarySearchTree/BinarySearchTree.java b/src/main/java/dataStructures/binarySearchTree/BinarySearchTree.java
index 5466d0f0..7af11e25 100644
--- a/src/main/java/dataStructures/binarySearchTree/BinarySearchTree.java
+++ b/src/main/java/dataStructures/binarySearchTree/BinarySearchTree.java
@@ -38,13 +38,17 @@ public class BinarySearchTree
, V> {
private void insert(Node node, T key, V value) {
if (node.getKey().compareTo(key) < 0) {
if (node.getRight() == null) {
- node.setRight(new Node<>(key, value));
+ Node newNode = new Node<>(key, value);
+ newNode.setParent(node);
+ node.setRight(newNode);
} else {
insert(node.getRight(), key, value);
}
} else if (node.getKey().compareTo(key) > 0) {
if (node.getLeft() == null) {
- node.setLeft(new Node<>(key, value));
+ Node newNode = new Node<>(key, value);
+ newNode.setParent(node);
+ node.setLeft(newNode);
} else {
insert(node.getLeft(), key, value);
}
@@ -116,11 +120,15 @@ private Node delete(Node node, T key) {
}
/**
- * Removes a key from the tree, if it exists
+ * Removes a key from the tree, if it exists.
*
* @param key to be removed
+ * @throws RuntimeException if key does not exist
*/
public void delete(T key) {
+ if (root == null) {
+ throw new RuntimeException("Key does not exist!");
+ }
root = delete(root, key);
}
@@ -206,10 +214,13 @@ private T predecessor(Node node) {
* Search for the predecessor of a given key.
*
* @param key find predecessor of this key
- * @return generic type value; null if key has no predecessor
+ * @return generic type value; null if key has no predecessor or key not found
*/
public T predecessor(T key) {
Node curr = root;
+ if (curr == null) {
+ return null;
+ }
while (curr != null) {
if (curr.getKey().compareTo(key) == 0) {
break;
@@ -219,7 +230,9 @@ public T predecessor(T key) {
curr = curr.getLeft();
}
}
-
+ if (curr == null) {
+ return null; // key not found
+ }
return predecessor(curr);
}
@@ -249,10 +262,13 @@ private T successor(Node node) {
* Search for the successor of a given key.
*
* @param key find successor of this key
- * @return generic type value; null if key has no successor
+ * @return generic type value; null if key has no successor or key not found
*/
public T successor(T key) {
Node curr = root;
+ if (curr == null) {
+ return null;
+ }
while (curr != null) {
if (curr.getKey().compareTo(key) == 0) {
break;
@@ -262,7 +278,9 @@ public T successor(T key) {
curr = curr.getLeft();
}
}
-
+ if (curr == null) {
+ return null; // key not found
+ }
return successor(curr);
}
@@ -294,7 +312,7 @@ public List getInorder() {
}
/**
- * Stores in-order traversal of tree rooted at node into a list
+ * Stores pre-order traversal of tree rooted at node into a list
*
* @param node node which the tree is rooted at
*/
diff --git a/src/main/java/dataStructures/binarySearchTree/README.md b/src/main/java/dataStructures/binarySearchTree/README.md
index b8368371..69905f9b 100644
--- a/src/main/java/dataStructures/binarySearchTree/README.md
+++ b/src/main/java/dataStructures/binarySearchTree/README.md
@@ -1,74 +1,77 @@
# Binary Search Tree
-## Overview
+## Background
-A Binary Search Tree (BST) is a tree-based data structure in which each node has at most two children, referred to as
-the left child and the right child. Each node in a BST contains a unique key and an associated value. The tree is
-structured so that, for any given node:
+A Binary Search Tree (BST) is a node-based tree structure where each node has at most two children. The key property (**BST invariant**):
+- Left subtree contains only nodes with keys **less than** the node's key
+- Right subtree contains only nodes with keys **greater than** the node's key
-1. The left subtree contains nodes with keys less than the node's key.
-2. The right subtree contains nodes with keys greater than the node's key.
+This ordering enables efficient search by eliminating roughly half the remaining nodes (assuming balanced) at each step. Often, the time complexity of operations is proportional to the tree's height, making it efficient in the case of balanced trees.
-This property makes BSTs efficient for operations like searching, as the average time complexity for many operations is
-proportional to the tree's height.
+### Predecessor and Successor
-Note: in the following explanation a "smaller" node refers to a node with a smaller key and a "larger" node refers to a
-node with a larger key.
+- **Predecessor**: The largest key smaller than the given key
+- **Successor**: The smallest key larger than the given key
-## Implementation
+Finding these involves two cases:
+1. **In subtree**: Predecessor is the rightmost node in left subtree; successor is the leftmost node in right subtree
+2. **In ancestors**: Traverse up via parent pointers until finding an ancestor that satisfies the condition
-### BinarySearchTree Class
+### Delete Operation
-The BinarySearchTree class is a generic implementation of a BST. It supports a variety of operations that allow
-interaction with the tree:
+Delete has three cases based on the node's children:
-- root(): Retrieve the root node of the tree.
-- insert(T key, V value): Insert a key-value pair into the tree.
-- delete(T key): Remove a key and its associated value from the tree.
-- search(T key): Find a node with a specified key.
-- predecessor(T key): Find the key of the predecessor of a specified key.
-- successor(T key): Find the key of the successor of a specified key.
-- searchMin(): Find the node with the minimum key in the tree.
-- searchMax(): Find the node with the maximum key in the tree.
-- getInorder(): Return an in-order traversal of the tree.
-- getPreorder(): Return a pre-order traversal of the tree.
-- getPostorder(): Return a post-order traversal of the tree.
-- getLevelorder(): Return a level-order traversal of the tree.
+| Case | Strategy |
+|------|----------|
+| **0 children** (leaf) | Simply remove the node |
+| **1 child** | Replace node with its child |
+| **2 children** | Replace node's key/value with its **successor**, then delete the successor |
-We will expand on the delete implementation due to its relative complexity.
+Why does the 2-children case work?
-#### Delete Implementation Details
+The successor (smallest node in right subtree) maintains the BST invariant because:
+1. It's larger than everything in the left subtree (since it's larger than the deleted node)
+2. It's smaller than everything else in the right subtree (since it's the minimum there)
-The delete operation is split into three different cases - when the node to be deleted has no children, one child or
-two children.
+Using the predecessor (largest in left subtree) works equally well.
-**No children:** Simply delete the node.
+
-**One child:** Reassign the parent attribute of the child to the parent of the node to be deleted. This will not violate
-the binary search tree property as the right child will definitely be smaller than the parent of the deleted node.
+## Complexity Analysis
-**Two children:** Replace the deleted node with its successor. This works because the binary search tree property is
-maintained:
+| Operation | Average | Worst Case | Notes |
+|-----------|---------|------------|-------|
+| `search()` | `O(log n)` | `O(n)` | Worst case: degenerate (linear) tree |
+| `insert()` | `O(log n)` | `O(n)` | Same as search |
+| `delete()` | `O(log n)` | `O(n)` | Involves search + potential successor lookup |
+| `predecessor()` / `successor()` | `O(log n)` | `O(n)` | May traverse full height |
+| `searchMin()` / `searchMax()` | `O(log n)` | `O(n)` | Traverse to leftmost/rightmost |
+| Traversals | `O(n)` | `O(n)` | Must visit all nodes |
-1. the entire left subtree will definitely be smaller than the successor as the successor is larger than the deleted
- node
-2. the entire right subtree will definitely be larger than the successor as the successor will be the smallest node in
- the right subtree
+**Space**: `O(n)` for storing n nodes
-### Node
+**Interview tip:** The worst case `O(n)` occurs when insertions happen in sorted order, creating a "linked list" shape or also known as a "degenerate tree". This is why self-balancing trees (AVL, Red-Black) exist.
-The Node class represents the nodes within the BinarySearchTree. Each Node instance contains:
+## Notes
-- key: The unique key associated with the node.
-- value: The value associated with the key.
-- left: Reference to the left child.
-- right: Reference to the right child.
-- parent: Reference to the parent node.
+1. **No duplicates**: This implementation throws an exception on duplicate keys. To support duplicates, you could store counts in nodes or use a list as the value.
-## Complexity Analysis
+2. **Parent pointers**: Nodes maintain parent references to enable upward traversal for predecessor/successor finding. This adds space overhead but simplifies these operations.
+
+3. **Key-value pairs**: The implementation stores both keys (for ordering) and values (for data). Keys must be `Comparable`.
+
+4. **Unbalanced by design**: A basic BST does not self-balance. For guaranteed `O(log n)` operations, use [AVL Tree](../avlTree) or Red-Black Tree.
+
+## Applications
-**Time Complexity:** For a balanced tree, most operations (insert, delete, search) can be performed in O(log n) time,
-except tree traversal operations which can be performed in O(n) time. However, in the worst case (an unbalanced tree),
-these operations may degrade to O(n).
+| Use Case | Why BST? |
+|----------|----------|
+| In-memory sorted data | In-order traversal yields sorted sequence |
+| Range queries | Find all keys in `[a, b]` efficiently |
+| Floor/ceiling queries | Find largest key ≤ x or smallest key ≥ x |
+| Symbol tables | Key-value lookup with ordering |
-**Space Complexity:** O(n), where n is the number of elements in the tree.
+**When to use BST vs alternatives:**
+- Need sorted iteration? → BST (HashMap can't do this)
+- Need guaranteed `O(log n)`? → Use AVL/Red-Black instead of basic BST
+- Only need insert/search/delete? → HashMap is `O(1)` average, simpler
diff --git a/src/main/java/dataStructures/disjointSet/README.md b/src/main/java/dataStructures/disjointSet/README.md
index 23de0b52..a97cb4ee 100644
--- a/src/main/java/dataStructures/disjointSet/README.md
+++ b/src/main/java/dataStructures/disjointSet/README.md
@@ -27,10 +27,23 @@ Querying for connectivity and updating usually tracked with an internal array.
a balanced tree and hence complexity does not necessarily improve
- Note, this is not implemented but details can be found under weighted union folder.
-3. **Weighted Union** - Same idea of using a tree, but constructed in a way that the tree is balanced, leading to
-improved complexities.
+3. **Weighted Union** - Same idea of using a tree, but constructed in a way that the tree is balanced, leading to
+improved complexities.
- Can be further augmented with path compression.
+## Complexity Analysis
+
+| Implementation | Union | Find | Space |
+|----------------|-------|------|-------|
+| Quick Find | `O(n)` | `O(1)` | `O(n)` |
+| Quick Union | `O(n)` | `O(n)` | `O(n)` |
+| Weighted Union | `O(log n)` | `O(log n)` | `O(n)` |
+| Weighted Union + Path Compression | `O(α(n))`* | `O(α(n))`* | `O(n)` |
+
+*`α(n)` is the inverse Ackermann function, which grows so slowly that it's effectively constant (`≤ 4`) for all practical input sizes.
+
+**Interview tip:** When asked about Union-Find complexity with path compression, say "amortized nearly constant time" or "O(α(n)) where α is the inverse Ackermann function, practically constant."
+
## Applications
Because of its efficiency and simplicity in implementing, Disjoint Set structures are widely used in practice:
1. As mentioned, it is often used as a helper structure for Kruskal's MST algorithm
diff --git a/src/main/java/dataStructures/disjointSet/quickFind/README.md b/src/main/java/dataStructures/disjointSet/quickFind/README.md
index 69291d4c..ee7313d9 100644
--- a/src/main/java/dataStructures/disjointSet/quickFind/README.md
+++ b/src/main/java/dataStructures/disjointSet/quickFind/README.md
@@ -21,12 +21,18 @@ Simply use the component identifier array to query for the component identity of
and check if they are equal. This is why this implementation is known as "Quick Find".
## Complexity Analysis
-Let n be the number of elements in consideration.
-**Time**:
- - Union: O(n)
- - Find: O(1)
+| Operation | Time | Notes |
+|-----------|------|-------|
+| Find | `O(1)` | Direct lookup in map |
+| Union | `O(n)` | Must scan all elements to update identifiers |
-**Space**: O(n) auxiliary space for the component identifier
+**Space**: `O(n)` for the component identifier map
-## Notes
\ No newline at end of file
+## Notes
+
+1. **When to use**: Quick Find is suitable when finds vastly outnumber unions. If you have many union operations, consider Weighted Union instead.
+
+2. **HashMap vs Array**: Our implementation uses `HashMap` to support arbitrary object types. If elements are integers `0` to `n-1`, a simple array suffices and is faster.
+
+3. **Union cost adds up**: Performing `n` union operations costs `O(n²)` total, which becomes prohibitive for large datasets. This is the main limitation of Quick Find.
\ No newline at end of file
diff --git a/src/main/java/dataStructures/disjointSet/weightedUnion/DisjointSet.java b/src/main/java/dataStructures/disjointSet/weightedUnion/DisjointSet.java
index 9f1419f1..c5b76f08 100644
--- a/src/main/java/dataStructures/disjointSet/weightedUnion/DisjointSet.java
+++ b/src/main/java/dataStructures/disjointSet/weightedUnion/DisjointSet.java
@@ -92,9 +92,12 @@ public void add(T obj) {
* Checks if object a and object b are in the same component.
* @param a
* @param b
- * @return
+ * @return true if in same component, false otherwise or if either key doesn't exist
*/
public boolean find(T a, T b) {
+ if (!parents.containsKey(a) || !parents.containsKey(b)) {
+ return false;
+ }
T rootOfA = findRoot(a);
T rootOfB = findRoot(b);
return rootOfA.equals(rootOfB);
@@ -106,17 +109,26 @@ public boolean find(T a, T b) {
* @param b
*/
public void union(T a, T b) {
+ if (!parents.containsKey(a) || !parents.containsKey(b)) {
+ return; // key(s) does not exist; do nothing
+ }
+
T rootOfA = findRoot(a);
T rootOfB = findRoot(b);
+
+ if (rootOfA.equals(rootOfB)) {
+ return; // already in same component
+ }
+
int sizeA = size.get(rootOfA);
int sizeB = size.get(rootOfB);
if (sizeA < sizeB) {
- parents.put(rootOfA, rootOfB); // update root A to be child of root B
- size.put(rootOfB, size.get(rootOfB) + size.get(rootOfA)); // update size of bigger tree
+ parents.put(rootOfA, rootOfB);
+ size.put(rootOfB, sizeA + sizeB);
} else {
- parents.put(rootOfB, rootOfA); // update root B to be child of root A
- size.put(rootOfA, size.get(rootOfA) + size.get(rootOfB)); // update size of bigger tree
+ parents.put(rootOfB, rootOfA);
+ size.put(rootOfA, sizeA + sizeB);
}
}
diff --git a/src/main/java/dataStructures/disjointSet/weightedUnion/README.md b/src/main/java/dataStructures/disjointSet/weightedUnion/README.md
index 83b2912f..fbbe4245 100644
--- a/src/main/java/dataStructures/disjointSet/weightedUnion/README.md
+++ b/src/main/java/dataStructures/disjointSet/weightedUnion/README.md
@@ -28,11 +28,11 @@ For each of the node, we traverse up the tree from the current node until the ro
two roots are the same
## Complexity Analysis
-**Time**: O(n) for Union and Find operations. While union-ing is indeed quick, it is possibly undermined
+**Time**: `O(n)` for Union and Find operations. While union-ing is indeed quick, it is possibly undermined
by O(n) traversal in the case of a degenerate tree. Note that at this stage, there is nothing to ensure the trees
are balanced.
-**Space**: O(n), implementation still involves wrapping the n elements with some structure / wrapper (e.g. Node class).
+**Space**: `O(n)`; still involves wrapping the n elements with some structure / wrapper (e.g. Node class).
# Weighted Union
@@ -70,9 +70,13 @@ In other words, using internal arrays or hash maps to track is sufficient to sim
Our implementation uses hash map to account for arbitrary object type.
## Complexity Analysis
-**Time**: O(log(n)) for Union and Find operations.
-**Space**: Remains at O(n)
+| Operation | Time | Notes |
+|-----------|------|-------|
+| Find | `O(log n)` | Height bounded by `log(n)` |
+| Union | `O(log n)` | Dominated by two find operations |
+
+**Space**: `O(n)` for parent and size tracking
### Path Compression
We can further improve on the time complexity of Weighted Union by introducing path compression. Specifically, during
@@ -89,12 +93,20 @@ grandparents._
Credits: CS2040s Lecture Slides
-The analysis with compression is a bit trickier here and talks about the inverse-Ackermann function.
-Interested readers can find out more [here](https://dl.acm.org/doi/pdf/10.1145/321879.321884).
+The analysis with compression is trickier and involves the **inverse Ackermann function** `α(n)`.
+
+| Operation | Amortized Time |
+|-----------|----------------|
+| Find | `O(α(n))` |
+| Union | `O(α(n))` |
+
+**What is `α(n)`?** The inverse Ackermann function grows *incredibly* slowly - for all practical values of `n` (up to ~10^80, more than atoms in the universe), `α(n) ≤ 4`. For all practical purposes, it's constant time.
+
+**Interview tip:** "With weighted union and path compression, union-find operations are amortized O(α(n)), which is effectively constant time for any realistic input size."
-**Time**: O(alpha)
+For the formal analysis, see [Tarjan's original paper](https://dl.acm.org/doi/pdf/10.1145/321879.321884).
-**Space**: O(n)
+**Space**: `O(n)`
## Notes
### Sample Demo - LeetCode 684: Redundant Connections
diff --git a/src/main/java/dataStructures/heap/MaxHeap.java b/src/main/java/dataStructures/heap/MaxHeap.java
index 9c105d37..d71b77a3 100644
--- a/src/main/java/dataStructures/heap/MaxHeap.java
+++ b/src/main/java/dataStructures/heap/MaxHeap.java
@@ -14,9 +14,10 @@
* offer(T item) - O(log(n))
* poll() - O(log(n)); Often named as extractMax(), poll is the corresponding counterpart in PriorityQueue
* remove(T obj) - O(log(n))
+ * updateKey(T obj) - O(log(n))
* decreaseKey(T obj) - O(log(n))
* increaseKey(T obj) - O(log(n))
- * heapify(List lst) - O(n)
+ * heapify(List lst) - O(n)
* heapify(T ...seq) - O(n)
* toString()
*
@@ -66,17 +67,17 @@ public T poll() {
/**
* Inserts item into heap.
+ * Note: Duplicates are not supported due to the Map augmentation.
*
* @param item item to be inserted
*/
public void offer(T item) {
- // shouldn't happen as mentioned in README; do nothing, though should customize behaviour in practice
if (indexOf.containsKey(item)) {
-
+ return; // duplicates not supported
}
heap.add(item); // add to the end of the arraylist
- indexOf.put(item, size() - 1); // add item into index map; here becomes problematic if there are duplicates
+ indexOf.put(item, size() - 1); // add item into index map
bubbleUp(size() - 1); // bubbleUp to rightful place
}
@@ -86,8 +87,8 @@ public void offer(T item) {
* @param obj object to be removed
*/
public void remove(T obj) {
- if (!indexOf.containsKey(obj)) { // do nothing
-
+ if (!indexOf.containsKey(obj)) {
+ return; // object not in heap
}
remove(indexOf.get(obj));
}
@@ -99,11 +100,13 @@ public void remove(T obj) {
* @return deleted element
*/
private T remove(int i) {
- T item = get(i); // remember element to be removed
+ T item = get(i);
swap(i, size() - 1); // O(1) swap with last element in the heap
- heap.remove(size() - 1); // O(1)
- indexOf.remove(item); // remove from index map
- bubbleDown(i); // O(log n)
+ heap.remove(size() - 1);
+ indexOf.remove(item);
+ if (i < size()) { // only bubbleDown if not removing the last element
+ bubbleDown(i);
+ }
return item;
}
@@ -111,17 +114,16 @@ private T remove(int i) {
* Decrease the corresponding value of the object.
*
* @param obj old object
- * @param updatedObj updated object
+ * @param updatedObj updated object with smaller value
*/
public void decreaseKey(T obj, T updatedObj) {
- // shouldn't happen; do nothing, though should customize behaviour in practice
- if (updatedObj.compareTo(obj) > 0) {
-
+ if (!indexOf.containsKey(obj) || updatedObj.compareTo(obj) > 0) {
+ return; // object not found or updatedObj is not smaller
}
- int idx = indexOf.get(obj); // get the index of the object in the array implementation
- heap.set(idx, updatedObj); // simply replace
- indexOf.remove(obj); // no longer exists
+ int idx = indexOf.get(obj);
+ heap.set(idx, updatedObj);
+ indexOf.remove(obj);
indexOf.put(updatedObj, idx);
bubbleDown(idx);
}
@@ -130,21 +132,41 @@ public void decreaseKey(T obj, T updatedObj) {
* Increase the corresponding value of the object.
*
* @param obj old object
- * @param updatedObj updated object
+ * @param updatedObj updated object with larger value
*/
public void increaseKey(T obj, T updatedObj) {
- // shouldn't happen; do nothing, though should customize behaviour in practice
- if (updatedObj.compareTo(obj) < 0) {
- return;
+ if (!indexOf.containsKey(obj) || updatedObj.compareTo(obj) < 0) {
+ return; // object not found or updatedObj is not larger
}
- int idx = indexOf.get(obj); // get the index of the object in the array implementation
- heap.set(idx, updatedObj); // simply replace
- indexOf.remove(obj); // no longer exists
+ int idx = indexOf.get(obj);
+ heap.set(idx, updatedObj);
+ indexOf.remove(obj);
indexOf.put(updatedObj, idx);
bubbleUp(idx);
}
+ /**
+ * Update the value of an object in the heap.
+ * Delegates to increaseKey or decreaseKey based on the comparison.
+ * In practice, this unified method is often sufficient.
+ *
+ * @param obj old object
+ * @param updatedObj updated object
+ */
+ public void updateKey(T obj, T updatedObj) {
+ if (!indexOf.containsKey(obj)) {
+ return;
+ }
+ int cmp = updatedObj.compareTo(obj);
+ if (cmp > 0) {
+ increaseKey(obj, updatedObj);
+ } else if (cmp < 0) {
+ decreaseKey(obj, updatedObj);
+ }
+ // if equal, no change needed
+ }
+
/**
* Takes in a list of objects and convert it into a heap structure.
*
@@ -183,6 +205,9 @@ public void heapify(T... seq) {
*/
@Override
public String toString() {
+ if (size() == 0) {
+ return "[]";
+ }
StringBuilder ret = new StringBuilder("[");
for (int i = 0; i < size(); i++) {
ret.append(heap.get(i));
@@ -284,7 +309,8 @@ private void bubbleUp(int i) {
* @return boolean value that determines is leaf or not
*/
private boolean isLeaf(int i) {
- // actually, suffice to compare index of left child of a node and size of heap
+ // check if node does not have a left child and does not have a right child
+ // actually, suffice to check if left child index is out of bound, as right child index > left child index
return getRightIndex(i) >= size() && getLeftIndex(i) >= size();
}
@@ -296,25 +322,25 @@ private boolean isLeaf(int i) {
*/
private void bubbleDown(int i) {
while (!isLeaf(i)) {
- T maxItem = get(i);
- int maxIndex = i; // index of max item
+ T biggestItem = get(i);
+ int biggestItemIndex = i; // index of max item
// check if left child is greater in priority, if left exists
- if (getLeftIndex(i) < size() && maxItem.compareTo(getLeft(i)) < 0) {
- maxItem = getLeft(i);
- maxIndex = getLeftIndex(i);
+ if (getLeftIndex(i) < size() && biggestItem.compareTo(getLeft(i)) < 0) {
+ biggestItem = getLeft(i);
+ biggestItemIndex = getLeftIndex(i);
}
// check if right child is greater in priority, if right exists
- if (getRightIndex(i) < size() && maxItem.compareTo(getRight(i)) < 0) {
- maxIndex = getRightIndex(i);
+ if (getRightIndex(i) < size() && biggestItem.compareTo(getRight(i)) < 0) {
+ biggestItem = getRight(i);
+ biggestItemIndex = getRightIndex(i);
}
- if (maxIndex != i) {
- swap(i, maxIndex);
- i = maxIndex;
- } else {
- break;
+ if (biggestItemIndex == i) {
+ break; // heap property is achieved
}
+ swap(i, biggestItemIndex);
+ i = biggestItemIndex;
}
}
}
diff --git a/src/main/java/dataStructures/heap/README.md b/src/main/java/dataStructures/heap/README.md
index 554f7990..e8a29a0c 100644
--- a/src/main/java/dataStructures/heap/README.md
+++ b/src/main/java/dataStructures/heap/README.md
@@ -61,47 +61,84 @@ After all, the log factor in the order of growth will turn log(E) = log(V^2) in
to 2log(V) = O(log(V)).
### Heapify - Choice between bubbleUp and bubbleDown
-Heapify is a process used to create a heap data structure from an unordered array. One can also call `offer()` or
-some insertion equivalent starting from an empty array, but that would take O(nlogn).
+Heapify converts an unordered array into a heap. Two approaches exist:
+
+1. **Naive approach**: Insert elements one by one using `offer()` → `O(n log n)`
+2. **Efficient approach**: BubbleDown from back to front → `O(n)`
+
+Both are **correct**, but bubbleDown is more efficient. Here's why: