Exercise 12.3.5
Suppose that instead of each node $x$ keeping the attribute $x.p$ pointing to $x$'s parent, it keeps $x.succ$, pointing to $x$'s successor. Give pseudocode for
SEARCH
,INSERT
, andDELETE
on a binary search tree $T$ using this representation. These procedures should operate in time $\O(h)$, where $h$ is the height of the tree $T$. (Hint: You may wish to implement a subroutine that returns the parent of a node.)
I've got to admit, I was skeptical about this one. It was quite a good exercise, however, as it forces you to internalize a lot about binary search trees and their predecessor/successor invariants.
The pseudocode and the C code differ a bit, as the C code is a bit more complicated to avoid a few extra steps. It even can be optimized a bit further, but it's not worth the complexity. The asymptotic bounds are going to be the same as the pseudocode
Some thoughts
I pondered a bit about why it would be useful to keep track of the successor
instead of the parent. I may be missing something, but I can find a single
advantage – the tree can be walked inorder without extra space and a bit more
optimally. Another benefit is that it simplifies finding the successor of the
deleted node in DELETE
. Apart from that, not keeping track of the parent
introduces some overhead, especially when deleting.
Notable differences
The pseudocode needs two change in two significant ways:
As the parent is no longer findable through a pointer, we need to walk the three from the root and find it. On the flip side, we don't need to update parent pointers.
Whenever we remove a node, we need to preserve the invariant of the successor field. This means that we need to find the predecessor of the node we are deleting, and have it successor point to the deleted node's successor.
Pseudocode
SEARCH
remains unchanged:
TREE-SEARCH(x, k)
if x = NIL or k == x.key
return x
if k < x.key
return TREE-SEARCH(x.left, k)
else
return TREE-SEARCH(x.right, k)
INSERT
needs to allocate the correct successor of the newly inserted node and
update it's predecessor. Since we're only inserting leaf nodes, both predecessor
and successor are findable through their invariants:
The predecessor is going to be the last parent at which we branched right. It might be the parent of the inserted node, if we put it at the left position.
The successor is going to be the last parent at which at branched left.
-
Notably, we don't need to explore the new node's children, as they don't exist because it's a leaf node.
TREE-INSERT(T, z) y = NIL x = T.root pred = NIL
while x != NIL y = x if z.key < x.key z.succ = x x = x.left else pred = x x = x.right if y == NIL T.root = z else if z.key < y.key y.left = z else y.right = z pred = y if pred != NIL pred.succ = z
The final one, DELETE
is a bit more complicated. It needs TREE-PARENT
, that
finds the parent of a node, starting from the root, and TREE-PREDECESSOR
that
finds finds the predecessor of a node, starting from the root. It also needs
a modified version of TRANSPLANT
. Since those are pretty self-explanatory,
let's start with DELETE
and explore them afterwards.
It's worth noting that we don't need to call TREE-MINIMUM
on z.right
in the
two-children case, as we already know its successor.
TREE-DELETE(T, z)
pred = TREE-PREDECESSOR(z)
if z.left == NIL
TRANSPLANT(T, z, z.right)
else if z.right == NIL
TRANSPLANT(T, z, z.left)
else
y = z.succ
if (z.right != y)
TRANSPLANT(T, y, y.right)
y.right = z.right
TRANSPLANT(T, z, y)
y.left = z.left
if pred != NIL
pred.succ = z.succ
Node that y.p != z
needs to be changed to z.right != y
, which means the same
thing.
TRANSPLANT
now needs to explicitly find the parent:
TRANSPLANT(T, u, v)
p = TREE-PARENT(T, u)
if p == NIL
T.root = v
else if u == p.left
p.left = v
else
p.right = v
The notable change here is that we don't need to update the v
's parent, as we
don't keep track of it.
Finally, TREE-PREDECESSOR
is fairly needs to start from the root, instead of
the node:
TREE-PREDECESSOR(T, x)
y = NIL
z = T.root
while z != x
if z.key < x.key
y = z
z = z.right
else
z = z.left
z = z.left
while z != NIL
y = z
z = z.right
return y
The first loop traverses the tree from the root to x
, keeping track of the
last right branch to mark it as a potential predecessor. When it finishes, z
equals x
. If there is no left child, we're done, and the ancestor is the
predecessor. If there is a left child, we follow it and keep going left, until
there are no no mode left children to follow. The last is essentially what
TREE-MINIMUM
does. We could change the order of the loops (find the minimum of
the left child and go through the root only if it doesn't exist). While it will
improve the constant, it's asymptotically the same.
The C code merges TREE-PREDECESSOR
and TREE-PARENT
into one.
It's pretty easy to illustrate that the result is $\O(h)$, so I won't bother.
C code
#include <stdlib.h> struct node_t { struct node_t *left; struct node_t *right; struct node_t *succ; int key; }; typedef struct node_t node_t; typedef struct { node_t *root; } tree_t; tree_t *make_tree() { tree_t *tree = malloc(sizeof(tree_t)); tree->root = NULL; return tree; } node_t *make_node(int key) { node_t *node = malloc(sizeof(node_t)); node->succ = NULL; node->left = NULL; node->right = NULL; node->key = key; return node; } node_t *insert(tree_t *tree, int key) { node_t *parent = NULL; node_t *current = tree->root; node_t *new = make_node(key); node_t *pred = NULL; while (current) { parent = current; if (new->key < current->key) { new->succ = current; current = current->left; } else { pred = current; current = current->right; } } if (!parent) { tree->root = new; } else if (new->key < parent->key) { parent->left = new; } else { parent->right = new; pred = parent; } if (pred) pred->succ = new; return new; } node_t *find_parent(tree_t *tree, node_t *node) { node_t *previous = NULL; node_t *current = tree->root; while (current && current->key != node->key) { if (current->key < node->key) { previous = current; current = current->right; } else { previous = current; current = current->left; } } return previous; } void find_parent_and_predecessor(tree_t *tree, node_t *node, node_t **parent, node_t **predecessor) { *parent = NULL; *predecessor = NULL; node_t *current = tree->root; while (current && current->key != node->key) { if (current->key < node->key) { *parent = current; *predecessor = current; current = current->right; } else { *parent = current; current = current->left; } } if (!current) return; current = current->left; while (current) { *predecessor = current; current = current->right; } } void transplant(tree_t *tree, node_t *parent, node_t *target, node_t *source) { if (!parent) { tree->root = source; } else if (target == parent->left) { parent->left = source; } else { parent->right = source; } } void delete_tree(tree_t *tree, node_t *node) { node_t *parent; node_t *predecessor; find_parent_and_predecessor(tree, node, &parent, &predecessor); if (!node->left) { transplant(tree, parent, node, node->right); } else if (!node->right) { transplant(tree, parent, node, node->left); } else { node_t *successor = node->succ; if (node->right != successor) { node_t *sparent = find_parent(tree, successor); transplant(tree, sparent, successor, successor->right); successor->right = node->right; } transplant(tree, parent, node, successor); successor->left = node->left; } if (predecessor) predecessor->succ = node->succ; } node_t *search(tree_t *tree, int key) { node_t *node = tree->root; while (node) { if (node->key == key) { return node; } else if (node->key < key) { node = node->right; } else { node = node->left; } } return NULL; } typedef void (*callback_t)(node_t *); void inorder_walk(node_t *node, callback_t callback) { if (!node) return; inorder_walk(node->left, callback); callback(node); inorder_walk(node->right, callback); } void successor_walk(node_t *node, callback_t callback) { while (node->left) node = node->left; while (node) { callback(node); node = node->succ; } }