Problem 12.2

Radix trees

Given two strings $a = a_0 a_1 \ldots a_p$ and $b = b_0 b_1 \ldots b_p$, where each $a_i$ and each $b_j$ is in some ordered set of characters, we say that string $a$ is lexicographically less than string $b$ if either:

  1. there exists an integer $j$, where $0 \le j \le \min(p, q)$, such that $a_i = b_i$ for all $i = 0, 1, \ldots, j - 1$ and $a_j < $b_j$, or
  2. $p < q$ and $a_i = b_i$ for all $i = 0, 1, \ldots p$.

For example, if $a$ and $b$ are bit strings, then $10100 < 10110$ by rule 1 (letting $j = 3$) and $10100 < 101000$ by rule 2. This ordering is similar to that used in English-language dictionaries.

The radix tree data structure shown in Figure 12.5 stores the bit strings $1011$, $10$, $011$, $100$, and $0$. When searching for a key $a = a_0 a_1 \ldots a_p$, we go left at node of depth $i$ if $a_i = 0$ and right if $a_i = 1$. Let $S$ be a set of distinct bit strings whose lengths sum to $n$. Show how to use a radix tree to sort $S$ lexicographically in $\Theta(n)$ time. For example, in Figure 12.5, the output of the sort should be the sequence $0, 011, 10, 100, 1011$.

This is as simple as building the tree and then doing a preorder walk. Each insertion will take $m$ steps, where $m$ is the length of the string being inserted. Given that upper nodes sort lexicographically before lower nodes, and zeroes sort before ones, we need to report the parent before the children, and the left subtree before the right.


C code

#include <stdlib.h>
#include <stdio.h>

struct node_t {
    struct node_t *zero;
    struct node_t *one;
    const char *str;
};

typedef struct node_t node_t;

typedef struct {
    node_t *root;
} radix_tree_t;

typedef void (*callback_t)(const char *);

node_t *make_node() {
    node_t *new = malloc(sizeof(node_t));
    new->zero = NULL;
    new->one = NULL;
    new->str = NULL;
    return new;
}

radix_tree_t *make_radix_tree() {
    radix_tree_t *new = malloc(sizeof(radix_tree_t));
    new->root = make_node();
    return new;
}

void insert(radix_tree_t *tree, const char *string) {
    const char *s = string;
    node_t *current = tree->root;

    while (*s != '\0') {
        if (*s == '0') {
            if (!current->zero) current->zero = make_node();
            current = current->zero;
        } else if (*s == '1') {
            if (!current->one) current->one = make_node();
            current = current->one;
        } else {
            fprintf(stderr, "Invalid string: %s", s);
            exit(1);
        }
        s++;
    }

    current->str = string;
}

void walk(node_t *node, callback_t callback) {
    if (node->str) callback(node->str);
    if (node->zero) walk(node->zero, callback);
    if (node->one) walk(node->one, callback);
}

void walk_sorted(radix_tree_t *tree, callback_t callback) {
    walk(tree->root, callback);
}