We showed that the worst-case number $T(n)$ of comparisons used by

`SELECT`

to select the $i$th order statistic from $n$ numbers satisfies $T(n) = \Theta(n)$, but the constant hidden by the $\Theta$-notation is rather large. When $i$ is small relative to $n$, we can implement a different procedure that uses`SELECT`

as a subroutine but makes fewer comparisons in the worst case.

- Describe an algorithm that uses $U_i(n)$ comparisons to find the $i$th smallest of $n$ elements, where $$ U_i(n) = \begin{cases} T(n) & \text{if } i \ge n/2 \\ \lfloor n/2 \rfloor + U_i(\lceil n/2 \rceil) + T(2i) & \text{otherwise} \end{cases} $$ (
Hint:Begin with $\lfloor n/2 \rfloor$ disjoint pairwise comparisons, and recurse on the set containing the smaller element from each pair.)- Show that, if $i < n/2$, then $U_i(n) = n + \O(T(2i)\lg(n/i))$.
- Show that, if $i$ is a constant less than $n/2$, then $U_i(n) = n + \O(\lg{n})$.
- Show that, if $i = n/k$ for $k \ge 2$, then $U_i(n) = n + \O(T(2n/k)\lg{k})$.

This is a modified version of `SELECT`

. Not only it finds the $i$th order
statistic, but it also partitions the array, thus finding the $i-1$ smaller
elements.

- If $i \ge n/2$, we just use
`SELECT`

- Otherwise, split the array in pairs and compare each pair.
- We take the smaller elements of each pair, but keep track of the other one.
- We recursively find the first $i$ elements among the smaller elements
- The $i$th order statistic is among the pairs containing the smaller
elements we found in the previous step. We call
`SELECT`

on those $2i$ elements. That's the final answer.

Just picking the smaller element of each pair is not enough. For example, if
we're looking for the 2nd order statistic and our pairs are `1, 2`

, `3, 4`

,
`5, 6`

, `7, 8`

, `9, 10`

, the answer is in the larger part of the first pair.
That's why we need to keep track and later perform `SELECT`

on $2i$ elements.

Steps 1-4 can be implemented in place by modifying the algorithm to put the
larger elements of the pairs on the inactive side of the pivot and modifying
`PARTITION`

to swap the elements on the inactive side every time it swaps
elements on the active side. More details can be found in the Instructor's
Manual.

We can prove (b) by induction. This is the step:

$$ \begin{align} U_i(n) &= \lfloor n/2 \rfloor + U_i(\lceil n/2 \rceil) + T(2i) \\ &= \lfloor n/2 \rfloor + \lceil n/2 \rceil + \O(T(2i)\lg(\lfloor n/2 \rfloor / i)) + T(2i) \\ &= n + \O(T(2i)\lg(n/i)) + T(2i) \\ &= n + \O(T(2i)\lg(n/i)) \end{align} $$

This is a bit more sloppy that doing it with the substitution method, but that feels like grunt work to me at this point.

The other two are obvious:

$$ \begin{align} U_i(n) &= n + \O(T(2i)\lg(n/i)) \\ &= n + \O(\O(1)\lg(n/i)) \\ &= n + \O(\lg{n} - \lg{i}) \\ &= n + \O(\lg{n} - \O(1)) \\ &= n + \O(\lg{n}) \end{align} $$

$$ \begin{align} U_i(n) &= n + \O(T(2i)\lg(n/i)) \\ &= n + \O(T(2n/k)\lg(n/(n/k))) \\ &= n + \O(T(2n/k)\lg{k}) \\ \end{align} $$

Again, this reasoning is sloppy, but I don't feel like applying the substitution method.