Problem 5.2
Searching an unsorted array
The problem examines three algorithms for searching for a value $x$ in an unsorted array $A$ consisting for $n$ elements.
Consider the following randomized strategy: pick a random index $i$ into $A$. If $A[i] = x$, then we terminate; otherwise, we continue the search by picking a new random index into $A$. We continue picking random indices into $A$ until we find an index $j$ such that $A[j] = x$ or until we have checked every element of $A$. Note that we pick from the whole set of indices each time, so that we may examine a given element more than once.
- Write pseudocode for a procedure
RANDOM-SEARCH
to implement the strategy above. Be sure that your algorithm terminates when all indices into $A$ have been picked.- Suppose that there is exactly one index $i$ such that $A[i] = x$. What is the expected number of indices into $A$ that we must pick before we find $x$ and
RANDOM-SEARCH
terminates?- Generalizing your solution to part (b), suppose that there are $k \ge 1$ indices $i$ such that $A[i] = x$. What is the expected number of indices into $A$ that we must pick before we find $x$ and
RANDOM-SEARCH
terminates? Your answer should be a function of $n$ and $k$.- Suppose that there are no indices $i$ such that $A[i] = x$. What is the expected number of indices into $A$ that we must pick before we have checked all elements of $A$ and
RANDOM-SEARCH
terminates?Now consider a deterministic linear search algorithm, which we refer to as
DETERMINISTIC-SEARCH
. Specifically, the algorithm searches $A$ for $x$ in order, considering $A[1], A[2], A[3], \ldots, A[n]$ until either it finds $A[i] = x$ or it reaches the end of the array. Assume that possible permutations of the input array are equally likely.
- Suppose that there is exactly one index $i$ such that $A[i] = x$. What is the average-case running time of
DETERMINISTIC-SEARCH
? What is the worst-case running time ofDETERMINISTIC-SEARCH
?- Generalizing your solution to part (e), suppose that there are $k \ge 1$ indices $i$ such that $A[i] = x$. What is the average-case running time of
DETERMINISTIC-SEARCH
? What is the worst-case running time ofDETERMINISTIC-SEARCH
? Your answer should be a function of $n$ and $k$.- Suppose that there are no indices $i$ such that $A[i] = x$. What is the average-case running time of
DETERMINISTIC-SEARCH
? What is the worst-case running time ofDETERMINISTIC-SEARCH
?Finally, consider a randomized algorithm
SCRAMBLE-SEARCH
that works by first randomly permuting the input array and then running the deterministic linear search given above on the resulting permuting array.
- Letting $k$ be the number of indices $i$ such that $A[i] = x$, give the worst-case and expected running time of
SCRAMBLE-SEARCH
for the cases in which $k = 0$ and $k = 1$. Generalizing your solution to handle the case in which $k \ge 1$.- Which of the three searching algorithms would you use? Explain your answer.
RANDOM-SEARCH pseudocode
RANDOM-SEARCH(x, A, n):
v = ∅
while |∅| ≠ n
i = RANDOM(1, n)
if A[i] = x
return i
else:
v = v ∩ i
return ␀
v
can be implemented in multiple ways - a hash table, a tree or a bitmap. The
last one would probabily perform best and consume the least space.
RANDOM-SEARCH with one match
RANDOM-SEARCH
is well-modelled by Bernoulli trials. The expected number of
picks is $n$.
RANDOM-SEARCH with multiple matches
In similar fashion, the expected number of picks is $n/k$.
RANDOM-SEARCH with no matches
This is modelled by the balls and bins problem, explored in section 5.4.2. The answer is $n(\ln{n} + \O(1))$.
DETERMINISTIC-SEARCH with one match
The worst-case running time is $n$. The average-case is $(n+1)/2$ (obviously).
DETERMINISTIC-SEARCH with multiple matches
The worst-case running time is $n-k+1$. The average-case running time is $(n+1)/(k+1)$. Let $X_i$ be an indicator random variable that the $i$th element is a match. $\Pr\{X_i\} = 1/(k+1)$. Let $Y$ be an indicator random variable that we have found a match after the first $n-k+1$ elements ($\Pr\{Y\} = 1$). Thus:
$$ \E[X] = \E[X_1 + X_2 + \ldots + X_{n-k} + Y] = 1 + \sum_{i=1}^{n-k}\E[X_i] = 1 + \frac{n-k}{k+1} = \frac{n+1}{k+1} $$
DETERMINISTIC-SEARCH with no matches
Both the worst-case and average case is $n$.
SCRAMBLE-SEARCH matches
It's the same as DETERMINISTIC-SEARCH
, only we replace "average-case" with "expected".
Best algorithm
Definitelly DETERMINISTIC-SEARCH
. SCRAMBLE-SEARCH
gives better expected
results, but for the cost of randomly permuting the array, which is a linear
operation. In the same time we could have scanned the full array and reported a
result.