Homework 2, Advanced Data Structures

Due in class Wednesday, December 5th (well, you can have until Friday the 7th). The problems vary in difficulty. Some quite easy, some harder.

1. Hashing

In class we saw that if we had a weak universal family of hash functions, and setting m=n2, then there is a constant probability that there will be no collisions. What if instead of caring that there be no collisions, we care about ensuring that there only be a constant number of items per bucket. Define a hash family H to be k-weak-universal if the prob[h ∈ H | h(x1) = h(x2) = ... = h(xk)] ≤ c/m(k-1) for some c. (Notice that any k-independent hash family is also k-weak-universal.)

Prove that with constant probability there are at most two items per bucket if m > n(3/2) for k=3. Generalize the result for arbitrary k.

2. Soft Heaps

Consider this statement from the top of page 1014 of Chazelle's paper.

Indeed, consider a sequence of n inserts of keys in decreasing order, intermixed with n findmins, and set ε = 1/2. Despite the high value of the error rate ε, the findmins must actually return the minimum key at least half of the time.

In Chazelle's Soft Heaps, the items go through the following life cycle: The item is inserted uncorrupted. It may become corrupted (possibly instantly when it is inserted). Once corrupted, it remains corrupted until it is deleted. Armed with this clarification, (and the fact that the number of corrupted items that exist if there have been n prior insertions is at most εn) prove the statement.

Using the Sleator/Derryberry bucket-based non-meldable soft-heap implementation, the statement does not hold. Explain this. (Note that in this algorithm, a findmin always returns an item whose key is among the smallest εn items. But that is all that you know about the item returned.

3. Ancestor Relations in a Static Tree

Suppose you're given a static rooted tree. You want to place labels on the nodes of the tree such that in O(1) time you can answer queries of the form "is node A an ancestor of node B?" Explain the labels to use, how to compute them, and how to do the ancestor query.

4. Simple Potential Proof

Consider the following representation of positive integers. For each 1 digit in the binary representation there's a element of a linked list containing the position of that 1. So for example 13 would be represented as a list containing three numbers: [0, 2, 3]. (Low order first.)

The function add(a,b) takes two numbers in this representation and returns their sum in this representation. The input numbers are destroyed in the process. The function create() returns the list [0], the representation of the number 1.

Give an algorithm for add(a,b) and prove, using a potential function, that its running time is O(1).


Danny Sleator
Last modified: Mon Nov 26 21:54:26 2007