![]() |
|
|||
![]() |
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
![]() |
|
![]() |
[an error occurred while processing this directive]
THEOREM 7.1 Suppose h : X → Z is a hash function where |X| and |Z| are finite and |X| ≥ 2|Z|. Suppose A is an inversion algorithm for h. Then there exists a probabilistic Las Vegas algorithm which finds a collision for h with probability at least 1/2. PROOF Consider the algorithm B presented in Figure 7.2. Clearly B is a probabilistic algorithm of the Las Vegas type, since it either finds a collision or returns no answer. Thus our main task is to compute the probability of success. For any x ∈ X, define x ~ x1 if h(x) = h(x1). It is easy to see that ~ is an equivalence relation. Define Each equivalence class [x] consists of the inverse image of an element of Z, so the number of equivalence classes is at most |Z|. Denote the set of equivalence classes by C. Now, suppose x is the element of X chosen in step 1. For this x, there are |[x]| possible x1s that could be returned in step 3. |[x]| - 1 of these x1s are different from x and thus lead to success in step 4. (Note that the algorithm A does not know the representative of the equivalence class [x] that was chosen in step 1.) So, given a particular choice x ∈ X, the probability of success is (|[x]| - 1)/|[x]|.
The probability of success of the algorithm B is computed by averaging over all possible choices for x: Hence we have constructed a Las Vegas algorithm with success probability at least 1/2. Hence, it is sufficient that a hash function satisfy the strongly collision-free property, since it implies the other two properties. So in the remainder of this chapter we restrict our attention to strongly collision-free hash functions. 7.3 The Birthday AttackIn this section, we determine a necessary security condition for hash functions that depends only on the cardinality of the set Z (equivalently, on the size of the message digest). This necessary condition results from a simple method of finding collisions which is informally known as the birthday attack. This terminology arises from the so-called birthday paradox, which says that in a group of 23 random people, at least two will share a birthday with probability at least 1/2. (Of course this is not a paradox, but it is probably counter-intuitive). The reason for the terminology birthday attack will become clear as we progress. As before, let us suppose that h : X → Z is a hash function, X and Z are finite, and |X| ≥ 2|Z|. Denote |X| = m and |Z| = n. It is not hard to see that there are at least n collisions the question is how to find them. A very naive approach is to choose k random distinct elements x1,...,xk ∈ X, compute zi = h(xi), 1 ≤ i ≤ k, and then determine if a collision has taken place (by sorting the zis, for example). This process is analogous to throwing k balls randomly into n bins and then checking to see if some bin contains at least two balls. (The k balls correspond to the k random xis, and the n bins correspond to the n possible elements of Z.) We will compute a lower bound on the probability of finding a collision by this method. This lower bound will depend on k and n, but not on m. Since we are interested in a lower bound on the collision probability, we will make the assumption that Since the inverse images are all (roughly) the same size and the xis are chosen at random, the resulting zis can be thought of as random (not necessarily distinct) elements of Z. But it is a simple matter to compute the probability that k random elements z1,...,zk ∈ Z are distinct. Consider the zis in the order z1,...,zk. The first choice z1 is arbitrary; the probability that z2 ≠ z1 is 1 - 1/n; the probability that z3 is distinct from z1 and z2 is 1 - 2/n, etc. Hence, we estimate the probability of no collisions to be If x is a small real number, then Then our estimated probability of no collisions is So we estimate the probability of at least one collision to be If we denote this probability by ∈, then we can solve for k as a function of n and ∈. If we ignore the term -k, then we estimate If we take ∈ = .5, then our estimate is So this says that hashing just over If X is the set of all human beings, Y is the set of 365 days in a non-leap year (i.e., excluding February 29), and h(x) denotes the birthday of person x, then we are dealing with the birthday paradox. Taking n = 365 in our estimate, we get
Copyright © CRC Press LLC
![]() |
![]() |
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
![]() |
![]() |