M. Boreale

2004-07-06 15:55:50 UTC

Hello,

I'm an outsider (though well informed) of cryptography, so apologies

in advance if this turn out to be a trivial question.

Is there any result relating Shannon entropy to collision probability?

I.e., to be more precise, suppose:

- X is a discrete r.v. with prob. distribution p1,...,pN;

- IC(X) is the probability of collision for X, Sum_i pi^2 (i.e. the

probability of getting twice the same outcome in two independent

experiments);

- H(X) is Shannon entropy of X, -Sum_i pi*log(pi);

My question is if there is any theorem relating IC(X) to H(X).

The (obviuous) intuition is that the higher IC(X), the lower H(X).

E.g., for X a constant (IC(X)=1) then H(X)=0, while when X has a

uniform distribution, IC(X) reaches its minimum (1/N) and H(X) its

maximum (-log(N)). Does this generalize to arbitrary X, and how?

I'm asking this because, when the distribution of X is unknown, it is

apparently simpler to estimate IC(X) (by, say, measuring the fraction

of collisions over repeated experiments on X) than H(X).

Many thanks

- Michele

I'm an outsider (though well informed) of cryptography, so apologies

in advance if this turn out to be a trivial question.

Is there any result relating Shannon entropy to collision probability?

I.e., to be more precise, suppose:

- X is a discrete r.v. with prob. distribution p1,...,pN;

- IC(X) is the probability of collision for X, Sum_i pi^2 (i.e. the

probability of getting twice the same outcome in two independent

experiments);

- H(X) is Shannon entropy of X, -Sum_i pi*log(pi);

My question is if there is any theorem relating IC(X) to H(X).

The (obviuous) intuition is that the higher IC(X), the lower H(X).

E.g., for X a constant (IC(X)=1) then H(X)=0, while when X has a

uniform distribution, IC(X) reaches its minimum (1/N) and H(X) its

maximum (-log(N)). Does this generalize to arbitrary X, and how?

I'm asking this because, when the distribution of X is unknown, it is

apparently simpler to estimate IC(X) (by, say, measuring the fraction

of collisions over repeated experiments on X) than H(X).

Many thanks

- Michele