Immanuel Scholz
2004-02-14 17:27:50 UTC
Hello group,
while reading rfc2246 (TLS 1.0) I came up to the pseudo random number
generator using chained keyed hash functions to generate randomness.
These functions got the only requirement, that they must be "collision
resistent". (In fact, it refers to rfc2104 (HMAC)).
I found no word about another required feature: That an attacker get
no additional information about the original text from only looking at
the hash value. This means, the hash function some kind of "encrypts"
the message before it hashes it.
This second requirement is proofable not implicit needed by "collision
resistance":
Let H(x) be a collision resistant hash function. And let H'(x) be a
hash function in that way, that it computes H(x) and concatenate the
last 20 bytes of input data at the end of the hash value (so the lengt
of the hash of H' is 20 bytes more than that of H).
It is clear, that H' is collision resistant, since the concatination
does not makes it easier to find any collision (it makes it harder,
since the last 20 bytes are perfectly protected).
But it is also clear, that using H', an attacker get information about
the input message by looking at the hash value - at least the last 20
bytes.
Next, chaining a hash function to generate random numbers from a true
random seed need functions which "protect" their input data.
Otherwise, an attacker can reconstruct pieces of the random stream of
the same chain by reconstructing them from younger values.
(You know, a pseudo random generator is broken, if you can good
predict elements from only looking at other elements of the stream.)
Is it such a common requirement for all things that get the word
"hash", that no information about the message can be extracted from
the hash value? I do not think so, since hash functions are used in
environments where this is not important. "Hash" is used long before
not in contexts of "additional encrypt it content"..
So I searched for typical definitions of "hash function", beginning
with the papers about hash functions commonly used. I hoped, that if
all postet hash functions already include this statement, this is the
reason it not appears in papers about random number generators...
Rivest says in his MD5-rfc1321 nothing about my second requirement -
only about collision resistance.
In rfc3174 (SHA1) I got the same picture. Nothing about "No
information can be extracted about the original message from the hash
value."
Since I can't believe, the guys from Netscape involving SSL (it did
not change from SSL to TLS) simple forgot this, the mistake must be on
my side... (and this is the reason I post it to sci.crypt.research and
not to the SSL-guys ;)
Are there any research stuff I did not found about encryption security
of common hash algorithms?
Any updates on TLS or SSL I did not read?
Any comments?
Ciao, Imi.
references:
rfc1321 - MD5: "It is conjectured that it is computationally
infeasible to produce two messages having the same message digest, or
to produce any message having a given prespecified target message
digest."
rfc3174 - SHA1: "The SHA-1 is called secure because it is
computationally infeasible to find a message which corresponds to a
given message digest, or to find two different messages which produce
the same message digest."
rfc2104 - HMAC: "The security of the message authentication mechanism
presented here depends on cryptographic properties of the hash
function H: the resistance to collision finding (...) and the message
authentication property of the compression function of H when applied
to single blocks (...)."
while reading rfc2246 (TLS 1.0) I came up to the pseudo random number
generator using chained keyed hash functions to generate randomness.
These functions got the only requirement, that they must be "collision
resistent". (In fact, it refers to rfc2104 (HMAC)).
I found no word about another required feature: That an attacker get
no additional information about the original text from only looking at
the hash value. This means, the hash function some kind of "encrypts"
the message before it hashes it.
This second requirement is proofable not implicit needed by "collision
resistance":
Let H(x) be a collision resistant hash function. And let H'(x) be a
hash function in that way, that it computes H(x) and concatenate the
last 20 bytes of input data at the end of the hash value (so the lengt
of the hash of H' is 20 bytes more than that of H).
It is clear, that H' is collision resistant, since the concatination
does not makes it easier to find any collision (it makes it harder,
since the last 20 bytes are perfectly protected).
But it is also clear, that using H', an attacker get information about
the input message by looking at the hash value - at least the last 20
bytes.
Next, chaining a hash function to generate random numbers from a true
random seed need functions which "protect" their input data.
Otherwise, an attacker can reconstruct pieces of the random stream of
the same chain by reconstructing them from younger values.
(You know, a pseudo random generator is broken, if you can good
predict elements from only looking at other elements of the stream.)
Is it such a common requirement for all things that get the word
"hash", that no information about the message can be extracted from
the hash value? I do not think so, since hash functions are used in
environments where this is not important. "Hash" is used long before
not in contexts of "additional encrypt it content"..
So I searched for typical definitions of "hash function", beginning
with the papers about hash functions commonly used. I hoped, that if
all postet hash functions already include this statement, this is the
reason it not appears in papers about random number generators...
Rivest says in his MD5-rfc1321 nothing about my second requirement -
only about collision resistance.
In rfc3174 (SHA1) I got the same picture. Nothing about "No
information can be extracted about the original message from the hash
value."
Since I can't believe, the guys from Netscape involving SSL (it did
not change from SSL to TLS) simple forgot this, the mistake must be on
my side... (and this is the reason I post it to sci.crypt.research and
not to the SSL-guys ;)
Are there any research stuff I did not found about encryption security
of common hash algorithms?
Any updates on TLS or SSL I did not read?
Any comments?
Ciao, Imi.
references:
rfc1321 - MD5: "It is conjectured that it is computationally
infeasible to produce two messages having the same message digest, or
to produce any message having a given prespecified target message
digest."
rfc3174 - SHA1: "The SHA-1 is called secure because it is
computationally infeasible to find a message which corresponds to a
given message digest, or to find two different messages which produce
the same message digest."
rfc2104 - HMAC: "The security of the message authentication mechanism
presented here depends on cryptographic properties of the hash
function H: the resistance to collision finding (...) and the message
authentication property of the compression function of H when applied
to single blocks (...)."