Discussion:
requirements for a hash function in random generators
(too old to reply)
Immanuel Scholz
2004-02-14 17:27:50 UTC
Permalink
Hello group,


while reading rfc2246 (TLS 1.0) I came up to the pseudo random number
generator using chained keyed hash functions to generate randomness.

These functions got the only requirement, that they must be "collision
resistent". (In fact, it refers to rfc2104 (HMAC)).
I found no word about another required feature: That an attacker get
no additional information about the original text from only looking at
the hash value. This means, the hash function some kind of "encrypts"
the message before it hashes it.


This second requirement is proofable not implicit needed by "collision
resistance":

Let H(x) be a collision resistant hash function. And let H'(x) be a
hash function in that way, that it computes H(x) and concatenate the
last 20 bytes of input data at the end of the hash value (so the lengt
of the hash of H' is 20 bytes more than that of H).
It is clear, that H' is collision resistant, since the concatination
does not makes it easier to find any collision (it makes it harder,
since the last 20 bytes are perfectly protected).
But it is also clear, that using H', an attacker get information about
the input message by looking at the hash value - at least the last 20
bytes.


Next, chaining a hash function to generate random numbers from a true
random seed need functions which "protect" their input data.
Otherwise, an attacker can reconstruct pieces of the random stream of
the same chain by reconstructing them from younger values.
(You know, a pseudo random generator is broken, if you can good
predict elements from only looking at other elements of the stream.)


Is it such a common requirement for all things that get the word
"hash", that no information about the message can be extracted from
the hash value? I do not think so, since hash functions are used in
environments where this is not important. "Hash" is used long before
not in contexts of "additional encrypt it content"..

So I searched for typical definitions of "hash function", beginning
with the papers about hash functions commonly used. I hoped, that if
all postet hash functions already include this statement, this is the
reason it not appears in papers about random number generators...


Rivest says in his MD5-rfc1321 nothing about my second requirement -
only about collision resistance.

In rfc3174 (SHA1) I got the same picture. Nothing about "No
information can be extracted about the original message from the hash
value."


Since I can't believe, the guys from Netscape involving SSL (it did
not change from SSL to TLS) simple forgot this, the mistake must be on
my side... (and this is the reason I post it to sci.crypt.research and
not to the SSL-guys ;)


Are there any research stuff I did not found about encryption security
of common hash algorithms?

Any updates on TLS or SSL I did not read?

Any comments?


Ciao, Imi.



references:


rfc1321 - MD5: "It is conjectured that it is computationally
infeasible to produce two messages having the same message digest, or
to produce any message having a given prespecified target message
digest."

rfc3174 - SHA1: "The SHA-1 is called secure because it is
computationally infeasible to find a message which corresponds to a
given message digest, or to find two different messages which produce
the same message digest."

rfc2104 - HMAC: "The security of the message authentication mechanism
presented here depends on cryptographic properties of the hash
function H: the resistance to collision finding (...) and the message
authentication property of the compression function of H when applied
to single blocks (...)."
Gregory G Rose
2004-02-14 17:50:02 UTC
Permalink
Post by Immanuel Scholz
while reading rfc2246 (TLS 1.0) I came up to the pseudo random number
generator using chained keyed hash functions to generate randomness.
These functions got the only requirement, that they must be "collision
resistent". (In fact, it refers to rfc2104 (HMAC)).
I found no word about another required feature: That an attacker get
no additional information about the original text from only looking at
the hash value. This means, the hash function some kind of "encrypts"
the message before it hashes it.
It's very difficult to mathematically define "no
information about ..." because by definition the
hash function output provides some information
about its input. It has to be defined in terms of
computational difficulty or indistinguishability.

Generally, though, the property you are describing
is called "pre-image resistance" (or
"one-way-ness"). That is, it must be difficult to
find an input that produces a particular output.
Note that collision resistance implies pre-image
resistance, because if you can find a pre-image,
you could hash an arbitrary input value, find a
pre-image of the hash output, and now you have two
inputs producing the same output... by definition
a collision.

Greg.
--
Greg Rose
232B EC8F 44C6 C853 D68F E107 E6BF CD2F 1081 A37C
Qualcomm Australia: http://www.qualcomm.com.au
Nick Maclaren
2004-02-17 06:22:13 UTC
Permalink
Post by Gregory G Rose
Post by Immanuel Scholz
while reading rfc2246 (TLS 1.0) I came up to the pseudo random number
generator using chained keyed hash functions to generate randomness.
These functions got the only requirement, that they must be "collision
resistent". (In fact, it refers to rfc2104 (HMAC)).
I found no word about another required feature: That an attacker get
no additional information about the original text from only looking at
the hash value. This means, the hash function some kind of "encrypts"
the message before it hashes it.
It's very difficult to mathematically define "no
information about ..." because by definition the
hash function output provides some information
about its input. It has to be defined in terms of
computational difficulty or indistinguishability.
NO!!!

That is ONE such meaning, which is the one used in the cryptography
context. When using the concept of 'randomness' in the statistical
context, there is a different meaning. The reason that this matters
is that an approach that 'generates randomness' (though, of course,
it doesn't) may be effective in one of the two contexts but not the
other.

Sorry about riding this hobby horse, but an awful lot of naive people
get caught out by this one, and I suspect from the posting that the
original poster isn't entirely clear on the difference, and one of the
classic errors is to use one type of method in the wrong context.
Post by Gregory G Rose
Generally, though, the property you are describing
is called "pre-image resistance" (or
"one-way-ness"). That is, it must be difficult to
find an input that produces a particular output.
Definitely MUCH better terms than "randomness", because they make it
clear which meaning is wanted!


Regards,
Nick Maclaren.
Immanuel Scholz
2004-02-17 06:32:55 UTC
Permalink
[ Perhaps there is confusion over terminology.
Collision resistance implies 2nd-preimage resistance,
but not (1st) preimage resistance. See Chapter 9 of
_The Handbook of Applied Cryptography_. --DW, co-moderator ]


Hi,
Post by Gregory G Rose
Note that collision resistance implies pre-image
resistance, because if you can find a pre-image,
you could hash an arbitrary input value, find a
pre-image of the hash output, and now you have two
inputs producing the same output... by definition
a collision.
I disagree. The definition is to find to DIFFERENT
input values.

So to find a collision, you have to find two different
pre-images of a hash output (or a different output
to the one you use to calculate the output).

If you use a fixed hash input, a collision resistant
hash function may give you always hints to exact
this input and nothing about any other "collision
input values", so it is not pre-image-resistant but
still collision resistant.


As I showed in my original posting, it is easy to
construct a hash function which is NOT pre-image
resistant and which still IS collision resistant by
simple taking the output of any collision resistant
hash and concatinating pieces of the original hash
input to the output. So the hash give information about
the input (and so is not pre-image resistant) but still
collision resistant, since adding input information
does not make it easier to find a collision but harder.



(Note also, that this must not be confused with
one-way-permutations. Of course, finding the inverse to
a permutation always leads to collisions of pairs of
permutations.)


Ciao,
Imi.
Nick Maclaren
2004-02-17 16:33:10 UTC
Permalink
Post by Immanuel Scholz
[ Perhaps there is confusion over terminology.
Collision resistance implies 2nd-preimage resistance,
but not (1st) preimage resistance. See Chapter 9 of
_The Handbook of Applied Cryptography_. --DW, co-moderator ]
Interesting. That sounds as if it may be the difference between
an unintelligent and an intelligent opponent, which leads to the
difference between statistical game theory and number theoretic
ditto. I must look it up.
Post by Immanuel Scholz
As I showed in my original posting, it is easy to
construct a hash function which is NOT pre-image
resistant and which still IS collision resistant by
simple taking the output of any collision resistant
hash and concatinating pieces of the original hash
input to the output. So the hash give information about
the input (and so is not pre-image resistant) but still
collision resistant, since adding input information
does not make it easier to find a collision but harder.
Yes. As a real-life example of collision-resistant functions, I use
the following cores in my checksum program:

for (i = 0; i < bufflen; ++i)
checksum = (13*checksum+buffer[i])&mask;

for (i = 0; i < bufflen; ++i)
checksum = ((checksum<<8)&mask)^
CRC_table[(checksum>>24)^buffer[i]];

Both are very collision-resistant in practical use, with the first
being more so. The second is POSIX, of course. In terms of
cryptographic security, they are laughable. I use MD5 for that,
because it was the leader when I wrote the code.

Note that there are some very good number-theoretic grounds to
justify the collision-resistance of the first method against most
data not specifically created to trick it. CRCs are fairly good,
but not as robust against long-period variations, though they are
better against random bit flipping.


Regards,
Nick Maclaren.

Protego
2004-02-17 16:32:40 UTC
Permalink
For our TRNG:s we have avoided conventional hash functions as
they have a very small internal memory, typically a string of
16 bytes. It is advantageous to use a larger internal memory,
as the complexity can more easily be increased. Better still,
for a TRNG, is to exploit known properties of the hardware,
but this requires possible tedious electrical/statistical testing.

Your question refer to merely a pseudo-random generator (or stream
cipher). Using an existing function, or algorithm, have the advantage
that published hash functions have been thoroughly investigated to
give high security compared to the internal memory size.

Under fairly general assumptions is it possible to argue that the
quality of the PRNG/TRNG may increase with an increasing internal
memory size, thus enabling better randomness or more secure encryption.

Bo Dömstedt
Chief Cryptographer
Protego Information AB/Gamma
SE - 223 70 Lund
SWEDEN
http://www.protego.se
Loading...