*Post by Ilya O. Levin*I would like to bring to your attention the paper

"Sapparot-2. Fast Pseudo-Random Number Generator".

The purpose of this paper is to introduce an enhanced

version of Sapparot. It is a high-speed pseudo-random

number generator efficient in both software and hardware

yet very simple.

The paper is available at

http://www.literatecode.com/get/sapparot2.pdf

Your paper talks about this as a PRNG, and yet you have a section in it

about security, and you're posting to sci.crypt.research; both of these

lead me to the impression that you think it might be useful as a stream

cipher. By the standards stream ciphers are currently held to, I don't

think it is. Although it might be a perfectly good PRNG for statistical

or gaming applications.

There is a class of attacks called "Guess-and-Determine" attacks. In

these we first tally up what we know about the various unknown values,

usually considered as single bits. Things like "this xor that xor carry

= known". Choose the equation with the least unknowns, and guess the

value of one of them. Eventually this will allow us to infer

("determine") the value of some other unknown bit, so we get to a place

where we start to get two-for-the-price-of-one. Soon after that

happens, we start to get to a point where we can check whether the

results seem to "add up" -- where, if we guessed wrong, we'll get a

contradiction, and can backtrack. Our paper

http://www.qualcomm.com.au/PublicationsDocs/new_snow_gd3.pdf shows one

application of this technique in all its glory, where we manage to

recover the key for the original version of SNOW in

less-than-brute-force time. (And I'm not talking about time-memory

tradeoffs here, which you mention in your paper.)

I can see a fairly straightforward way to apply this kind of attack to

Sapparot-2. There's only one thing in there that prevents it being

pretty easy, and that's the variable rotation by the B column, applied

to the C column. (The carries from the integer additions are only a

minor complication, best handled by working from least significant bits

upward where possible.) So, we work around the rotation. Assuming that

it really is a fairly OK PRNG, we can assume that any desired value for

that rotation occurs uniformly at random. (In all the discussion that

follows, I'll use numbers for the t = 32 bit version.) It seems to me

that it will be advantageous to the attacker to keep the incoming "A"

column and the C column "lined up", so we will first look for places

where the high order 5 bits of the incoming B are "00111". This will

rotate the C column by 7 places, same as A. To do this we'll have to

gather about 33 consecutive words of output from the generator, and try

to apply the attack to each pair; generally it will fail, but we expect

it to work for one of them. This multiplies the effective complexity of

the attack by 2^5, but then gives us 5 bits of information we can use

already.

Anyway, we now can assume the value for those five bits is what we

needed it to be. Note that these are also the same five bits that get

xored to the low-order values taken from A into the B column! (This is

lucky for this attack; you might want to change that 5 bit rotation to

something different.) Now look at the least significant bit of the

first output. It's the xor of the three LSBs. But the first thing you

do to it xor A into C (we're talking about the LSB here, so add is

equivalent to xor since there's no carry.) So we know that that bit is

actually, now, just the LSB of B xor C... one variable has already

disappeared! If we guess the LSB of input B, for example, we now know

the corresponding LSB of input C, which becomes bit 7 of output C. Now

look at the B column; we already guessed that LSB, so we know it. It

gets xored with a 1 bit (that we knew from the rotation amount) and a

constant 1 bit from the A column, which cancel out, so the LSB of

output A is now known. ...

Now this kind of analysis is most amenable to machine optimization, but

even in the example I started above, where I just consider one pair of

consecutive outputs, we have 64 bits of information about the state,

and only 96 bits of state. Our experience tells me that we can probably

recover the internal state in something not much worse than 2^32

guesses, counting backtracking when we can check internal

contradiction; the depth of the search tree will probably be greater

than 32, but many branches will get pruned before they are anywhere

near that deep. Note: I'm describing the attack as if I just look at a

single pair of words, but in reality I think it will probably work

better on three or four consecutive outputs, with corresponding

requirements for more plaintext. The tradeoff here is that you get much

more information about the bits, but have to manage more complexity. I

wish I had finished my "generic G&D attack" program, this would be a

great exercise for it. But I never did, and I don't have time to go off

and implement this specific attack.

Hmmm... going back and looking again, I think that choosing the

rotation to be 5, and not 7, actually works even better. But anyway.

Any honours student out there who wants to take my start and run with

it?

Anyway, I could be wrong, particularly about the expected complexity of

such an attack, but I really don't think this generator is

cryptographically sound.

Greg.