
Title:
Fdiv  Software Testing

Description:

All right. Let's try to work out the math.

So what we have is, by assumption, we can run 1 test every 10 microseconds.

Now, there are a million microseconds in a second, 60 seconds in a minute,

and 60 minutes in an hour, and 24 hours in a day.

So now we're going to do some unit cancellation.

We can kill microseconds, we can kill minutes, we can kill seconds,

and we can kill hours.

So if we do the multiplication, we can get tests per day.

And if we do that, we get 8,640,000,000 tests per day.

If we multiply this testing throughput by the failure rate

we're going to get 1 failure per 9 billion tests.

We can cancel tests, do the division, and arrive at 0.96 expected failures per day.

So under what I think are fairly modest assumptions here,

if we perform completely random testing of the input space for fdiv,

we should be able to find this bug in about a day.

And so now this kind of testing is going to need some sort of an oracle.

So we're going to need a way to tell if our particular output from fdiv is correct or not.

And the way this is going to work is IEEE floating point,

which is what fdiv is implementing, is specified very tightly.

That is to say, one implementation of IEEE floating point division

has to return the same bit pattern as another implementation.

That's one of the nice things about that particular specification is that it's fairly tight.

So we ask ourselves, what would have been the oracle for fdiv?

And probably it would have been Intel's existing 487 floating point unit,

which had been around for some years by the time they were developing the Pentium.

So what I think this shows, unless I've messed up sort of egregiously on the math somewhere,

is that random testing would have been a perfectly good way

to find the Intel Pentium fdiv flaw, presuming, of course,

that we could have found a way to rig up a Pentium in concert with a 487

in such a way that they could have cooperated to do the testing.