
Title:
Problems With Random Tests  Software Testing

Description:

Now here we have below a code from Wikipedia that does the same thing.

And I'm not going to go through the logic here.

But what you can see is this is quite a bit more idiomatic Python.

It's actually quite a bit nicer than the code that I wrote.

So if you like that better then use this as a model instead of the code that I wrote.

The code that I wrote is pretty kind of dumb and obvious.

We have equivalently a Luhn valid check sum using the Wikipedia algorithm,

which just does the obvious thing.

And then what I have here is a random tester, which generates a random credit card number

with a certain prefix and 15 digits and then ensures is that it's valid.

The validity checking function for credit card numbers simply makes sure

that it has the right length, that it has the right prefix, and that the checksum comes out to be 0.

That is to say, the isLuhnvalid function returns true. So that's all there is to it.

But what I want to do finally is take a look at one other issue.

I'm going to comment out my code here and comment in some different code.

What we're doing here is generating completely random 15digit numbers.

What I'm doing is generating a random integer between 0 and the smallest 16digit number.

The largest number that could be generated here is the largest 15digit number.

And then we're going to zerofill, convert that to a string, and do a zerofill operation,

which adds leading zeros.

Now we have a completely random number that is 15 digits long.

And if that checks out to be a valid credit card number,

we're going to increment our validity checker and then finally after doing this 100 thousand times

we're going to print the number of valid credit card numbers that we got.

So let's run this and see what happens. Okay.

We got no valid credit card numbers out of 100 thousand.

So the problem is the prefix was too long.

With a 6digit prefix, the chance is one in a million that we'll generate just this prefix

and then it goes down to one in 10 million that will meet the prefix and the checksum requirement.

So if we start off with a much smaller prefix like just 37 and this is basically anything

in the American Express system I think, now let's see what happens.

We're going to generate 100 thousand credit card numbers and 104 of them came out to be valid.

So even with just a twodigit prefix, it's pretty unlikely that we generate valid credit card numbers.

And so what that means is if we're generating lots of invalid credit card numbers

of course we're stressing only a very small bit of a transaction processing logic

that checks for valid credit card numbers and we're not stressing the rest of it.

So what I hoped I accomplished here is first of all motivated the fact that this generation of valid data

is a real one and second of all, to give you a little bit of a feel for what code looks like

that we usually have to write to generate valid inputs.

And so, if we go back to our web browser example, you can see that we will be doing a

similar exercise but it'd just be quite a bit more sophisticated to generate

for example a valid HTML or a valid HTML with scripts and other things.

That it would take to actually do meaningful fuzzing of a web browser as shown by the blue line here.

And so now to do this, instead of spending half an hour or however long you spend

on the PR quiz now maybe you're going to be spending many weeks.

But on the other hand, what we're going to get out of this if we do it right

is a lot of highvalue bugs including security bugs in our web browser and strongly possible

but of course not guaranteed that the value we get out of those bugs

that we find in a web browser but the effort would've been worth it.