Password generation insufficiently random?

Lucent · December 2018

When generating a 15-character password with 1Password for Windows 7.2.581 with “allow digits” checked and "allow ambiguous characters" unchecked, the chance of a password containing only letters and no numbers should be [(26+26-4) / (26+26-4 + 10-2)] ^ 15 = 10% or 1 in 10. However, if I continue to click the generate password button over and over, more than 20% will contain no digits. Statistically, this is off, and points to a problem with the way passwords are randomized or at which point ambiguous characters are removed. Perhaps someone can try generating passwords programmatically to see if the correct proportion contain digits?

Lucent · December 2018

This was easier to see and test with a 4 character password. About half (53%) should have no numbers. In testing, more like 3/4 had no numbers.

If this holds up in testing, this is a definite security vulnerability, as someone could brute force passwords faster if there are fewer bytes of entropy than implied by a password's complexity. Depending on the nature of the missing entropy, at the very least, brute forcing only letters first would consistently allow cracking of alphanumeric passwords earlier than 50% through the problem space.

AGAlumB · December 2018

@Lucent: I get where you're coming from because certainly you would expect to get different results...but probability doesn't work that way: it doesn't care about human expectations, or anything actually; randomness gets you what it gets you, nothing more, nothing less. You'd need to do hundreds to get anything even remotely statistical. You will get similar results flipping coins: probability is 50/50, but you won't get exactly 50 heads and 50 tails from 100 flips. Randomness isn't neat and tidy. That's the whole point. I do understand the impulse to pattern and order though. :)

AGAlumB · December 2018

Also, your premise doesn't make sense. Each position in a password is random, and has an equal chance of being any of the allowed characters. Since there are way more letters (capital and lowercase) than numbers, the probability of each position being filled by a letter is much higher. Does that help?

jpgoldberg · December 2018

I'm going to think out loud as I work through this, @Lucent.

I haven't done any calculation at this point, my intuition is that 10% is actually high for your first example. So please bear with me as I work through this.

In our "new" Secure Password Generator (not rolled out to all clients, but don't worry, I will get back to Windows in due course) our set of upper and lowercase letters are what you expect (and so there are 52 of those). And there are ten digits. Our set of ambitious characters are 0O1Il5S

So there are 7 non-ambiguous digits and 48 non-ambiguous letters. (Lucent may have missed a digit, probably the "5" in ambiguous.

Two ways to calculate

Let's calculate the chances of not getting a digit in a password of length 15 in two different ways. First Lucent's way of taking the chances of not getting a digit in a single position and then multiplying that by itself for the length of the password. That is, if p is the probability of not getting a digit in a single position than p^n is the chance for a password of length n

So p will be 48/55. p to the 15th power is ... 0.13. OK, so my intuition was wrong.

Now the second way to calculate is just something I want to show off. It uses the method described by @rob in his recent PasswordsCon talk. This is actually a technique for calculating the number of ways to generate a password when you require a digit (or digits and symbols, etc). What we can do is use this method to calculate the number of possibilities for generating a password with letters allowed and digits required and also the number of possibilities in which the digits are allowed but not required. The math behind this is detailed in this draft PDF.

As it happens, the password generator I linked to above has , SuccessProbability() method for character password recipes that does exactly this.

So let's give it a twirl

package main
import (
    "fmt"
    "go.1password.io/spg"
)

func main() {
    recipe := &spg.CharRecipe{
        Length:  15,
        Allow:   spg.Letters,
        Require: spg.Digits,
        Exclude: spg.Ambiguous,
    }
    p := recipe.SuccessProbability()

    fmt.Printf("Chance of not getting a digit is %.3g\n", 1.0-p)
}

And when we run that we get "Chance of not getting a digit is 0.13".

Hurrah! Both computation mechanisms (using very different algorithms) come to the same result.

So the expected rate of not getting a digit 13%.

We would be closer to Lucent's 10% if we didn't consider "5" ambiguous. So the same program, but with the recipe of

recipe := &spg.CharRecipe{
        Length:  15,
        Allow:   spg.Letters,
        Require: spg.Digits,
        ExcludeChars: "0O1IlS",
    }

gets us an answer of 0.099, which is very nearly 10%.

Returning to Windows

So should we be expecting 10% or 13% to not have digits? Our Windows code for this is a port of the go package into Rust, and I see that the set of ambiguous characters is consistent 0O1Il5S. So it's 13%.

So now the question is whether what you observed is consistent with 13%. And for this, I'd need to know how many trials there were and how many included digits. Ideally, it would be nice to generate a whole bunch and check. That will also avoid errors of possibly overlooking digits. I'd love to play with our Windows code to test generating a bunch of these, but it's 1:30 AM, and I will ask @SergeyTheAgile to take a look.

Lucent · December 2018

Very thorough answer, and you were exactly right about me missing 5 and S. I did about 100 trials with 5-digit passwords, and with the updated character list, the number of those that should include no digits is 50%. In this test, it was 55%, which is reasonable. I was also curious if there was a statistical difference between generating longer passwords with the entire character set and then removing ambiguous characters before trimming vs. only using allowed characters. Against my intuition, there was not, at least not for the purpose of determining the probability of containing digits. Excuse the PHP; it's the language I have string functions memorized best in:

$LENGTH = 5;
$SAMPLES = 100000;
$all = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
$ambig = ["0","O","1","I","l","5","S"];
$unambig = str_replace($ambig, "", $all);

function generate_passwords($remove, $set, $length, $samples) {
        global $ambig;
        $numbers_yes = 0;
        $numbers_no = 0;
        for ($a = 0; $a < $samples; $a++) {
                $password = "";
                for ($c = 1; $c <= $length * 2; $c++)
                        $password .= $set[random_int(0, strlen($set)-1)];
                if ($remove)
                        $password = str_replace($ambig, "", $password);
                $password = substr($password, 0, $length);
//              echo $password, "\n";
                if (strpbrk($password, "0123456789") == TRUE)
                        $numbers_yes++;
                else
                        $numbers_no++;
        }
        return $numbers_no / ($numbers_yes + $numbers_no);
}
echo "No numbers when ambiguous characters never included:\t", generate_passwords(FALSE, $unambig, $LENGTH, $SAMPLES), "\n";
echo "No numbers when ambiguous characters removed:\t", generate_passwords(TRUE, $all, $LENGTH, $SAMPLES), "\n";

It could be I've done an insufficient number of tests to get statistically significant findings, but I had a funny feeling way too few passwords contained digits. Knowing that 3 out of 10 digits are excluded from the get-go makes it more reasonable, but I first suspected a more mundane issue, like the first click of the randomize button incorrectly ignored the "allow digits" checkbox. I'll test again under different conditions, like the initial generation of a password rather than a regeneration, to see if it makes a difference. At the very least, I think simple tests of the entropy of generated passwords is worth a test harness.

This also relates to the paradoxical result that requiring digits in an 8-character password that was already randomized with digits in mind reduces its complexity by 1/3.

SergeyTheAgile · December 2018

Just to be clear, currently stable release of 1Password 7.2 does have older password generator, newer version is in the works and will be more consistent with @jpgoldberg explanation. It will take some time, I'll update this thread once it's out to beta channel.

jpgoldberg · December 2018

Thanks! @SergeyTheAgile. There's no real reason to dig into current behavior when we know we are going to be using either the code or algorithms of the new SPG everywhere.

I'm not ruling out that @Lucent is correct, but we'd need more data to perform the statistical analysis to know either way. And one of the reasons for using common code for all of this is so that we can just worry about testing codebase. The code that I described is used in some clients, and it will get to all 1Password clients. So that code is correctly our focus.

I'm finding that when I generate 1024 (because I like round numbers) of passwords, I get close to the expected portion (0.8702256) turning up with digits. For 1024 trials, the 95% confidence interval is between 0.8583837 and 0.8991687. And my own tests fall within that. (I started playing with Golang probability packages, but ended up just pasting things into R).

So again, I'm not ruling out a bug in the Windows password generator, and please feel free to experiment more. But the fix for any such bug is to move to the newer code and algorithm, which is something we are already working on. Also note that while embarrassing, a small bias in the generator would be hard to exploit if people are generating long enough passwords. That is not justification for any bias in the generator. A biased generator would be a bug to be fixed. But if there is a bug, it will be fixed by continuing with our roll out of the new generator.

jpgoldberg · December 2018

Thanks for looking again @lucent,

I did about 100 trials with 5-digit passwords, and with the updated character list, the number of those that should include no digits is 50%. In this test, it was 55%, which is reasonable.

Very reasonable, that falls comfortably in the 95% confidence interval: 0.447 – 0.650

I was also curious if there was a statistical difference between generating longer passwords with the entire character set and then removing ambiguous characters before trimming vs. only using allowed characters. Against my intuition, there was not,

Intuitions are curious things. But if you think about this from another perspective, you should see that the results have to be the same. Both schemes lead you to exactly the same set of possible passwords with the same distribution of them, and so they had better come out with the same results. Instead of thinking about how they are generated, just think of the set of possible passwords created by the scheme which each possible password being just as likely as all others.

When the generation scheme is non-uniform, that is when we'd run into trouble. But any algorithm that results in a uniform distribution of the same set has got to have the same statistical properties.

It's a lot easier to generate with the restricted set instead of generating and filtering. But when it came to requiring things like digits and symbols we found that generate and reject was the only way we could figure out to be able to do that and get a uniform distribution. (See @rob's talk and the paper I linked to earlier for what wen't wrong with alternatives.)

This also relates to the paradoxical result that requiring digits in an 8-character password that was already randomized with digits in mind reduces its complexity by 1/3.

Just remember, the more requirements you make the fewer ways there are to satisfy those requirements. And so when generating uniformly, the fewer requirements the better.

Password generation insufficiently random?

Comments

Two ways to calculate

Returning to Windows