World Password Day cracking challenge

jpgoldberg · August 2018

One bit hints are now live at https://github.com/agilebits/crackme/blob/master/password-day-2018-1bitHints.json

The README has also been updated to include the hints:

ID	Status	Successful password	Hint	Submission date	By whom	Place
3UOKUEBO	Sample	governor washout beak	0b0	N/A	Sample	0th
AJPYJUTN	Sample	glassy ubiquity absence	0b1	N/A	Sample	0th
IV2DL67Q	Sample	splendor excel rarefy	0b0	N/A	Sample	0th
NO4VRU4S	Not found		0b1			Nth
33YRS77A	Not found		0b0			Nth
J6J4QUWQ	Not found		0b0			Nth
SFELTO3W	Not found		0b0			Nth
DOHB6DC7	Not found		0b0			Nth
2SB5OP3G	Not found		0b0			Nth
5BSLBTKR	Not found		0b1			Nth

nagele · August 2018

@bb8 I believe* the hints just speed things up. If the hashing of each potential password takes the most amount of time, then the hint allows you to do a smaller calculation (SHA256 on the password). If the result of this calculation matches the hint, then proceed with the PBKDF to check the full hash. If it doesn't match, then you save some time by skipping the PBKDF for that password. Since the value of the bit can only be one of two things, I believe this cuts out half of the work.

*I'm also a newbie so I might not be right on some (or all) of this!

bb8 · August 2018

@nagele Haha, okay. I'll leave it to the pros. :P

jpgoldberg · August 2018

The purpose of the hints is to be able to quickly reject half the guesses. If you generate a guess, p, you take the SHA1 SHA256 hash of p and look at its first bit. If the first bit of that hash is 0 then you rule out that p is a solution to challenge NO4VRU4S. You don't have to do the 100,000 rounds of PBKDF2-HMAC-SHA256 to be able to rule out p, so you can quickly move on to your next guess. The point is that it is much quickly to calculate the the SHA1 SHA256 hashes than it is to run a full test on a guess.

There are other optimizations that people may make, but here I leave it to the experts. As it happens two of the hints are 0b1 and the other five are 0b0. On average for any guess, you should have a 50% chance of 1 and a 50% chance of 0. So perhaps an optimization strategy would be to only look at guesses that result in a 0 and then test those guesses against the five for which that hint applies.

Again, I don't know that this sort of optimization will work, but we left the SHA1 SHA256 hashes to be unsalted exactly so that people only need to compute that SHA1 SHA256 once per guess no matter how many targets they are going after.

bb8 · August 2018

@jpgoldberg thanks for that explanation!

Using the three original sample hashes/passwords, I took each passphrase and calculated their SHA1 hashes. Next, I took the hash and converted it to binary, however, the first bit for each of the three came out to be 0b0, even the "glassy ubiquity absence" password with the bitHint of 0b1. Is there a step I'm missing?

jpgoldberg · August 2018

Hmm. @bb8. One of the reasons that I did this for the samples was so that people could see if there is a step that I missed, but I checked my results (for the samples) in two different ways.

I calculated those using the code in the source repository
I calculated those hashes on the command line using openssl dgst

$ echo -n "glassy ubiquity absence" | openssl dgst -sha1 -hex
(stdin)= c599712fe6c8e7291c114043c6c54fe2984006eb

So the leading byte of that is c5, which and c is greater than 7, so that really should be a "1" bit. That is, this byte should be 1100 0101

What do you get for the SHA1 hash of "glassy ubiquity absence"?

Note the -n in the echo -n command. Without that, the newline character would be part of the string.

bb8 · August 2018

@jpgoldberg No, you're right. I figured out my mistake. I get the same hash as you noted, however, in my conversion to binary (I was trying to use Python) the SHA1 value was 'str' so it wasn't correctly converting. Once I fixed that, I see the correct binary output.

My next step is figuring out the proper method to get the first bit in Python... I'm still learning the bits and bytes. ;-)

bb8 · August 2018

@jpgoldberg also, pardon my ignorance on the go language. In the source code for generating the hints, it shows "h := sha256.New()" is it not calculating the SHA256 hash, as opposed to the SHA1 hash?

jpgoldberg · August 2018

Yikes I screwed up badly.

Hmm. ...

// MakeBitHint returns a string representing the first bits bits of the
// SHA256 hash of the string
func MakeBitHint(s string, bits int) string {
    if bits < 1 {
        return "0b"
    }
    if bits > 8 {
        bits = 8
    }
    h := sha256.New()
    h.Write([]byte(s))
    lead := h.Sum(nil)[0]

    // will right shift lead by 8 - bits
    lead >>= 8 - uint(bits)

    // then print result to bits places
    fmtString := fmt.Sprintf("0b%%0%db", bits)
    out := fmt.Sprintf(fmtString, lead)
    return out
}

Crap! You are right.

jpgoldberg · August 2018

I can understand me misremembering whether I used SHA1 or SHA256 by the time I announced things, but then I don't understand why I have the samples wrong.

I'm supposed to get on a plane tomorrow morning, but will try to fix this now. For the moment, I need to ask everyone to not put confidence in the hints as it is clear that I must have done something wrong.

bb8 · August 2018

@jpgoldberg no worries! Looking forward to the updated hints. :)

jpgoldberg · August 2018

Announcement: 1 bit hints use SHA256 (not SHA1)

Previously I incorrectly described how the 1 bit hints are generated. Instead of unsalted SHA1, they are created with unsalted SHA256.

Although this is my screwup in my announcements, you should put away your torches and pitch forks because

The code used to generate the hints was made public (and attention was drawn to it)
If you tested the hints on the samples, you would have seen that SHA1 didn't work for the "governor washout beak" sample.

jpgoldberg · August 2018

Thanks @bb8, I don't need to update the hints. I just need to tell the world that they were created as it says in the source. SHA256.

Even my comments in the code for both the hint creation function and the tests say SHA256. Anyway, here is an "independent" test.

for p in one two three four "governor washout beak" "glassy ubiquity absence" "splendor excel rarefy"; do
    h=$(echo -n $p | shasum -a256 | cut -b1-2)
    echo "$p:  $h"
done

yielding,

one:  76
two:  3f
three:  8b
four:  04
governor washout beak:  7a
glassy ubiquity absence:  e6
splendor excel rarefy:  40

bb8 · August 2018

@jpgoldberg Awesome, thanks!!

eigenl0ss · August 2018

At best this only cuts the potential hashing time in half. It won't necessarily be enough to get any passwords cracked before the end of September or October.

jpgoldberg · August 2018

@eigenl0ss correctly points out that,

At best this only cuts the potential hashing time in half.

Yep. That is the point.

People should keep their head starts

It won't necessarily be enough to get any passwords cracked before the end of September or October.

I'm thinking of people who started back in May. If they made some progress, this cuts in half the remaining space that they need to search. Giving too big a hint at this point would be less fair to the people who have already been working on this.

Bigger hints if needed

And the system is set up to produce hints of up to eight bits. Once we have done this once (and learned from my mistakes), it should be easier to offer hints in the future. So the code for creating the hints is already created and if we do have to do this again, I will surely remember that it was SHA256. If you take a look at the test code for the hint generator, you will see that test vectors for the samples going up to three bits.

Annoying opsec

The most annoying part of doing this, is that the actual solutions are stored in a "secure location" on removable media. And then, once we've generated the (new) hints, we don't want to have to keep that file secret for too long. So there are just some irritating opsec logistics to deal with. Nobody really wants to be a $30,000 target. Unlike other flags and challenges, where reproducible demonstrations are needed, this is the kind of thing where knowledge of the solution could just mean that you "get lucky" in your testing order. There is no way to prove that someone who wins by getting lucky (say hitting a solution after a search of just 10% of the space) did so honestly or with advanced knowledge.

Keeping the solutions

Indeed, we came close to not preserving the solutions at all. If we didn't need these hints, and if the challenges were within reach, there would never be a need for us to hold on to the original passwords. But we figured that if it turns out that the challenge is much harder than we'd thought (and this proved to be the case), we might need to be able to prove at some point that solutions do exist that are genuinely created as we claim they are.

eigenl0ss · August 2018

I think you're overestimating the number of people trying to crack the passwords. Anybody with the cracking hardware available to do so is likely to be using it for mining cryptocurrency or cracking the passwords of more valuable targets. It's not very reasonable to put themselves at risk for loss by sinking funds into hashing for months on end, only to find the competition modified a few months later, (as it now has been!) forcing them to scrap all previous hashes and/or rewrite their cracking code midway, lest they get scooped by better-leveraged competitors.

jpgoldberg · August 2018

It's not very reasonable to put themselves at risk for loss by sinking funds into hashing for months on end, only to find the competition modified a few months later, (as it now has been!) forcing them to scrap all previous hashes and/or rewrite their cracking code midway, lest they get scooped by better-leveraged competitors.

I don't know if you've read the previous discussion leading up to the hints. You will see that the hints were designed in consultation with people who had already been participating. Furthermore the full details, including hints on the samples and the source code for the hint generation, were announced in advance.

Ideally, we would have priced the challenge correctly from the outset. We did not think that it would be "months on end". And if you look at the very original announcement, you will see that we sought to price this so that it would be worthwhile for people to shift over from mining.

It's not that we didn't consider these things. We simply estimated the cost wrong. We wanted this wrapped up in a few weeks. But we've doubled the prizes twice now, and have added hints.

TuxToaster · August 2018

@eigenl0ss while I agree that they may have overestimated the number of people participating, I'd have to disagree with the rest of your assessment. I've got hardware dedicated to this that has been running for the past few months, and while there is definitely a risk of coming out at a loss, the potential gain is much higher than what would likely be generated on the same hardware over the same period mining crypto currencies, at least at this point. Not to mention the non-monetary value gained in experience and research for both 1Password and those participating in the exercise.

There is no reason that the hints provided would force a participant to "scrap all previous hashes and/or rewrite their cracking code midway", as any previously checked hashes would have already been ruled as valid or invalid. The hints simply reduce the amount of work needed to validate remaining potential candidates by allowing one to cut out those which do not match the hint without needing to perform the work of calculating a full 100,000 iterations of PBKDF2.

As Jeff pointed out, given that sample code was provided in advance adding the hints to whatever methodology is in use should have been a minor adjustment to reduce work, not a complete rewrite. If anything, I think they've been more than fair. If you've been following the discussion here, you'd see that a lot of thought has been put into ensuring that any hints provided would provide an equal benefit to all involved, and it's obvious that great care was taken to avoid the exact scenario you suggest of the hint giving someone coming in a better chance than those who have been participating since the beginning.

eigenl0ss · September 2018

@TuxToaster @jpgoldberg

I am not saying the challenge sucks. AgileBits has been great, doing everything short of rewriting Hashcat to support the hints and revisions. I am saying that there is not enough of an incentive, by my assessment, for more than ~10 people to be participating in the competition, and that it will probably not be solved - not even one password - by the end of the year.

jpgoldberg · September 2018

I am saying that there is not enough of an incentive, by my assessment, for more than ~10 people to be participating in the competition, and that it will probably not be solved - not even one password - by the end of the year.

Well, we can always offer more hints, so we can always make it easier. The hint that we've offered turns a 42.5 bit challenge into a 41.5 bit challenge. If we offer another hint in a few weeks and then another in October if necessary, we will have moved it to a 39.5 bit challenge, reducing the cracking cost by 8 times. And if we need to do this again in November and December, we will have cut the cracking cost by 32 times.

Maybe this is all wishful thinking on my part. I was hoping the whole thing would be wrapped up within two months of the initial challenge. I was very very wrong. But we are going to find a way to give out the money that we promised and reward people for the effort that they put in.

CyberSpaceCowboy · September 2018

What honestly I'm starting to think is there's a split between the qualifications of people working on this because the challenge requires you to have 3 prereqs. You have to have a moderate understanding of functional CS. You must have a moderate to advanced understanding of data encoding and encryption. (of course both of which can be googled throughed) And finally (the one I struggle the most with) You have to have a moderate to large supply of cracking computer hardware. many people with the proper mining equipment probably lack the EE/CMPENG/CS classes that teach you what exactly needs done and how and don't have the time to google/search the obscure questions to make a successful setup, and thus rather choose to just use their hardware to do traditional mining. Then on the other hand you have people with the knowledge, but maybe not the time to invest it in or the hardware to apply it (and willingness to invest it). So the only True contenders comes down to out of all the people working on this. those who optimized a setup, has the hardware to leave it going, and the willingness to get paid out only on chance.

jpgoldberg · September 2018

I'm not sure that I agree with @CyberSpaceCowboy's assessment of the skills and resources needed to productively participate in this challenge, but I fully agree that they are some highly specialized skills.

You have to have a moderate understanding of functional CS.

Keep in mind that you don't need to write hashcat or John the Ripper (which takes an enormous amount of skill), but you do need to be able to run them and know how to configure them. There are active communities of people who do this with prior experience of using those tools. But it is a lot to learn for someone who has never used those tools before.

You must have a moderate to advanced understanding of data encoding and encryption.

You need to understand what is meant by PBKDF2-H256 and the data formats. And you have to be able to transform what is offered into things that work for input hashcat or John the Ripper. But again, people who have used those tools will generally understand this stuff.

You have to have a moderate to large supply of cracking computer hardware

Yep. And you are correct that a lot of people who have the gear and have only been using for mining will not have the skills listed above. But I suspect that a lot of people who have the skills listed above have the gear which they turn over to mining when there isn't a worthwhile set of passwords to attack.

Quite simply, the kinds of skills and resources needed to meaningfully participate are not independent of each other. Experience with cracking tools is going to be correlated with experience in password/hash encodes, and some of these people will have rigs that can be used or easily converted to cracking.

Me: I've used both JtR and hashcat once or twice in the distant past, just to see how they work. It took me time to understand how to set things up, and this was long enough ago that I would need to relearn those. I don't have the gear. But I was hoping for participation from those communities. And I think there would be more if we'd initially priced things better (and made the challenge easier).

CyberSpaceCowboy · September 2018

@jpgoldberg Ok I might have used a little bit over exaggerated word choice, I tried to justified with "everything is a google search away" but like it definitely can be very intimidating parameters at first glance.

AGAlumB · September 2018

@CyberSpaceCowboy: I think we can all agree about that. Cheers! :)

jpgoldberg · September 2018

Hey, I think I finally got @eigenl0ss's point. (I can be a bit slow, and it takes time for things to percolate through my brain).

If I can put it in Econ-101 terms, it's that even if the incentives we offer are sufficient for covering the cost of the cracking effort, there is still a larger opportunity cost that we can't realistically match.

I'm not entirely convinced, but I'm not unconvinced either. A lot of it has to do with the extent to which I believe in efficient market theory. For those unfamiliar with efficient market theory, it can be summarized by this joke:

Two economists were walking down the street. One says to the other, "hey, is that a $20 bill lying on the sidewalk?" The other answers, "No, it can't be. If it were, someone would have picked up already." They continue walking.

The idea is that if things on a reasonably free market aren't priced "correctly" someone would be able to take advantage of that and through that process the price would move to the "correct" place. So if bitcoin mining were a "sure thing" more people would do it, which would bring down the price down closer to mining costs.

bb8 · September 2018

I don't have the cracking power (though if I can sell my Mac Pro that isn't needed I'll build me a rig here soon!), but I messed around with this for a couple weeks. I didn't have a clue about the type of hashes used, or how to get the proper hashes for use in hashcat, so I googled a lot, and asked some questions of folks already cracking away.

I ended up writing some python scripts to aid in generating phrase lists, as well as utilizing the hints provided.

So while chances of me getting to any of the prizes is next to none, I still had fun and learned a lot in the process.

Thanks @jpgoldberg for the opportunity!

Ben · September 2018

That is great to hear @bb8. Thanks for sharing your experience. :+1:

Ben

jpgoldberg · September 2018

That is terrific @bb8. So yeah, whoever wins is very likely to be someone who has done this kind of thing before, but if people also learn from the experience that is great, too.

eigenl0ss · September 2018

If I can put it in Econ-101 terms, it's that even if the incentives we offer are sufficient for covering the cost of the cracking effort, there is still a larger opportunity cost that we can't realistically match.

You got it. If you offered $30-100k for a single solution, it would be an absolute no-brainer and the passwords would be cracked before any of us (forum users) even got the chance to see the announcement.

The flip side of that is that even if someone has a password solved, they may elect to hold onto it in the expectation that 1Password will increase the reward again (as it has several times in the past), thereby holding out on $10k cash now in the hopes of receiving >>$10k in the future (provided nobody else beats the astronomical odds and cracks a password).

If 1P makes a clarification that it will never raise the rewards again, I'd wager the probability distribution (of seeing a password release over time) shrinks and moves closer to the present.