Separate password breaches last week at LinkedIn, eHarmony and Last.fm exposed millions of credentials, and once again raised the question of whether any company can get password security right. To understand more about why companies keep making the same mistakes and what they might do differently to prevent future password debacles, I interviewed Thomas H. Ptacek, a security researcher with Matasano Security.
Ptacek is just one of several extremely smart researchers I’ve been speaking with about this topic. Below are some snippets from a conversation we had last week.
BK: I was just reading an article by Eric Chabrow, which pointed out that LinkedIn — a multi-billion dollar company that holds personal information on some of world’s most important executives — has neither a chief information officer nor a chief information security officer. Is it too much to ask for a company like this to take security seriously enough to do a better job protecting and securing their users’ passwords?
Ptacek: There is no correlation between how much money a company or service has or takes in — or whether it’s free or not free — and how good their password storage practices are. Nobody gets this right. I think it’s a problem of generalist developers writing password storage systems. They may be good developers, but they’re almost never security domain specialists. There are very few good developers who are also security domain specialists. So if you’re a smart and talented developer but not a security domain specialist, and you’ve got to put together a password storage system, even if it’s just MD5, to you that’s a space alien technology. That’s a cryptographic algorithm. You can tell your boss that’s a cryptographic hash. You feel good about the fact that you’re storing passwords in a cryptographic hash. But you have to be a domain expert to know that the term cryptographic hash doesn’t really mean much.
BK: Why doesn’t cryptographic hash mean much? Maybe LinkedIn shouldn’t have been using a plain old SHA-1 cryptographic hash function, but shouldn’t developers be seeking to secure their passwords with a solid cryptographic algorithm?
Ptacek: The basic mechanism by which SHA-1 passwords are cracked, or MD5 or SHA-512 — it doesn’t matter what algorithm you use — hasn’t changed since the early 1990s. As soon as code to implement SHA-1 came out, it was also available to John the Ripper and other password cracking tools. It’s a really common misconception — including among security people — that the problem here is using SHA-1. It would not have mattered at all if they had used SHA-512, they would be no better off at all.
BK: I’ve heard people say, you know this probably would not have happened if LinkedIn and others had salted the passwords — or added some randomness to each of the passwords, thus forcing attackers to expend more resources to crack the password hashes. Do you agree with that?
Ptacek: That’s actually another misconception, the idea that the problem is that the passwords were unsalted. UNIX passwords, and they’ve been salted forever, since the 70s, and they have been cracked forever. The idea of a salt in your password is a 70s solution. Back in the 90s, when people broke into UNIX servers, they would steal the shadow password file and would crack that. Invariably when you lost the server, you lost the passwords on that server.
BK: Okay. So if the weakness isn’t with the strength of the cryptographic algorithm, and not with the lack of salt added to the hashed passwords, what’s the answer?
Ptacek: In LinkedIn’s case, and with many other sites, the problem is they’re using the wrong kind of algorithm. They use a cryptographic hash, when they need to use a password hash.
BK: I’ll bite: What’s the difference?
Ptacek: The difference between a cryptographic hash and a password storage hash is that a cryptographic hash is designed to be very, very fast. And it has to be because it’s designed to be used in things like IP-sec. On a packet-by-packet basis, every time a packet hits an Ethernet card, these are things that have to run fast enough to add no discernible latencies to traffic going through Internet routers and things like that. And so the core design goal for cryptographic hashes is to make them lightning fast.
Well, that’s the opposite of what you want with a password hash. You want a password hash to be very slow. The reason for that is a normal user logs in once or twice a day if that — maybe they mistype their password, and have to log in twice or whatever. But in most cases, there are very few interactions the normal user has with a web site with a password hash. Very little of the overhead in running a Web application comes from your password hashing. But if you think about what an attacker has to do, they have a file full of hashes, and they have to try zillions of password combinations against every one of those hashes. For them, if you make a password hash take longer, that’s murder on them.
So, if you use a modern password hash — even if you are hardware accelerated, even if you designed your own circuits to do password hashing, there are modern, secure password hashes that would take hundreds or thousands of years to test passwords on.
BK: Can you give me an example of the difference between the time and resources it would take to crack, say, a SHA-1 password versus a good password hash?
Ptacek: Let’s say you have a reasonably okay password and you’re using salted, randomized SHA-1 hash. I can, off-the-shelf, build a system that will try many, many tens of thousands of possible passwords per second. I can go into Best Buy or whatever and buy a card off the shelf that will do that, and the software to do it is open source.
If we were instead using something like Bcrypt, which would have just been the difference of using a different [software] library when I built my application, I can set Bcrypt up so that a single attempt — to test whether my password was “password” or “himom” — just that one test could take a hundred milliseconds. So all of a sudden you’re talking about tens of tries per second, instead of tens of thousands.
What you do with a password hash is you design it in the opposite way you would design a standard cryptographic hash. A cryptographic hash wants to do the minimum amount of work possible in order to arrive at a secure result. But a password hash wants to deliberately be designed to do the maximum amount of work.
Can you explain in layman’s terms what it is that makes a password hash like Bcrypt take so much longer to crack?
Ptacek: It’s similar to if you said, let’s take SHA-1 password, but instead of just using SHA-1, let’s run SHA-1 on itself thousands of times. So, we’ll take the output of SHA-1 and feed it back to SHA-1, and we’ll do that thousands and thousands of times, and you’ll only know if your password hash is right, when you look at the result of that 1,000th SHA-1 run. So, in order to make that password hash work, you have to run the algorithm 1,000 times for each guess. That’s roughly the tactic that the modern, secure password hashes take. These are algorithms that are designed so that you can’t arrive at the result without lots and lots of work. I mean, we’re talking about 100 milliseconds [one-tenth of a second] worth of work on modern hardware to get the results of a single password attempt.
BK: So what’s the catch here? Are these password hashing systems any harder or more expensive to deploy?
Ptacek: It’s a knowledge issue. This is easier to deploy that building it yourself. If you build it yourself, it takes a fair amount of effort. Whereas the libraries to secure password hashes, just where you drop them in it’s like one function call. And they’re all free. There are no patents to worry about or anything like that.
The Web application very, very minorly increases its costs. Some imperceptible amount. But cleverly, they’ve done it in a way that drastically explodes the costs for attackers. If a normal password try takes a microsecond [one millionth of a second]– but a bcrypt password trial takes 10 milliseconds [a hundredth of a second] — you might as well give up. It will take years and years.
Let’s say we have a Web application that lost 10 million password hashes, but they’re bcrypted passwords. So, I think in that last dump we saw from the LinkedIn dump, about like three million of those passwords had been cracked. It was really quick and easy to see if your password had been in there. In this case, if we lost 10 million bcrypted passwords, instead of 3 million of them being published and compromised, you might be looking at tens or hundreds of user passwords being compromised. Because you could maybe effectively look for every password that was “password,” but you certainly couldn’t run a dictionary against every password there anymore.
BK: Isn’t it likely that lots of companies that are relying on cryptographic hashes haven’t switched to password hashes because it might be disruptive, expensive or hard to do? Do you agree, or are there other issues getting in the way here?
Ptacek: At a certain point, the cost of migrating that is incredibly expensive. And securing Web applications that are as complex as LinkedIn is an incredibly hard problem. At same time, trying to find talent to make sure they’re building it correctly…there’s just not enough application security talent out there. It’s hard to hire people who know what they’re talking about. There’s so much bulk knowledge that gets substituted for real expertise. You can find a lot of general developers sharing wisdom about how to develop secure applications, but these are by and large just general developers trying to be smart, they’re not people who really know what they’re talking about.
To me, the fundamental problem is passwords. People should use better storage algorithms if they’re going to inflict passwords on their users. They should do a better job of it. But the real answer is things like two-factor authentication with smart phones. Two factor authentication seems like the answer to me. I think ten years from now this is going to be a common approach.
” A feature of Erewhon is the absence of machines; this is due to the widely shared perception by the Erewhonians that they are potentially dangerous.”
http://en.wikipedia.org/wiki/Erewhon
“Part of the problem may be that there are few consequences for companies with a devil-may-care attitude toward data.”
http://www.nytimes.com/2012/06/11/technology/linkedin-breach-exposes-light-security-even-at-data-companies.html?pagewanted=2&_r=1&hp
I learned something new again in security. The core idea of requiring a longer login process makes lots of sense and I think users would prefer that to additional hoops to jump through to access online services.
However one thing that still seems to be unaddressed to me is preventing password theft by keylogging and continuing use of weak authentication. There are available solutions out there to encrypt keystrokes so that user passwords are not viewable yet everyone seems focused on anti-virus methods to prevent keylogging. A keystroke encryption tool seems to cut these approaches off at the knees and are very low cost (less than $1/computer.)
Multi-factor and out-of-band authentication looks like it’s getting more attention, at least in financial services, but still has a long way to go.
Hey – good one.
You don’t much about keystroke encryption anywhere. Thanks for posting this reminder and info.
If it truly is low cost and very low processing overhead, one has to wonder why more firms / web sites are not doing this…??
@Kris Tuttle: “There are available solutions out there to encrypt keystrokes so that user passwords are not viewable yet everyone seems focused on anti-virus methods to prevent keylogging.”
Since a keyboard CPU is not user programmable, the encryption, and especially decryption, will occur in the main CPU. That CPU eventually takes encrypted keystrokes back to plaintext keystrokes (and, thus, passwords) before handing them off for local use or transmission.
That means passwords end up being exposed as plaintext in running applications, and so are at least potentially available to resident bots. The benefits of keystroke encryption thus depend upon having incompetent bots, an issue totally under attacker control.
I’ll repeat my post from the LinkedIn article:
Crypto expert and creator of md5crypt Poul-Henning Kamp on LinkedIn’s mistakes…
LinkedIn Password Leak: Salt Their Hide
http://queue.acm.org/detail.cfm?id=2254400
This guy explicitly says that any hashing algorithm that takes less than one full second is too weak.
The problem with that for Web servers should be obvious. No Web site getting even dozens of user login requests per second (or even minute) is going to want every one of them to take a full second to login, even if the user only logs in once a day. People close the Web page if the site is too slow for even five seconds…
Of course, we’re only talking login requests here, not every page request, but still…
I suppose one option is for Web sites to ASSIGN you a strong password and NOT allow you to change it except on their schedule. A lot of people would hate this because the password would probably not be memorable. But this would force everyone to use password managers.
The problem with password managers is…they’re vulnerable. One analysis of cracked passwords discovered in a trojan file dump indicated sixty percent of he passwords came from Web browser password managers.
So an end user would have to use a stronger password manager such as KeePass – hopefully with a strong master password…
See how this goes? More complexity, more motivation for the end user to SUBVERT his OWN security…
That’s a losing game. The alternative is…a losing game.
A-a-a-n-n-n-d-d-d we’re back to my meme again…
If you are worried about to much effort for the server, just do multiple hashing when the users are doing their first time registriation/password change. Then when users login, offloud the hashing to run on the client.
Same problem. Depending on how many users are logging in for the first time, which on a large site might not be a trivial number, this could slow the server.
One way this could be avoided is the user’s first registration isn’t immediate. You take his info, offload it to another server which does nothing but generate the hashes, then email him his registration confirmation. Most sites do this sort of registration anyway so handing off the actual hashing to another server which can take the time to do it makes sense.
But the problem is now the subsequent login. You have to take a clear text password and hash it to match the hash in the server database. Offloading to the client just shifts the slowness from the server to the client. This would probably work since the end user PC is doing a lot less work and in addition on login another second or so really won’t be that noticeable as most users expect login to take longer than a simple page request.
But now your hashing algorithm code – presumably in JavaScript – is now visible to malware or intruders on the end user PC (unless obfuscated somehow) – including your salt (not to mention the end user’s password in the first place.) Once your salt is revealed, hackers can build a pre-hashed database of every likely password. If they can subsequently download your hash database, they can still crack many of the passwords.
The only way to avoid that is to have a second secret salt which is added to the password and first salt on the server side – which again means some slow hashing has to be done on the server. However, at that point perhaps the hashing needn’t be that slow since the initial hashing has already been done on the client side.
With the use of a secret salt, the malware hacker can capture the end user’s password and the salt on the client side, and even the server hash database on the server side, but still not be able to compromise more than that one password on the server side, even with the use of a pre-hashed database – as long as the secret salt is not leaked or captured in some way.
So the secret salt now becomes the single point of vulnerability. So it better not be stored in plain text anywhere on the server – either as a single salt or as an algorithm. It should exist only in server memory as executable code.
I worked at City College of San Francisco for a while as a database developer. The college had the commercial Banner system for administration, which has an algorithm for generating certain critical identification data. The algorithm was very minimal and dependent on a couple relatively simple variables. The documentation said these variables should be removed from the PL/SQL code before using the system.
They weren’t.
If you allow the client to pass the hash, why would a hacker bother to crack the hash?
Its all about making it diffecult to protect the password should it be compromised. Most current hashes are perfectly known and documented. But they are considered 1 way. Meaning that the it possible to check if the password is correct, but not what the actual password is.
With the increase in power the hashes are increasingly easily reversed. The whole point here is that by making the hashing more cpu intensive, it becomes impossible to go from hash -> password
It will always be possible to go from hash to password by using a hash database of all likely passwords. Then the hacker only needs to seize the server database through some other security flaw and he can crack it.
Making the hashing more CPU intensive does not stop that as you can always put more CPU power into the task – probably faster than you can develop new and TESTED hashing algorithms that are adopted widely.
Making the hashing CPU intensive at the registration or login point has some value but not if the hash database is compromised via some other security flaw. In that situation you still can’t prevent the cracking of a large number of passwords.
The only defense against that is a secret salt which is NOT stored in the hash database so that even if the database is seized or a password or hashing algorithm is compromised on the client side it is impossible to reproduce the hash.
So Web sites have three choices:
1) Use an efficient hashing algorithm with a secret salt to minimize server CPU and protect the hash database.
2) Use a less efficient hashing algorithm without a secret salt which is harder to crack but imposes a CPU penalty and does not protect the hash database except temporarily (until CPU power increases enough to overpower it.)
3) Use a less efficient hashing algorithm with a secret salt which secures the hashing database – and take the CPU hit as the cost of security.
As long as you can protect the secret salt, I’d pick 1). For sites that really need security and can afford it, pick 3).
“It will always be possible to go from hash to password by using a hash database of all likely passwords.”
So an answer could be using passwords that are definitely not likely ?
I use many different passwords. Part of the couple of them, for example, is my nickname from (way back) high school. Not a word that any dictionary or lexicon would recognize. Nor is it a derivative of any recognizable word. The two classmates of mine who used to use it and might remember have unfortunately passed away.
Other passwords of mine are similarly “not likely”.
I am not trying to prove any brilliance on my part. I think my purpose is to point out that users have the huge part of the responsibility. If they do not step up to it then it should be forced on them.
Agreed. Weak passwords defeat all of these advanced security measures.
Unfortunately, forcing strong passwords on end users is next to impossible, unless it is in connection with some device that does the password handling for them so they don’t have to worry about it.
I think eventually some sort of common security device which is owned and used by everyone will have to be the at least feasible if not perfect solution.
And that device may or may not be built into a cellphone. It definitely won’t BE a cell phone which are just as insecure as any other computer.
“It will always be possible to go from hash to password by using a hash database of all likely passwords.”
If those passwords are salted with a 128bit salt, you would need more matter than is in the entire universe to store a hash database of *just one* password. If you don’t understand this, then you don’t understand salting and I would advise you to quietly leave the discussion before you embarrass yourself further.
And don’t give me this rubbish about ‘but you can’t guarantee that someone wont find a way in 2 years time!!!!’. No. The laws of physics don’t allow for it. To be able to create that hash database would *literally* be rewriting the laws of physics. If you can do that, why are you bothering with some silly hash database? Apply for the nobel prize man!
>>If you allow the client to pass the hash, why would a
>>hacker bother to crack the hash?
>Its all about making it diffecult to protect the password
>should it be compromised. Most current hashes are
>perfectly known and documented. But they are considered
>1 way.
Did you read what he wrote, and what he was replying to? Someone suggested the client do all the CPU computation to generate the hash and then give it over to the server. To repeat him:
If you allow the client to pass the hash, why would a
hacker bother to crack the hash?
Remember: this is how Windows authentication works, the Chinese hackers are using this in their attacks. They DO NOT crack the password, they just steal the hash and use it WITHOUT KNOWING THE PASSWORD.
Exactly. 1st rule of a secure app
Never trust the client!
i know it wont solve the problem, but SSO could reduce the login delay to one System.
On the other hand, you need to trust one SSO provider and when they fuckup, all your accounts are compromised 😉
To be fair, I don’t think PHK would consider himself a crypto expert, but he has tried to do the right thing over the years.
Brian, thank you for the nice article. And the guy is absolutely correct, very few developers really understand this issue and raising awareness is important. I also only learned about the modern approach to password security relatively recently, namely from these articles:
http://www.h-online.com/security/features/Storing-passwords-in-uncrackable-form-1255576.html?view=print
http://cslyon.net/2011/05/10/sha-512-w-per-user-salts-is-not-enough/
https://blog.whitehatsec.com/hash-length-extension-attacks/
great article,
wasnt really aware of the issue and the explanation about password and crypto hashes was enlightening.
thanks!
It is in fact really easy to migrate from one password hash function to another. The standard password hashing functions use a storage format which identifies which function was used for each password. The password checking function can therefore support several hashes: the current favoured one and old ones for backwards compatibility. When a user logs in, the software can check if their password is stored in an obsolete format and quietly upgrade it there and then without any special action from the user.
I have to laugh by the fact that million articles about password strength and password managers the minute there are reports of passwords being stolen. The strength of your password or having it locked-up in Fort Knox does not mean anything when it is stolen from the source! I agree with you that these organizations need to be securing our info better, and there are steps to be taken and to me two-factor authentication is a step in the right direction. People need to be talking less about password strength and more about other steps like the need to implement some form of 2FA were you can telesign into your account and know you are protected if your password were to be stolen. If they were to try to use the “stolen” password and were not on the computer, smartphone or tablet you have designated trusted, they would still need the one-time PIN code which is delivered to YOUR phone via SMS or Voice.
I think 2FA is way overrated.
scenario 1:
hacker has write access to the database.
changes your phonenumber
login
scenario 2:
use of some hardware token
manufacturer of the token gets compromised
everyone needs a new token, this can take some time
2FA is not a bad thing, but putting it on top of a weak security scheme wont help much.
If the attacker has access to the database, the game is already over.
That said, there are attackers against two-factor auth that do require additional defenses. For example, if the attacker can social-engineer the customer support representative / IT Helpdesk admin to change the phone number in the database. That’s a slightly more realistic scenario that doesn’t rely on the attacker already having access to your internal systems.
Two factor authentication will result in the following:
The hackers figure out how to compromise each of the two factors.
Really, does anyone expect this will not occur?
It already has in some instances of 2FA.
And using a cell phone as one of the factors just shifts the security nightmare from one technology to another. Cell phones ARE COMPUTERS! They are no more secure than computers and they are used by the same end users who can be compromised in any number of ways.
If you’re going to use 2FA, make sure the second factor is a technology DESIGNED for that purpose by someone who is a security expert – and not a consumer device developer.
@BrownChicken: “People need to be talking less about password strength and more about other steps like the need to implement some form of 2FA were you can telesign into your account and know you are protected if your password were to be stolen.”
That seems exactly reversed:
Typically, only weak (i.e., short and predictable) passwords can be exposed from hashed results. Users can exploit that for themselves by using long, random passwords, with a different one for each site or equipment. That also means using a password manager, such as LastPass.
Next, 2-factor authentication is not magic. In particular, it does little to protect against a live bot, and many machines do in fact have bot infections. Bots can see any data which flows through the machine, including passwords, retina scans, digital signatures, and any other sort of authentication whatsoever. Since the problem is not the cryptography, improving that is not going to solve the problem. The problem is the bots.
The problem is that a regular Proof of work system (http://en.wikipedia.org/wiki/Proof-of-work_system) does not work here since all the hashing needs to be done on the server (never trust the client) while also making it long enough to compute to stop brute force or even dictionary attacks.
This post hits a number of important points.
A few months back, I needed to get into an old laptop that I hadn’t used in a long time, and I couldn’t remember any of the passwords. But using various tools, I was able to crack all of the passwords in a relatively short period of time. Newer versions of Windows would be somewhat harder to crack, but the principles remain the same – if you give me physical access to a machine, I can crack the passwords. If I had bothered to learn to program CUDA, I could have done it even faster.
Bcrypt sounds like a stopgap solution, but I wonder if we will reach a point in some number of years where we will be looking back at bcrypt the same way we now look back at things like MD5 or SHA1.
From an end-user perspective, people tell you to use different passwords for every website, and use complex passwords. These two together essentially make it impossible for users to remember it all (yes, you can construct things for commonly used passwords, but the ones you use infrequently are always more problematic). Personally I end up using a password vault, which helps to manage it all, but most people don’t do this, and instead re-use passwords and use simpler passwords that would be relatively easy to crack using brute-force methods. My own feeling is that passwords are essentially obsolete, but an adequate replacement has not yet emerged.
I have always liked things like RSA tokens, but the average end user doesn’t have a smartcard reader, and using USB tokens would be too expensive. Nobody wants to force the user to go out and buy something (although a bank or brokerage account might be where such a thing could start).
But you would probably need to have a different certificate for every website, and simple tokens can’t really handle that many certificates.
Two factor authentication do not help when hackers own server as they can hijack user’s TFA algoritm too. it just protect users from phishing and keyloggers …
the only working way is one time out-of-band passwords. and if you send OTP to cellphones not landlines or do not ask OTP everytime user login, then it do not protect user from malwares of course but is the best solution and the risk is really low
I wonder if the upcoming SHA-3 will be any improvement in terms of password security, or if it will just ‘improve’ the flawed SHA-2. Will users of SHA-1 & 2 be able easily to migrate to 3, because otherwise they won’t do it.
Brian, can you or your crypto colleagues comment?
Bill, my understanding of Thomas’ comments was that upward iterations of SHA-whatever aren’t the answer, as those are cryptographic algorithms that prize speed. The idea behind a password storage hash is that it designed just the opposite: to take as long as possible (within reason) to hash the password.
That segment is in the middle of the above transcript, but I’ll repaste it here because it’s somewhat central to the discussion:
BK: Okay. So if the weakness isn’t with the strength of the cryptographic algorithm, and not with the lack of salt added to the hashed passwords, what’s the answer?
Ptacek: In LinkedIn’s case, and with many other sites, the problem is they’re using the wrong kind of algorithm. They use a cryptographic hash, when they need to use a password hash.
BK: I’ll bite: What’s the difference?
Ptacek: The difference between a cryptographic hash and a password storage hash is that a cryptographic hash is designed to be very, very fast. And it has to be because it’s designed to be used in things like IP-sec. On a packet-by-packet basis, every time a packet hits an Ethernet card, these are things that have to run fast enough to add no discernible latencies to traffic going through Internet routers and things like that. And so the core design goal for cryptographic hashes is to make them lightning fast.
Well, that’s the opposite of what you want with a password hash. You want a password hash to be very slow. The reason for that is a normal user logs in once or twice a day if that — maybe they mistype their password, and have to log in twice or whatever. But in most cases, there are very few interactions the normal user has with a web site with a password hash. Very little of the overhead in running a Web application comes from your password hashing. But if you think about what an attacker has to do, they have a file full of hashes, and they have to try zillions of password combinations against every one of those hashes. For them, if you make a password hash take longer, that’s murder on them.
So, if you use a modern password hash — even if you are hardware accelerated, even if you designed your own circuits to do password hashing, there are modern, secure password hashes that would take hundreds or thousands of years to test passwords on.
The problem with that last part – about algorithms which are hard to break, even with custom hardware – is that hackers don’t do that.
They use the same algorithm to generate every likely password and build a database. This is no different than running your own Web server and having a huge number of people generate password hashes.
Then they just have to capture the REAL hash database by some OTHER security flaw. They don’t have to crack ANY hashes at that point. It’s just a matching exercise to get the original password because they already KNOW the original password that produced that hash.
Only a secret salt which they do NOT know – and cannot obtain – can defeat this attack.
But that assumes the secret salt can never be compromised itself. Which if such a method becomes common will be the next target of hackers just as two-factor authentication will be the next target if it becomes widely adopted.
A-a-a-n-n-n-d-d-d we’re back to my meme…
The salt itself doesn’t necessarily have to be secret to provide security. Just non-deterministic; ideally from a hardware random number generator.
The point of salting isn’t to protect against a targeted attack against one particular user. Salting wasn’t designed to make brute forcing a single hash any harder, just to make opportunistic attacks against the entire database infeasible.
Basically, salting ensures that if two users have the same password, they don’t have the same hash; an attacker can’t just hash “password” and try it against the hash database. The attacker would need a separate look-up/rainbow table for each and every salt used in the hash database.
I’m surprised Mr Ptacek didn’t endorse salting more. His advice of using Bcrypt really only protects against targeted brute-force attacks, while completely ignoring the fact that most of these database breaches are likely perpetrated to harvest email/password combos across the entire userbase. Has he heard of rainbow tables?
I’d put the following to Mr Ptacek: If Linked.in had used Bcrypt, with a hash-time of 100ms and taken your implication that salting is basically useless in providing security, how long would take to make a look-up table of the top 100 passwords (ie. upwards of 60% of the userbase)? 10 seconds.
How about the same situation, except with a 4-byte salt added: almost 1400 years.
The assertion that simply switching from sha-1 to bcrypt would result in only tens or hundreds of passwords breached is dangerously false.
Instead of describing the security benefits of salting a ‘misconception’, I would be advocating the combination of salting with password storage hashes. The advice given by Mr Ptacek is, on face value, dangerous and irresponsible.
Yes, the salt doesn’t have to be secret. But again, can we be sure that storing the salt in the clear in the database will ALWAYS insure that an attack can’t be done with compromises at least one password in the database? (See my comment below)
Is such an attack feasible? Why bother asking – just make sure it won’t work no matter how much compute power can be put towards it.
The goal of a password database is to insure both that 1) no single user’s password can be compromised; and 2) if the database itself is stolen, an effective recovery of ANY password is not feasible.
An additional secret salt makes both goals much easier than hashing with a known (even random) salt.
But in the end, if you can just guess or figure out one (or more) end user’s password, much of the time all this effort is wasted. And that will never change.
The salt doesn’t need to be secret, but it should be random and unique to each user. With a random 128-bit salt, you’ll have to consider each user/hash individually and won’t be able to precompute anything.
It’s okay if you don’t want to stretch an algorithm until it takes 1/10 of a second or more; that could run into some responsiveness issues, especially for a larger site or with a DoS attack, but you can still slow it down until it takes 1/500 of a second and limit login attempts to no more than 1 per second and it would be a huge improvement over plain MD5/SHA-1.
“With a random 128-bit salt, you’ll have to consider each user/hash individually and won’t be able to precompute anything.”
This is precisely what I’m questioning as not necessarily being true.
A secret fixed salt in addition to the random salt eliminates even the possibility of such an approach.
If an attacker can guess your salt values without access to the password database, you’re doing something wrong.
Who said anything about guessing the salt value?
Obviously ANY salt you use should be a high entropy value, whether fixed or random.
You can’t precompute rainbow tables for 2**128 salt values. It’s billions of billions of billions of times beyond what is possible. It’s not a matter of having a few billion dollars or of technology catching up in a few years. It’s way beyond that.
I don’t mean to go on the attack, I understand where you’re coming from and it’s nothing personal, but in the interests of a robust debate, I need to clear up a few things. 🙂
First up, it’s not a salt if it’s fixed, it’s just a suffix.
Second, it’s already assumed that the password database is secret. No one is publishing their password database publicly (although, with the right design and algorithms, together with user education and a very strict password complexity policy, it would in theory be possible to maintain security even if you did). If the database itself is secret, then it follows that any salt inside the database is also secret.
I assume you propose storing this secret suffix somewhere else other than the password database. If an attacker could break in and steal password.db, why wouldn’t they also just steal secret_fixed_suffix.txt too? If it’s possible to make the suffix more secret than the passwords themselves, why not just also do that same thing to the passwords (and save yourself time by storing the suffixes with the passwords then)? And for increased security, you could have different suffixes for each user.. Oh wait…
“First up, it’s not a salt if it’s fixed, it’s just a suffix.”
That’s a semantics issue. I’m talking about a high-entropy random value. Suffix is how it’s used, not what it is.
“Second, it’s already assumed that the password database is secret.”
Oh, please… You have to allow for the possibility of another security flaw enabling capture of the database.
“If an attacker could break in and steal password.db, why wouldn’t they also just steal secret_fixed_suffix.txt too? ”
Because it’s not stored anywhere except in running memory on the server. While I won’t make the impossible claim that this is secure, it’s more secure than storing it anywhere.
You can also store it (in running memory) on another server altogether which means the hacker has to compromise both servers – especially hard if the other server isn’t ON the Internet but merely connected to the Web server via a restricted channel.
“If it’s possible to make the suffix more secret than the passwords themselves, why not just also do that same thing to the passwords (and save yourself time by storing the suffixes with the passwords then)?”
That’s not the point. The point is that a secret suffix makes it impossible to crack a captured password database without compromising the secret salt as well. You CAN’T make the password database “more secret” by definition – it has to be available. (You could of course put it on yet another server as well as the secret salt on still another server, which wouldn’t hurt.)
Apparently no one has bothered to read all of my posts on this issue…
“That’s a semantics issue. I’m talking about a high-entropy random value. Suffix is how it’s used, not what it is.”
A salt is *uniquely generated* random bitstring *for each user*. What you’re proposing is fixed (ie. the same for all users). Please don’t call it a salt because it’s not. Calling it a salt is disingenuous at best.
“Because it’s not stored anywhere except in running memory on the server.”
So you’re now only storing the secret in volatile memory? And what happens if the process crashes? Or are you suggesting that it be hard-coded into the authentication code? Given that the authentication code for a website is most likely written in php, how do you propose you protect that secret in the [less secure] php code? You’re basing your security advice over the hope that the attacker doesn’t notice?
Regardless, from an information theoretic perspective, all you’ve done is modified your hash function from a standard one to a non-standard one. I fail to see how this provides any security over an existing standard hash function, given that one should never design a secure system without accounting for Kerckhoff’s principle/Shannon’s maxim.
“You can also store it (in running memory) on another server altogether which means the hacker has to compromise both servers – especially hard if the other server isn’t ON the Internet but merely connected to the Web server via a restricted channel.”
Why don’t you just put the password database on that server if it’s so secure?
“…The point is that a secret suffix makes it impossible to crack a captured password database without compromising the secret salt as well.”
Actually, as we have all been aware of in the information theory & cryptographic community for the last 40 years, storing a salt alongside the [password storage] hash of the user password, makes it impossible to ‘crack’ the database faster than brute force anyway. As a bonus, you don’t even need to store any secrets or have any secondary offline servers to maintain and support. Just because you don’t understand how salting improves security doesn’t mean it’s insecure. If you don’t ‘trust’ salting, then by all means fly solo and roll your own password authentication system, but don’t advise others to do so.
All this talk about tacking some non-standard, secret, fixed suffix stored in running memory on some other server to the end of user passwords before you hash them on the original webserver to prevent some obscure, specific attack scenario where we assume the attacker only has the ability to read the password database file and nothing else, sounds like unnecessary complexity to fix a problem which doesn’t exist.
Fact of the matter: best practice is salting (to prevent space/time tradeoff attacks such as rainbow tables) combined with a standard (since we adhere to Kerckhoff’s principle), computationally secure (to mitigate cryptographic attacks) password storage hash (to mitigate brute-force attacks). If Linked.in had followed best practice, this leak would be a complete non-issue.
I can’t believe I’m biting on a comment thread on this topic. Oh well.
bcrypt has embedded random salting folks. Rainbow tables are ineffective against a database (although, if you’re lucky, maybe you’ll have a table generated for that one hash that was generated for that one user when they created their password). I doubt Tom is really suggesting that salting is useless, but I suspect he’s more pointing out that most static / shared salt style implementations are of significantly less use than people think.
What a pleasant surprise to see a post with ptacek. I follow his posts on Hacker News and twitter… smart guy!
He hit the nail on the head. The only thing I disagree with is that “[T]he idea of a salt in your password is a 70s solution.” Instead I would say that a salt is an important component of a strong password hashing mechanism. For example bcrypt uses 128 bits of salt “in order to make building dictionaries of common passwords space consuming” according to OpenBSD’s crypt(3) manpage. For this reason there are zero rainbow tables available for bcrypt passwords.
A few points about bcrypt:
* The only operating system to use bcrypt by default is OpenBSD
* The maximum password length is 72 characters (many sources incorrectly state 55; this is easily disproven by comparing passwords of 55/56 chars and 72/73 chars). I’ve done this for both interactive SSH logins on OpenBSD and comparing passwords using py-bcrypt. Beyond 72 chars, additional characters will be ignored.
* bcrypt is available for python and I’d assume all languages suited for writing CGI scripts and programs.
Hi – good point. Even at 1/10 second per hash, a pretty good sized dictionary lookup table could be built in a week or so – and once built, could be published for anyone to use. So it seems like some sort of salting, not just slow hashing, is required to put this sort of attack out of range.
“there are zero rainbow tables available for bcrypt passwords.”
Are we sure this is because the space requirements are high? Even in these days of multi-terabyte NAS boxes for under $1,000? (Not to mention cloud clusters and distributed computing…)
Or could it be because so few systems are using bcrypt passwords these days?
Frankly, if Linux is a “niche OS” (except on servers), BSD is an even more niche OS. Which is unfortunate since variants of BSD are probably the most secure OS out there other than DoD Certified ones.
I don’t think you’ve proved the point that a rainbow attack is not effective just because there are no rainbow tables for bcrypt – yet.
HOWEVER…that said…
I just found this article by Ptacek from back in 2007…
Enough With The Rainbow Tables: What You Need To Know About Secure Password Schemes
http://www.securityfocus.com/blogs/262
He states in this piece that if you have a RANDOM salt, you can defeat rainbow tables because you expand the number of POSSIBLE passwords in any hacker created database to a point which is not feasible. This is presumably what Nic above was talking about.
This approach sounds good except…”(the nonce [salt]is stored in the clear” in the hash database.
So you have a hash and the salt used to create the hash. How is it that this prevents creating a rainbow table? If you capture the database by some other security flaw, you can see ALL the salts used in that table and which has they are associated to. So you can still generate a rainbow table from likely passwords and those salts until you get a match for at least one such combination.
How long will that take? Who knows? Some mathematician can probably figure that out. I can’t. So is it a feasible attack? Again, who knows for sure? Especially given, as I said before, clusters, distributed processing and petabytes of storage.
Remember, your enemy isn’t necessarily some 17-year-old in his mom’s basement. It could be a nation state – even OUR nation state – with millions or even billions to spend on a password cracking scheme.
But a SECRET ADDITIONAL SALT totally defeats that technique – at least until the secret salt is compromised as well.
All that said, Ptacek is probably right. Bcrypt is your best choice today.
But I’d use a secret salt on top of it just to be safer.
“I don’t think you’ve proved the point that a rainbow attack is not effective just because there are no rainbow tables for bcrypt – yet.”
Well, I didn’t say a rainbow attack wouldn’t be effective against bcrypt. I said that you can’t build rainbow tables for bcrypt passwords. (so the effectiveness of a theoretical bcrypt rainbow table is a moot point.)
Technically you could build a rainbow table for bcrypt passwords, but given the 128 bits of random salt per password, the resulting table would either be:
a) vastly incomplete, so small relative to the number of total possible hashes that it wouldn’t serve a purpose, or
b) so large that all the hard drives in the world couldn’t even store it
Anyway, the point of the article is that password hashes (like bcrypt or scrypt) must replace cryptographic hashes (like sha512) regardless of salting. You can salt sha512 passwords (defeating rainbow tables), but then, GPUs and even CPUs are still fast enough to brute force the passwords.
Cryptographic hash functions (md5, sha-2) are not meant to store passwords. Cryptographic password hash functions (bcrypt, scrypt) are. Use the right tool for the job.
I’m speculating here, but I wonder if it would be possible to in a sense build a “hashed” hash database for bcrypt hashes…
In other words, instead of generating every possible bcrypt hash, and having to store every one of those, perhaps it’s possible to generate the hashes and then only store a “reference” to that hash rather than the hash itself.
In other words, you compute the hash “space” but don’t store the resulting hashes themselves. This would cut down on the actual physical space required to store the overall hash – although it might not cut it down by enough – and it does nothing to solve the problem of how long it would take to generate every possible hash.
I have no idea if this is feasible. It’s just a thought.
What I do know is that someone with a lot of resources is undoubtedly trying to figure out how to compromise bcrypt hashes as we speak – and probably has been trying for several years at this point.
Right now, bcrypt looks like the answer to password compromise by brute force and rainbow tables. Adding a secret salt would make it the answer for much longer. regardless of how much CPU power or algorithms could be brought to bear.
Bcrypt adds a 128-bit salt (a unique random value) to every password so you cannot create a lookup table that would be useful in any way. You could hash a small dictionary using thousands or even millions of potential salt values if you had the resources, but the odds that you’d ever see that salt value in use are less than your odds of winning the lottery, twice.
Rainbow tables don’t exist for bcrypt because bcrypt uses 128-bit salt values so you’d need to create 2**128 tables which is impossible. Brute force is your only option.
“Impossible” is not a word I use when a national intelligence agency can throw literally billions of dollars at a password cracking project.
In any event, my point – which everyone here appears to have missed completely (and deliberately?) – is that an additional fixed secret salt added to the original salt would eliminate ANY possibility of producing a successful rainbow table by ANY means FOREVER – as long as the secret salt remains secret.
I don’t see why that is so hard to comprehend.
I don’t think you understand how big 2**128 is. Imagine that you want to create rainbow tables for all 8 character, lower-case, alphanumeric passwords. There are about 2**41 possible combinations. Assuming no wasted computation (and making rainbow tables wastes plenty), you’d need to perform 2**169 computations to make rainbow tables for all 2**128 salts.
If you wanted to perform a brute force attack against the 6.5m LinkedIn passwords using the same character set, you’d only need to perform about 2**64 computations. So the effort to create the rainbow tables would be 2**105 times greater than brute-force. If you had a supercomputer that could brute force 2**64 in 1 second (and for bcrypt that’s insane), the universe would die a heat death long before you finished.
Having a second fixed salt value is useful for breaking any off-the-shelf password cracking tools and will stop an attacker who is able to grab the password database via something like SQL injection but is unable (or doesn’t know how) to get access to the fixed salt.
But, the fixed salt has little additional effect on precomputation. If you’re already stretching and salting, then rainbow tables and other precomputation attacks are useless anyway.
I’m entirely willing to grant you that current technology may not be able to break bcrypt hashes. Even technology from 20 years from now might not be able to – although I would hesitate to state that as a fact.
I’m MERELY saying that to insure that the problem NEVER arises no matter WHAT technology is available to brute force or rainbow table a hashing algorithm (aside from telepathy or magic), a secret salt is the simple answer (again, as long as the secret salt remains secret.)
I don’t see why the concept of a secret salt has generated so many “Dislikes” and such hostility on this thread. It’s a perfectly simple and easy to implement concept that provides significant additional security with little additional overhead.
A secret salt prevents someone who doesn’t know that secret salt from cracking passwords. This is a good thing. But, if the person is able to recover that value; it no longer provides any security.
Salting and stretching already make rainbow tables impossible. Impossible. No way. No how. You don’t need a secret salt to ensure this.
Rainbow tables require anywhere from hundreds of megabytes to multiple gigabytes of storage. Currently, we can store about a terabyte on a hard drive (2**40 bytes). If you needed to store a 1 gigabyte rainbow table (or set of tables) for each salt, you’d need about 2**158 bytes of storage. That means you’d need 2**118 terabyte hard drives or 2**108 one petabyte hard drives.
Rounding down, you’d need 300,000,000,000,000,000,000,000,000,000,000 petabyte hard drives.
There is no advance in engineering or physics that will make that a reality. Ever.
“A secret salt prevents someone who doesn’t know that secret salt from cracking passwords. This is a good thing. But, if the person is able to recover that value; it no longer provides any security.”
The only possible response to this is…duh… Not to mention I’ve repeatedly stated “as long as the secret salt remains secret.”
“Salting and stretching already make rainbow tables impossible. Impossible. No way. No how.”
Today… Not necessarily tomorrow…
” You don’t need a secret salt to ensure this.”
Is there something wrong with doing it anyway? When in fact it can actually make a rainbow table not just “impossible” but actually IRRELEVANT?
“If you needed to store a 1 gigabyte rainbow table (or set of tables) for each salt, you’d need about 2**158 bytes of storage. That means you’d need 2**118 terabyte hard drives or 2**108 one petabyte hard drives.”
All of which is irrelevant to my point…
“There is no advance in engineering or physics that will make that a reality. Ever.”
That’s just a ridiculous statement. Period. Granted, given your ASSUMPTIONS, it may be a correct statement vis-a-vis hard drives TODAY.
None of which, as I’ve said, is relevant to my point: that it is IMPOSSIBLE to assert with one hundred percent authority that no OTHER way of getting around the issue can be found, either today, tomorrow or X years from now (where X is a fairly small one- to two-digit number.)
This sort of dogmatism about a security technology is why we continue to have unlimited numbers of breaches – because security geeks simply can’t conceive that anyone else could get around their technology…
It’s a fundamental flaw in the entire industry. Read Peter Gurtmann’s talk at Defcon 17 entitled “The Psychology of Computer Insecurity” for details on why geeks are the last people who should be designing security…
You really don’t understand the math. In order to store rainbow tables for all 2**128 salt values you would need to turn the entire Earth into a storage device that can encode several bytes of information per atom.
This isn’t about Moore’s law or the advances that we might see in the coming years. We will never be able to store that much data. Ever.
Keepass has a nice description of their hashing process.
http://keepass.info/help/base/security.html#secdictprotect
The password plus a 128-bit random salt are hashed using SHA-256 the result is then encrypted N times using AES then hashed again using SHA-256. By default N=6000 but in the program you can set it to whatever you want to. Set the encryption rounds to several million and it takes about a second on a typical PC. It’s all about introducing enough time into the process to make attack impractical.
How does this in any way prevent a dictionary attack? As I understood the problem, LinkedIn and others had their entire user database copied. This provides nearly infinite time to crack the password located in that database. A simple rainbow table (which is curiously not mentioned in this article on password security) is usually employed by crackers just for this very reason. It would take a little longer to generate the rainbow tables for the new algorithm, but I fail to see how producing a slower algorithm somehow decreases the speed of such an attack. Only in situations were the website was directly attacked would this technique work.
I assumed that the passwords were sufficiently salted. At least that’s what I got out of the transcript.
It’s all about massive numbers.
Salting prevents pre-computation but if you steal the database you have in theory a near infinite amount of time so salting just slows you down a little. But if the password is sufficiently long/complex it could take many years. An attack isn’t practical, at least for most people. The NSA has massive amounts of computing power but no one knows what they can and can’t break.
If you were using a password with 128 bits of entropy (i.e. a randomly-generated 20 character password using all the available keyboard characters as the pool), and the attacker could generate and test a trillion hashes a second, it would still take 5 × 10^18 years on average to recover the password.
But most people use passwords that have very low level of entropy, probably in the 10 to 30 bits range. (Entropy estimates are available in NIST SP800-63, Appendix D.) 2^30 is roughly a billion. Assuming there there are no rounds to increase the calculation time, a quad core PC with GPUs might be able to get up to a 100 million hashes a second. Which means a 30 bit password won’t last more than a few seconds.
I cited the wrong Appendix. Entropy estimates are available in NIST SP800-63, Appendix A.
Password stretching cripples rainbow tables. You don’t have an infinite amount of time to build the tables. Suppose you can build tables of a certain size in 1 day for SHA-1. A lookup in the table might take 1 minute in the worst case. If a site hashes passwords by iterating SHA-1 a thousand times, it will take you about three years to build that same rainbow table and lookups will take almost seventeen hours per hash.
Because of the lookup time, rainbow tables don’t scale well anyway. As long as you only care about a few users, rainbow tables are great. But if you want to crack a maximum number of accounts for a big site like LinkedIn, brute force is much more effective.
With large salts, rainbow tables can’t be used at all and brute force no longer scales. Use salts and password stretching.
“As long as you only care about a few users, rainbow tables are great. But if you want to crack a maximum number of accounts for a big site like LinkedIn, brute force is much more effective. ”
Why distinguish? Some times all you want to do is crack a few users – the most important users. Password protection should do both, not just one or the other.
Bcrypt is the right approach for now. I wouldn’t guarantee it for another twenty years, however. With a secret additional salt it can be guaranteed forever – as long as the secret salt remains secret.
Bcrypt is tunable. You change change the parameters to make it slower. It may be replaced by solutions like scrypt that focus on memory hardness, but it won’t be easily crackable without some cryptographic breakthrough.
While most cryptographers like to place their bets on the side of “requires a cryptographic breakthrough”, I prefer to place my bet on the side of “there probably will be one sooner or later..”
Either that or the matter will remain moot as long as people use crappy passwords in the first place…
I was just reading Matt Weir and Sudhir Aggarwal’s Defcon 17 talk “Cracking Passwords”. They talk about breaking Web Hosting Talk Forum’s vBulletin passwords. vBulletin used the following algorithm:
MD5(MD5(Password).salt)
Their slides say this:
Quote
It’s a major problem. Since each user’s salt is different, we have to hash each password guess for each user.
Compare it to the PhpBB attack
– Assume you spent 1 hour attacking the PhpBB list
– It would take you 200,000 hours to to run the same attack on the Web Hosting Talk list.
That Being Said:
I’ve still managed to crack 17% of the passwords just using my laptop over a 1 week period.
I just used a list of previously cracked passwords, (no word mangling rules), only made it through the words starting with ‘i’.
End Quote
Granted, this isn’t a bcrypt hash. But the point is there are dumber and smarter approaches to cracking password hashes. I will never state with authority that someone CANNOT figure out a smarter way to break a bcrypt hash tomorrow…or next year…or five years from now whether by advances in mathematics or merely advances in available computing power…or merely a smarter method of incrementally approaching the problem.
That’s a fool’s game.
Ptacek is wrong.
Slow password hashes are an invitation for easy DDoS.
The threat of denial of service attacks can be mitigated. Slowing hashes to 1/10 of a second might be too aggressive for a big site with a lot of login activity, but you don’t have to slow it that much. Stretching until the password hash takes 1/100 of a second would drastically reduce the efficacy of password hashing attempts.
CAPTCHA would mostly prevent denial of service attacks. You could also force the client to perform some computation that is easy for the server to verify.
Where does it say that the Web site is responsible for the user security, I did not pay LinkedIn for any service all I wanted to do was join the service and they agreed to allow me into their network. Now that they have all of my information, I think I should double check what the user agreement said about this kind of stuff. Maybe I should pay a company to look out for my security like the way I pay with my VISA. I’m Just walking around the “password is broken” and the got to have “2FA” to be truly safe. Think about it? who is the user going to call for help LinkedIn. Facebook might be better suited to handle Security issues but come Monday morning they will still be in business and I will be locked out of my email/Bank accounts. Google could create a passwd cloud and also Amazon now that LinkedIn needs help maybe Amazon can B2B and set them straight. Now that might be the future a Amazon password as a token on LinkedIn GUI.
And then there’s THIS – which makes “hash” of ANY password algorithm! 🙂
Security flaw in MySQL, MariaDB allows access with any password—just keep submitting it
http://arstechnica.com/information-technology/2012/06/security-flaw-in-mysql-mariadb-allows-access-with-any-password-just-keep-submitting-it/
And speaking of computing power needed to break a crytpo scheme (in this case an SSL certificate collision)…
Flame’s crypto attack may have needed $200,000 worth of compute power
http://arstechnica.com/security/2012/06/flame-crypto-attack-may-have-needed-massive-compute-power/
Note: “Rather than use the Amazon service, Sotirov’s team used a cluster of 200 PlayStation 3 consoles, which over a weekend delivered an equivalent amount of computing resources.”
The figure is actually $20,000 in Amazon EC2 resources, adjusted back to 2008 when Flame was likely created.
“It’s a really common misconception — including among security people — that the problem here is using SHA-1. It would not have mattered at all if they had used SHA-512, they would be no better off at all.”
Is provably not true. It takes much longer to brute force SHA-512 because it’s a more complicated algorythm by design. As it happens there are also no fast GPU crackers for it.
I mean I personally wouldn’t just use SHA-512 on it’s own, you still need lots of rounds and salt to get it right, but the sentiment is technically incorrect. It won’t always be – but today it is. Even when there are quick GPU hashers it will still take a long time more than straight SHA-1.
Personally I believe you’re only doing it right if it takes a long time on a fast CPU to generate a hash, like say maybe 200k rounds with a salt on SHA-512 should be a ballpark minimum. The longer it takes – within reason – the better. Then just keep updating your scheme as new tech emerges, which is the fundamental mistake a lot of people have been making.
The good news is we’ve been fielding a lot more Q’s than normal on how crypt() works so at least the good thing that comes out of this might be that people wake up.
SHA-512 would be slightly better since the algorithm is a bit slower, but the improvement is very small. There was a small amount of exaggeration in the statement but he is essentially correct: just using SHA-512 or SHA-3 (when it’s chosen) is not the answer. Hash functions are fast, password hashes need to be slow.
Besides using proper password hashing methods, here’s one thing more that companies with large user databases (like LinkedIn’s) and adequate IT security budgets should do:
http://www.openwall.com/presentations/PHDays2012-Password-Security/mgp00050.html
(Move password hashing onto dedicated equipment – separate from authentication servers – and store and use a local parameter there.)
We (Openwall) would be happy to help implement that.
The rest of the slides from my PHDays talk are also relevant (focusing on password hashing per se):
http://www.openwall.com/presentations/PHDays2012-Password-Security/
Solar Designer, I agree with your recommendation and I’ve recommended the same thing over the years for many security- or performance-critical aspects of systems. Using a dedicated device gives us more than enhanced performance or an extra obstacle for them to break: it allows us to leverage more robust engineering techniques to increase the assurance of that part of the system. So, the legacy apps or platforms might require a huge mainstream OS to run on, yet the password system might only need a simplified networking stack & TCB built with security baked in.
Many designs I’ve commented on at Schneier’s blog use an approach like this. I originally learned the principle from Orange Book B3/A1 systems. We can still apply it to specialized systems (i.e. appliances) w/ minimal cost/complexity. Two existing products, Hydra Web Server & Secure64 DNS, run their specialized application on an OS/TCB built to ensure security ground up. I think we can take that principle MUCH further, esp. leveraging recent DO-178B & EAL6 platforms.
This is exactly what I said above: offload the hashing to ease CPU stress on the Web server.
Not sure what is meant by “local parameter”, but storing a secret salt on that hashing server – either in compiled code running in memory or even in hardware – NO attempt to create a rainbow table can possibly succeed – not just for now but forever. And storing it on a separate server not directly connected to the Internet makes it even more safe as long as the network connections between the two servers are restricted and secure.
Solar Designer, the author of the venerable John the Ripper cracking tool, presented this excellent history of password hashing last month at PHDays:
https://speakerdeck.com/u/openwall/p/password-security-past-present-future
He references a talk I also gave last month at Interop (not as an author of dsniff or OpenSSH, but for Duo Security) on modern two-factor authentication, such as Tom alludes to:
http://www.duosecurity.com/resources/modern-two-factor-authentication
Decent password hashing such as Niels Provos’ bcrypt or RFC 2898’s PBKDF2 can limit password exposure when the authentication database is stolen (although if servers are compromised directly, attackers can just trojan their way to credential theft).
But we contend that broadly deployable, out-of-band two-factor authentication can stop massive account takeover even in the face of total password compromise, such as we see today – as Mike Hearn at Google said in his RIPE 64 talk two months ago, “the age of the password is over, and never coming back”:
https://speakerdeck.com/u/duosec/p/mike-hearn-at-ripe-64-abuse-at-scale
Yes, two-factor authentication is about moving the goalposts. Real security is the art of doing so economically!
Is it as easy as adding a random delay of 1-2 seconds per login attempt, successful or not? Sure, more opportunity of DDOS, but meh, that’s less embarrassing than a hack.
You can implement a delay between login attempts, use CAPTCHA, and/or force a client side computation that is easy to verify for the server. The DoS threat can be addressed.
Wow what a response. Authentication is and always will be a challenge. I have to admit that I learned a few things from the article and the comments.
But the question of availability vs sensitivity needs to be recognized. This was a social networking site, not a highly sensitive data storage, and it is meant to be user friendly. We can debate the methods of improving authentication security, but the need for it in this instance, in my humble opinion, isn’t warranted.
That’s the sort of sentiment that causes the problem in the first place to be fair, there’s no debate needed in the face of massive incompetence that was easy to prevent? Really?
If you can get the users data by Googling the persons email address or name, what is there to protect? The password? As a user I never use the same password for sites, because I don’t trust any site.
So the basic premises of this article is that since folks don’t really understand crypto they aren’t engineering with the right crypto solution (b/scrypt), but I would counter that this article reads like someone who hasn’t done anything BUT crypto work.
For starters, hashing passwords is a defense in depth mechanism, primarily meant to shield you if something else has gone wrong (most commonly, SQL Injection, but there are a number of other ways that the database could leak). It is NOT a primary defense mechanism, but a secondary one. That is very important context.
Richard Steven Hack chimes in with the most obvious question the business is going to ask if a security person proposes using bcrypt. “You want to intentionally delay the user’s access and make the website seem unresponsive? Are you mad?”. Seriously, just from a user experience asking for a 1 second delay in authentication is a deal breaker for a secondary form of authentication. Very few commercial websites are going to allow it.
Worse though, it directly enables absolutely trivial Denial of Service against even huge websites. All an attacker would need to do was post a large volume of requests to the login page to bring the website down. bcrypt is not just slow, but it is CPU intensive. If the attacker submits just 100 requests in a second (fairly trivial to do) they can tie up the processor at 100% for almost 40 seconds. That 1:40 ratio of asymmetry is horrendous – you would be hard pressed to find a better ratio favoring DoS attackers on any modern web server. Rate limiting and clustering can feasibly mitigate this for a single source (though companies already do a terrible job of rate limiting even with the modern presence of trawling attacks), but if there is any degree of distribution both become significantly less effective.
Does anyone think any company that makes a large chunk of revenue off of users being able to access their website is going to willingly choose to ENABLE such easy DoS? For a secondary protection? Security is often a mix a tradeoffs, and bcrypt is a terrible tradeoff
You and RSH make it sound like it’s a deal-breaker all-around. It’s not. MANY sites implement effective authentication. There’s definitely going to be sites, especially eCommerce, where even a second kills it. Solution: those sites can pick a different method, while the rest can use the recommended working one.
The thing that makes this method attractive is it’s quite simple & cheap to implement. Groups like OWASP make it easier for non-security developers to add effective security to parts of apps like this. There are other methods for authentication that don’t take as much time. However, they have tradeoffs.
It’s awesome how you completely ignore my bigger point. A second matters for a lot of businesses, true, but if you notice I spent quite a bit more space explaining how it trivially ENABLES DoS. Advice like this demonstrates the worst sort of security behavior – it fixates on one positive aspect and absolutely disregards all of the negative aspects. It shows such a complete divorce from actual development realities to be worthless. It would be akin to a medical researcher advocating injecting bleach into the blood stream because it kills MRSA. Technically correct, but absolutely useless.
In the REAL WORLD, where we actually are responsible for securing revenue generating websites, performance matters. bcrypt doesn’t add one or two orders of magnitude to resource consumption on the server. Let’s talk about where that 1 second matters – yeah, it is a second of the user’s time but a whole lot more importantly it is a full second of the server’s time, simply to compute a hash. That is absurd – drunken stupid absurd. The performance implications of that are huge – they turn even a mundane user activity spike into a very serious concern, and make actual malicious attacks trivial.
When it comes down to it the best advice isn’t even that we should use bcrypt, but rather that we should first create an HMAC using a server salt (not stored in the database) with the password, and then hash the HMAC with a user salt. If you don’t do that, then weak crappy passwords will still be discovered even with bcrypt. If you do create an HMAC I have yet to see an argument that SHA isn’t good enough (yes, bcrypt would make it even harder to crack, but we don’t need perfect security, just good enough security)
Perhaps the next app for Cloudflare to offer — carry out password authentication on a separate server before permitting visitors to access the website?
To be correct, I don’t say it’s NECESSARILY a “deal breaker all around.”
But it WILL be a concern, rightly or wrongly, to many large Web sites.
We all know most sites heavily undersubscribe their servers and bandwith and heavily oversubscribe their users. Anything which negatively impacts their response time is going to be argued against.
Offloading the hashing onto a separate server is certainly an approach which could be used to mitigate this problem. It also improves security.
By storing my secret salt concept on the other server, it further improves security. A hacker would have to compromise BOTH servers (since the second server is NOT attached to the Internet or ANY network at all except to the Web server via a restricted access channel) to have ANY chance of retrieving the secret salt – and without the secret salt ANY method of using offline OR online cracking is IMPOSSIBLE (unless of course the user password is captured from the user – which will always remain THE problem.)
Swapping out an insecure method of storing passwords for a secure one is pretty simple, can be done in half a day or less: http://schneems.com/post/24678036532/zomg-my-passwords-are-insecure-now-what
To eliminate use of hashed password tables, one could use random per-password salt and store it alongside the hash, in addition to using somewhat slower, but not too slow hash.
To generate my passwords, I am using first 10 letters of base-64 encoding of SHA256 hash of a constant passphrase (which is quite long and secure) combined with the website name. This way I can use same passphrase for any sites without risking compromise when the site gets hacked, and without relying on potentially insecure password manager (I just use a python script to calculate my passwords; I remember the code and can re-write it from scratch if I need to). I could loop SHA256 a couple thousands times, if I were to make this solution for use by others.
F-Secure weighs in…
No, Heavy Salting of Passwords is Not Enough, Use CUDA Accelerated PBKDF2
:http://www.f-secure.com/weblog/archives/00002384.html
Basically they suggest offloading your password hashing to NVidia CUDA-capable video cards.
They also reference this article by Francoise Pesce:
Lessons Learned from Cracking 2 Million LinkedIn Passwords
:https://community.qualys.com/blogs/securitylabs/2012/06/08/lessons-learned-from-cracking-2-million-linkedin-passwords
People seem to forget this. If I have access to your databases, and figure out what hash function you use, I can change the passwords myself to gain access to the accounts.
And, er, if I have DBMS access, you are essentially f*cked anyway, as it is a short hop from there to actual system access.
As for Linkedin, SQLi. Blind SQLi. It was obvious from the dump itself. They should not have had that flaw. They were careless.
True: an attacker may be able to compromise accounts without cracking passwords. But, there are other reasons for using a strong password hashing scheme:
People reuse passwords. They shouldn’t, but they do. Strong password hashing limits an attacker’s ability to crack passwords and minimizes any related account compromises.
Cracking passwords allows an attacker to get into customer accounts even if the original vulnerability is later fixed.
The initial attacker may not be the person who ultimately uses the information. He may want to provide the the login/password info to someone else who will then use it for theft, fraud, etc. The cracked passwords are then way more valuable to him than system/DBMS access.
The attacker may not want to lock people out by changing their hashes since this might be quickly noticed. It’s much more stealthy to crack passwords offline so that he (or whoever he passes the info to) can login normally.
“That’s actually another misconception, the idea that the problem is that the passwords were unsalted.”
The idea that salt isn’t the problem here is COMPLETELY false and extremely misleading. This part of the article should be removed or rewritten, else naive developers will start to think “I’m using bCrypt, I don’t need salt!”
Scenario: An attacker has just acquired N > 1 unsalted “fast” hashes from the database of a website. The attacker would like to crack them all as fast as possible.
Proof:
If we make the hashes X times slower but don’t add salt, the attacker has to do X times the amount of work they would normally do. The attacker can still create a lookup table from his password dictionary with W words in O(W lg W) time. It will take X*W time to hash every dictionary entry, then W * lg(W) time to sort it on the hash. He can then crack each hash in O(lg W) time, without even computing a hash. The total time it takes is on the order of (XW+N) lg W.
If we add salt, the attacker can no longer use a lookup table and must crack each hash individually, so it takes him O(W*N) time to run an attack of the same “quality”. W*N is in little-omega((XW+N) lg W), so adding salt impedes the attacker more than making the hash function slower.
I’m pretty sure he didn’t mean to suggest using bcrypt without salt, but it should be made VERY clear that salting is the #1 problem, and we should focus on getting everyone to do salt right before confusing them further with slow hash functions.
[quote]I’m pretty sure he didn’t mean to suggest using bcrypt without salt, but it should be made VERY clear that salting is the #1 problem, and we should focus on getting everyone to do salt right before confusing them further with slow hash functions.[/quote]
You can handle salting and stretching at the same time and leaving either one of them out is a big mistake. Stretching alone will stop precomputation (you can’t make a rainbow table of sufficient size with naive brute force), but it doesn’t change the fact that brute-force password guessing, however slow, will still scale over all users. Salting alone also doesn’t solve the problem. It prevents precomputation/scaling, but brute force of individual users is still very fast. Use both: no precomputation, no scaling, and its slow.
Also: bcrypt uses a 128-bit salt. Nobody should be designing their own solution. Just use bcrypt, scrypt, or PBKDF2 and follow the directions.
BTW: I agree that his statement about salts is misleading/incorrect.