Passwords, Entropy, and Good Password Practices

Right now for most of us the key to any security in our online life is the degree of entropy in our passwords. So what is entropy, and how does it affect our passwords?

Entropy is in general the degree of randomness or disorder in any given system. Sometimes it is very easy to assess, such as a password of “1234”, which all too many people use. Because it is a simple sequence, there is no real randomness at all, and would be quickly guessed. And as we saw in the last tutorial, such passwords are quickly discovered in a dictionary attack. There are things you can do to make it less likely that your password will be cracked and used against you.

Password Safety

The thing to keep in mind as we discuss password safety is that the objective is not to make your password ultimately uncrackable. That may be impossible in any case. If you are a “person of interest” to a determined government agency the odds are they can devote enough computing power to getting your password that their odds are pretty good. This is a simpler problem than cracking a good PGP encryption key, which right now is considered computationally infeasible even for the NSA and GCHQ. Passwords are a somewhat simpler problem. So the threat you should really be targeting is a criminal organization that wants to get your password and use it take your money. This is a threat you can significantly reduce by following sound practices.

Don’t use the same password on many sites

The reason for this is that if you use the same password on many sites, a hacker can crack a database at a site that does not follow best practices, and then they have it. The password can then be tried at other sites, and no matter how good the other sites’ security is, they cannot stop someone who already knows your password. And hackers really do try this kind of attack. So don’t do it! Now, it might be reasonable to assess just how important the security is on a site-by-site basis. An approach that is a reasonable compromise is to pick sites where you don’t particularly care (for me, that would include Twitter and most online forums) and use the same insecure password for all of them. Recognize that you are accepting the risk that someone can easily get in there, and when they get in they can do whatever you can do. Then for PayPal, for your bank, and other sites where it really matters, use a highly secure password that is unique to each such site. This gets you most of the security you need without unduly taxing you. If some site requires a 17-character password that includes upper case and lower case letters, numbers, and Sanskrit hieroglyphs just so you can post a customer support question on their forums, they are idiots, but I don’t see any problem with having a standard password you use for all such sites.

Add to the Entropy

But for sites that are important, Entropy is a good thing in choosing passwords. Entropy is essentially randomness, and it means choosing passwords that are very unlikely to appear in hacker’s dictionary. A password like “password” will be in every dictionary. So will “1234”, “qwerty”, and “letmein”. And any word found in a real dictionary (for some reason “monkey” is very popular) is fair game. For your amusement, here is just one list of the 25 worst passwords in use: Using something like this is the equivalent of not using any password at all. And remember that it does not need to be on this top 25 list to be a “no-no”. Pretty much every name and every dictionary word is in this dictionary as well. So if someone has the hash of your password, and they run it against their dictionary and fail to get a match, are you home free? Not necessarily, but you made it through the first round at least. Remember that this is an arms race and that Moore’s Law works for the bad guys as well. How many things can they try? Well, one thing is to try every possible variation. If you have a password of 6 letters, all lower case, all they need to do is try every possible six-letter password in order: aaaaaa, aaaaab, aaaaac,…zzzzzy, zzzzzz. So how hard is that? We can do a calculation on this. With six letters, and 26 letters in the standard English alphabet (if you use a different alphabet, adjust as necessary), it is a simple calculation. The first letter can be any of 26 choices, and for each of those the second letter can be any of 26 choices, and so on, so the total space in which the attacker needs to search is 26 to the sixth power, or 26x26x26x26x26x26. And with a spreadsheet or other calculator you can quickly find that it equals 308,915,776. Certainly a large number, but against that we have to see how many hashes per second an attacker can calculate. And here we discover that this problem is trivial. Not only is computer power increasing, but calculating hashes is precisely how bitcoin mining works, so a lot of ingenious folks have been finding ways to boost this number. It is now trivial to calculate billions of hashes per second. So how can we improve the situation in our favor?

Go back to our calculation of the total number of passwords in the password space. It had two numbers, the base and the exponent. The base was 26, because we could choose from among 26 lower case letters to construct our password. The exponent was 6 because we had 6 letters in our password. So how do we use these two numbers to improve things?

First, with the base, we can increase the range of characters. If we add upper case letters, that gets us 52, and 52 to the sixth gets us to 19,770,609,664. Well, nearly 20 billion is better than 300 million, but not enough better. Add in numbers, and you have 62 possible characters and that gets us to nearly 57 billion (from now on I am going to round the numbers), which is again better, but when an attacker can calculate billions of hashes in seconds (I have seen reports of bitcoin rigs that can calculate 800 billion per second) this just isn’t getting us there. Throw in the special characters, and you are up to 95 possible characters, but that only gives you 700 billion or so possible passwords.
So our conclusion is that a six-letter password created with maximum entropy can be cracked in an offline attack (i.e. where the attacked has copied the database and can run his scripts at will against the copy) in about a second.

Password Haystacks Approach

The above result lead Steve Gibson, host of Security Now, to advocate a different approach altogether, one that focuses purely on length. He calls this the Password Haystacks approach, in that if you are looking for a needle in a haystack, the bigger the haystack the harder it gets. He says that to get security we need very long passwords, but if they also have high entropy they are almost impossible to remember, so forget the entropy, just go for length. He argues that a password like ………..pass…………… (i.e. 11 periods, pass, then 15 periods) is actually secure since the attacker has to calculate every possible password length using all 95 characters up to a password length of 30 before discovering your password. So the calculation is 95 + 95-squared + 95-cubed + 95 to the fourth + ……+ 95 to the 29th + 95 to the 30th. And if I did the math correctly, this comes out to 2 x 10 to the 59th. And this is a seriously large number. Let’s assume for the sake of argument that the attacker can check a trillion passwords per second. That amounts to 10 to the 12th power. So to check these will require 2×10^59/10^12 seconds. That is equal to 2×10^47 seconds. And since there are 3×10^7 seconds in a year, that is 2×10^47/3×10^7 years, which is 6×10^39 years. The universe has been around for 1.3×10^10 years, so call this a gazillion times the age of the universe. In this kind of attack, length of password seems to trump everything. BTW, if you ever wondered what the termĀ computationally infeasible means, you just saw it.

However, we need to remember that this is an arms race, and that attackers and defenders are constantly adjusting to what the other does. If everyone adopted the Password Haystacks approach, could attackers come up with a different way of checking passwords that would make this feasible? I am not smart enough to definitely answer that question, but I know enough about the history of cryptography to know that unless you can prove it is mathematically impossible, there is an excellent chance that some smart person somewhere will come up with an ingenious solution to the problem. So I am not willing to completely rely on Password Haystacks. Nevertheless, it does reveal a profound truth that we can take advantage of. Length is definitely the best possible way to improve your password security, and that simply falls out of the math. But I think Entropy still has a role to play.

Password Vaults

The problem can be stated as follows:

  • You should use unique passwords for at least the important sites, even if you don’t care about some sites.
  • Long passwords are absolutely the best protection.
  • Length alone may not be enough going forward, so Entropy is good as well.
  • Long, high-entropy passwords are just about impossible for most people to remember

So, what is the solution? My personal belief is that password vaults are the best protection. I actually use two of them in combination to allow me to use good passwords and still have a sane life. First, there is LastPass. This program integrates with your Web browser (it is available for most browsers i.e., Chrome, Firefox, Opera, Safari, and Internet Explorer), it integrates with other products like Ubikey and Duo Security for two-factor authentication, and will automatically fill-in your login name and password for any site you have saved. The data is saved in the cloud, but it is encrypted first, locally, using AES 256-bit encryption. You can use it on any computer therefore, but first need to provide your own password to unlock the data. So you do need to memorize one good strong password. But then LastPass will remember all of the others, and if you wish it will create strong, high-entropy passwords for you. LastPass is a commercial product, but it offers a useful service, and I have opted to purchase the Premium

The only downside to this approach is that you have to be connected to the Internet to access your passwords. In most cases, you are looking for Web site passwords, so you need to online to even need the passwords, but some things you need locally (like the password to your wifi router, perhaps?), plus I am kind of a belt-and-suspenders type. so I also use KeePass(x), which stores the data in a local database. That also means that if anything happens to LastPass I can still get my passwords. It means an extra step, since every time I create a new online account I not only have to add it to LastPass (which is virtually automatic) but also to KeePass(x), which is not at all automatic. KeePass(x) is cross platform so I can use it on both Linux and Windows, and stores its data in a password-protected database. And unlike LastPass, KeePass(x) is completely open source. Both programs are available for Android as well.

The Science Fiction writer Robert Heinlein once said “Keep all of your eggs in one basket, but WATCH THAT BASKET“, and he was quoting either Mark Twain or Andrew Carnegie. That is the essence of the password vault approach, and I think it is the best overall solution to providing good password security for real human beings, at least for the next few years. I suspect that biometrics will take over at some point, indeed, they are starting to now.

Listen to the audio version of this post on Hacker Public Radio!