Not so long
ago it was common to see passwords stored in databases in plaintext. Major
corporations were even guilty of this (do a google search for “web hosting stored passwords in plaintext”
to see what I mean). Thankfully it appears that developers have finally
realized it’s better to code something the safe way the first time rather than
suffer the repercussions of shortcuts. As a result, the instances of this occurring
are quickly dwindling out of existence.
These days
there’s a phenomenon that’s arguably just as dangerous: hashing unsalted
passwords.
Let’s
imagine a scenario where a malicious individual has managed to acquire a copy of user
records from our web application’s database. Three records have been chosen
from our “user” table:
ID
|
Name
|
UserName
|
Password
|
198
|
Martha Robinson
|
mrobinson
|
0ce8d8fccd5e6071fdab1a4edec504872ad338dcf34920f5b5baedfb9a74da91
|
237
|
John Davidson
|
jdoe
|
0ce8d8fccd5e6071fdab1a4edec504872ad338dcf34920f5b5baedfb9a74da91
|
599
|
Sarah Smith
|
ssmith
|
ca74e5fe75654735d3b8d04a7bdf5dcdd06f1c6c2a215171a24e5a9dcb28e7a2
|
All
passwords in the table are hashed once using SHA-256.
You might be
thinking, these passwords are hashed, so what’s the problem? One of the
defining traits of modern day hashing algorithms is that the slightest change
in the source string will drastically change the generated hash from that
string. For instance, take a look at the hashes for ‘Football123’ and ‘Footballs123’ (respectively):
0ce8d8fccd5e6071fdab1a4edec504872ad338dcf34920f5b5baedfb9a74da91
f77211f25d1f3bafd92ca23085d583f230c9febe6fc4e45a2e5ef0592c68f2e5
f77211f25d1f3bafd92ca23085d583f230c9febe6fc4e45a2e5ef0592c68f2e5
See that? By
just changing the “Football” part in our plaintext password to plural “Footballs”, there’s a huge difference in the appearance of the two hashes.
The first
thing the hacker is going to do is try and crack Martha and Johns’ passwords.
Why? The two hashes are identical to each other. Thus, the attacker will deduce
from this that those hashes were calculated from the same exact passwords.
Now the
hacker knows that Martha and John have identical passwords. This is a big
problem – what are the chances
they both happened to think of the same extremely complex and secure passphrase?
Not very likely.
What we need
is a way to ensure that even simple passwords – even ones like Martha and
Johns’, will at least appear unique
in our db. This is where salting comes into play. What we do is generate a
“random” string of characters and mix those in with our plaintext source
passkey.
So in our
above example, instead of hashing ‘Football123’ which results in: 0ce8d8fccd5e6071fdab1a4edec504872ad338dcf34920f5b5baedfb9a74da91
We can take
a random string like ab9FlmNO!!, concatenate it to the end of our ‘Football123’
text and hash that instead which yields: da2c1d31cb17f8a7fc13a7e49f392c99dcee1f6bd863ba266abf28ed1aa0e9f9
Or maybe we
have Football123Lc87nHNm!? Which becomes: b2cf5569f2fd345218c4670431d32b3a5ea38ec3a02a719973f73b5d9d25a130
And now even
if someone else comes along with the same password, they’ll still end up with a
unique hash.
The orientation of the salt doesn’t really
matter – you could place it at the end, at the beginning, in the middle of the
string, wherever. As long as your back-end code knows how to apply the salt to
the given plaintext password prior to hashing, you’re good. The main point is adding extra characters to the passwords to make their hashes more distinguishable.
In terms of
storing those salts in the database, we could simply add on another column to
our user table that could contain the hash. The next
time a user tries to log in, our system looks up the corresponding salt hash,
adds that to the user-provided password, hashes them together, and if that hash
matches the pass in the database, then the password is correct.
If the salt is blatantly exposed in the database as well, then doesn’t that render this security measure useless? No, this makes it increasingly more difficult for our attacker to be able to identify those who have weak passwords simply by visual inspection. The hacker can easily see Martha and John’s salts, but won’t be able to discern quickly that they have the same passwords. Sure, it would be much better if the hash wasn’t exposed to begin with, but that isn’t feasible as our application needs some way of being able to tell what that salt actually is. We could also have a global salt that would be used by every user, which could be encapsulated within the application and unknown to database. This would present its own set of drawbacks as well. There's no 100% "safe" solution. We're just making this as secure as we possibly can.
Although salting
is a powerful safeguard against brute-force and dictionary attacks, it’s far
from being the only tool you’ll want in your arsenal of security mechanisms if
you expect your application and its data to withstand modern day exploits. That said,
it’s an invaluable technique that builds a strong starting foundation for proper
password storage. Salting is a critical step that must not be overlooked during the
development process.