Foreword
Starting with your table definition:
- UserID
- Fname
- Lname
- Email
- Password
- IV
Here are the changes:
- The fields
Fname
, Lname
and Email
will be encrypted using a symmetric cipher, provided by OpenSSL,
- The
IV
field will store the initialisation vector used for encryption. The storage requirements depend on the cipher and mode used; more about this later.
- The
Password
field will be hashed using a one-way password hash,
Encryption
Cipher and mode
Choosing the best encryption cipher and mode is beyond the scope of this answer, but the final choice affects the size of both the encryption key and initialisation vector; for this post we will be using AES-256-CBC which has a fixed block size of 16 bytes and a key size of either 16, 24 or 32 bytes.
Encryption key
A good encryption key is a binary blob that's generated from a reliable random number generator. The following example would be recommended (>= 5.3):
$key_size = 32; // 256 bits
$encryption_key = openssl_random_pseudo_bytes($key_size, $strong);
// $strong will be true if the key is crypto safe
This can be done once or multiple times (if you wish to create a chain of encryption keys). Keep these as private as possible.
IV
The initialisation vector adds randomness to the encryption and required for CBC mode. These values should be ideally be used only once (technically once per encryption key), so an update to any part of a row should regenerate it.
A function is provided to help you generate the IV:
$iv_size = 16; // 128 bits
$iv = openssl_random_pseudo_bytes($iv_size, $strong);
Example
Let's encrypt the name field, using the earlier $encryption_key
and $iv
; to do this, we have to pad our data to the block size:
function pkcs7_pad($data, $size)
{
$length = $size - strlen($data) % $size;
return $data . str_repeat(chr($length), $length);
}
$name = 'Jack';
$enc_name = openssl_encrypt(
pkcs7_pad($name, 16), // padded data
'AES-256-CBC', // cipher and mode
$encryption_key, // secret key
0, // options (not used)
$iv // initialisation vector
);
Storage requirements
The encrypted output, like the IV, is binary; storing these values in a database can be accomplished by using designated column types such as BINARY
or VARBINARY
.
The output value, like the IV, is binary; to store those values in MySQL, consider using BINARY
or VARBINARY
columns. If this is not an option, you can also convert the binary data into a textual representation using base64_encode()
or bin2hex()
, doing so requires between 33% to 100% more storage space.
Decryption
Decryption of the stored values is similar:
function pkcs7_unpad($data)
{
return substr($data, 0, -ord($data[strlen($data) - 1]));
}
$row = $result->fetch(PDO::FETCH_ASSOC); // read from database result
// $enc_name = base64_decode($row['Name']);
// $enc_name = hex2bin($row['Name']);
$enc_name = $row['Name'];
// $iv = base64_decode($row['IV']);
// $iv = hex2bin($row['IV']);
$iv = $row['IV'];
$name = pkcs7_unpad(openssl_decrypt(
$enc_name,
'AES-256-CBC',
$encryption_key,
0,
$iv
));
Authenticated encryption
You can further improve the integrity of the generated cipher text by appending a signature that's generated from a secret key (different from the encryption key) and the cipher text. Before the cipher text is decrypted, the signature is first verified (preferably with a constant-time comparison method).
Example
// generate once, keep safe
$auth_key = openssl_random_pseudo_bytes(32, $strong);
// authentication
$auth = hash_hmac('sha256', $enc_name, $auth_key, true);
$auth_enc_name = $auth . $enc_name;
// verification
$auth = substr($auth_enc_name, 0, 32);
$enc_name = substr($auth_enc_name, 32);
$actual_auth = hash_hmac('sha256', $enc_name, $auth_key, true);
if (hash_equals($auth, $actual_auth)) {
// perform decryption
}
See also: hash_equals()
Hashing
Storing a reversible password in your database must be avoided as much as possible; you only wish to verify the password rather than knowing its contents. If a user loses their password, it's better to allow them to reset it rather than sending them their original one (make sure that password reset can only be done for a limited time).
Applying a hash function is a one-way operation; afterwards it can be safely used for verification without revealing the original data; for passwords, a brute force method is a feasible approach to uncover it due to its relatively short length and poor password choices of many people.
Hashing algorithms such as MD5 or SHA1 were made to verify file contents against a known hash value. They're greatly optimized to make this verification as fast as possible while still being accurate. Given their relatively limited output space it was easy to build a database with known passwords and their respective hash outputs, the rainbow tables.
Adding a salt to the password before hashing it would render a rainbow table useless, but recent hardware advancements made brute force lookups a viable approach. That's why you need a hashing algorithm that's deliberately slow and simply impossible to optimize. It should also be able to increase the load for faster hardware without affecting the ability to verify existing password hashes to make it future proof.
Currently there are two popular choices available:
- PBKDF2 (Password Based Key Derivation Function v2)
- bcrypt (aka Blowfish)
This answer will use an example with bcrypt.
Generation
A password hash can be generated like this:
$password = 'my password';
$random = openssl_random_pseudo_bytes(18);
$salt = sprintf('$2y$%02d$%s',
13, // 2^n cost factor
substr(strtr(base64_encode($random), '+', '.'), 0, 22)
);
$hash = crypt($password, $salt);
The salt is generated with openssl_random_pseudo_bytes()
to form a random blob of data which is then run through base64_encode()
and strtr()
to match the required alphabet of [A-Za-z0-9/.]
.
The crypt()
function performs the hashing based on the algorithm ($2y$
for Blowfish), the cost factor (a factor of 13 takes roughly 0.40s on a 3GHz machine) and the salt of 22 characters.
Validation
Once you have fetched the row containing the user information, you validate the password in this manner:
$given_password = $_POST['password']; // the submitted password
$db_hash = $row['Password']; // field with the password hash
$given_hash = crypt($given_password, $db_hash);
if (isEqual($given_hash, $db_hash)) {
// user password verified
}
// constant time string compare
function isEqual($str1, $str2)
{
$n1 = strlen($str1);
if (strlen($str2) != $n1) {
return false;
}
for ($i = 0, $diff = 0; $i != $n1; ++$i) {
$diff |= ord($str1[$i]) ^ ord($str2[$i]);
}
return !$diff;
}
To verify a password, you call crypt()
again but you pass the previously calculated hash as the salt value. The return value yields the same hash if the given password matches the hash. To verify the hash, it's often recommended to use a constant-time comparison function to avoid timing attacks.
Password hashing with PHP 5.5
PHP 5.5 introduced the password hashing functions that you can use to simplify the above method of hashing:
$hash = password_hash($password, PASSWORD_BCRYPT, ['cost' => 13]);
And verifying:
if (password_verify($given_password, $db_hash)) {
// password valid
}
See also: password_hash()
, password_verify()