Password Hashing

From wiki.visual-prolog.com

Revision as of 14:08, 13 February 2019 by Thomas Linder Puls (talk | contribs) (initial version)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

 


The basic problem is that user have a password, which the "system" must be able to validate, but which should be kept secret for everybody else.

Password Hash

The idea is to create a cryptographic hash of the password and save that in the "user dictionary".

class predicates
    password_hash : (string Password) -> string Hash.
clauses
    password_hash(Password) = cryptography::<someSuitableHashFunction>(Password).

To validate the password you hash again and compare the new hash to the stored hash:

class predicates
    password_validate (string Password, string Hash) determ.
clauses
    password_validate(Password, Hash) :-
        Hash = password_hash(PassWord).

A cryptographic hash is a hash with the property that it is "hard" to come from the hash value back to the password or to find something else that gives the same hash value.

Rainbow table

However, this approach has the problem that you can create a so called rainbow table: you take a huge amount of frequently used passwords and hash them with someSuitableHashFunction and then it is easy to come backwards for those frequently used passwords.

Salt

To avoid this you can make sure that the users password is not frequently use. That is done by applying salt to the users passwords.

If the users password is "horse" then you actually hash "<salt>horse", where <salt> is something strange, e.g. "Qa12#5ASGhorse" (this is the idea behind salting; in practice it is done in a slightly different way).

class predicates
    password_hash : (string Password, string Salt) -> string Hash.
clauses
    password_hash(Password, Salt) = cryptography::<someSuitableHashFunction>(Hash, Salt).

the same Salt should be used when validating the password:

class predicates
    password_validate (string Password, string Salt, string Hash) determ.
clauses
    password_validate(Password, Salt, Hash) :-
        Hash = password_hash(PassWord, Salt).

In principle you can use the same salt each time, but if a hacker gets knowledge about the salt she can make a new rainbow table with that salt, and then we are back to square one.

Subsequently, it is recommended to use a new long random salt each time. This salt does however not need to be kept secret, but can be saved in the user dictionary together with the password hash.

Here the combination of hash and salt is called a hashToken:

domains
    hashToken = hashToken(string Salt, string Hash).
 
class predicates
    password_hash : (string Password) -> hashToken HashToken.
clauses
    password_hash(Password) = hashToken{Salt, Hash} :-
        Salt = generateRandomSalt(),
        cryptography::<someSuitableHashFunction>(Hash, Salt).
 
class predicates
    password_validate (string Password, hashToken HashToken) determ.
clauses
    password_validate(Password, hashToken(Salt, Hash)) :-
        Hash = cryptography::<someSuitableHashFunction>(PassWord, Salt).

Algorithm

The last issue is the choice of someSuitableHashFunction. As already mentioned it must be difficult to come from hash to password. A consequence of this is (somewhat oddly) that it must take long time to calculate the hash of a password. Because if it is fast then it is also fast to create rainbow tables and the like. Notice however that the calculation must take long time for the hacker; it does not help that you have a hopeless implementation, if the hacker has a good implementation.

A problem in that context is that what is a suitably slow algorithm today may be too fast in some years.

As a consequence of this and probably also for many other reasons you will have to expect that you will have to change or adjust the algorithm as time passes.

Currently, various places on the net state that PBKDF2+SHA1+10.000 is a good choice.

PBKDF2 is an algorithm that takes another algorithm hash H argument and makes N iterations using H in some way.

PBKDF2+SHA1+10.000 means that H is the hash algorithm SHA1 that N is 10.000 (i.e. that the algorithm makes 10.000 iterations).

The number 10.000 is assumed to be increased as computers becomes faster.

All in all, we must expect that we will have to adjust the algorithm as time goes by.

Solution

JWT (JSON Web Tokens) does not deal with this problem, but we have used some of the ideas in this context.

A Hash Token consists of three components:

  1. A description of the algorithm and its parameters
  2. the salt
  3. the hash

Like in JWT the algorithm and its parameters are described as a JSON object:

   { "alg" : "PBKDF2", "hash" : "SHA1", "p2c" : 10000 }

The three components are Base64 for URL encoded and combined with ".".

So a HashToken will look like this (the token does not have any line shifts, they are just inserted here for readability):

   eyJhbGciOiJQQktERjIiLCJoYXNoIjoiU0hBMjU2IiwicDJjIjoxMDAwMH0

.1xVFoIMO9C-u3O81CYoUxkV3KB_vAxUmMvrvTG9hHaM .v9cZjPwtji27vjeCFj8SLwzkqWUDe9OmT9YC3sLOMps

  1. eyJhbGciOiJQQktERjIiLCJoYXNoIjoiU0hBMjU2IiwicDJjIjoxMDAwMH0 is the JSON object that describes the algorithm and its arguments (utf8 + base64 URL encoded)
  2. 1xVFoIMO9C-u3O81CYoUxkV3KB_vAxUmMvrvTG9hHaM is the salt (base64 URL encoded)
  3. v9cZjPwtji27vjeCFj8SLwzkqWUDe9OmT9YC3sLOMps is the hash (base64 URL encoded)

The key point is that a HashToken contains all the information that is necessary to validate a password, even though you use different algorithms, parameters and random salt.

So a single predicate can validate passwords, even when algorithms and parameters changes, and different algorithms/parameters can also be mixed in a single user dictionary.

predicates
    password_validate : (string Password, string HashToken) determ.

For now PFC supports PBKDF2 and main algorithm:

predicates
    password_pbkdf2 : (string Password, integer64 IterationCount = 10000, string HashAlgId = bcrypt_sha256_algorithm) 
	    -> string HashToken.

The actual hashing is performed by the default cryptography provider on the computer, and it has been validated that SHA1, SHA256, SHA384 and SHA512 can be used as hash algorithm.

Example

This little example:

implement main
    open core, stdio, bCrypt_native
 
clauses
    run() :-
        Password = "MySweetLittlePony",
        FalsePassword = "MySweetLittleCat",
        foreach HashAlgId in [bcrypt_sha1_algorithm, bcrypt_sha256_algorithm] do
            writef("HashAlgId = %\n", HashAlgId),
            test(Password, FalsePassword, HashAlgId),
            test(Password, FalsePassword, HashAlgId)
        end foreach.
 
class predicates
    test : (string Password, string FalsePassword, string HashAlgId).
clauses
    test(Password, FalsePassword, HashAlgId) :-
        HashToken = cryptography::password_pbkdf2(Password, :HashAlgId = HashAlgId),
        writef("%\n", HashToken),
        validate(Password, HashToken),
        validate(FalsePassword, HashToken),
        nl.
 
class predicates
    validate : (string Password, string HashToken).
clauses
    validate(Password, HashToken) :-
        Valid = if cryptography::password_validate(Password, HashToken) then "valid" else "invalid" end if,
        writef("% is %\n", Password, Valid).
 
end implement main

will produce output like this:

    HashAlgId = SHA1
    eyJhbGciOiJQQktERjIiLCJoYXNoIjoiU0hBMSIsInAyYyI6MTAwMDB9.AquRQqikdFLq3S70WbREu5MAqZY.KEbkIURoD_qiylq-TzH013RuW9U
    MySweetLittlePony is valid
    MySweetLittleCat is invalid

    eyJhbGciOiJQQktERjIiLCJoYXNoIjoiU0hBMSIsInAyYyI6MTAwMDB9._uLmvdiJWJXfoz15NTicjf4MjLU.7PeEqE9j6k9RF4bmLlLmFxjGaPE
    MySweetLittlePony is valid
    MySweetLittleCat is invalid

    HashAlgId = SHA256
    eyJhbGciOiJQQktERjIiLCJoYXNoIjoiU0hBMjU2IiwicDJjIjoxMDAwMH0.9qs1jevgKywfXnaxW6rjlrgU1jTNQv8Kf5uAT0bWvzI.fGL9de1MaOImZDnDVSQk7eI0ORajAiDTwKBAt8pbHto
    MySweetLittlePony is valid
    MySweetLittleCat is invalid

    eyJhbGciOiJQQktERjIiLCJoYXNoIjoiU0hBMjU2IiwicDJjIjoxMDAwMH0.skxJNq9rEWdCL88WATlHCMGs6aj4RXucLL0phLv6zNs.hG4xObaK4RnxrF6QX1_dABndhcnel4AuKEp32lqIvg4
    MySweetLittlePony is valid
    MySweetLittleCat is invalid

The output is only "like" this because a random salt is generated each time password_pbkdf2 is called.

The first part is the same when the algorithm (and parameters) is the same, and the length of the salt and the hash depends on the chosen algorithm.