Thanks to the Matasano crypto challenges, I have had the opportunity to look at DSA more closely. In this post I am going to present the algorithm, demonstrate it works, show an implementation in Python and the vulnerabilities I have discovered through the challenges.
DSA is a standard for digital signatures.
The first step in using DSA is the parameter generation. The DSA parameters are q (a N-bits prime), p (the modulus) and g (the generator).
As I have not implemented this part, I am not going to describe it. The official standard will have recommendation for the length in bits of the numbers, and how to compute them. You can also use OpenSSL dsaparam command to generate and manipulate parameters for DSA.
The second step is generating a key pair from the parameters. The private key is chosen as a random number between 1 and q. Let’s call it x. The public key is then computed as
y = g^x mod p
Modular exponentiation is a common operation in public-key cryptography. Due to the large numbers usually involved, using the normal expressions available in the language may lead to your algorithm becoming very slow. It is therefore recommended to use an efficient implementation. You can see one such implementation in my solutions to the challenges (example).
The strength of modular exponentiation is that it is easy to compute even with enormous numbers, but the inverse operation, the discrete logarithm, i.e. finding the exponent x given the rest of the values, is believed to be computationally hard. If that were not the case, we would be able to find the private key of the pair knowing the public key and the DSA parameters (both of which are public), and there would be no security.
Now let’s move on to signing a message.
DSA signs integers, so a message needs to be hashed first; SHA-1 was initially the only choice provided, but recent additions allow for the (stronger) SHA-2. Let’s call the resulting hash H.
- Generate a random, per-message value k such that 0 < k < q. This is called the nonce; using a predictable k, or repeating the k twice for two different messages has strong security implications, as we will see later.
r = (g^k mod p) mod q. If r is 0, repeat again with a different k.
s = 1/k (H + x*r) mod q= invmod(k, q) * (H + x*r). If s is 0, repeat with a different k.
- The signature is the pair (r, s).
Note that when computing s you can use a invmod function (inverse modular); this can be computed efficiently using the extended Euclidean algorithm.
Finally, let’s verify the signature of a message against the public key.
- Check that both r and s are between 0 and q (excluded).
w = invmod(s, q).
- u1 = (H * w) mod q.
- u2 = (r * w) mod q.
- u1 = g^u1 mod p.
- u2 = y^u2 mod p.
- v = u1 * u2 mod q.
- If v == r, the message has been verified successfully.
Remember that we need to prove that for a valid key pair and signature, v is equal to r.
r = g^k mod p mod q v = g^u1 * y^u2 mod p mod q
We can start by expanding k:
w = invmod(s, q) = 1/s mod q k = H/s mod q + xr/s mod q = (Hw + xrw) mod q
Then we expand g^k
g^k = g^Hw * g^xrw mod q
y = g^x mod q we derive:
g^k = g^Hw ^ y^rw mod q = g^u1 * y^u2
which proves our point. It follows that if the keypair is not valid, or the signature is tampered with, the verification will not work.
Vulnerability 1: guessing k
If k is predictable or can be guessed easily with an exhaustive search, the private key can be recovered. Remember that when signing the message we computed
s = 1/k (H + x*r) mod q. From this you can derive:
x * r = ((s * k) - H) mod q
and therefore the private key by dividing by x. For Challenge 43, an unknown value of k between 0 and 2^16 is recoverable in a few seconds on a standard machine:
def BreakDSA(p, g, q, r, s): publicKey = Y # given in the challenge text # The message that was signed. H = dsa.HashMessage(b'For those that envy a MC it can be hazardous to your health\n' b'So be friendly, a matter of life and death, just like a etch-a-sketch\n') k = 0 brokenKey = None for k in range(1, 2 ** 16 + 1): top = (s * k) - H bottom = invmod(r, q) privateKey = (top * bottom) % q # This follows from the generation of the key pair. testPub = dsa.modexp(g, privateKey, p) if testPub == publicKey: brokenKey = privateKey break
Vulnerability 2: repeating k
Even if your k is very secure, you should never use it to sign two different messages. If an attacker knows of two messages signed with the same k, the value of k itself can be recovered. You start from
s1 = 1/k (H1 + x*r) mod q s2 = 1/k (H2 + x*r) mod q
Note that when k is the same for both signatures, also r is the same; this follows from the definition. Only s changes. You can combine the two parts:
s1 - s2 = 1/k (H1 + x*r) mod q - 1/k (H2 + x*r) mod q s1 - s2 = (1/k (H1 + x*r) - 1/k (H2 + x*r)) mod q s1 - s2 = 1/k (H1 - H2) mod q k = ((H1 - H2) / (s1 - s2)) mod q
As we know, we can then recover the private key x from k. Note that it is sometimes possible to get a negative k from this computation, which common DSA implementations do not accept. If that happens, you can simply invert the roles of the first and second messages in the computation, since it does not matter for the result.
Vulnerability 3: parameter tampering
The initial choice of parameters influences the security of the algorithm. An interesting case occurs when G is equal to P + 1. Let’s see what happens with the keypair and signature generation:
y = (p+1)^x mod p = 1 r = ((p+1)^x mod p) mod q = 1 mod q = 1
so y and r are always 1, no matter what the private key is; s is still good because it depends from other factors beside r. Now when I try to verify a random message:
v = ((p+1)^u1 * 1^u2) mod p = (p+1)^u1 mod p = 1
so v == r trivially, because they are both equal to 1 (I did not compute u1 and u2 since they are irrelevant). So the public key will verify any signature, no matter what.
ECDSA is a modification of the traditional DSA that uses elliptic curve cryptography. I have not reached that part yet, but I plan to do soon, and I will post about ECDSA to compare it with its traditional counterpart.