RSA Encryption

Encryption is all about encoding information with the end of limiting (meaningful) access to it. Usually when we’re talking about encryption/decryption, it’s in the context of an outgoing/incoming message. You use encryption dozens or hundreds of times a day without ever having to think about it. Every credit card swipe, every iMessage, every HTTPS visit, and every packet sent to your wireless headphones is encrypted. Anyone intercepting your data while it’s in motion is only reading garbage (so long as they can’t decrypt it).

RSA is an encryption algorithm developed at MIT in the 70’s1 by researchers Rivest, Shamir, and Adleman (hence, RSA). It’s simple enough to understand, but it’s very secure. Despite its age, RSA remains one of the most widely used public-key encryption schemes. “Public-key” isn’t some throwaway phrase, it’s integral to the description of RSA, and it describes a very significant class of ciphersystems in Cryptography. To build context for this concept and for RSA, we’ll first look at a more traditional class of systems.

A Bit About Modular Arithmetic

The modulus operator returns the remainder of an integer division. For example, 5 mod 3=25 \text{ mod } 3 = 2. Equivalently, we say 52(mod 3)5 \equiv 2 \,(\text{mod } 3), “five is congruent to two, mod three”. Z\mathbb{Z} denotes the set of integers, and Zn\mathbb{Z}_n denotes the set of integers modulo nn, i.e. {0,1,2,...,n1}\{0, 1, 2, ..., n-1\}.

Traditional Cipher Systems

A cipher system (P,C,K,e,d)(P, C, K, e, d) is defined with the following components:

P=P = a set of plaintexts \\ C=C = a set of ciphertexts \\ K=K = a set of keys \\ e:PC=e:\, P \to C = an encryption function \\ d:CP=d:\, C \to P = a decryption function

The role of all these pieces will be apparent with a simple example:

The Shift Cipher

P=C=K=Z26P = C = K = \mathbb{Z}_{26}\\ ek(x)=x+k(mod 26)e_k(x) = x + k \,(\text{mod } 26)\\ dk(y)=xk(mod 26)d_k(y) = x - k \,(\text{mod } 26)

What this means is we’re working with a 2626 character alphabet (i.e. A,...,ZA, ..., Z), where each character is mapped to a number (i.e. A=0,...,Z=25A = 0, ..., Z = 25). Then if we choose the key to be 33, we have e3(A)=D,e3(B)=E,...,e3(Z)=Be_3(A) = D, e_3(B) = E, ..., e_3(Z) = B. So all we’ve done is shifted (rotated) each character in our message by 33 characters. kk can be any integer from 00 through 2525 (260(mod 26)26 \equiv 0 \,(\text{mod } 26)). When k=3k = 3 specifically, this is called the Caesar cipher.

RSA

RSA is a public key ciphersystem, which means that everyone can know how to encrypt a message, but the encryption key is not sufficient to decrypt the message. So the encryption function is still invertible (otherwise there would be no way to decrypt), but it is hard to find the inverse of the encryption function. A function like this is called a one-way function2, or a trapdoor function. By “hard” to decrypt, we mean something like “a modern computer could run until the universe burns out and still have almost no chance of decrypting”.

RSA leverages this fact to create its one-way functions: it is relatively easy to compute very large prime numbers, but very difficult to find the prime factors of large numbers.

Let pp and qq be two large primes, and n=pqn = p \cdot q. Then take ee and dd in Z(p1)(q1)\mathbb{Z}_{(p-1)(q-1)}, such that ed1(mod 26)e \cdot d \equiv 1 \,(\text{mod } 26). That is, ee and dd are inverses in Z(p1)(q1)\mathbb{Z}_{(p-1)(q-1)}. At this point we can throw out everything but ee, dd, and nn. we no longer need them and their discovery would break the security of our scheme. Now we can share (e,n)(e, n) publicly, and must keep dd secret. The encryption for a message MM is Me(mod n)M^e \,(\text{mod } n), and the decryption is (Me)d(mod n)(M^e)^d \,(\text{mod } n). It’s as simple as that.

It’s not obvious at all that MedM(mod pq)M^{e \cdot d} \equiv M \,(\text{mod } p \cdot q) when ed1(mod (p1)(q1))e \cdot d \equiv 1 \,(\text{mod } (p-1)(q-1)), but this fact is essential for RSA to work as it does. I’ll prove this statement below in case you’re interested.

I’ve already mentioned that RSA is widely used today, but in particular many RSA systems use a 20482048 bit (~600600 digit) ”nn” value, and an ”ee” value of precisely 65,53765,537. If you like your powers of two, you may recognize this as 216+12^{16}+13. For a sense of scale, the number of atoms in the universe has about 8080 digits. The numbers we’re working with here are truly massive, so this may raise a couple concerns for you.

First, isn’t it difficult to find prime numbers that large to use for pp and qq?

Prime numbers aren’t as scarce for nn in our ballpark as you may think. By the Prime Number Theorem, there are approximately nln(n)\frac{n}{\ln(n)} primes in the first nn natural numbers. So in pratice, it’s reasonable to iteratively take numbers in some ballpark and do a primality test. Naively, all you’d have to do is check if any prime between 22 and n\sqrt{n} divides nn. So clearly this can be done much faster than finding the prime factors in general (which we said earlier was hard).

Second, isn’t it difficult to reduce such huge exponents to encrypt and decrypt messages?

Not really! Since we’re doing modular exponentiation, we can be super efficient. Let’s do an example. Say we want to encrypt the letter “X” (23rd23^{rd} letter) with the key e=25e = 25 in Z26\mathbb{Z}_{26}. We want to know 2325(mod 26)23^{25} \,(\text{mod } 26). Computing directly gives 2325=11,045,767,571,919,545,466,173,812,409,689,94323^{25} = 11,045,767,571,919,545,466,173,812,409,689,943. But that took a ton of sub-operations, and we have to do a ton more to find this value mod 2626. By leveraging the fact we’re working in Z26\mathbb{Z}_{26}, we can reduce this to order O(log(n))O(\log(n)) operations. 2325=(2312)223=((236)2)223=(((233)2)2)223=(((23232)2)2)22323^{25} = (23^{12})^2 \cdot 23 = ((23^6)^2)^2 \cdot 23 = (((23^3)^2)^2)^2 \cdot 23 = (((23 \cdot 23^2)^2)^2)^2 \cdot 23. 232(mod 26)=529(mod 26)923^2 \,(\text{mod } 26) = 529 \,(\text{mod } 26) \equiv 9. 239(mod 26)=207(mod 26)2523 \cdot 9 \,(\text{mod } 26) = 207 \,(\text{mod } 26) \equiv 25. Now we have (((25)2)2)2(((25)^2)^2)^2. Almost there! 252=625125^2 = 625 \equiv 1, and of course 11 to any power is 11. So 23251(mod 26)23^{25} \equiv 1 \,(\text{mod } 26). TLDR: by finding modular equivalences in intermediate steps, we’re able to keep the size of our multiplications in a reasonable range.

Closing Remarks

So that is RSA, and why it works. This is truly an awesome system. There’s some interesting history here with RSA and the Diffie-Hellman Key Exchange and their treatment by certain global superpowers, but I will save the culture for another post.


Proof that xZpq,xedx\forall x \in \mathbb{Z}_{pq},\, x^{ed} \equiv x

Fermat’s Little Theorem4: if pp is prime, then xpx(mod p)x^p \equiv x \,(\text{mod } p).\\ By construction, ed1 mod (p1)(q1)ed \equiv 1 \text{ mod } (p-1)(q-1), so (ed1)(ed-1) is a multiple of (p1)(q1)(p-1)(q-1).\\ That is, kZ:ed1=k(p1)\exists k \in \mathbb{Z}:\, ed-1 = k(p-1).\\ Then (xe)d=xed=xxed1=xxk(p1)=x(xp)kxk(x^e)^d = x^{ed} = x \cdot x^{ed-1} = x \cdot x^{k(p-1)} = x \cdot (x^p)^k \cdot x^{-k}.\\ By Fermat’s Little Theorem, xpx(mod p)x^p \equiv x \,(\text{mod } p), so x(xp)kxkxxkxkx(mod p)x \cdot (x^p)^k \cdot x^{-k} \equiv x \cdot x^k \cdot x^{-k} \equiv x \,(\text{mod } p).\\ By symmetry, (xe)dx(mod q)(x^e)^d \equiv x \,(\text{mod } q).\\ So we have xZn,xedx(mod pq)\forall x \in \mathbb{Z}_n, x^{ed} \equiv x \,(\text{mod } pq).


Footnotes

  1. An equivalent system was developed secretly and used by the British government years before Rivest, Shamir, and Adleman devised RSA!

  2. The existence of one-way functions is actually only a conjecture! To prove OWFs exist is to prove PNPP \neq NP. That is to say, RSA’s security depends on a very big (but probably safe) assumption.

  3. Numberphile has several RSA-related videos. This one explains why 216+12^{16}+1 is a pretty good choice of ee.

  4. The second coolest theorem with a name of the form “Fermat’s L___ Theorem”.


Related Posts

Huffman Compression

Data compression is a process of modifying the representation of some information so that it can be stored using less data. We discuss how information is quantified (entropy), and a simple, speedy, and greedy compression algorithm (the Huffman Coding).