Cryptography is the study and practice of techniques for secure communication in the presence of adversaries. It involves transforming information in such a way that only authorized parties can access it, while preventing unauthorized access.
In this lecture, we'll explore how number theory provides the mathematical foundation for modern cryptography, particularly through the RSA encryption system. This is a really neat application of modular arithmetic and some of the structure we have uncovered there!
Acknowledgement: The material on this page has been inspired by the notes Introduction to Cryptography by Shishir Agrawal, and some of the code presented here is from those notes.
In cryptography, we want to send messages, and classically this means messages conisting of text, like "Bet all on red". In modern days, we may want to send text, or other information like bank account information etc, in a secure matter. For mathematical cryptography, we are mostly concerned with sending numbers, but if you can do this securely, then you can also send text or anything else you want by just turning your information into numbers. We call this process integrification, since we are turning information into integers. This section runs through one possible way of doing that.
The benefit of integrification is that it allows us to:
The basic idea is to map each letter to a number. A common approach is:
For longer messages, we can represent entire words or sentences as a single large integer using a base-26 representation. For example, if we have the letters with values $a_0, a_1, a_2, \ldots, a_{n-1}$ (where each $a_i$ is between 0 and 25), we can create the integer:
$$M = a_0 \cdot 26^{n-1} + a_1 \cdot 26^{n-2} + \cdots + a_{n-1} \cdot 26^0$$This converts any text string into a unique integer, which we can then encrypt using mathematical operations.
Of course, you may want to include other characters, like spaces, punctuation, numbers,..., and maybe you care about the difference between upper case and lower case letters. We won't worry about any of this, and in the following we will only care about letters (not spaces or any other characters) which are not case sensitive.
The RSA algorithm, named after its inventors Rivest, Shamir, and Adleman (1977), is one of the first public-key cryptosystems and is widely used for secure data transmission. Its security is based on the computational difficulty of factoring large integers.
RSA relies on the following mathematical concepts from number theory:
To set up RSA encryption, we perform the following steps:
The public key is the pair $(n, e)$, which can be shared openly.
The private key is the pair $(n, d)$, which must be kept secret. The primes $p$ and $q$ should also be kept secret (or destroyed after generating the keys).
Once the keys are generated:
Encryption: To encrypt a message $M$ (represented as an integer with $0 \leq M < n$), compute: $$C = M^e \bmod n$$ where $C$ is the ciphertext.
Decryption: To decrypt the ciphertext $C$, compute: $$M = C^d \bmod n$$Why this works: Since $ed \equiv 1 \pmod{\phi(n)}$, we have $ed = 1 + k\phi(n)$ for some integer $k$. Therefore: $$C^d \equiv (M^e)^d = M^{ed} = M^{1+k\phi(n)} = M \cdot (M^{\phi(n)})^k \equiv M \cdot 1^k = M \pmod{n}$$ by Euler's theorem. If you are astute, you've noticed that we used Euler's theorem here but we didn't check that $M$ satisfies the hypothesis that $\gcd(M, n) = 1$. In fact, this won't always be the case, but miraculously $M^{ed} \equiv M \mod n$ even when $\gcd(M,N) \neq 1$!
Exercise: Prove this!
Examples of RSA: Write simple code that shows how to encrypt and decrypt with RSA. Check that the decryption gives the original message.
The security of RSA depends on the difficulty of factoring the modulus $n$. If an adversary can factor $n = pq$, they can compute $\phi(n) = (p-1)(q-1)$ and then calculate the private key $d$ from the public exponent $e$.
However, for sufficiently large primes (typically 1024 bits or larger in practice), factoring is computationally infeasible with current algorithms and computers. This makes RSA secure for practical purposes.
Code-breaking approach: If you know $n$ and $e$ (the public key) and can factor $n$ into $p$ and $q$, then you can:
Use the tool below to generate RSA keys. Specify the number of bits for the primes $p$ and $q$:
RSA encryption demonstrates the beautiful connection between pure number theory and practical applications. The security of RSA relies on:
In practice, RSA is used with key sizes of 2048 bits or larger to ensure security against modern factoring algorithms and computational power. While RSA encryption of long messages can be slow, it's often used to encrypt symmetric keys, which are then used for faster symmetric encryption of the actual message data (a hybrid cryptosystem).
Try breaking RSA with small keys. Factor $n$, compute $\phi(n)$, find $d$, and decrypt the message:
Let's see an example of how RSA can be broken if the modulus $n$ is too small. Suppose Alice uses the following public key:
Bob sends Alice an encrypted message: $C = 26$
An eavesdropper Eve can break this by:
So the original message was $M = 2$, which corresponds to the letter 'C' in our encoding (B in 0-indexed).
This demonstrates why RSA requires large primes—small values can be easily factored!