Cryptography and RSA

Cryptography is the study and practice of techniques for secure communication in the presence of adversaries. It involves transforming information in such a way that only authorized parties can access it, while preventing unauthorized access.

In this lecture, we'll explore how number theory provides the mathematical foundation for modern cryptography, particularly through the RSA encryption system. This is a really neat application of modular arithmetic and some of the structure we have uncovered there!

Acknowledgement: The material on this page has been inspired by the notes Introduction to Cryptography by Shishir Agrawal, and some of the code presented here is from those notes.

Converting Text to Numbers

In cryptography, we want to send messages, and classically this means messages conisting of text, like "Bet all on red". In modern days, we may want to send text, or other information like bank account information etc, in a secure matter. For mathematical cryptography, we are mostly concerned with sending numbers, but if you can do this securely, then you can also send text or anything else you want by just turning your information into numbers. We call this process integrification, since we are turning information into integers. This section runs through one possible way of doing that.

Integrification

The benefit of integrification is that it allows us to:

Apply mathematical transformations to messages
Use modular arithmetic for encryption
Leverage properties of prime numbers and number theory

The basic idea is to map each letter to a number. A common approach is:

A → 0, B → 1, C → 2, ..., Z → 25

For longer messages, we can represent entire words or sentences as a single large integer using a base-26 representation. For example, if we have the letters with values $a_0, a_1, a_2, \ldots, a_{n-1}$ (where each $a_i$ is between 0 and 25), we can create the integer:

$$M = a_0 \cdot 26^{n-1} + a_1 \cdot 26^{n-2} + \cdots + a_{n-1} \cdot 26^0$$

This converts any text string into a unique integer, which we can then encrypt using mathematical operations.

Of course, you may want to include other characters, like spaces, punctuation, numbers,..., and maybe you care about the difference between upper case and lower case letters. We won't worry about any of this, and in the following we will only care about letters (not spaces or any other characters) which are not case sensitive.

The RSA Cryptosystem

The RSA algorithm, named after its inventors Rivest, Shamir, and Adleman (1977), is one of the first public-key cryptosystems and is widely used for secure data transmission. Its security is based on the computational difficulty of factoring large integers.

Mathematical Setup

RSA relies on the following mathematical concepts from number theory:

Euler's totient function: For a positive integer $n$, $\phi(n)$ counts the number of integers from 1 to $n$ that are coprime to $n$. If $n = pq$ where $p$ and $q$ are distinct primes, then $\phi(n) = (p-1)(q-1)$.

Euler's theorem: If $\gcd(a, n) = 1$, then $a^{\phi(n)} \equiv 1 \pmod{n}$.

Modular multiplicative inverses: If $\gcd(e, \phi(n)) = 1$, there exists a unique $d$ modulo $\phi(n)$ such that $ed \equiv 1 \pmod{\phi(n)}$.

RSA Key Generation

To set up RSA encryption, we perform the following steps:

Choose two large prime numbers $p$ and $q$.
Compute $n = pq$. This will be the modulus for both encryption and decryption.
Compute $\phi(n) = (p-1)(q-1)$.
Choose an encryption exponent $e$ such that $1 < e < \phi(n)$ and $\gcd(e, \phi(n)) = 1$.
Compute the decryption exponent $d$ such that $de \equiv 1 \pmod{\phi(n)}$. In other words, $d = e^{-1} \bmod \phi(n)$.

The public key is the pair $(n, e)$, which can be shared openly.

The private key is the pair $(n, d)$, which must be kept secret. The primes $p$ and $q$ should also be kept secret (or destroyed after generating the keys).

RSA Encryption and Decryption

Once the keys are generated:

Encryption: To encrypt a message $M$ (represented as an integer with $0 \leq M < n$), compute: $$C = M^e \bmod n$$ where $C$ is the ciphertext.

Decryption: To decrypt the ciphertext $C$, compute: $$M = C^d \bmod n$$

Why this works: Since $ed \equiv 1 \pmod{\phi(n)}$, we have $ed = 1 + k\phi(n)$ for some integer $k$. Therefore: $$C^d \equiv (M^e)^d = M^{ed} = M^{1+k\phi(n)} = M \cdot (M^{\phi(n)})^k \equiv M \cdot 1^k = M \pmod{n}$$ by Euler's theorem. If you are astute, you've noticed that we used Euler's theorem here but we didn't check that $M$ satisfies the hypothesis that $\gcd(M, n) = 1$. In fact, this won't always be the case, but miraculously $M^{ed} \equiv M \mod n$ even when $\gcd(M,N) \neq 1$!

Exercise: Prove this!

Theorem (RSA Correctness)

Let $n = pq$ where $p$ and $q$ are distinct primes. Let $e$ and $d$ be positive integers such that $ed \equiv 1 \pmod{\phi(n)}$. Then for any integer $M$ with $0 \leq M < n$: $$M^{ed} \equiv M \pmod{n}$$

Examples of RSA: Write simple code that shows how to encrypt and decrypt with RSA. Check that the decryption gives the original message.

Security of RSA

The security of RSA depends on the difficulty of factoring the modulus $n$. If an adversary can factor $n = pq$, they can compute $\phi(n) = (p-1)(q-1)$ and then calculate the private key $d$ from the public exponent $e$.

However, for sufficiently large primes (typically 1024 bits or larger in practice), factoring is computationally infeasible with current algorithms and computers. This makes RSA secure for practical purposes.

Code-breaking approach: If you know $n$ and $e$ (the public key) and can factor $n$ into $p$ and $q$, then you can:

Compute $\phi(n) = (p-1)(q-1)$
Compute $d = e^{-1} \bmod \phi(n)$
Decrypt any ciphertext using $M = C^d \bmod n$

Interactive RSA Key Generation

Use the tool below to generate RSA keys. Specify the number of bits for the primes $p$ and $q$:

Conclusion

RSA encryption demonstrates the beautiful connection between pure number theory and practical applications. The security of RSA relies on:

The difficulty of factoring large numbers
Properties of Euler's totient function
Modular exponentiation and multiplicative inverses

In practice, RSA is used with key sizes of 2048 bits or larger to ensure security against modern factoring algorithms and computational power. While RSA encryption of long messages can be slow, it's often used to encrypt symmetric keys, which are then used for faster symmetric encryption of the actual message data (a hybrid cryptosystem).

Interactive Code-Breaking Example

Try breaking RSA with small keys. Factor $n$, compute $\phi(n)$, find $d$, and decrypt the message:

Example: Breaking RSA with Small Keys

Let's see an example of how RSA can be broken if the modulus $n$ is too small. Suppose Alice uses the following public key:

$n = 33$
$e = 7$

Bob sends Alice an encrypted message: $C = 26$

An eavesdropper Eve can break this by:

Factoring $n = 33 = 3 \times 11$
Computing $\phi(n) = (3-1)(11-1) = 2 \times 10 = 20$
Finding $d$ such that $ed \equiv 1 \pmod{20}$, i.e., $7d \equiv 1 \pmod{20}$
Solving: $d = 3$ (since $7 \times 3 = 21 \equiv 1 \pmod{20}$)
Decrypting: $M = 26^3 \bmod 33 = 17576 \bmod 33 = 2$

So the original message was $M = 2$, which corresponds to the letter 'C' in our encoding (B in 0-indexed).

This demonstrates why RSA requires large primes—small values can be easily factored!