## 9.4   Going Farther: RSA

One of the best known applications of number theory is to what is known as a public key cryptosystem. The idea behind a public key cryptosystem is that one person, let's call her Alice, can tell anyone and everyone enough information to encode a message to be sent to her, but only Alice can decode the message. This certainly runs counter to intuition. If one has enough information to encode a message, why can't they run the process in reverse to decode it?

Several different methods for implementing public key cryptosystems have been proposed. One of the first, and certainly the best known, was developed by Rivest, Shamir, and Adelman, and is known as RSA. Before describing the specific mechanics involved in implementing RSA, we first discuss a general scheme known as text blocking that is incorporated into most cryptosystems.

### 9.4.1   Text Blocking

The crytosystems we have introduced thus far have been based on the idea of encoding messages a single letter at a time. Most cryptographic schemes of this type are not useful in practice because they will be vulnerable to frequency analysis. Recall that frequency analysis proceeds by counting the number of times each character occurs in an encoded message, and then making educated guesses about how to decode each character.

A simple way to work around frequency analysis is to organize the message text into blocks. Instead of representing each single character as a number, the text is taken in blocks of a certain length. Thus far, we have worked with a 95-character alphabet. We will now encode blocks of several characters of text as single numbers. For example, suppose we are going to convert the text of the quote

To be or not to be, that is the question. -- Hamlet

into numbers. If we take the text in blocks of four characters each, then the first four characters "To b" would be converted into a single number, the next four characters "e or" would be converted to a single number, and so on. The first step in converting each 4-character block to a number is to convert each individual character in the block to a number. Use the applet below to convert "To b" to a list of numbers:

Your browser does not support java.

We then combine these numbers together as shown:

52·950 + 79·951 + 0·952 + 66·953 = 56594307.

The number 56594307 is then encoded and transmitted. More generally, if we take blocks of size n, and if (a1, a1, . . . , an) are the numerical equivalents for each letter in a block, then the block number is computed using the formula

a1·950 + a2·951 + a3·952 + ··· + an·95n–1.

By blocking in this manner, we have effectively raised the size of our "alphabet" from 95 to 95n, which makes frequency analysis much more difficult, if not completely infeasible. For example, if n = 10, then the size of our "alphabet" is 9510 = 59873693923837890625.

To make life a little easier for you down the road, the applet below is provided. This applet will take a message, break it into blocks, and then compute the numerical equivalent for each block. It also will reverse the process, converting blocks back into text.

Your browser does not support java.

### 9.4.2   Implementing RSA

The first step in implementing the RSA cryptosystem is to select two distinct primes p and q, and then set M = pq. If our messages are blocked into groups of n letters, then p and q must be large enough so that M > 95n. Taking logarithms (base 95), this translates to n < log95(M). We will pick the block length n by this formula.

Once p and q have been selected, the next step is to select a positive integer e such that gcd(e, (p–1)(q–1)) = 1. The integer e is called the encoding exponent, for reasons that will be clear shortly. Once suitable choices of M and e have been made, then we encode a text block number x to a new number c using the formula

c = xe % M.

The number c is then transmitted to the recipient. Note that the only information required to encode a message are the integers e and M. In particular, it is not necessary to know p and q in order to send a message.

To decode a message, we require a decoding exponent d, which is given by the formula

d = e–1 % ((p–1)(q–1)).

Once d has been computed, the original number x may be recovered using the formula

x = cd % M.

Let's look at a specific example to see how this works. Here's our quote again:

To be or not to be, that is the question. -- Hamlet

We begin by selecting two primes, say p = 12373 and q = 32491, and then setting M = pq = 402011143. The block length will then be

Your browser does not support java.

Recalling that the applet rounds down when in "Integers" mode, we will end up using a block length of 4. The next step is to select an encoding exponent. A common choice is e = 17, but this is suitable only if gcd(e, (p–1)(q–1)) = 1. Here's a check:

Your browser does not support java.

Now we're ready to encode. The applet below takes the original message, the encoding exponent e, and the modulus M as input, and returns a list of encoded blocks. It selects the block size by the formula we derived above. Here's the set of encoded blocks for our message.

Your browser does not support java.

In order to decode, we first need to compute the decoding exponent d. Recall that the formula for d is given by

d = e–1 % ((p–1)(q–1)).

We can use the applet below to compute d as shown.

Your browser does not support java.

Once we have d, we can use the "RSA Decode" applet to decode the set of encoded blocks.

Your browser does not support java.

#### Exercise 3

In this exercise, we prove that the method given above for decoding messages using RSA will work. Suppose that M = pq where p and q are distinct primes, 0 < x < M, c = xe % M, and

d = e–1 % ((p–1)(q–1)).

Show that

x = cd % M.

#### Exercise 4

Bill wants to send a message to Alice using RSA, so she tells him that her encoding exponent is e = 17 and her encoding modulus is M = 12195377863. Show Bill how to encode the message

Hey Alice, isn't RSA cool!? --Bill

#### Exercise 5

Alice has received the encoded blocks shown below. Use the fact that M = 104729·116447 to decode the message.

Your browser does not support java.

### 9.4.3   Is RSA secure?

It's clear from the decoding procedure that if one can compute d then one can decode a message. Recall that the formula for computing d is given by

d = e–1 % ((p–1)(q–1)).

Thus computing d using this formula requires knowledge of the values of p and q. On the other hand, the only information known to anyone encoding messages are the values of e and M. Of course, we know that M = p·q, so that even if we just know e and M, we can find p and q by factoring M. For example, in Exercises 4 and 5 above, Alice used the encoding modulus M = 12195377863. In practice, messages sent to her would not be secure because it is easy to factor this value of M. Below is an applet to do just that.

Your browser does not support java.

#### Exercise 6

Bill has decided he wants to have people send him messages encoded using RSA. He tells everyone that his encoding exponent is e = 17 and his encoding modulus is M = 13776856397. He has received his first set of encoded blocks, which is given below. Decode the message.

Your browser does not support java.

At this point, you may be asking yourself "Why would anyone use this cryptosystem? It doesn't seem to be secure." Indeed, the examples that we have seen so far are not secure, because it has been possible to factor the modulus M.

The key to making RSA secure is to select primes p and q that are much larger, say on the order of 100 digits each. In this case, M has about 200 digits and would be impossible to factor without knowledge of p and q. (One has to be a bit careful when making the choice of p and q, but we won't worry about these details here.) Without knowing p and q, there is no known way to compute the decoding exponent d.