A hands-on intro to RSA
A simple framework to explain and demonstrate how the most widely used encryption protocol works ... and how to break it!
by Dr. Tomás Silveira Salles
9 minute read
RSA is one of the most common encryption protocols today, and everyone in the world of IT should have at least a basic understanding of it. However, explaining only the mathematics involved makes the experience quite demotivating for a lot of people, so we decided to go for a hands-on approach and wrote a small framework called TextbookRSA that implements all the crucial steps, and that you can read and play around with.
TextbookRSA is written in swift and uses only its base standard library: Foundation. This means you can try it out on any Mac or Linux machine (as long as swift is installed).
As a bonus, we also included a working implementation of how quantum computers can be used to break RSA encryption, to fulfil a promise made in a previous article.
TextbookRSA was never meant to be safe for real-world applications, but rather to be easy to read and understand. The most obvious weakness is the fact that the public keys in this framework are at most 32 bits long (on 64-bit machines, and at most 16 bits long on 32-bit machines). This is because we didn't want to implement "big-integer" types and arithmetic functions. We see big-integer arithmetics as a separate problem that is independent of the RSA protocol, and it would distract the reader/user from the key concepts that we wanted to convey. Other weaknesses are for example the lack of protection against timing attacks, the lack of a check against weak prime pairs, and using electronic codebook as the block cypher mode.
On the other hand, the framework fulfils its purpose of demonstrating the main concepts, it's very easy to use and the code is quite clean and simple.
Note: If you're already familiar with RSA encryption, you can probably understand the project more quickly by looking at the README file and browsing through the sources.
To begin a message exchange through RSA, the receiver, usually named Bob in textbooks, chooses two prime numbers and . Their product is often called the RSA modulo, and is part of the public key, but the two prime numbers are private and Bob never shares them with anyone. In TextbookRSA the interface for this triple of numbers is defined in the protocol RSAKeysProtocol
and we have only one implementation, named UIntRSAKeys
(which uses UInt
as the integer type). In the code, you can see that we use our method UInt.randomPrimeInRange(...)
to choose and , which picks random numbers in a given interval and tests whether they are prime. In real-world applications, where and must be hundreds or even thousands of bits long, it is unrealistic to check this with absolute certainty, but there are tests that can quickly tell with high probability whether the chosen numbers are prime or not.
The next step is to choose an exponent to be used for encryption, and which is the second and last part of the public key. It is important that not have any common factors with , but that is really the only condition. As far as is known, it does not matter if the same exponent is used many times or by many different people. Many implementations of RSA simply hard-code , for example, because it is the smallest (and therefore the most efficient) exponent you can choose. An even more common value is , which has some mathematical properties that can be used to make encryption more efficient and is a prime number, which makes it unlikely to share factors with . In our library we stick to the original idea and choose at random. This is done by calling keys.generateEncryptionParameters()
, which essentially returns the pair .
Since the pair is public, Bob can transmit it to the message sender, usually named Alice in textbooks, through any public channel. To facilitate these transmissions, the relevant types in TextbookRSA implement the protocol Codable
, which is a new and extremely helpful Swift 4 feature. This means you can easily convert these types into well known formats such as JSON and back.
➕➖ Encoding of the RSA public key
There are several different official formats to encode RSA public keys, depending on the toolkit you use (for example OpenSSH or OpenSSL) but the essential content is always the same: the RSA modulo and the encryption exponent. Additionally, many formats include an identification string to document which algorithm is being used, so you don't get your keys mixed up. These formats are usually not in readable text form (like JSON) but in data form (binary), and usually the result gets converted into text through base-64 encoding. This encoding turns each group of 6 bits into a character, even though the data is organised into groups of 8 bits and most of them represent numbers and not letters. The string that comes out has nothing to do with a natural language description of the keys for humans. This is why the file that is automatically generated by the toolkit usually looks like this:
-----BEGIN PUBLIC KEY-----
MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAvUYHKDzF277x/IF81JBT
ApD4zgczNrUBvTWbLBNA8uEby7i/5RRf1PdwUojGjANK5gpV3+5M7dkRBerWHBNM
GWb2zho1tyO+ujcMlQFfuY2z7cju5IMmrbt7IlO84OCWSS5WfsNdGcWTjLHMm0Dn
NlViSBODxNFFuPUTnApUJtVeFXZlAiZwwICYs+QzsgaJFDhcQoKuK4vP5A101Kv0
muws4toR0qd119qMQuR+wp7EpAzP6FbmRdr2a7JBkj4bXcHWrEaq5K4lFkILokCy
TCXZv9hBIo6AMrvLzNtFsdIwT09bS0wqxdIIa/THX+Z7b4afeN1JM5qwa5N98VqK
SwIDAQAB
-----END PUBLIC KEY-----
It looks as if the keys were encrypted within the file (which doesn't make sense because they are public!) but in reality they are just encrypted in a very efficient but human-unfriendly format.
Now Alice can encrypt her message by calling Encrypter.encrypt(message, parameters: parms)
. You will find the implementation of two overloads of this method in EncrypterProtocol+DecrypterProtocol.swift and Encrypter+Decrypter.swift. One of them expects message
to be of type Data
(so you can encrypt arbitrary files) and the other takes message
of type String
(which is easier to use for quick tests, for example on a Swift Playground).
The reason why Encrypter
is defined in the directory ECB is because RSA really only encrypts individual integers, and only integers smaller than . This would be very boring for a beginner trying to learn something by using our framework, so we added ECB (short for electronic codebook) as a way of chopping longer messages into small blocks so that these blocks can be individually encrypted via RSA. Again, this is not a good idea in real-world applications, because when you use ECB, equal blocks are transformed into other equal blocks, which reveals patterns in the data. In actual encryption systems RSA would be used either for signing (where it is applied only to a short hash of the message) or to encrypt the private keys for a different protocol (such as AES-GCM), which would then be used to encrypt the actual message.
The RSA part of the encryption process is the transformation of each individual block. A block (i.e. a number) is transformed into the number . This happens in the method UIntRSA.transform(...)
and uses the class Math.PowerOracle
, which is probably the most interesting piece of mathematics in the project. The numbers involved in this exponentiation can be very large, so we have to worry about two things: Computing the result efficiently and avoiding an overflow of our integer type. To do this, we use the classic approach of breaking the exponent into a sum of powers of 2 and calculating the exponentiation in several steps.
➕➖ Easier done than said: Efficient modular exponentiation
Let's say we want to compute . First we write 10 as a sum of powers of 2: . This means that is the same as . Then we compute . Now, we know that , so we substitute for and get . Similarly, we know that , so we substitute for and get . Finally, to compute we substitute for and for and get .
This may seem like a long detour to compute something simple, but in practice it is very efficient. The key point is that after each step we apply the modulo again, so we're always working with small numbers. This way we don't have to worry about overflows, and the process is also faster because small numbers are easier to multiply.
When Bob receives Alice's message, he can create a Decrypter
object using his keys and call decrypter.decrypt(encryptedMessage)
(if the original message was of type Data
) or decrypter.decryptText(encryptedMessage)
(if the original message was of type String
). Once again, you'll find the code in the ECB directory because the decryption is applied to each block individually and then the blocks are put together to get the original message. The RSA protocol only defines what happens to a single block. Recall that for a block , Alice sends Bob the number . To decrypt, Bob first computes the inverse of in the number system. That means, Bob finds a number such that . Then he computes , and a well-known mathematical theorem guarantees that this number is exactly the original block .
Finding the number is easy if you know the primes and . This is done by calling keys.generateDecryptionParameters(...)
, which internally uses Euclides's Greatest Common Divisor algorithm (yes, the one you learned in school). But as far as we know it is very very hard if you don't know and , and this is the assumption RSA is built on.
Earlier this year we published an article about quantum computers and mentioned that they can be used to break RSA encryption using Shor's algorithm. We also mentioned that this algorithm does not find the prime factors of the RSA modulo, which left a lot of people confused and waiting for a well-deserved explanation.
To be clear, quantum computers can factor large integers quickly, and Shor's algorithm can be used as part of the process. What we want to show is that Shor's algorithm alone is not a factoring routine, and that you don't have to factor the RSA modulo to decrypt messages.
The famous quantum computing algorithm written by Peter Shor is neither a factorization nor a decryption algorithm. It computes the period of a specific function. The function is of the form where and are fixed and is the variable. As you increment , this function takes on different values for a while, but eventually starts to repeat a pattern. The number of distinct values before the function starts to repeat itself is what is called its period, and is the output of Shor's algorithm. For example, if and , the values will be (starting at ): 1, 2, 4, 3, 1, 2, 4, 3, ... so we can see that the period is 4. When is large, finding the period is extremely difficult, but a quantum computer can do it.
In our framework, due to the current lack of a quantum computer here at the office, we compute periods by trial and error, also called the brute force method, and therefore we can only do it for very small . The class that takes care of this is the PeriodOracle
. We also added an extension to UIntRSAKeys
with the static method small(maxPrime:)
(obviously not a part of the RSA protocol), which creates RSA keys with small prime numbers so that you can actually try out the PeriodOracle
(check out Period/SmallKeys.swift).
Going back to our usual example, we have the original block , the RSA modulo and the exponent and from these Alice computed the encrypted block . Notice that , and are all public, so an attacker, usually named Eve in textbooks, would know them. If Eve used a quantum computer with Shor's algorithm, she could compute the period of the function . Similarly to what Bob did for decryption, she would then compute the inverse of in the number system. In other words, Eve would find a number such that and then compute , which is exactly the original block .
➕➖ Gimme more math. I can take it!
First, notice that . Since , there is some integer such that , so .
It is easy to see that the period with base is the same as the period with base , so . In particular, . Eve would have thus decrypted Alice's message and never knew or needed Bob's secret prime factors and .
This whole process is implemented by our class PeriodDecrypter
, which has the same interface as Decrypter
(namely the DecrypterProtocol
), except you don't need the private keys to initialize it, but only the RSA modulo instead.
The home page of the project on GitHub gives a more detailed description of the public API, without going into implementation details or the theory of RSA encryption. It also shows how to install the framework using CocoaPods (Mac) or the Swift Package Manager (Mac and Linux). Finally, if you want to start testing right away, the repository also contains a Swift Playground where the entire public API is used.
Want to try it out?
Go to GitHub