View on GitHub

Quantum Key Distribution (QKD) in OpenSSL

A report describing how we implemented QKD for OpenSSL during the RIPE Quantum Internet Hackathon 2019

This is part 1 in a multi-part report describing how we implemented Quantum Key Distribution (QKD) in OpenSSL as part of the pan-European quantum Internet hackathon in Delft on 5 and 6 November 2019. See the main page of this report for the other parts. – Bruno Rijsman

Security in the classical Internet.

Security challenges in the Internet: authentication, confidentiality, and integrity.

Imagine that you use your web browser to connect to your bank’s website to transfer some money. Some of the main network security challenges in this transaction are:

  1. Authentication: You want to validate (i.e. make sure) that the website that you are connected to is really the bank’s website and not some fake website that looks exactly like your bank’s website but was created by a criminal to steal your username and password.

  2. Confidentiality: You want all the traffic between your browser and the bank’s website to be encrypted so that it cannot be seen by some malicious snooper in the middle. For example you don’t want a thief who is on the same public Wifi to steal your username and password. Frankly, you don’t even want your Internet service provider to know what terms you are searching for on Google.

  3. Integrity: You want to make sure that the traffic that is exchanged between the browser and the website is not modified by some malicious actor in the middle. For example, you don’t want that thief who is on the same public Wifi to change the bank account in the transfer to her own account.

These challenges don’t only apply to browsing on the Internet, but also to other applications such as e-mail, messaging, file transfers, machine-to-machine communications, etc.

There are other security challenges such as non-repudiation or anonymity, but in this report we focus on the three that we listed.

The Transport Layer Security (TLS) protocol.

The Transport Layer Security (TLS) protocol is one of the most widely used protocols on the Internet for providing authentication, confidentiality, integrity, and other security functions.

TLS was standardized by the Internet Engineering Task Force (IETF), which is the main standardization organization for Internet protocols.

There are multiple versions of TLS, each described in its own standards document which is called a Request For Comment (RFC) in IETF jargon:

The predecessor for TLS was the Secure Sockets Layer (SSL) protocol, which is now considered obsolete. However, many people still use the older term SSL when they are actually referring to TLS. For example, one of the main open source implementations of TLS is still called OpenSSL.

Symmetric versus asymmetric encryption.

One of the main functions of the TLS protocols is to encrypt all traffic between the two communicating end-points.

Generally speaking, there are two types of encryption:

Symmetric encryption.

In symmetric encryption the two communicating end-points use one and the same key.

The sender encrypts the traffic with a particular key. The receiver decrypts the traffic with that same key. The same key is used in both directions: for client-to-server and server-to-client traffic.

The key must remain secret. Only the two end-points are allowed to know the key. If some malicious attacker also discovers the key, she will be able to tap and decrypt the traffic as it flows from sender to receiver. She can even decrypt the traffic, change it, and re-encrypt the changed traffic, without the two end-points discovering that the message was tampered with.

There are several algorithms (also known as ciphers) for symmetric encryption, such as for example the Advanced Encryption Standard (AES), the International Data Encryption Algorithm (IDEA), and Blowfish.

Each algorithm are multiple variations (so-called modes) and supports multiple key sizes (for example 128 bits, 192 bits, 256 bits etc.)

Symmetric encryption is very fast and can be implemented in hardware. As a random example, the Juniper PTX10K-LC1105 line card has 30 MacSec Ethernet ports, where each port can do 100 Gbps AES256 encryption at wire-speed, for a total of 3 Tbps of encryption and decryption per card.

Asymmetric encryption.

In asymmetric encryption the two communicating end-points use different keys:

For obvious reasons, asymmetric encryption is also known as public key encryption or private-public key encryption.

The best known asymmetric encryption algorithm is called Rivest Shamir Adleman (RSA) after the names of inventors.

The problem with asymmetric encryption is that it is slow. Specifically, it is not fast enough for line rate encryption of large volumes of traffic. For that reason, TLS always uses symmetric encryption for encrypting and decrypting the application traffic.

Still, the same mathematical principles of asymmetric encryption are used to address some other problems in network security, including key agreement and secure signing. We will discuss the former in more detail when we talk about the key distribution problem below.

The security of classical encryption.

As the power of classical computers increases over time and as new classical attack vectors are discovered on encryption algorithms, some protocols become obsolete over time because they are not considered to be secure anymore. Or at least, the required key sizes increase over time. For example, the Data Encryption Standard (DES) was widely used for many years but is not considered to be secure anymore and has been replaced by AES.

The security of asymmetric encryption protocols is based on the fact that it is infeasible to factor large numbers in to prime factors. At least, it is infeasible for classical computers to do the factorization. In the 1990s an efficient quantum algorithm was discovered that quantum computer to factor large numbers. As a result, most existing asymmetric encryption protocols are not secure if the adversary has access to a sufficiently large and reliable quantum computer (no such computer is known to exist at this time.)

We will go into much more detail in part 2 (Quantum computing breaks and fixes classical security) of this report. We bring it up here to make an important point: only asymmetric encryption is known to be vulnerable to a quantum attack; as far as we know (but this is not formally proven) symmetric encryption is not vulnerable to quantum attack.

The key distribution problem

We already mentioned that for performance reasons symmetric encryption (as opposed to asymmetric encryption) is used to encrypt high-speed traffic. TLS, for example, always uses symmetric encryption for application traffic.

Using symmetric encryption introduces a problem: how can the two end-points of the connection (for example the browser in your home and the website in some data center on the other side of the world) agree on the encryption key that is used to encrypt and decrypt the traffic? This is called the key distribution problem or the key agreement problem or the key exchange problem.

One obvious method is to agree on the key to be used in advance, before the communication starts, and to share the key using some secure out-of-band mechanism. This approach is called using Pre-Shared Keys (PSK). Imagine a spy and her handler physically exchanging papers with keys (so-called one time pads) before the spy leaves to infiltrate the enemy. This is clearly not feasible or the Internet: a website cannot possibly have a set of pre-shared keys for every browser that could potentially visit the website.

Another method for implementing key agreement is based on the same mathematical principles as asymmetric encryption.

The basic idea is very simple to understand:

There are some additional technicalities that allow S to verify that the public key is actually owned by R. This involves concepts such as public key certificates, Certificate Authorities (CA), digital signatures etc. that are beyond the scope of this report.

This above protocol is known as RSA Key Transport, but for various reasons this protocol is now deprecated.

The actual protocol that is used in the Internet today for key agreement is known as the Diffie-Hellman protocol. It does not use asymmetric encryption such as RSA directly, but it is based on the same mathematical principles.

The Diffie-Hellman key exchange protocol.

It turns out that there is a way for two communicating parties to dynamically agree on a new symmetric encryption key (the session key) for each communication session.

The browser and the website and exchange a series of messages, following a pre-determined protocol, and at the end of that message exchange, the browser and the website will have agreed on a so-called shared secret that is only known to the browser and to the website and not to anyone else. The browser and the website can use this shared secret as the symmetric encryption key.

The amazing thing is that this statement (“the shared secret is only know to the browser and to the website and not to anyone else”) is true even if:

(a) the key agreement protocol messages are sent in the clear (i.e. not encrypted), and (b) some malicious attacker is wiretapping the connection and observing all the messages that are exchanged during the key agreement protocol.

When I first read about this it blew my mind. Think about it: two random strangers can meet for the first time. They talk a little bit, and at the end of the conversation they both agree on some secret number. I am standing right next to them, and I can hear everything they are saying, but I cannot figure out what the secret number is. And these are random strangers who have never met before and how don’t know anything about each other (so they can’t say “the secret number is my birth year” for example). How is that even possible?

The most widely used algorithm for dynamically agreeing on a shared secret is called the Diffie-Hellman (DH) algorithm. The mathematical details of how it works are surprisingly simple (see below for details).

There are actually a number of variations on the Diffie-Hellman algorithm:

When your browser connects to a secure website, in most cases some variation of Diffie-Hellman (usually ECDHE) is used to implement key agreement. And in many cases, the OpenSSL library is used to implement Diffie-Hellman.

The Diffie-Hellman algorithm details.

The basic Diffie-Hellman algorithm (as opposed to the elliptic curve Diffie-Hellman algorithm ECDH) works as follows.

First you have to understand the concept of modular arithmetic. In modulo N math, there is only a finite set of numbers from zero to N-1. When you add one number to another number, and the result is greater or equal to N, then the result “rolls over”.

For example, in modulo 7 there are only 7 numbers, namely 0 through 6:

    +---+---+---+---+---+---+---+
    | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
    +---+---+---+---+---+---+---+

If you start with the number 3 and you add 2, the result is 5:

                    +1  +2
                   +-+ +-+
                   | | | |  
                   | v | v 
    +---+---+---+---+---+---+---+
    | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
    +---+---+---+---+---+---+---+

    (3 + 2) mod 7 = 5

But if you start with the number 3 and you add 5, the result is 1 (we “rolled over”):

    +4  +5          +1  +2  +3    
 +---+ +-+         +-+ +-+ +-+ +---+
 |   | | |         | | | | | | |   |
 |   v | v         | v | v | v |   |
 |  +---+---+---+---+---+---+---+  |
 |  | 0 | 1 | 2 | 3 | 4 | 5 | 6 |  |
 |  +---+---+---+---+---+---+---+  |
 |                                 |
 +---------------------------------+

    (3 + 5) mod 7 = 1

Once we have defined modular addition, it is straightforward to define modular subtraction (the inverse of addition), multiplication (repeated addition), division (the inverse of multiplication), etc.

Now that we are clear on modular math, we can explain the Diffie-Hellman algorithm:

Example packet trace showing a Diffie-Hellman exchange.

Let’s have a look at the Diffie-Hellman algorithm in action in the real world. I use a browser (Safari in this example) to visit a secure website (https://www.google.com/ in this example). In this scenario, the browser acts as the TLS client, and the website acts as the TLS server. I use the WireShark protocol analyzer to capture and analyze the TLS traffic.

The following screenshot shows the TLS traffic between my browser and the Google website (I have filtered the traffic to show only a single TLS session).

In this case the client and server negotiate the use of Elliptic Curve Diffie-Hellman Ephemeral (ECDHE) whose parameters are a little bit different than the g and p of regular Diffie-Hellman, the the flow of the messages is the same.

We only show a subset of the exchanged TLS messages, namely the ones that illustrate how the Diffie-Hellman exhange works.

Session Overview

HTTPS WireShark Entire Session

Let’s zoom in on some specific packets to see the details of the Diffie-Hellman exchange.

TLS Client Hello

In the following screenshot we see the TLS Client Hello message that the client sends to the server.

It contains a list, sorted in order of preference, of proposed Cipher Suites that the client supports from which the server can choose. Each cipher suite proposed, amongst other things, a specific key exchange protocol such as for example Elliptic Curve Diffie Hellman-Ephemeral (ECDHE).

HTTPS WireShark Client Hello

TLS Server Hello

In the following screenshot we see the TLS Server Hello message that the server sends to the client.

It contains the Cipher Suite that the server has chosen.

HTTPS WireShark Server Hello

TLS Server Certificate Status

In the following screenshot we see the TLS Certificate Status message that the server sends to the client.

The Server Key Exchange field contains the public Elliptic Curve (EC) Diffie-Hellman Parameters as well as the Public Key chosen by the server.

HTTPS WireShark Certificate Status

TLS Server Client Key Exchange

In the following screenshot we see the TLS Client Key Exchange message that the client sends to the server.

It contains the Diffie-Hellman Public Key chosen by the client.

HTTPS WireShark Client Key Exchange

Encrypted Application Data

At this point both the server and the client have all the information that they need to each compute the shared secret (which never appears on the wire).

From here on out, both side use the shared secret to encrypt the application traffic.

First we see a message with encrypted application data from the client to the server. This actually contains the encrypted HTTP GET request (we know that, but we cannot see that in the encrypted packet).

HTTPS WireShark Client Encrypted Application Data

Then we see a message with encrypted data going back from the server to the client. This actually contains the encrypted HTTP GET response (we know that, but we cannot see that in the encrypted packet).

HTTPS WireShark Server Encrypted Application Data

What is OpenSSL and how does it fit into the picture?

The open source OpenSSL library is widely used to provide security on the Internet. One of the main functions of the OpenSSL library is to implement the Transport Layer Security (TLS) protocol, which forms the basis for the Secure Hypertext Transfer Protocol (HTTPS), which in turn enables secure and private browsing on the Internet.

The name OpenSSL comes from the Secure Sockets Layer (SSL) protocol, a now outdated predecessor for the TLS protocol.