Noise is a framework that can be used to construct secure channel protocols. Noise takes a fairly basic set of cryptographic operations and allows them to be combined in ways that provide various security properties. Noise is not a protocol itself: it is a protocol framework, so by "filling in the blanks" you get a concrete protocol that has essentially no knobs to twist. We’ll use the term “Noise protocol” to refer to a concrete protocol, and “Noise framework” to refer to the framework overall.
Every Noise protocol begins with a handshake that follows a particular pattern. The end result of a Noise handshake is an encrypted channel that provides various forms of confidentiality, integrity, and authenticity guarantees. Which of these guarantees you get depends on which handshake pattern is used, but a collection of standard handshakes with known security properties are provided. The Noise framework is fully agnostic to what is actually transmitted via the encrypted channel established with a handshake. You could transmit messages, video files, or anything else.
Noise is fundamentally based around Diffie-Hellman key agreement. There are many constructions that make use of DH, including perhaps the most simple DH construction which is to agree on a key that is then used directly for symmetric encryption. Noise has several advantages over building your own DH-based protocol. Some of the primary benefits are (1) that the structured nature of the Noise framework allows us to build protocols with exactly the properties we need, as well as analyze whether those properties are present, and (2) that “advanced” properties not provided by a simple DH construction (like message authentication) can be built into Noise protocols with combinations of Diffie-Hellman and the behavior of the Noise state machine. Noise Explorer is a tool that automatically analyzes handshake patterns and demonstrates the security guarantees present at each step of the handshake graphically. I refer to Noise Explorer often when trying to understand new handshake patterns.
The rigidity of a Noise protocol is one of its biggest assets. A web browser, using TLS in lieu of a Noise protocol, might have to connect to a wide variety of servers, each supporting different combinations of cryptographic algorithms. This additional capability on behalf of the web browser means that sometimes the browser might use less secure cryptography than it is capable of, or that bugs may be introduced by the logic that handles protocol negotiation. On the other hand, a Noise protocol uses a defined set of cryptographic algorithms and handshake messages that are chosen ahead of time. Noise fits well in homogeneous environments where negotiation is not generally required because both parties run software controlled by the same entity.
I will try to explain a bit more about the Noise framework and why it's neat, but I should mention that the Noise spec is very readable.
There are several reasons why I think the Noise framework is useful:
In short, Noise allows developers to build secure protocols that do not have a lot of surprising behavior.
* Noise supports fallback patterns, which allow for some negotiation in circumstances that cause an initial handshake to fail, such as when a long-term static key has changed. This is very limited compared to, say, TLS.
A Noise protocol begins with two parties exchanging handshake messages. During this handshake phase the parties exchange DH public keys and perform a sequence of DH operations, hashing the DH results into a shared secret key. After the handshake phase each party can use this shared key to send encrypted transport messages.
The Noise framework supports handshakes where each party has a long-term static key pair and/or an ephemeral key pair. A Noise handshake is described by a simple language. This language consists of tokens which are arranged into message patterns. Message patterns are arranged into handshake patterns.
A message pattern is a sequence of tokens that specifies the DH public keys that comprise a handshake message, and the DH operations that are performed when sending or receiving that message. A handshake pattern specifies the sequential exchange of messages that comprise a handshake.
A handshake pattern can be instantiated by DH functions, cipher functions, and hash functions to give a concrete Noise protocol.
A handshake consists of two parties, the initiator and the responder. Once a Noise handshake is completed, the result is an AEAD-protected transport channel, but it's also important to note that arbitrary message payloads can be transmitted during the handshake phase, before the full handshake is complete. This allows immediate transmission of protocol messages without the full round trip delay of the handshake. Payloads transmitted alongside handshake messages are partially protected, and will have different security guarantees depending on which handshake message they are attached to.
Whenever encrypted information is transmitted during a handshake (after keying material has been established, usually after the first Diffie-Hellman), the hash of the handshake transcript so far is included as the "associated data" in AEAD. This helps ensure that both parties have the same view of the handshake, even if the encrypted payload is empty.
The quote above mentions that the initiator and responder can each have a long-term static key pair and/or an ephemeral key pair. Noise handshake patterns are named after the state of these long-term static keys: NK , IK , XN , etc. The first letter indicates the status of the initiator's long-term static key, and the second letter indicates the status of the responder's long-term static key. All Noise handshakes involve some combination of transmitting public keys and performing Diffie-Hellman operations. Static keys are used to provide long-term participant identity, so you can confirm that the party you’re talking to today is the same party you were talking to yesterday.
All the standard handshake patterns require an exchange of ephemeral keys: this is done to provide forward secrecy, so that a later compromise of long-term static keys would not reveal the plaintext contents of previous communications. Noise has this property in common with TLS 1.3, which also requires the exchange of ephemeral keys, an upgrade from previous versions of TLS where it was optional. Some Noise protocols also offer identity hiding properties, depending on when the static keys are transmitted.
Letter | Meaning |
---|---|
N | No long term static-key is present |
K | The long-term static key is Known to the other party before the handshake |
X | The long-term static key is transmitted (Xmitted) to the other party during the handshake |
I | The long-term static key (for the initiator) is Immediately transmitted to the responder, despite absent/reduced identity hiding |
Handshakes are represented textually using a standard format: an arrow signifying the direction of communication followed by a sequence of tokens that describe state machine operations. You will see this "ASCII art" format whenever handshake patterns are described in the Noise specification or elsewhere.
During a handshake, each party transmits its ephemeral and/or static public keys, and performs DH operations between the ephemeral and/or static public keys of both parties. In fact, there are only six possible tokens (barring PSKs, which we will discuss later):
Commas separate each token in the same step of the handshake and indicate that the associated action occurs before the next token is processed.
Here is the NN Noise handshake pattern. NN means that neither party has a long-term static key, so the handshake is based entirely on ephemeral keys. The handshake pattern is:
This pattern represents an unauthenticated DH handshake.
The first thing to notice is that e and ee are not messages, per se -- they are tokens processed by the state machines of both parties. Some tokens ( e and s ), but not all (e.g. ee , es ), lead to messages being sent.
Let's look at what each party does during this handshake.
First, the -> arrow indicates that the transmission will be from the initiator to the responder. The e token specifies that the initiator generates an ephemeral keypair and transmits the public key to the responder. The responder receives and stores the initiator public key. Both parties hash this key into their handshake hash, which will be included as authenticated data in AEAD ciphertext (ensuring that both parties have the same view of the handshake transcript) as soon as a symmetric key is established and the parties begin encrypting messages. The initiator also has the option to transmit a payload alongside this handshake message. If the initiator were to include a payload, it would include no authentication.
The responder can include a message payload alongside this handshake message. This message would be encrypted, providing message secrecy and some forward secrecy.
See the analysis of the NN handshake in Noise Explorer for some more information.
At the termination of the handshake, both parties will have a shared symmetric state (technically, two shared symmetric states) that can be used to send encrypted messages back and forth. These transport messages (post-handshake) will benefit from message secrecy and some forward secrecy. Because the whole handshake is unauthenticated via any out-of-band means, this scheme is not resistant to an active attacker.
* The chaining key is used as an input to HKDF, which outputs the actual k used for encryption. Each update to the chaining key also results in a new k .
Let's consider now the NK pattern. The initiator here still has no long-term static identity key, but the responder has a long-term static identity that is known to the initiator (transmitted out of band, or during a previous handshake).
The handshake pattern is as follows:
e, es
The first step of the handshake pattern is a "pre-message," which just serves to identify that the contents were somehow transmitted before the handshake began. In this case,
-> e, es
The initiator generates an ephemeral public key transmits it to the responder. Transmitted / received messages are always hashed into the handshake hash. Next, both parties perform a Diffie-Hellman between the initiator's ephemeral key and the responder's static key, which is (as always) used to update the chaining key.
Because our chaining key is now based off the responder's long-term static key, which was transmitted out-of-band, any message payload attached to this handshake method benefits from some message secrecy (i.e. given a full transcript of this handshake, the message contents could only be decrypted by an attacker with access to the responder's long-term private key).
The responder now generates an ephemeral keypair and transmits its public key to the initiator. This handshake message (containing the responder's ephemeral pubkey) benefits from sender authentication since the responder's long-term static identity was used in a Diffie-Hellman. This handshake message also benefits from some message secrecy, since the former DH was used to establish a symmetric key.
Both parties perform a Diffie-Hellman between the initiator's ephemeral key and the responder's ephemeral key, rolling the result into the chaining key and enabling forward secrecy, should the responder’s long-term static key ever be compromised.
During a Noise handshake, each party keeps track of the following variables:
As each token is processed, these variables are updated. The functions supported by the state machine are defined in the Processing Rules section of the Noise specification.
Because the handshake pattern is set ahead of time, each state of the state machine has exact one valid transition to the next state. You can view the possible state transitions as a simple, single-directional chain: there is no input that causes cyclical behavior.
During the handshake phase, the two parties share a single symmetric cipher state. Once a Noise handshake is completed, this state is split into two cipher states, one for each direction of communication. Each of the newly-created ciphers uses a key derived from an HKDF with the chaining key as input.
At this point, the handshake is complete and there is nothing Noise-specific about communicating over the encrypted channels produced by the handshake. Noise does specify a rekey operation that could be triggered by an application-specific message to rotate keys any time after the handshake has been completed.
Noise supports several other features outside of the handshake patterns that we haven't yet talked about.
Prologues can be used to ensure that both parties have identical views of data -- to ensure that a MITM attack hasn't occurred between the two users before the handshake commences, for example. Prologues will cause the handshake to fail if both parties do not have the same prologue data, but prologues are not considered to be secret data and are not mixed into encryption keys.
Noise also supports pre-shared keys. PSKs can be used to provide message secrecy (and some form of message authentication) before any other handshake operations have occurred. Noise patterns that use PSKs are named by appending "pskZ" to the name of the handshake, where "Z" is a number indicating where the psk token is inserted into the handshake.
Let's take NNpsk0 for example. Remember that the original NN handshake is:
NNpsk0 is NN with the PSK token included at the beginning of the first handshake message. The suffixes 1 , 2 , etc place the PSK token at the end of the first, second, etc. messages respectively. The NNpsk0 handshake pattern is:
-> psk, e
As a PSK is pre-shared by definition, the psk token doesn't actually cause either party to transmit anything to the other. The psk token is processed by both parties mixing the PSK into their cipher state.
In particular, this token is processed by each party calling MixKeyAndHash(psk) (defined in the Noise spec), which updates both the chaining key and the handshake hash. To ensure forward secrecy and avoid catastrophic reuse of cipher keys, the Noise protocol framework does not allow for the transmission of encrypted data after just processing the psk token.
When an e token is processed in a PSK handshake, the ephemeral public key is mixed into the handshake hash (as usual) and the chaining key (which is specific to PSK handshakes). This mixing ensures randomization of the symmetric key to ensure that the symmetric key is not based solely on the PSK. In fact, an e token must be present in a PSK-based handshake, either before or after the psk token.
When we use Noise to build a protocol, we "fill in the blanks" by providing a handshake pattern, an AEAD construction, a hash function, and a DH scheme. Noise prescribes a naming convention for a specified protocol, as follows:
Noise_NK_25519_AESGCM_SHA256
This protocol name contains all the information required for Noise clients to participate in a concrete run of this protocol, giving us a nice human-readable way to specify a protocol. The initial chaining key within the handshake state machine is actually based on the full protocol name, further ensuring that both parties have the same internal model of the protocol they are running.
Noise is used today in several high-profile projects:
EdDSA is a digital signature scheme that functions over elliptic curves. While ECDSA is probably the most widely deployed elliptic
Explore some of the context of threshold cryptography, a particular problem & the specific cryptographic constructions used to