Even though the ping-ponging keys are vulnerable to future quantum attacks, they are broadly believed to be secure against today’s attacks from classical computers. The Signal Protocol developers didn’t want to remove them or the battle-tested code that produces them. That led to their decision to add quantum resistance by adding a third ratchet. This one uses a quantum-safe KEM to produce new secrets much like the Diffie-Hellman ratchet did before, ensuring quantum-safe, post-compromise security.
The technical challenges were anything but easy. Elliptic curve keys generated in the X25519 implementation are about 32 bytes long, small enough to be added to each message without creating a burden on already constrained bandwidths or computing resources. A ML-KEM 768 key, by contrast, is 1,000 bytes. Additionally, Signal’s design requires sending both an encryption key and a ciphertext, making the total size 2,272 bytes.
And then there were three
To handle the 71x increase, Signal developers considered a variety of options. One was to send the 2,272-byte KEM key less often—say every 50th message or once every week—rather than every message. That idea was nixed because it doesn’t work well in asynchronous or adversarial messaging environments. Signal Protocol developers Graeme Connell and Rolfe Schmidt explained:
Consider the case of “send a key if you haven’t sent one in a week”. If Bob has been offline for 2 weeks, what does Alice do when she wants to send a message? What happens if we can lose messages, and we lose the one in fifty that contains a new key? Or, what happens if there’s an attacker in the middle that wants to stop us from generating new secrets, and can look for messages that are [many] bytes larger than the others and drop them, only allowing keyless messages through?
Another option Signal engineers considered was breaking the 2,272-byte key into smaller chunks, say 71 of them that are 32 bytes each. Breaking up the KEM key into smaller chunks and putting one in each message sounds like a viable approach at first, but once again, the asynchronous environment of messaging made it unworkable. What happens, for example, when data loss causes one of the chunks to be dropped? The protocol could deal with this scenario by just repeat-sending chunks again after sending all 71 previously. But then an adversary monitoring the traffic could simply cause packet 3 to be dropped each time, preventing Alice and Bob from completing the key exchange.
