cr.yp.to: 2014.02.05: Entropy Attacks!

Newer (Access-K): 2014.02.13: A subfield-logarithm attack against ideal lattices: Computational algebraic number theory tackles lattice-based cryptography.

Table of contents (Access-I for index page)

2025.04.23: McEliece standardization: Looking at what's happening, and analyzing rationales. #nist #iso #deployment #performance #security

2025.01.18: As expensive as a plane flight: Looking at some claims that quantum computers won't work. #quantum #energy #variables #errors #rsa #secrecy

2024.10.28: The sins of the 90s: Questioning a puzzling claim about mass surveillance. #attackers #governments #corporations #surveillance #cryptowars

2024.08.03: Clang vs. Clang: You're making Clang angry. You wouldn't like Clang when it's angry. #compilers #optimization #bugs #timing #security #codescans

2024.06.12: Bibliography keys: It's as easy as [1], [2], [3]. #bibliographies #citations #bibtex #votemanipulation #paperwriting

2024.01.02: Double encryption: Analyzing the NSA/GCHQ arguments against hybrids. #nsa #quantification #risks #complexity #costs

2023.11.25: Another way to botch the security analysis of Kyber-512: Responding to a recent blog post. #nist #uncertainty #errorbars #quantification

2023.10.23: Reducing "gate" counts for Kyber-512: Two algorithm analyses, from first principles, contradicting NIST's calculation. #xor #popcount #gates #memory #clumping

2023.10.03: The inability to count correctly: Debunking NIST's calculation of the Kyber-512 security level. #nist #addition #multiplication #ntru #kyber #fiasco

2023.06.09: Turbo Boost: How to perpetuate security problems. #overclocking #performancehype #power #timing #hertzbleed #riskmanagement #environment

2022.08.05: NSA, NIST, and post-quantum cryptography: Announcing my second lawsuit against the U.S. government. #nsa #nist #des #dsa #dualec #sigintenablingproject #nistpqc #foia

2022.01.29: Plagiarism as a patent amplifier: Understanding the delayed rollout of post-quantum cryptography. #pqcrypto #patents #ntru #lpr #ding #peikert #newhope

2020.12.06: Optimizing for the wrong metric, part 1: Microsoft Word: Review of "An Efficiency Comparison of Document Preparation Systems Used in Academic Research and Development" by Knauff and Nejasmic. #latex #word #efficiency #metrics

2019.10.24: Why EdDSA held up better than ECDSA against Minerva: Cryptosystem designers successfully predicting, and protecting against, implementation failures. #ecdsa #eddsa #hnp #lwe #bleichenbacher #bkw

2019.04.30: An introduction to vectorization: Understanding one of the most important changes in the high-speed-software ecosystem. #vectorization #sse #avx #avx512 #antivectors

2017.11.05: Reconstructing ROCA: A case study of how quickly an attack can be developed from a limited disclosure. #infineon #roca #rsa

2017.10.17: Quantum algorithms to find collisions: Analysis of several algorithms for the collision problem, and for the related multi-target preimage problem. #collision #preimage #pqcrypto

2017.07.23: Fast-key-erasure random-number generators: An effort to clean up several messes simultaneously. #rng #forwardsecrecy #urandom #cascade #hmac #rekeying #proofs

2017.07.19: Benchmarking post-quantum cryptography: News regarding the SUPERCOP benchmarking system, and more recommendations to NIST. #benchmarking #supercop #nist #pqcrypto

2016.10.30: Some challenges in post-quantum standardization: My comments to NIST on the first draft of their call for submissions. #standardization #nist #pqcrypto

2016.06.07: The death of due process: A few notes on technology-fueled normalization of lynch mobs targeting both the accuser and the accused. #ethics #crime #punishment

2016.05.16: Security fraud in Europe's "Quantum Manifesto": How quantum cryptographers are stealing a quarter of a billion Euros from the European Commission. #qkd #quantumcrypto #quantummanifesto

2016.03.15: Thomas Jefferson and Apple versus the FBI: Can the government censor how-to books? What if some of the readers are criminals? What if the books can be understood by a computer? An introduction to freedom of speech for software publishers. #censorship #firstamendment #instructions #software #encryption

2015.11.20: Break a dozen secret keys, get a million more for free: Batch attacks are often much more cost-effective than single-target attacks. #batching #economics #keysizes #aes #ecc #rsa #dh #logjam

2015.03.14: The death of optimizing compilers: Abstract of my tutorial at ETAPS 2015. #etaps #compilers #cpuevolution #hotspots #optimization #domainspecific #returnofthejedi

2015.02.18: Follow-You Printing: How Equitrac's marketing department misrepresents and interferes with your work. #equitrac #followyouprinting #dilbert #officespaceprinter

2014.06.02: The Saber cluster: How we built a cluster capable of computing 3000000000000000000000 multiplications per year for just 50000 EUR. #nvidia #linux #howto

2014.05.17: Some small suggestions for the Intel instruction set: Low-cost changes to CPU architecture would make cryptography much safer and much faster. #constanttimecommitment #vmul53 #vcarry #pipelinedocumentation

2014.04.11: NIST's cryptographic standardization process: The first step towards improvement is to admit previous failures. #standardization #nist #des #dsa #dualec #nsa

2014.03.23: How to design an elliptic-curve signature system: There are many choices of elliptic-curve signature systems. The standard choice, ECDSA, is reasonable if you don't care about simplicity, speed, and security. #signatures #ecc #elgamal #schnorr #ecdsa #eddsa #ed25519

2014.02.13: A subfield-logarithm attack against ideal lattices: Computational algebraic number theory tackles lattice-based cryptography.

2014.02.05: Entropy Attacks! The conventional wisdom says that hash outputs can't be controlled; the conventional wisdom is simply wrong.

2014.02.05: Entropy Attacks! The conventional wisdom says that hash outputs can't be controlled; the conventional wisdom is simply wrong.

The conventional wisdom is that hashing more entropy sources can't hurt: if H is any modern cryptographic hash function then H(x,y,z) is at least as good a random number as H(x,y), no matter how awful z is. So we pile one source on top of another, hashing them all together and hoping that at least one of them is good.

But what if z comes from a malicious source that can snoop on x and y? For example, imagine a malicious "secure randomness" USB device that's actually spying on all your other randomness sources through various side channels, or—worse—imagine RDRAND microcode that's looking at the randomness pool that it's about to be hashed into. I should note that none of the attacks described below rely on tampering with x or y, or otherwise modifying data outside the malicious entropy source; you can't stop these attacks by double-checking the integrity of data.

Of course, the malicious device will also be able to see other sensitive information, not just x and y. But this doesn't mean that it's cheap for the attacker to exfiltrate this information! The attacker needs to find a communication channel out of the spying device. Randomness generation influenced by the device is a particularly attractive choice of channel, as I'll explain below.

Here's an interesting example of an attack that can be carried out by this malicious source:

This attack forces H(x,y,z) to start 0000, even if x and y were perfectly random. It's fast, taking just 16 computations of H on average.

Maybe the randomness generator doesn't actually output H(x,y,z); it uses H(x,y,z) as a seed for some generator G, and outputs G(H(x,y,z)). Okay: the attacker changes H to G(H), and again forces the output G(H(x,y,z)) to start 0000. Similarly, the attack isn't stopped by pre-hashing of the entropy source before it's mixed with other entropy sources. Every mix from the malicious entropy source lets the attacker produce another "random" number that starts 0000.

More generally, instead of producing "random" numbers that start with 0000, 0000, 0000, etc., the malicious entropy source can produce "random" numbers that start with successive 4-bit components of AES_k(0),AES_k(1),... where k is a secret key known only to the attacker. Nobody other than the attacker will be able to detect this pattern.

Recall that DSA and ECDSA require random "nonces" for signatures. It's easy to imagine someone grabbing each nonce as a new "random" number from the system's randomness generator. However, it's well known (see, e.g., https://www.isg.rhul.ac.uk/~sdg/igor-slides.pdf) that an attacker who can predict the first 4 bits of each nonce can quickly compute the user's secret key after a rather small number of signatures. Evidently hashing an extra entropy source does hurt—in the worst possible way; the attacker has the user's secret key!—contrary to the conventional wisdom stated above.

EdDSA (see https://ed25519.cr.yp.to) is different. It uses randomness once to generate a secret key and is then completely deterministic in its signature generation (following 1997 Barwood, 1997 Wigley, et al.). The malicious entropy source can still control 4 bits of the secret key, speeding up discrete-log attacks by a factor of 4, but this isn't a problem—we use curves with ample security margins. The source can increase the 4 bits by carrying out exponentially more H computations, but this has to fit into the time available after inspecting x and y and before generating a "random" number.

Of course, there are many other uses of randomness in cryptography: for example, if you want forward secrecy then you're constantly generating new ECDH keys. Controlling 4 bits of each secret key isn't nearly as damaging as controlling 4 bits of DSA/ECDSA nonces—it's the same factor of 4 mentioned above—but, as I mentioned above, the malicious entropy source can also use randomness generation as a communication channel to the attacker. For example, the source controls the low bit of each public key with an average cost of just 2 public-key generations, and uses the concatenation of these low bits across public keys to communicate an encryption of the user's long-term key. This channel is undetectable, reasonably high bandwidth, and reasonably low cost.

On the other hand, there's no actual need for this huge pile of random numbers. If you've somehow managed to generate one secure 256-bit key then from that key you can derive all the "random" numbers you'll ever need for every cryptographic protocol—and you can do this derivation in a completely deterministic, auditable, testable way, as illustrated by EdDSA. (If you haven't managed to generate one secure 256-bit key then you have much bigger problems.)

With this as-deterministic-as-possible approach, the entire influence of the malicious entropy source is limited to controlling a few "random" bits somewhere. There are at least two obvious ways to further reduce this control:

Let me emphasize that what I'm advocating here, for security reasons, is a sharp transition between

This is exactly the opposite of what people tend to do today, namely adding new entropy all the time. The reason that new entropy is a problem is that each addition of entropy is a new opportunity for a malicious entropy source to control "random" outputs—breaking DSA, leaking secret keys, etc. The conventional wisdom says that hash outputs can't be controlled; the conventional wisdom is simply wrong.

(There are some special entropy sources for which this argument doesn't apply. For example, an attacker can't exert any serious control over the content of my keystrokes while I'm logged in; I don't see how hashing this particular content into my laptop's entropy pool can allow any attacks. But I also don't see how it helps.)

Is there any serious argument that adding new entropy all the time is a good thing? The Linux /dev/urandom manual page claims that without new entropy the user is "theoretically vulnerable to a cryptographic attack", but (as I've mentioned in various venues) this is a ludicrous argument—how can anyone simultaneously believe that

There are also people asserting that it's important for RNGs to provide "prediction resistance" against attackers who, once upon a time, saw the entire RNG state. But if the attacker sees the RNG state that was used to generate your long-term SSL keys, long-term PGP keys, etc., then what exactly are we gaining by coming up with unpredictable random numbers in the future? I'm reminded of a Mark Twain quote:

We obviously need systems that can maintain confidentiality of our long-term keys. If we have such systems, how is the attacker supposed to ever see the RNG state in the first place? Maybe "prediction resistance" can be given a theoretical definition for an isolated RNG system, but I don't see how it makes any sense for a complete cryptographic system.

[Advertisement: If you're interested in these topics, you might want to join the randomness-generation mailing list; I sent this there a few days ago. To subscribe, send email to randomness-generation+subscribe@googlegroups.com. There's also a mailing list cleverly named dsfjdssdfsd for discussion of randomness in IETF protocols. Randomness is also a frequent topic on more general cryptographic mailing lists.]

[2023.03.17 update: I should note related work from 2006, namely Section 3.2 of "Analysis of a strong pseudo random number generator" by Thomas Biege. The 2006 paper and this blog post are both analyzing what a malicious entropy source can do. After that setup, the 2006 work is explaining how a linear hash function gives a malicious entropy source full control over outputs, whereas this blog post is explaining how a modern cryptographic hash function gives a malicious entropy source partial control over outputs.]

The cr.yp.to blog

2014.02.05: Entropy Attacks! The conventional wisdom says that hash outputs can't be controlled; the conventional wisdom is simply wrong.