cr.yp.to: 2023.06.09: Turbo Boost

Newer (Access-K): 2023.10.03: The inability to count correctly: Debunking NIST's calculation of the Kyber-512 security level. #nist #addition #multiplication #ntru #kyber #fiasco

Older (Access-J): 2022.08.05: NSA, NIST, and post-quantum cryptography: Announcing my second lawsuit against the U.S. government. #nsa #nist #des #dsa #dualec #sigintenablingproject #nistpqc #foia

Table of contents (Access-I for index page)

2025.04.23: McEliece standardization: Looking at what's happening, and analyzing rationales. #nist #iso #deployment #performance #security

2025.01.18: As expensive as a plane flight: Looking at some claims that quantum computers won't work. #quantum #energy #variables #errors #rsa #secrecy

2024.10.28: The sins of the 90s: Questioning a puzzling claim about mass surveillance. #attackers #governments #corporations #surveillance #cryptowars

2024.08.03: Clang vs. Clang: You're making Clang angry. You wouldn't like Clang when it's angry. #compilers #optimization #bugs #timing #security #codescans

2024.06.12: Bibliography keys: It's as easy as [1], [2], [3]. #bibliographies #citations #bibtex #votemanipulation #paperwriting

2024.01.02: Double encryption: Analyzing the NSA/GCHQ arguments against hybrids. #nsa #quantification #risks #complexity #costs

2023.11.25: Another way to botch the security analysis of Kyber-512: Responding to a recent blog post. #nist #uncertainty #errorbars #quantification

2023.10.23: Reducing "gate" counts for Kyber-512: Two algorithm analyses, from first principles, contradicting NIST's calculation. #xor #popcount #gates #memory #clumping

2023.10.03: The inability to count correctly: Debunking NIST's calculation of the Kyber-512 security level. #nist #addition #multiplication #ntru #kyber #fiasco

2023.06.09: Turbo Boost: How to perpetuate security problems. #overclocking #performancehype #power #timing #hertzbleed #riskmanagement #environment

2022.08.05: NSA, NIST, and post-quantum cryptography: Announcing my second lawsuit against the U.S. government. #nsa #nist #des #dsa #dualec #sigintenablingproject #nistpqc #foia

2022.01.29: Plagiarism as a patent amplifier: Understanding the delayed rollout of post-quantum cryptography. #pqcrypto #patents #ntru #lpr #ding #peikert #newhope

2020.12.06: Optimizing for the wrong metric, part 1: Microsoft Word: Review of "An Efficiency Comparison of Document Preparation Systems Used in Academic Research and Development" by Knauff and Nejasmic. #latex #word #efficiency #metrics

2019.10.24: Why EdDSA held up better than ECDSA against Minerva: Cryptosystem designers successfully predicting, and protecting against, implementation failures. #ecdsa #eddsa #hnp #lwe #bleichenbacher #bkw

2019.04.30: An introduction to vectorization: Understanding one of the most important changes in the high-speed-software ecosystem. #vectorization #sse #avx #avx512 #antivectors

2017.11.05: Reconstructing ROCA: A case study of how quickly an attack can be developed from a limited disclosure. #infineon #roca #rsa

2017.10.17: Quantum algorithms to find collisions: Analysis of several algorithms for the collision problem, and for the related multi-target preimage problem. #collision #preimage #pqcrypto

2017.07.23: Fast-key-erasure random-number generators: An effort to clean up several messes simultaneously. #rng #forwardsecrecy #urandom #cascade #hmac #rekeying #proofs

2017.07.19: Benchmarking post-quantum cryptography: News regarding the SUPERCOP benchmarking system, and more recommendations to NIST. #benchmarking #supercop #nist #pqcrypto

2016.10.30: Some challenges in post-quantum standardization: My comments to NIST on the first draft of their call for submissions. #standardization #nist #pqcrypto

2016.06.07: The death of due process: A few notes on technology-fueled normalization of lynch mobs targeting both the accuser and the accused. #ethics #crime #punishment

2016.05.16: Security fraud in Europe's "Quantum Manifesto": How quantum cryptographers are stealing a quarter of a billion Euros from the European Commission. #qkd #quantumcrypto #quantummanifesto

2016.03.15: Thomas Jefferson and Apple versus the FBI: Can the government censor how-to books? What if some of the readers are criminals? What if the books can be understood by a computer? An introduction to freedom of speech for software publishers. #censorship #firstamendment #instructions #software #encryption

2015.11.20: Break a dozen secret keys, get a million more for free: Batch attacks are often much more cost-effective than single-target attacks. #batching #economics #keysizes #aes #ecc #rsa #dh #logjam

2015.03.14: The death of optimizing compilers: Abstract of my tutorial at ETAPS 2015. #etaps #compilers #cpuevolution #hotspots #optimization #domainspecific #returnofthejedi

2015.02.18: Follow-You Printing: How Equitrac's marketing department misrepresents and interferes with your work. #equitrac #followyouprinting #dilbert #officespaceprinter

2014.06.02: The Saber cluster: How we built a cluster capable of computing 3000000000000000000000 multiplications per year for just 50000 EUR. #nvidia #linux #howto

2014.05.17: Some small suggestions for the Intel instruction set: Low-cost changes to CPU architecture would make cryptography much safer and much faster. #constanttimecommitment #vmul53 #vcarry #pipelinedocumentation

2014.04.11: NIST's cryptographic standardization process: The first step towards improvement is to admit previous failures. #standardization #nist #des #dsa #dualec #nsa

2014.03.23: How to design an elliptic-curve signature system: There are many choices of elliptic-curve signature systems. The standard choice, ECDSA, is reasonable if you don't care about simplicity, speed, and security. #signatures #ecc #elgamal #schnorr #ecdsa #eddsa #ed25519

2014.02.13: A subfield-logarithm attack against ideal lattices: Computational algebraic number theory tackles lattice-based cryptography.

2014.02.05: Entropy Attacks! The conventional wisdom says that hash outputs can't be controlled; the conventional wisdom is simply wrong.

2023.06.09: Turbo Boost: How to perpetuate security problems. #overclocking #performancehype #power #timing #hertzbleed #riskmanagement #environment

As context, Hertzbleed had been announced a week earlier, on 14 June 2022, and had been demonstrated extracting secret keys from the official software for SIKE running on various Intel and AMD CPUs.

SIKE was, at the time, a high-profile candidate for post-quantum encryption. It was one of just two candidates selected for a large-scale experiment run by Google and Cloudflare in 2019. It was backed by a "Case for SIKE" paper in 2021 advertising "A decade unscathed", and by $50000 in prize money from Microsoft for solving small SIKE challenges.

The next month, SIKE was selected by NIST as one of just four post-quantum encryption schemes to continue considering for standardization, beyond NIST's initial selection of Kyber.

After SIKE's security was publicly smashed, various cryptographers claimed that there had been "no attack progress for ~12 years" and that the attack was "without any warning".

In fact, if you check a 2018 video of a talk to thousands of people at CCC, you'll find Tanja Lange at 48:25 saying "At this moment I think actually CSIDH has a better chance than SIKE of surviving, but who knows. Don't use it for anything yet"—evidently warning against both CSIDH and SIKE.

In 2020, I tweeted that SIKE "scares me for being too new". In 2021, I disputed the "Case for SIKE" paper: "Most important dispute is regarding risk management, [Sections] 1+8. Recent advances in torsion-point attacks have killed a huge part of the SIKE parameter space, far worse than MOV vs ECDLP."

Hertzbleed extracting secret keys from the official software for SIKE was clearly newsworthy.

What did TAO say about this attack? TAO's critical conclusion was that the security problem was much broader than the SIKE demo. Here's the quote:

TAO's critical recommendation was to plug the leak at its source by disabling sensor-dependent frequency variations such as Turbo Boost:

TAO also explained why throwing an ad-hoc wrench into this particular demo was not an adequate response to the attack:

After writing the page, we heard that Intel had, on 14 June 2022, announced a demo of overclocking attacks against AES hardware built into Intel CPUs. Two papers in 2023, 2H2B and Hot Pixels, then presented demos extracting secrets from implementations of ECDSA and Classic McEliece, and stealing pixels and history from Chrome and Safari. These additional demos confirmed our predictions of broad applicability.

Proactive vs. reactive. So far this makes Hertzbleed sound like a simple story:

Well, no, it's not that simple. There were contrary recommendations: most importantly, recommendations to not take immediate action, except for the SIKE code patch to block the initial demo.

This was an inadequate response to overclocking attacks, as illustrated by the demos of ECDSA key extraction and pixel stealing and so on. OS distributors and end users who could have turned on immediate broad-spectrum protections were instead being told "you don’t need to apply a patch or change any configurations right now."

It's not clear from the above quote why the Hertzbleed team thought users "probably" shouldn't be worried. Another part of the same page said that there "might" be broader applicability:

Anyone who has spent time reading the vast literature on power-analysis attacks sees that analyzing Turbo Boost variants of these attacks is going to be a massive evaluation effort. Someone whose career is built on writing attack papers says "Great! We'll have at least ten years of papers on this fascinating topic."

Someone trying to protect users instead says "Yikes! This is going to be a security disaster. Maybe there are some limits, but even years from now we won't be sure about those limits. Fortunately, most of our devices have configuration options to directly address the root cause of the leak. Let's use those options right now."

The challenge of evaluating security risks. The security level of a system is defined by the best possible attack. But what is the best possible attack?

There's always a risk that we've missed attacks. How do we quantify the probability and impact of this risk? This is not an easy question.

Let's say you've seen some examples of the terrible track record of public-key encryption systems that portray non-commutativity as a security improvement:

You then see another public-key encryption system advertising non-commutativity as a security improvement. What's the chance that there's a real security gain this time? What's the chance that there's a security loss, where the attacks were simply obscured by the complications of non-commutativity? How much did I bias the answers to these questions by using the phrase "terrible track record"?

Within the field of study called risk analysis, there's a subfield, cryptographic risk analysis, that directly addresses these questions. Cryptographic risk analysis

So far this subfield has successfully produced approximately 0 papers from approximately 0 researchers funded by approximately 0 million dollars of grants. Cryptographers generally aren't even aware of risk analysis as an interesting topic of study. Meanwhile risk-analysis researchers tend to focus on case studies that are more approachable than cryptography: train crashes, bank failures, pandemics, etc.

Cryptanalysts have mental models that help them select attack targets, but there's very little documentation of those models. Sometimes the models turn into informal public warnings, such as the SIKE warnings quoted above, but the same warning process is completely unprotected against abuse by charlatans, such as QKD proponents pointing to the categorical risks of public-key cryptography while falsely portraying QKD as risk-free.

Decades of seat-of-the-pants cryptographic risk analysis have led to broad agreement on a few really basic risk-management principles, such as not using an unanalyzed cryptosystem. These principles are sometimes useful for making decisions; I'll apply them later in this blog post, and I hope that they're eventually backed up by scientific risk analysis. But the bigger point here is that, at least for the moment, the community has not established ways to quantify cryptographic risks.

Avoiding demonstrated insecurity is not enough. One way to avoid the difficulties of risk assessment is to misdefine "secure" as "not convincingly demonstrated to be broken".

For example, this misdefinition says that SIKE was "secure" before 2022, and then suddenly became insecure in 2022.

But that's wrong. SIKE was never secure. Looking only at attacks published before 2022 was overestimating its actual security level.

For example, the only reason that the big SIKE experiment in 2019 didn't end up giving away user data to pre-quantum attackers is that the experiment used double encryption, encrypting data with SIKE and with X25519.

One way that leaving out the X25519 layer would definitely have been a security disaster comes from users often sending data in 2019 that still needed confidentiality in 2022. Attackers who had the common sense to record the ciphertext in 2019 could certainly break it in 2022.

A different way where we don't know that it would have been a security disaster, but where that's the only safe presumption, comes from the possibility of the attacker already knowing the SIKE attack algorithm in 2019, even though the public didn't know the attack until 2022.

Concretely, for readers who already know what IDA is and who Coppersmith is: Imagine Coppersmith seeing the SIDH proposal in 2011, smelling blood in SIDH's torsion points, and discussing SIDH with IDA employee Everett Howe, who had exactly the right background to see how to exploit those torsion points.

There are many other examples of what goes wrong if "secure" is misdefined as "not convincingly demonstrated to be broken". For example, this misdefinition says that ECDSA implementations are "secure" against Hertzbleed if there's only a SIKE demo. Why take protective action beyond SIKE if everything other than SIKE is "secure"?

As a side note, you'd think that a May 2023 publication about Hertzbleed's "wider applicability" would feel obliged to credit TAO for correctly stating in June 2022 that "overclocking attacks are much more broadly applicable" and explaining why. Sure, there wasn't a demo at that point, and demos are useful for adding confidence, but demos are only one corner of the knowledge used by competent defenders.

Great, I guess that answers the worry question! The scope of overclocking attacks is, in alphabetical order,

That's (1) the complete list and (2) obviously nothing to worry about. Clearly "you don’t need to apply a patch or change any configurations".

Or, wait, is the word "Update" withdrawing the "don't need" text? Is the page finally admitting that users should take action against the full breadth of the attack?

Let's think ahead to whichever paper presents the next demo of overclocking attacks. That paper will go beyond the limits of the current demos. If the words "security" and "risk" are misdefined by the limits of demos then that paper will decrease "security" and increase "risk". These misdefinitions say that today "the risk is still limited" to what the current demos accomplish, since today the paper "still" hasn't been published. After the paper is published, the misdefinitions will say that the "risk" goes beyond those limits.

A proper risk evaluation, starting from how overclocking attacks work and what the power-analysis literature already says, concludes much more efficiently that this is a broad threat requiring immediate action. That's what TAO already said in June 2022.

Confirmation bias. Suppose that, for whatever reason, you want to believe that a line of attacks isn't a real threat.

It's very easy to pick some limit of what has been demonstrated so far and portray that limit as a fundamental barrier, not because of any serious analysis of whether the limit can be broken and whether the limit matters, but because you want to believe that there's a barrier.

This is a very fast process. Papers on the attack demos will normally be careful to explain the limits of the demos, so you can simply pick your favorite limit from the list.

Let's look, for example, at a 15 June 2022 PCWorld article titled "Don’t panic! Intel says Hertzbleed CPU vulnerability unlikely to affect most users":

Indeed, the Hertzbleed paper says "Our unoptimized version of the attack recovers the full key from these libraries in 36 and 89 hours, respectively". We all know that any real attack has to finish before the movie ends, which is at most 2 hours.

Well, okay, maybe 3 like that old Costner movie, but that's really pushing it. How many people are going to focus on one thing for that long? Also, it's well known that attackers spend their tiny budgets lovingly hand-crafting a separate individualized attack for each targeted person, so it's inconceivable that "most users" could be "targets of sophisticated campaigns of attack".

The PCWorld article also quotes Intel's 14 June 2022 blog post: "While this issue is interesting from a research perspective, we do not believe this attack to be practical outside of a lab environment."

Well, um, sure, it's a paper from academics who configured their own "target server", but how does anyone get from this to believing that the attack wouldn't work outside the "lab"?

When Yehuda Lindell asked him "Why do you think we always know if something is used in the wild?", here was Green's reply:

In short, Green's argument for his claim that these "academic" attacks haven't been used is his claim that they haven't been detected, while others have. But this isn't answering Lindell's question. Why should we assume that all used attacks are detected?

Some attacks want to have effects that the victim can see: consider DoS attacks, ransomware, etc. Some further attacks are detected because the attacks are inherently noisy or because poorly trained attackers make mistakes. But why should we assume that stealthy attacks by well-trained attackers are detected, or, more to the point, that they're within Green's awareness of what has been detected?

Does Green claim that NSA's QUANTUMINSERT attacks were detected when they were first used in 2005? That they were detected before 2013, when they were revealed by the Snowden documents?

For comparison, Fox-IT announced in 2015 that, after learning about the attacks from the Snowden documents, it had built a detector for those attacks.

If all used attacks are detected, and QUANTUMINSERT wasn't detected, then QUANTUMINSERT wasn't used? Snowden made it up?

A scientist formulating "All attacks used by real attackers are detected" as a hypothesis will search for tests to disprove the hypothesis, and, given a reasonable level of attention to the available data, will rapidly succeed at debunking the hypothesis, as the above examples illustrate.

Even given iron-clad evidence of non-detection of the attack, the scientist won't claim that the attack hasn't been used. Such a claim would be overstating what's known.

Green's narrative about real attacks is, in Green's words, "intended to question" choices of "how to devote defensive time". Green has well over 100000 Twitter followers, including journalists and people deciding how research funding is spent. The first commentator in that particular thread was Josh Baron, who since 2017 has been at DARPA allocating grant funding for cryptography.

Is it possible for a narrative to turn into an article of faith shared among researchers, funding agencies, and journalists, influencing choices of research directions and protective actions, without any of the believers scientifically evaluating whether the narrative is correct? Maybe even with the narrative being dangerously inaccurate?

Side note 1, regarding DARPA: I categorically recommend against taking military funding. But the military/non-military distinction has no evident connection to Green's narrative.

Side note 2, in case you're thinking "Hmmm, could confirmation bias be driving, e.g., the rule of public analysis of cryptosystems?": Yes. It's good to ask this question and to think about ways to scientifically collect evidence for and against. Don't let cryptographers intimidate you into not asking the question.

Let's move on to a simpler example, the following note in a 28 June 2022 Cloudflare blog post titled "Hertzbleed explained":

Telling people that the demo is on non-"conventional" cryptography is one way of telling people that action isn't required. But why was the dividing line between "conventional" and non-"conventional" cryptography supposed to be relevant to these attacks?

Confirmation bias instantly makes up answers to this question. I've heard people claiming, for example, that SIKE was uniquely vulnerable because SIKE software is particularly slow. But this dividing line was incoherent (someone attacking a faster operation can trigger more repetitions to turn it into a slower operation), and the conclusion was wrong, as further attacks illustrated.

Let's try one more example. This example is a preemptive warning about an error that I haven't seen yet but that can easily be created by confirmation bias.

The starting point for this last example is the 2H2B paper mentioned above, which says that, for ECDSA and Classic McEliece, it was unable to saturate the CPU with a "request-per-TCP-connection server", so it configured a different type of server "for the sake of demonstration". The paper also says "We do not claim that any deployed server uses this configuration".

Say you're reading those quotes and want to believe that action isn't required. Confirmation bias will then tell you, aha, normal request-per-TCP-connection servers are a safe harbor against the attack.

But that's not what the 2H2B paper says, and Section 4.5 of Intel's AES paper already explains why it isn't true.

There is no reason that the targeted server has to be the sole source of CPU load. The attacker can instead trigger a mix of operations. For example, consider the following mix:

Data-dependent power variations in the targeted server will then sometimes cross the line, producing frequency changes visible to the attacker. These variations have been muted by a factor 4 compared to running the targeted server at 100% load, but this has to be compared to the distance to the edge, which can drop by much more than a factor 4 if the background operations are tuned carefully enough.

My experience using those options, starting long before any security benefits were identified, is that they work great. The performance impact is minor and is outweighed by the advantages mentioned below. The options are in any case trivially reversible in situations where they turn out to be truly unaffordable.

One way to stop action here is to deny that there's a security problem; that was covered in the first part of this blog post. The rest of this blog post looks at two more ways to stop action: (1) use histrionics and hype to remove action from the Overton window; (2) pass the buck.

Turbo Boost Max Ultra Hyper Performance Extreme. The IBM PC was released in 1981 with an Intel 8088 CPU running at 4.77MHz, and rapidly became "the primary target for most microcomputer software development", in the words of Wikipedia, pushing Apple down to second place.

Video games were popular computer applications, same as today. Programmers adjusted the speed of actions inside video games to make the games fun for humans, same as today. In particular, IBM PC video-game programmers would insert N-instruction delay loops into their games, or decide that each main-loop iteration would advance the game's physics simulation by N time steps, either way tuning N to provide the best user experience. Changes of N had predictable effects on the user-visible game speed, since the CPU always ran at 4.77MHz. (More context.)

But then faster new PC compatibles appeared, such as the IBM PC/AT, which was released in 1984 with an Intel 80286 CPU. Suddenly the carefully tuned video games were running faster, often to the point of unplayability.

So the video games were all rewritten to use CPU-speed-independent timers, right? Eventually, yes, but software rewrites take time.

In the meantime there was a widely deployed stopgap to handle old software: a button that slowed down the CPU to make the original video games playable again. The circuits inside the CPU work fine at a lower clock speed.

Some marketing genius had the idea of labeling the slowdown button as a "Turbo" speedup button, a constant reminder of the new CPU being faster (except for when you slowed it down).

The word "turbo" communicated speed, same as today, as illustrated by the 1983 release of the Turbo Pascal compiler.

(This meaning of "turbo" comes from "turbochargers", devices that increase engine efficiency by using turbines to compress air entering the engine. "Turbo" in Latin means "whirlwind".)

CPU technology continued to improve after that, using smaller and smaller circuits to carry out each bit operation. Because of these technology improvements, Intel was able to fit more computation inside the same power budget and the same affordable cooling solutions.

Intel increased clock frequencies from a few MHz to a few GHz. Intel added 64-bit vector instructions, then 128-bit vector instructions, etc., handling more bit operations per clock cycle. Intel also started expanding the number of cores on its CPUs.

Programmers who rewrote their software to take advantage of vector instructions and multiple cores gained more and more speed—but, again, software rewrites take time. Unoptimized non-vectorized single-core software didn't immediately disappear.

If a CPU has enough power and cooling to run vectorized multi-core software, and the CPU is merely asked to run unoptimized non-vectorized single-core software, presumably the CPU will have power and cooling to spare.

To use these sometimes-spare resources, Intel's Nehalem CPUs in 2008 introduced "Turbo Boost", which "automatically allows processor cores to run faster than the base operating frequency if the processor is operating below rated power, temperature, and current specification limits".

The Turbo Boost hype, starting with the name and continuing with benchmarks that do not reflect overall system performance, brainwashes large parts of the general public into believing that of course we need Turbo Boost.

The 14 June 2022 version of the Hertzbleed page (here's a PDF) recommended against turning off Turbo Boost, and claimed that turning it off would have an "extreme system-wide performance impact". I challenged this claim:

The Hertzbleed page changed "extreme" to "significant" without issuing an erratum, without changing its recommendation, and without providing any numbers.

The Hot Pixels paper similarly says "disabling DVFS entails severe practical drawbacks", without quantifying the alleged severity.

The 2H2B paper's "mitigations" section doesn't even mention the possibility of turning off Turbo Boost. The paper's "background" section makes it sound as if this possibility doesn't exist:

It's certainly true that the Intel Core i7-10510U where I'm typing this, as configured by the OS, uses such dynamic adjustments by default. I changed the configuration in 2020 (when I installed the laptop) to run at minimum clock speed. Leaving out the words "by default" is wrong: it's hiding the configurability from the reader. This inaccuracy is directly relevant to the core of the paper: a side effect of running the laptop at minimum clock speed is that, whatever the load is, there are no DVFS-induced frequency adjustments.

That isn't the only inaccuracy in the "modern processors" sentence. Consider, e.g., Intel's Pentium Gold G7400 spec sheet saying "Intel Turbo Boost Max Technology 3.0: No" and "Intel Turbo Boost Technology: No". The Pentium Gold G7400 was introduced in 2022; it's a dual-core 3.7GHz Alder Lake CPU, one of Intel's most cost-effective CPUs.

(The spec sheet also doesn't mention "Burst", which seems to be a rebranding of Turbo Boost for CPUs aimed at fanless environments, with overclocking limited more by temperature than by power.)

The 2H2B paper's "conclusions" section draws an analogy between overclocking attacks and Spectre. Overclocking attacks are, however, vastly different from Spectre in the range of protective actions available to OS distributors and end users today. All of my overclockable servers and laptops have simple end-user configuration options to turn overclocking off (and, in almost all cases, options to set even lower frequencies), whereas speculative execution is baked into CPU pipelines.

I don't use my phone much, and I haven't spent much time investigating its security. I presume it overclocks. I would guess that the manufacturer knows how to turn off overclocking today with a simple OS update, but, even if the situation is actually that overclocking is baked into all phones, there's a big difference between all phones and all "modern processors". A user who has the option of protecting confidential data by moving it from a phone to a non-overclocked laptop shouldn't be told that this option doesn't exist.

Security is not the only argument against Turbo Boost. You might be wondering why it's so common for computers to have turn-off-overclocking configuration options, and why there are recently released Intel CPUs that don't even have Turbo Boost, if people are convinced that the slowdowns are "extreme" and "severe".

Overclocking produces random heat spikes, random fan-noise spikes, and, according to the best evidence available, random early hardware death. Yes, cryptographers love randomness, but most people find these effects annoying. Meanwhile the speedups from overclocking are mostly in software that hasn't been optimized—which tends to be software that doesn't have much impact on the user experience to begin with. See TAO for further discussion.

Aleksey Shipilëv has provided data supporting another answer: overclocking is bad for the environment.

As an example, Shipilëv reported wall-socket measurements of "TR 3970X, OpenJDK build + tier1 testing" as 540 kJ, 24.5-minute latency, with default settings and just 410 kJ, 26-minute latency, with overclocking disabled.

(Shipilëv also reported reaching 340 kJ, 28.5-minute latency, by limiting PPT to 125W. I would expect setting a specific medium frequency without a PPT limit to have a similar effect.)

The CPU in question, the 32-core AMD Ryzen ThreadRipper 3970X, advertises a maximum boost frequency that's more than 20% above base frequency. Maximum doesn't reflect the overall user experience: for example, this many-core build-and-test process is obtaining only a 6% speedup from overclocking.

Maybe the user still thinks that a 6% speedup justifies consuming 24% more energy. Maybe somebody else is paying the power bill.

So, Marketing VP, what do you think about "Turbo Boost Murders Baby Polar Bears"? Catchy, isn't it?

Other countermeasures. Turning off Turbo Boost etc. isn't the only way to respond to overclocking attacks. Let's look at what else people are suggesting.

The aforementioned Intel blog post says "Also note that cryptographic implementations that are hardened against power side-channel attacks are not vulnerable to this issue."

Similarly, the "Mitigations" section of the 2H2B paper consists entirely of software-level power-attack countermeasures (even though the Hertzbleed web page, which is now also the 2H2B web page, correctly observes that "The root cause of Hertzbleed is dynamic frequency scaling").

As another example, here's the complete "Mitigations" section of AMD's advisory:

It's not clear how software authors are supposed to follow the "key-rotation" suggestion. Users have long-term keys and many other long-term secrets. Even if it's feasible to redesign and redeploy every cryptographic protocol to erase every key in 5 minutes, and even if this is fast enough to stop these attacks, what are we supposed to do about all the other user secrets?

It's even less clear how software authors are supposed to follow the "hiding" suggestion. The literature on "hiding" explains a variety of techniques under this name, but with an emphasis on hardware modifications such as DRP logic.

Okay, okay, there are some software "hiding" techniques. But Mangard–Oswald–Popp already commented in their 2009 power-attacks book that "hiding countermeasures that are implemented in software protect cryptographic devices only to a limited degree": for example, dummy operations and shuffling "do not provide a high level of protection", and instruction selection "is usually not sufficient to provide protection against DPA attacks".

Let's focus on the "masking" suggestion. Here it's much more clear what software authors are being asked to do. To build software with, e.g., "2-share XOR masking", you store each secret bit s as a random bit r and a separate bit XOR(r,s). There are then various details of how to carry out computations on these bits, how to safely generate the necessary randomness, etc.

For example, mkm4 is an implementation of Kyber for ARM Cortex-M4 CPUs, using a mix of 2-share XOR masking and 2-share "arithmetic" masking. The mkm4 paper

Wait a minute. If masking creates so much slowdown, and if people are recommending against turning off Turbo Boost because of the supposedly extreme performance impact, then how can people be recommending masking?

I'd like to imagine an answer driven by engineers measuring the overall system costs. We're talking about a slowdown in software handling secrets, so let's start by measuring the fraction of computer time spent on that software. Also, let's measure the actual effect of Turbo Boost. (And, with Fluffy in mind, let's measure energy usage.)

Occam's razor says, however, that the actual reason for recommending masking and Turbo Boost is a much simpler aspect of human behavior, namely shifting blame.

It's common for a problem with a large system to be something involving interactions between multiple components of the system. The people in charge of component X then have an incentive to say that, no, this problem should be addressed by component Y. Maybe at the same time Y is blaming Z, and Z is blaming X.

In the case of overclocking attacks, the people with control over Turbo Boost, such as OS distributors, have an incentive to say that the problem should instead be addressed by people writing software handling secrets. Meanwhile the people writing software handling secrets have an incentive to say that the problem should instead be addressed by the people with control over Turbo Boost.

Even if everybody starts with a shared understanding that there's an important security problem at hand, the decomposition of responsibility can easily produce paralysis.

Users who hear about the problem and want to protect themselves are much more likely to consider all options, but let's assume the user hasn't heard. What should OS distributors be doing? What should software authors be doing?

The simplest way out of the finger-pointing logjam is to observe that turning off Turbo Boost etc. stops attacks immediately, whereas asking for masked software leaves users exposed for much longer.

The point here is that only a small corner of the current cryptographic software ecosystem includes masked software (never mind all the non-cryptographic user data that should also be kept confidential). Sure, you can find the 2-share-masked implementation of kyber768 for Cortex-M4, but where's the masked version of OpenSSL for Intel CPUs?

This gives a clear rationale for turning off Turbo Boost right now, as TAO recommends.

Audit difficulties as a risk indicator. There's also a more fundamental rationale for keeping Turbo Boost turned off for the foreseeable future, even in a world of masked software: auditability.

There is, as noted above, a standard methodology, TVLA, for assessing side-channel leakage. TVLA does not work.

This is not a controversial statement. There is one attack paper after another extracting secrets from implementations passing TVLA. Buried on page 9 of the mkm4 paper is an admission that 2-share masking "is not enough to achieve practical side-channel resistance". A followup attack paper titled "Breaking a fifth-order masked implementation of CRYSTALS-Kyber by copy-paste" demonstrates how easy it is to break not just TVLA-"verified" mkm4 but an extension of mkm4 to use more shares.

Saying that an implementation passed a week of TVLA is like saying that a cryptosystem has more than 32 key bits. Not reaching that bar is a very bad sign, but reaching that bar provides negligible security assurance.

Internally, the "copy-paste" attack paper, which is well worth reading, copies and pastes components of an n-share neural network to start training an (n+1)-share neural network.

As in other recent AI developments, researchers don't have anything like a complete explanation of how the AI is succeeding. They feed it enough data and observe that it works. Cool!

"Hey, Stable Diffusion, here are some power measurements. Please draw my secret key bits, suspended in the air, silhouetted against a summer sky darkened by wildfires."

If an implementation isn't instantly broken by the latest not-really-understood side-channel attack, do we declare that it's safe to rely on the security of that implementation?

Cryptography is hard. As noted above, there's always a risk that we've missed attacks. There are, as also noted above, some basic principles that we follow to try to manage this risk. I already reviewed these principles in a blog post seven years ago:

Now let's pick whichever masked software and see how it stacks up against these principles:

So it's incredibly risky to trust masked software to provide a meaningful level of security against power attacks.

How do we quantify these factors, so that the relationship with risk can be scientifically studied? Superficial answers are the number of years the software has been available, the number of attack papers trying to break that software, and the change in security levels produced by those attack papers (assuming that, as usual, we insist on quantitative security claims).

I'm referring to these answers as superficial because they miss the cryptanalyst's difficulties of figuring out what exactly we're attacking and what's happening inside the attacks. An obvious metric for these difficulties is the human time used for a full audit of the attack surface, although one needs error bars here to account for variations during the human's career and variations from one human to another.

Turning off Turbo Boost etc. is much easier to audit. There's documentation from Intel saying what to do. There are easy double-checks finding that, yes, the clock speeds then stay consistent up to high precision. If we assume verification of implementation correctness then, without side-channel leaks, implementation security boils down to mathematical security. The latter has its own risks, but those are shared with the masked implementations.

It's not that turning off Turbo Boost eliminates the implementation risk; see, e.g., TAO's discussion of crystals. The point is simply that we shouldn't be skipping this defense in favor of a defense that's much harder to audit.

If systems are deployed in environments where power consumption is inherently exposed to attackers, then masking seems better than giving up. Hopefully it increases attack costs. But if we're in an environment where we can simply cut off the attacker's access to power information then of course we should do that, whether or not we have masked software. As TAO says:

There's now a high-order-masked implementation of sntrup761 decapsulation for FPGAs. The accompanying paper acknowledges me for my help. I think analogously masked software will be affordable, and I don't think the work I'm doing on verification of software correctness will have much trouble handling such software. But how is an auditor supposed to end up concluding that masking is more than a small speed bump in attacks?

Maybe someday, after enough work, the community will have a clear understanding of the limits of power attacks, and will know how to design systems beyond those limits. Or maybe not.

Either way, OS distributors today should, by default, be turning off Turbo Boost.

The cr.yp.to blog

2023.06.09: Turbo Boost: How to perpetuate security problems. #overclocking #performancehype #power #timing #hertzbleed #riskmanagement #environment