The cr.yp.to blog



2023.10.03: The inability to count correctly: Debunking NIST's calculation of the Kyber-512 security level. #nist #addition #multiplication #ntru #kyber #fiasco

[Sidney Harris cartoon used with permission. Copyright holder: ScienceCartoonsPlus.com.]

Quick, what's 240 plus 240? It's 280, right?

No, obviously not. 40 plus 40 is 80, and 240 times 240 is 280, but 240 plus 240 is only 241.

Take a deep breath and relax. When cryptographers are analyzing the security of cryptographic systems, of course they don't make stupid mistakes such as multiplying numbers that should have been added.

If such an error somehow managed to appear, of course it would immediately be caught by the robust procedures that cryptographers follow to thoroughly review security analyses.

Furthermore, in the context of standardization processes such as the NIST Post-Quantum Cryptography Standardization Project (NISTPQC), of course the review procedures are even more stringent.

The only way for the security claims for modern cryptographic standards to turn out to fail would be because of some unpredictable new discovery revolutionizing the field.

Oops, wait, maybe not. In 2022, NIST announced plans to standardize a particular cryptosystem, Kyber-512. As justification, NIST issued claims regarding the security level of Kyber-512. In 2023, NIST issued a draft standard for Kyber-512.

NIST's underlying calculation of the security level was a severe and indefensible miscalculation. NIST's primary error is exposed in this blog post, and boils down to nonsensically multiplying two costs that should have been added.

How did such a serious error slip past NIST's review process? Do we dismiss this as an isolated incident? Or do we conclude that something is fundamentally broken in the procedures that NIST is following?

Discovering the secret workings of NISTPQC. I filed a FOIA request "NSA, NIST, and post-quantum cryptography" in March 2022. NIST stonewalled, in violation of the law. Civil-rights firm Loevy & Loevy filed a lawsuit on my behalf.

That lawsuit has been gradually revealing secret NIST documents, shedding some light on what was actually going on behind the scenes, including much heavier NSA involvement than indicated by NIST's public narrative. Compare, for example, the following documents:

I filed a new FOIA request in January 2023, after NIST issued its claims regarding the security level of Kyber-512. NIST again stonewalled. Loevy & Loevy has now filed a new lawsuit regarding that FOIA request.

Public material regarding Kyber-512 already shows how NIST multiplied costs that should have been added, how NIST sabotaged public review of this calculation, and how important this calculation was for NIST's narrative of Kyber outperforming NTRU, filling a critical gap left by other steps that NIST took to promote the same narrative. This blog post goes carefully through the details.

Alice and Bob paint a fence. At this point you might be thinking something like this: "Sorry, no, it's not plausible that anyone could have mixed up a formula saying 2x+2y with a formula saying 2x+y, whatever the motivations might have been."

As a starting point for understanding what happened, think about schoolchildren in math class facing a word problem:

There is a fence to paint. Alice would take 120 minutes to paint the fence. Bob would take 240 minutes to paint the fence. How long would it take Alice and Bob to paint the fence together?

The approved answer in school says that Alice paints 1/120 of the fence per minute, and Bob paints 1/240 of the fence per minute, so together they paint 1/120 + 1/240 = 1/80 of the fence per minute, so it takes them 80 minutes to paint the fence.

The real answer could be more complicated because of second-order effects. Probably Alice and Bob working together are getting less tired than Alice or Bob working alone for longer would have. In the opposite direction, maybe there's a slowdown because Alice and Bob enjoy each other's company and pause for a coffee.

Schoolchildren often give answers such as 240 − 120 = 120, or 120 + 240 = 360, or (120 + 240)/2 = 180. These children are just manipulating numbers, not thinking through what the numbers mean.

Two disciplines for catching errors. In later years of education, physics classes teach students a type-checking discipline of tracking units with each number.

Here are examples of calculations following this discipline:

The same discipline wouldn't let you add, for example, "1 fence" to "120 min": the units don't match.

This discipline avoids many basic errors. On the other hand, it still allows, e.g., the mistake of adding "120 min" to "240 min" to obtain "360 min".

What catches this mistake is a discipline stronger than tracking units: namely, tracking semantics.

The numbers have meanings. They're quantitative features of real objects. For example, 80 minutes is the total time for Alice and Bob to paint the fence when Alice is painting part of the fence and Bob is painting part of the fence. That's what the question asked us to calculate.

A different question would be the total time for Alice to paint the fence and then for Bob to repaint the same fence. This would be 120 minutes plus 240 minutes.

Yet another question would be the total time for Alice to paint the fence, and then for Bob to wait for the coat of paint to dry, and then for Bob to apply a second coat. Answering this would require more information, namely the waiting time.

All of these questions make sense. They pass type-checking. But their semantics are different.

Alice and Bob tally the costs of an attack. Alice and Bob have finished painting and are now discussing the merits of different encryption systems.

They'd like to make sure that breaking whichever system they pick is at least as hard as searching for an AES-128 key. They've agreed that searching for an AES-128 key is slightly above 2140 bit operations.

Alice and Bob are broadcasting their discussion for anyone who's interested. Let's listen to what they're saying:

How to botch the tally of costs. Imagine a government agency that has also been looking at this particular cryptosystem, but with one critical difference: the agency is desperate to say that this cryptosystem is okay.

How does the agency deal with the XYZZY attack?

One answer is to aim for a lower security goal, hyping the cost of carrying out 2130 bit operations. For comparison, Bitcoin mining did only about 2111 bit operations in 2022. ("Only"!)

But let's assume that the agency has promised the world that it will reach at least the AES-128 security level.

What does the agency do?

Here's an idea. For the costs per iteration, instead of adding 225 for computation to 235 for memory access, how about multiplying 225 for computation by 235 for memory access?

The product is 260. Multiplying this by 295 iterations gives 2155, solidly above 2143. Problem solved!

How discipline catches the error. Alice and Bob are correctly tracking the semantics of each number. The agency isn't.

The total attack cost is the number of iterations times the cost per iteration. Each iteration incurs

The agency's multiplication of these two costs makes no sense, and produces a claimed per-iteration cost that's millions of times larger than the properly estimated per-iteration cost.

This multiplication is so glaringly wrong that it doesn't even pass physics-style type-checking. Specifically, multiplying "225 bitops/iter" by "235 bitops/iter" doesn't give "260 bitops/iter". It gives "260 bitops2/iter2". Multiplying further by "295 iter" doesn't give "2155 bitops"; it gives "2155 bitops2/iter".

Agency desperation strikes back. How can the agency phrase this nonsensical calculation of a severely inflated security estimate in a way that will pass superficial review?

The goal here is for the 155 to sound as if it's simply putting together numbers from existing sources. For example:

The deception here is in the third step, the step that leaps from cost 225 per iteration to cost 260 per iteration.

How many readers are going to check the third source and see that it was actually estimating cost 235 per iteration?

Streamlining the marketing. The wrong calculation sounds even simpler if there's a previous source that has already put the 295 and the 225 together:

At this point the agency has completely suppressed any mention of iterations, despite the central role of iterations in the attack and in any competent analysis of the attack.

How many readers are going to check both sources, see that the second source estimates cost 235 per iteration, and see that the iteration count in the first source is far below 2120?

Kyber's limited selection of security levels. You might be thinking something like this: "Okay, sure, I see how it would be possible for a desperate agency to replace cost addition with a nonsensical multiplication, replacing 2130 with a fake 2155, while at the same time making this hard for people to see. But why would anyone have wanted to play this risky game? If Kyber-512 was around 2130, and the target was a little above 2140, why didn't they just bump up the parameters to 10% higher security, something like Kyber-576?"

This is an obvious question given that RSA and ECC and (to take some post-quantum examples) McEliece and NTRU naturally support whatever size you want.

A long, long time ago, I wrote fast software for the NSA/NIST P-224 elliptic curve, and then found a better curve at that security level, namely y2 = x3 + 7530x2 + x mod 2226−5. But then I decided that bumping the size up to 2255−19 would be much more comfortable, so I did.

Kyber is different. You can't just bump up Kyber's parameters to 10% higher security:

One of the "unique advantages of Kyber" specifically advertised in the official Kyber documentation is that implementing a "dimension-256 NTT" handles "all parameter sets" for Kyber (emphasis in original). This "NTT" isn't something optional for Kyber implementors; it's baked into the structure of Kyber's public keys and ciphertexts. Using dimensions that aren't multiples of 256 would require changing the core Kyber design.

The same Kyber "advantage" also means that going beyond 1024 would lead to performance issues and, more importantly, security issues surrounding occasional "decryption failures" forced by the prime baked into the NTT. Avoiding this would again require changing the core Kyber design.

For comparison, NTRU options targeting higher security levels—including simple proofs that there are no decryption failures—are readily available. For example, one of the NTRU Prime options is sntrup1277.

But let's assume that NIST doesn't care about Kyber's limitations at the high end. Let's instead focus on the low end, specifically on applications that have limited sizes for public keys and/or ciphertexts and thus can't use the highest available security levels.

An application limited to 1KB can't use Kyber-768 (1184-byte public keys, 1088-byte ciphertexts). The highest-security Kyber option for that application is Kyber-512 (800-byte keys, 768-byte ciphertexts).

The same application obtains higher security with NTRU, according to a security-estimation mechanism called "Core-SVP". For example, the application can use

while the current version of Kyber-512, starting with the round-3 version from 2020, has Core-SVP just 2118.

Is this "Core-SVP" something I made up to make Kyber look bad? Absolutely not:

This blog post has bigger fish to fry, so let's blindly accept Kyber's claim that the new definition is better, meaning that Kyber-512 has Core-SVP 2118. That's still clearly worse than the 2129 for sntrup653 and the 2145 for NTRU-677.

It's not that Kyber's competitors always beat Kyber in size-security tradeoffs. For example, if an application instead has a limit of 1184 bytes, then it can use Kyber-768, which has Core-SVP 2181, while ntruhps needs 1230 bytes to reach Core-SVP 2179.

But Kyber's competitors often beat Kyber in size-security tradeoffs. Throwing away Kyber-512, leaving just Kyber-768 and Kyber-1024, means that Kyber has nothing as small as the 931 bytes for NTRU-677.

The normal way for scientists to present quantitative tradeoffs is with scatterplots, such as Figure 3.5 in my 2019 paper "Visualizing size-security tradeoffs for lattice-based encryption". The particular scatterplot shown here is Figure 7.3 in the 2021 paper "Risks of lattice KEMs" from the NTRU Prime Risk-Management Team. The vertical axis is the Core-SVP security estimate, and the horizontal axis is ciphertext bytes.

The scatterplot shows that Kyber has a higher Core-SVP than NTRU for applications with a size limit of, e.g., 768 bytes or 1088 bytes. But NTRU has a higher Core-SVP than Kyber for applications with a size limit of, e.g., 700 bytes or 1024 bytes or 2048 bytes. Kyber has nothing as small as the 699-byte option for NTRU. Kyber also has nothing as strong as the 1842-byte option for NTRU. NTRU is also trivially capable of adding further options between and beyond what's shown in the graph, whereas for Kyber this is more problematic.

Official evaluation criteria for the competition. NIST had issued an official call for post-quantum proposals in 2016. One of the evaluation criteria in the call was as follows:

Assuming good overall security and performance, schemes with greater flexibility will meet the needs of more users than less flexible schemes, and therefore, are preferable.

One of the official examples given for "flexibility" was that it is “straightforward to customize the scheme's parameters to meet a range of security targets and performance goals".

The call proposed five broad security "categories", and said that submitters could specify even more than five parameter sets to demonstrate flexibility:

Submitters may also provide more than one parameter set in the same category, in order to demonstrate how parameters can be tuned to offer better performance or higher security margins.

In 2020, NIST eliminated NewHope. One of the reasons stated in the aforementioned round-2 report was that "KYBER naturally supports a category 3 security strength parameter set, whereas NewHope does not". NewHope offered only NewHope-512 and NewHope-1024.

Imagine Kyber similarly offering only Kyber-768 and Kyber-1024, acknowledging that Kyber-512 doesn't meet the minimum security level specified by NIST. It's then very easy to see how limited Kyber's flexibility is compared to NTRU's broader, denser spectrum of security levels. How, then, would NIST argue that Kyber is the best option?

One answer is that the evaluation criteria say more flexibility is preferable only assuming "good overall security and performance". But how would NIST argue that NTRU doesn't have "good overall security and performance"?

Regarding the security of Kyber and NTRU, NIST's official 2022 selection report says that NIST is "confident in the security that each provides". The report describes MLWE, the problem inside Kyber, as "marginally more convincing" than the problem inside NTRU. There's much more that could and should have been said about the security comparison between Kyber and NTRU:

The situation isn't that NTRU avoids every security risk of Kyber. A careful comparison finds mathematical security risks in both directions. Maybe there's a way to argue that the mathematical security risks for NTRU should be given higher weight than the mathematical security risks for Kyber. But the immediate choice that NIST was facing in 2021 between NTRU and Kyber, assuming that the attackers currently recording user data will have quantum computers in the future, was between

The call for submissions said "NIST believes it is critical that this process leads to cryptographic standards that can be freely implemented in security technologies and products". Nothing else in the call was labeled as "critical". How could NIST ignore the damage that it was doing in not going ahead with NTRU? NIST knew it didn't have a patent license signed for Kyber yet, let alone an activated patent license.

Anyway, let's get back to the question of how NIST might be able to argue that NTRU doesn't have "good overall security and performance". A report saying that NIST is "confident in the security that each provides" is obviously not claiming that NTRU doesn't have "good overall security". What about performance?

The same selection report admits that "the overall performance of any of these KEMs would be acceptable for general-use applications". If the objective is to use performance differences as a deciding factor between two acceptable options, let's see how Kyber would stack up without Kyber-512:

This is a solid case for eliminating Kyber in favor of NTRU, given NIST's declaration that there can be only one.

(If NIST thought that performance differences at this scale matter, and if the best performance comes from Kyber at some security levels and NTRU at other security levels, then why wasn't NIST allowing both? Answer: The movie says there can be only one! STOP ASKING QUESTIONS!)

Tilting the competition, part 1: ignoring NTRU's extra flexibility. Keeping Kyber-512 changes the competition. Having three options, Kyber-512 and Kyber-768 and Kyber-1024, looks a lot better than having just two.

There are four NTRU circles in the first scatterplot above, namely NTRU-509 and NTRU-677 and NTRU-821 and NTRU-1229. But NTRU-821 isn't a winner, and earlier in NISTPQC there wasn't an NTRU-1229.

Wait a minute. The NTRU literature has always made clear that NTRU supports many more options. For example, here's a scatterplot from John Schanck's 2018 paper "A comparison of NTRU variants". There are a huge number of dots; each dot is showing another NTRU option.

One of the bizarre twists in NISTPQC was the following announcement from NIST in 2020: "NIST believes that too many parameter sets make evaluation and analysis more difficult." I asked various questions about this, starting as follows:

How many is "too many"? How did flexibility, which was portrayed as purely positive in the call for proposals, turn into a bad thing for NIST? The call for proposals explicitly allowed multiple parameter sets per category, never suggesting that this would be penalized!

NIST's latest report complains about NewHope's lack of flexibility to use dimensions strictly between 512 and 1024. If a submission team is thinking "Aha, Kyber similarly suffers from its lack of flexibility to target security levels strictly between maybe-2128 and maybe-2192, and we can clearly show this to NIST by selecting parameter sets at several intermediate security levels", then isn't this something NIST should be interested in, rather than discouraging by making submitters worry that this is "too many parameter sets"?

NIST never replied.

Think about what this is like for submitters trying to figure out what to do:

NTRU Prime moved up to selecting six sntrup parameter sets (plus six ntrulpr parameter sets, which, compared to sntrup, have larger ciphertexts but smaller public keys), enough that the flexibility advantage over Kyber should have been impossible to ignore. NIST ignored it.

Tilting the competition, part 2: exaggerating and hyping key-generation costs. For Intel's recent Golden Cove microarchitecture (the "performance" cores in Alder Lake CPUs), https://bench.cr.yp.to reports that

The total cycle count for handling a ciphertext, the total of encapsulation and decapsulation, is 13% smaller for NTRU-509 than for Kyber-512.

NTRU-509 also beats Kyber-512 in ciphertext size. NTRU-509 is the leftmost dot in the first scatterplot above, meaning smallest ciphertexts.

On the other hand, NTRU-509 takes 112866 cycles for key generation while Kyber-512 takes only 17777 cycles. The total of key generation plus encapsulation plus decapsulation is more than twice as large for NTRU-509 as for Kyber-512.

When some factors favor one option and some factors favor another option, someone objectively searching for the best option will think about what weight to put on each factor. Here are three reasons that a careful performance analysis will put very low weight on Kyber's key-generation speedup:

In other words: If an average key is used for just 100 ciphertexts then Kyber-512 saving 95089 Golden Cove cycles in key generation is

With this in mind, let's look at the "Kyber vs NTRU vs Saber" slide from NIST's March 2022 talk "The beginning of the end: the first NIST PQC standards".

The eye is immediately drawn to the larger red bars on the right. NTRU appears in two of the groups of bars, in both cases with clearly larger bars, meaning worse performance.

The main message NIST is communicating here is that NTRU costs strikingly more than Kyber and Saber. Only a small part of the audience will go to the effort of checking the numbers and seeing how NIST manipulated the choices in its presentation to favor Kyber over NTRU:

Regarding the last point: 1000 is just a rule of thumb. NIST could have posted a rationale for a proposal to use 500 and asked for public comments. But it didn't.

NIST's secret October 2021 Kyber-SABER-NTRU comparison claimed, without citation, that I had said 1000*(PK+CT) was reasonable. Compare this to what I had actually written in 2019 about the costs of sending and receiving a ciphertext, after various NTRU Prime documents had given examples backing up the first sentence:

Typically sending or receiving a byte costs at least three orders of magnitude more than a clock cycle. Taking bytes+cycles/1000 for sntrup4591761 gives 1047+45 = 1092 for the sender, 1047+94 = 1141 for the receiver, which is better than 1248 no matter how few cycles you use.

The numbers here account for Alice sending a 1047-byte sntrup4591761 ciphertext and Bob receiving a 1047-byte ciphertext, on top of about 45000 Haswell cycles for Alice's enc and about 94000 cycles for Bob's dec (which was later sped up a lot, but this barely matters next to the ciphertext sizes). See also the more detailed NTRU examples in Section 6.6 of "Risks of lattice KEMs", filed before NIST's deadline for input at the end of October 2021.

NIST's secret comparison continued by saying "David suggests 2000?", apparently referring to a secret performance comparison in 2020 where NIST used "bandwidth cost of 2000 cycles/byte". Evidently NIST was considering multiple options for this number. Maybe more FOIA results will shed more light on how exactly NIST ended up with a NIST-fabricated option that—quelle surprise!—is better for Kyber.

As for key reuse, NIST might try to defend itself by saying, look, there's a separate PK+CT bandwidth graph on the left, which for these KEMs is visually close to a 2000*CT+enc+dec graph. However:

As for the failure to account for Montgomery's trick, NIST might try to defend itself by saying that the OpenSSLNTRU software focused on NTRU Prime, so NIST's only choice was to presume that there's no speedup for NTRU beyond NTRU Prime. In fact, the OpenSSLNTRU paper had already explained why there will be about a 2x speedup.

Tilting the competition, part 3: concealing the fact that NTRU offers the highest security levels. The official call for submissions in 2016 recommended focusing on "categories 1, 2 and/or 3". See below for a full quote.

The call also recommended that submitters "specify some other level of security that demonstrates the ability of their cryptosystem to scale up beyond category 3". NTRU (and NTRU Prime) did this, specifying parameters across a wide range of security levels. See, e.g., the 2018 scatterplot shown above.

In the aforementioned round-2 report from 2020, NIST suddenly

This wasn't following the official evaluation criteria. NIST was retroactively changing "recommend" to "strongly encourage", was retroactively changing "beyond category 3" to "category 5", and was ignoring all of the existing documentation of NTRU's flexibility.

Submissions that provided "category 4", or provided higher security within "category 3", were fully meeting the recommendation in the official evaluation criteria:

Submitters may also provide more than one parameter set in the same category, in order to demonstrate how parameters can be tuned to offer better performance or higher security margins.

NIST recommends that submitters primarily focus on parameters meeting the requirements for categories 1, 2 and/or 3, since these are likely to provide sufficient security for the foreseeable future. To hedge against future breakthroughs in cryptanalysis or computing technology, NIST also recommends that submitters provide at least one parameter set that provides a substantially higher level of security, above category 3. Submitters can try to meet the requirements of categories 4 or 5, or they can specify some other level of security that demonstrates the ability of their cryptosystem to scale up beyond category 3.

But, in 2020, NIST wasn't even trying to follow the official evaluation criteria. It was inventing new evaluation criteria, with no warning, and retroactively applying those criteria to criticize the NTRU and NTRU Prime submissions.

Unsurprisingly, those submissions responded with software for higher security levels:

After insisting on higher security levels (and adopting Core-SVP) in its 2020 round-2 report, NIST praised NTRU for responding with higher security levels (as measured by Core-SVP) than Kyber, right?

Of course not. NIST concealed the fact that NTRU was offering higher security levels than Kyber:

Readers looking at NIST's graphs are left with the impression that NTRU is less flexible than Kyber and, in particular, has more trouble reaching high security levels. This is exactly the opposite of the facts.

Tilting the competition, part 4: throwing away the highest-performance option. NTRU-1229 and NTRU-HRSS-1373 aren't the only options that NIST excluded from its big graph. Let's again look at the low end, the top-performance end, where NIST chose to exclude NTRU-509.

Optimized NTRU-509 software was already available. If NIST had included NTRU-509 in the big graph then that graph would have shown NTRU-509 as the best performer, better than Kyber-512.

Accounting for key reuse would have further favored NTRU-509. Accounting for Montgomery's trick would have further favored NTRU-509. Upgrading from Haswell would have further favored NTRU-509. Counting 1000 cycles per byte for Alice and for Bob would have further favored NTRU-509.

But NIST simply removed NTRU-509 from the big graph, making NTRU look strictly worse than Kyber in that graph.

NIST went even further in its subsequent report selecting Kyber for standardization: the report didn't show NTRU-509 in any of the figures or tables. The report's descriptions of Kyber's performance were visibly more positive than its descriptions of NTRU's performance, as illustrated by NIST's claim that NTRU has "somewhat larger public keys and ciphertexts" than Kyber.

How does NIST stop people from quickly spotting the errors in this "somewhat larger public keys and ciphertexts" claim?

A discretization attack easily hides the fact that NTRU has smaller sizes than Kyber at intermediate security levels, but it doesn't hide NTRU-509 being smaller than Kyber-512. NIST's narrative also relied on kicking out NTRU-509.

How can NIST justify throwing NTRU-509 away?

The only possible answer is claiming that NTRU-509 doesn't reach the minimum allowed NISTPQC security level, the security level of AES-128. But, at the same time, NIST is including Kyber-512, so NIST is claiming that Kyber-512 does reach the security level of AES-128.

NTRU-509 has Core-SVP 2106, just 6 bits below Kyber-512's original Core-SVP (2112) or 12 bits below Kyber-512's revised Core-SVP (2118). Evidently NIST is claiming that AES-128 is inside this narrow margin: in other words, that NTRU-509 has slightly lower security than AES-128 while Kyber-512 has slightly higher security than AES-128.

Let's take a moment to admire how spectacularly fragile this is:

See how important it is for Kyber-512 to reach the AES-128 security level? Without that, Kyber is in big trouble: NTRU provides the highest level of security and the highest level of performance and the best flexibility.

The chaos beyond Core-SVP. How is Kyber-512 supposed to reach the AES-128 security level if AES-128 needs more than 2140 bit operations to break while the Core-SVP security estimate for Kyber-512 is only 2118?

This question was briefly addressed in the round-1 Kyber submission in 2017. That submission said that the 2017 version of Kyber-512 had Core-SVP 2112, falling short of the target by 30 bits, but gave a five-line list of reasons that "it seems clear" that Kyber-512 has at least 30 bits more security than Core-SVP indicates.

The round-2 Kyber submission in 2019 made the same claim regarding the 2019 version of Kyber-512. In 2020, I disproved the stated rationale. To summarize:

To try to rescue Kyber-512, the round-3 Kyber submission

The memory argument again wasn't relevant, given the official evaluation criteria asking for 2143 "classical gates". Kyber-512 wasn't claiming to require 2143 "classical gates" to break; it was claiming some undetermined number between 2135 and 2167.

Various papers then appeared claiming to cut further bits out of lattice security in various ways, such as a 2022 paper reporting an order-of-magnitude speedup from tweaking the "BKZ" layer inside attacks. Many of the papers made the analyses of lattice security even more complicated and even less stable than before. For example, for one line of "dual attacks":

And this is just one small piece of a giant unholy mess that some cryptographers say we should trust.

How, back in 2022, did NIST end up concluding that Kyber-512 is as hard to break as AES-128? Time to look at some quotes. I'll go through the quotes in two parts: first, looking at what NIST said its notion of hardness was; second, going line by line through what NIST said about Kyber-512's security level.

NIST rescuing Kyber-512, part 1: manipulating the qualification criteria. In the call for submissions, it was crystal clear that cryptosystems had to be at least as hard to break as AES-128 in every "potentially relevant" cost metric:

Each category will be defined by a comparatively easy-to-analyze reference primitive, whose security will serve as a floor for a wide variety of metrics that NIST deems potentially relevant to practical security. ...

In order for a cryptosystem to satisfy one of the above security requirements, any attack must require computational resources comparable to or greater than the stated threshold, with respect to all metrics that NIST deems to be potentially relevant to practical security.

(Emphasis in original.)

The call commented on the "classical gates" to break AES-128 etc. Obviously "classical gates" were a "potentially relevant" cost metric.

What exactly is this metric? The literature defines many different gate sets. NIST dodged years of requests to define exactly which gates it was including as "classical gates". NIST's 2022 selection report finally pinned down one part of this, allowing "each one-bit memory read or write" as a cost-1 gate.

Here's an illustration of how important definitions of cost metrics are:

Keep this in mind if you hear people claiming that the costs of lattice attacks have been thoroughly analyzed.

Anyway, being able to access arbitrarily large amounts of memory for cost 1 isn't realistic: the actual costs of data communication grow with distance. But NIST said in 2020 that anyone proposing a replacement metric "must at minimum convince NIST that the metric meets the following criteria", which "seems to us like a fairly tall order":

There have been no announcements on the NISTPQC mailing list of anyone claiming to have met these minimum criteria, never mind the question of whether such a claim could survive public scrutiny.

Recall that NIST excluded NTRU-509 from the figures and tables in its selection report, the report announcing the selection of Kyber over NTRU. If you look for the report's explanation of why NIST excluded NTRU-509, you'll find the following quote:

The submission specification uses both local and non-local cost models for determining the security category of their parameter sets. For a more direct comparison with the other KEM finalists, the assignment of security categories according to the non-local cost model is appropriate. This is what NIST used for NTRU in the figures and tables in this report.

The underlying definition of "local" accounts for long-distance communication costs, whereas "non-local" allows accessing arbitrarily large amounts of memory for free.

Everything I've been describing from NIST above, and more, sounds consistent with the official call for submissions asking for 2143 "classical gates", not counting the costs of memory access:

With this in mind, consider the fact that NIST was including Kyber-512 in its figures and tables in the same report. This must mean that NIST was claiming that breaking Kyber-512 takes at least 2143 bit operations, without accounting for memory-access costs, right?

Nope. NIST doesn't ask Kyber to meet the same criteria as other submissions.

In November 2022, NIST announced a list of parameter sets that it was "planning" to standardize, including Kyber-512. NIST's announcement avoided claiming that Kyber requires as many "classical gates" to break as AES-128. The announcement specifically acknowledged the possibility of Kyber being "a few bits" below (while omitting the possibility of Kyber being many more bits below):

It is clear that in the gate-count metric it is a very close call and that in this metric the pre-quantum security of Kyber-512 may be a few bits below the one of AES-128.

Instead the announcement relied on accounting for "realistic memory access costs" to claim that Kyber-512 qualified for "category 1":

... the best known attacks against Kyber-512 require huge amounts of memory and the real attack cost will need to take the cost of (access to) memory into account. This cost is not easy to calculate, as it depends on the memory access patterns of the lattice algorithms used for cryptanalysis, as well as the physical properties of the memory hardware. Nonetheless, barring major improvements in cryptanalysis, it seems unlikely that the cost of memory access will ever become small enough to cause Kyber-512 to fall below category 1 security, in realistic models of security that take these costs into account. We acknowledge there can be different views on our current view to include Kyber-512.

As a point of clarification: in this email, we refer to parameter sets based on the claimed security strength category where those parameter sets are most recently specified, irrespective of whether those parameter sets actually meet their claimed security level. That said, our current assessment is that, when realistic memory access costs of known attacks are taken into account, all the parameter sets we plan to standardize do, in fact, meet their claimed security strength categories.

(Emphasis added.)

So NIST used a "non-local" free-memory metric to kick out NTRU-509, but used a memory-access-is-expensive metric to claim that Kyber-512 qualifies for "category 1". Can anyone tell me how these two things make sense together?

(As a side note, NIST subsequently stated that its 2022 selection report was merely reflecting "the submitters' claimed security categories" and that the report was making no "assertions about whether or not the parameter sets actually provided the claimed level of security". How does NIST reconcile this with the report kicking out NTRU-509 while keeping Kyber-512? Both of those submissions were claiming to achieve "category 1" given memory-access costs.)

For anyone who cares about reviewability of security analyses, NIST's sudden switch to accounting for Kyber's memory-access costs should be setting off alarm bells.

None of the official Kyber security analyses had tried to quantify the effects of memory on security levels. The Kyber documentation had merely pointed at memory as supposedly saving the day in case there weren't enough "gates".

In the absence of an analysis, how exactly was NIST concluding that memory-access costs were enough to close the gap?

NIST rescuing Kyber-512, part 2: NIST's botched security analysis. In early December 2022, I asked how NIST was arriving at its conclusion that Kyber-512 was as hard to break as AES-128.

NIST followed up with an explanation on 7 December 2022. I'll refer to this explanation as "NIST's botched security analysis of Kyber-512"; for brevity, "NISTBS".

One of the complications in NISTBS is that it considers a large space of scenarios, with analysis steps mixed into comments on the likelihood of each scenario. Even worse, NISTBS doesn't give any confirming end-to-end examples of the tallies obtained in each particular scenario. So a security reviewer has to trace carefully through each step of NISTBS.

Here's one example of a scenario from within the space that NISTBS specifies. I'll call this "scenario X" for future reference. Scenario X makes the following three assumptions:

Obviously the quantitative conclusions of NISTBS vary depending on the exact assumptions. Considering scenario X is simpler than considering the full space of scenarios. I'll use scenario X as an example below.

Without further ado, here's every line of NISTBS, NIST's botched security analysis of Kyber-512.

We can elaborate a little bit further on our reasoning leading to our current assessment that Kyber512 likely meets NIST category I (similar considerations apply to the other parameter sets we plan to standardize for lattice-based schemes.)

This is a preliminary statement regarding the importance of the calculations at hand. See below for the calculations.

That said, beyond this message, we don’t think further elaboration of our current position will be helpful. While we did consult among ourselves and with the Kyber team,

I filed a formal complaint in December 2022 regarding NIST's lack of transparency for its investigation of Kyber-512's security level. As noted above, I filed a new FOIA request in January 2023.

it’s basically just our considered opinion based on the same publicly available information everyone else has access to.

This is not true. NISTBS starts from, e.g., the Matzov paper's 2137 estimate for "gates", but then goes beyond this in quantifying the impact of memory costs, something the Matzov paper definitely did not do. What we'll see later is how NISTBS botches its own calculations starting from the Matzov number.

The point of this thread is to seek a broader range of perspectives on whether our current plan to standardize Kyber512 is a good one, and a long back and forth between us and a single researcher does not serve that purpose.

Public review of NIST's security evaluations requires transparency and clarity regarding those evaluations. It is not appropriate for NIST to be asking for a range of perspectives while concealing information. An open and transparent process would involve less "back and forth" than the process that NIST chose.

Here's how we see the situation: In April this year, “Report on the Security of LWE” was published by MATZOV (https://zenodo.org/record/6412487#.Y4-V53bMKUk), describing an attack, assessed in the RAM model to bring some parameter sets, including Kyber512, slightly below their claimed security strength categories.

This is the most recent attack paper mentioned in NISTBS. That's why my definition of scenario X says "the most recent attack paper taken into account (Matzov)".

It's surprising that NISTBS doesn't mention any of the newer attack papers. NIST had hypothesized that there are no "major improvements in cryptanalysis" (see full quote above), but this doesn't justify ignoring the improvements that have already been published!

Anyway, given that NISTBS is calculating security levels starting from the Matzov paper, let's look carefully at those calculations.

"Assessed in the RAM model" appears to be referring to the Matzov paper counting the number of "gates". As a side note, "the" RAM model is ambiguous; the literature defines many different RAM models, and many different sets of "gates", as noted above.

In particular, the report estimates the cost of attacking Kyber512 using a classical lattice attack to be 2137 bit operations, which is less than the approximately 2143 bit operations required to classically attack AES-128.

NISTBS takes this 137 as the foundation of various calculations below.

This doesn't mean NISTBS is saying Kyber-512 is broken in 2137 "gates". NISTBS is saying that Matzov estimated 137, and then NISTBS is calculating various consequences of the 137. If the 137 is inaccurate then the details of the NISTBS calculations (see below) go up or down accordingly.

For purposes of putting together the sources available, the simplest case to consider is that 2137 accurately counts the number of "gates". Scenario X explicitly assumes this.

However, like previous lattice attacks, the MATZOV attack is based on sieving techniques, which require a large amount of (apparently unstructured) access to a very large memory.

In announcing its plans to standardize Kyber-512, NIST had said that "the best known attacks against Kyber-512 require huge amounts of memory"; here NISTBS is reiterating this.

The RAM model ignores the cost of this memory access,

Indeed, the "gate" counts in question ignore the cost of memory access.

and while the science of comparing the cost of memory access to other costs involved in a large cryptanalytic attack is not as mature as we would like, it seems overwhelmingly likely that, in any realistic accounting of memory access costs, these will significantly exceed the costs that are assessed by the RAM model for lattice sieving.

Here are three obvious examples of quantitative questions raised by this part of NISTBS. Quantification is essential for cryptographic security review.

First, what exactly does "significantly" mean in this context?

Second, how does NISTBS reach its "overwhelmingly likely ... significantly exceed" conclusion?

Third, how does NISTBS get from "significantly exceed" to its conclusion that having Kyber-512 fall short of AES-128 is "unlikely"? (Assuming no "major improvements in cryptanalysis".)

NISTBS does eventually get to some quantified calculations; see below.

The largest practical implementation of sieving techniques we know of, described in detail in “Advanced Lattice Sieving on GPUs, with Tensor Cores” by Ducas, Stevens, and van Woerden (https://eprint.iacr.org/2021/141), was forced by memory access limitations, to adopt settings for bucket size, that would be suboptimal according to the RAM model.

Something else unclear from this part of NISTBS is whether "bucket size ... suboptimal" is supposed to imply NIST's "significantly" claim regarding "costs", and, from there, NIST's claim that it's "unlikely" for Kyber-512 to be easier to break than AES-128.

It should be noted that, increasing the scale of the instances being attacked to near cryptographic scale would probably require extensive hardware optimization, e.g. by using special purpose ASICs, and these techniques, being generally acknowledged to be less effective against memory-intensive tasks, would likely make memory access even more of a bottleneck.

Qualitatively, this is a reasonable summary of what the literature on point is saying. However, at this point the reader still doesn't know how NISTBS gets from this to the claim that Kyber-512 is "unlikely" to be below the AES-128 security level.

Additionally,

This is where NISTBS transitions into quantification.

While the Kyber, Dilithium, and Falcon teams did not give a quantitative assessment of the practical cost of memory access during sieving against cryptographic parameters, assessments by the NTRU and NTRUprime teams gave estimates that would suggest the cost of sieving against category 1 parameters, in models that account for the cost of memory access, is something like 20 to 40 bits of security more than would be suggested by the RAM model.

Finally some numbers to work with! See below for how NISTBS uses these numbers.

As a side note, NIST seems to have very low confidence in the numbers it's citing, saying not just "estimates" but also "suggest" and "something like". But the question I want to focus on here is not how confident NIST is in the sources that it cites. The question is simply what security level NISTBS is calculating for Kyber-512 starting from the sources it cites.

Scenario X explicitly assumes accuracy of one of the two sources that NISTBS cites, specifically NTRU Prime. In context, this choice of source is favorable to Kyber: NISTBS points to NTRU Prime as giving Kyber a 40-bit bonus, and points to NTRU as giving Kyber only a 20-bit bonus.

(For NTRU’s estimates see section 6.3 of the round 3 specification document available at https://ntru.org/index.shtml . For NTRUprime’s estimates see section 6.11 of https://ntruprime.cr.yp.to/nist/ntruprime-20201007.pdf .

Scenario X specifically assumes "accuracy of the RAM-cost model in the NTRU Prime documentation", one of the two sources that NISTBS relies upon for its quantification. See below for the numbers that NISTBS obtains from this source.

The Kyber spec (available at https://pq-crystals.org/kyber/data/kyber-specification-round3-20210804.pdf) discusses, but does not quantify, memory access costs in section 5.3 (Q6))

Indeed, what's cited here doesn't quantify this. So let's keep going with the numbers that NISTBS obtains from other sources.

Taking Matzov's estimates of the attack cost to be accurate,

This is exactly what scenario X is assuming. Of course, NISTBS also considers other possibilities, but, as an illustrative example, let's follow through what NISTBS obtains from this assumption.

only 6 bits of security from memory access costs are required for Kyber512 to meet category 1,

Indeed, 137 is "only" 6 bits short of the 143 goal. NIST wants to find 6 bits of security that it can credit to Kyber-512, plus so much security margin that it can claim not to be worried about the "known unknowns" etc. The point of NISTBS is to argue that the costs of memory do the job.

so in this case Kyber512 would meet category 1 even if the NTRU and NTRUprime submission significantly overestimate the cost of memory access in lattice sieving algorithms.

Here NIST is finding more than its desired 6 bits of security, by giving Kyber the aforementioned "20 to 40 bits" coming from "assessments by the NTRU and NTRUprime teams" of the extra costs coming from memory access.

For example, if NTRU says 20 and if this is accurate, then NISTBS is calculating a security level of 137+20 = 157, safely above 143. (Again, this is explicitly assuming accuracy of the 137 in the first place.)

As another example, if NTRU Prime says 40 and if this is accurate, then NISTBS is calculating a security level of 137+40 = 177, even farther above 143. (Once again assuming accuracy of the 137.)

See how simple this calculation is? NISTBS points to its sources as saying that there are actually "20 to 40 bits of security more than would be suggested by the RAM model" (in NIST's words). So NISTBS adds 20 or 40 to Matzov's 137, giving 157 or 177.

NIST says that even if those sources have "significantly" overestimated the memory-access cost then Kyber-512 is still okay. To figure out what NIST means by "significant" here, simply work backwards from NIST's desired conclusion: if "20 bits" is overestimated by as many as 14 bits, then that still leaves 20−14 bits, covering the desired 6 bits. Anyway, Scenario X simply assumes accuracy of the NTRU Prime RAM-cost model.

Further, since about 5 of the 14 claimed bits of security by Matzov involved speedups to local computations in AllPairSearch (as described by section 6 of the MATZOV paper), it is likely that Kyber512 would not be brought below category 1 by the MATZOV attack, as long as state of the art lattice cryptanalyses prior to the MATZOV paper were bottlenecked by memory at all.

It's of course correct that if there's a bottleneck then speeding up computations outside the bottleneck has little impact. See below for how NIST seems to be using this to claim even more security.

However, we acknowledge there is some additional uncertainty in the exact complexity of the MATZOV attack (and all other sieving-based lattice attacks) due to the known-unknowns Dan alludes to (described with quantitative estimates in section 5.3 of the Kyber spec.)

Three reasons that it might be possible to beat Matzov's 2137 "gates" are (1) inaccuracies in Matzov's analysis (of course, these could also point the other way), (2) missing optimizations covered by the "known unknowns", and (3) missing optimizations beyond the "known unknowns".

Here NIST is pointing to #2. As a side note, it's disturbing to not see NIST accounting for #1 and #3. NIST explicitly assumed that there are no "major" improvements in cryptanalysis; but some of its scenarios have Kyber with very few bits of security margin, and closing those wouldn't require "major" improvements.

Scenario X skips this complication: it explicitly assumes that the 137 is accurate, and that there are no improvements from the "known unknowns".

Nonetheless, even taking the most paranoid values for these known-unknowns (16 bits of security loss),

This is what the Kyber documentation says is the worst case, yes.

the cost of memory access and/or algorithmically making memory access local, would still need to be less than what both the NTRU and NTRUPrime submissions assume.

I found this puzzling when I first saw it: if we take 137, and then subtract a hypothesized 16, then we need to find 22 bits, which is less than the 40 that NISTBS mentioned but not less than the 20. What's going on?

The best explanation I could come up with is that NIST thinks the 16 overlap the 5 bits that NISTBS mentioned above from Matzov, so NIST is actually taking 137−16+5, meaning that NIST has to find only 17 bits, and then the 20 that NISTBS attributes to NTRU is enough (at least if we disregard the uncertainties conveyed by "estimate" and "suggest" and "something like").

Again, Scenario X simply assumes that the 137 is accurate, with no speedups from the "known unknowns", so this complication doesn't arise for that scenario.

The low end estimate of approximately 20 bits (from the NTRU submission) is based on a conjecture by Ducas that a fully local implementation of the BGJ1 sieving algorithm is possible.

Here NIST is pointing to a reason to ask whether the NTRU model is too low. Scenario X explicitly takes the NTRU Prime model, which doesn't trigger this particular issue.

So, in the case that all known-unknowns take on the most paranoid values, this would either require a sieving algorithm with local memory access that is much better than any such published algorithm, and in fact better than any that has been conjectured (at least as far as we are aware),

This is summarizing the NISTBS calculations from the perspective of what algorithmic improvements would be required to break NIST's conclusions. This isn't relevant to scenario X.

or it would require the approximately 40 bits of additional security quoted as the "real cost of memory access" by the NTRUprime submission to be a massive overestimate.

This is summarizing the NISTBS calculations from the perspective of what modeling errors would be required to break NIST's conclusions.

It's concerning to observe deviations between what NISTBS attributes to its source here and what the source actually says. For example, the source says that it's estimating the cost of memory access, whereas NIST incorrectly makes it sound as if the source is mislabeling an estimate as a fact. Furthermore, contrary to what NISTBS's "quoted as" claim leads readers to believe, the "40 bits" that NISTBS claims as memory overhead is not a quote from what the source says on this topic.

Presumably NIST obtained 40 in the following easy way: look at the security-level table on page 103 of the source; observe that pre-quantum sieving for sntrup653 at the top is listed as 169 and 129 for "real" and "free" respectively; subtract the 129 from the 169.

In any event, a lot of things would have to go wrong simultaneously to push the real-world classical cost of known attacks against Kyber512 below category 1, which is why we don't think it's terribly likely.

This is going beyond the per-scenario calculations into an overall probability conclusion.

As a final note, known quantum speedups for lattice sieving are much less effective than Grover’s algorithm for brute force key search, so in the likely scenario where the limiting attack on AES128 is Grover’s algorithm, this would further increase the security margin of Kyber512 over AES128 in practice.

This is yet another complication, and one with several unquantified steps. It's also blatantly inconsistent with earlier comments from NIST on the impact of Grover's algorithm.

For example, in email dated 11 Sep 2017 13:48:59 +0000 to pqc-forum@nist.gov (before the list moved to Google), NIST wrote that "even if we assume the sort of quantum technology often suggested to be possible in 15 years (e.g. ~1GW power requirement and a few hours to factor a 2048 bit number), current technology can still do brute force search cheaper than Grover’s algorithm". Where are the numbers backing up NIST's new claim that Grover's algorithm is "likely" the top threat?

Surely NIST agrees that pre-quantum metrics are at least "potentially" relevant to the practical security of Kyber-512. Consequently, under the official evaluation criteria, NIST can't use post-quantum metrics as a way to rescue Kyber-512 if Kyber-512 is easier to break than AES-128 in the pre-quantum metrics.

I'll focus below on how NISTBS botched its calculation of the pre-quantum Kyber-512 security level.

What the underlying numbers actually mean. Core-SVP is a rough estimate for the number of iterations in a particular type of lattice attack. Each iteration involves large-scale memory access and computation.

Let's look at how the latest versions of the documentation for two submissions, NTRU Prime and Kyber, convert their estimates for the number of iterations into larger security-level estimates. (Note that both of the documents in question are from 2020, so the numbers don't include subsequent attack improvements.)

NTRU Prime focuses on the cost of memory access. In particular, for the important task of sorting N small items, a two-dimensional circuit of area essentially N needs time essentially N1/2, whereas a circuit of the same area running for the same time can carry out essentially N3/2 bit operations.

To put these two types of costs on the same scale, the NTRU Prime documentation estimates "the cost of each access to a bit within N bits of memory as the cost of N0.5/25 bit operations", and explains how the 25 comes from analyzing energy numbers reported by Intel.

As a concrete example:

For comparison, recall that Kyber-512 says Core-SVP 2118. A rough estimate for the cost of memory accesses in this Kyber-512 attack is the equivalent of 2154 bit operations.

This might sound similar to the Kyber documentation estimating 2151 bit operations ("gates"). But the 2151 estimate in the Kyber documentation isn't an estimate of the bit-operation equivalent of memory access. It's ignoring memory access. It's instead considering the number of bit operations used inside the attack's computations, and estimating that this number is somewhere between 2135 and 2167, given the "known unknowns".

Agency desperation, revisited. With the meaning of the numbers in mind, let's briefly summarize how NISTBS tries to use computations and memory to push up the claimed security level of Kyber-512:

Indeed, Core-SVP estimates 2118 iterations, at least with the round-3 Kyber redefinition of Core-SVP.

Yes, the Kyber-512 documentation has a preliminary estimate of 2151 bit operations.

Will it? Quantification needed.

This is where NISTBS goes horribly wrong.

The calculation here doesn't even pass basic type-checking. Yes, there's a 240 in NTRU Prime for sntrup653, but that's 240 bitops/iter. Multiplying this by Matzov's bitops, and portraying the result as bitops, is nonsense from NIST.

Whatever the cost is for computation per iteration, you have to add that to the cost for memory access per iteration. Multiplying is wrong.

In the typical case of both numbers being considerably above 1, multiplying the numbers—which is exactly what NISTBS is doing when it says "40 bits of security more than would be suggested by the RAM model" and "40 bits of additional security"—gives an embarrassing, indefensible overestimate of attack costs.

To finish this NISTBS recap, let's briefly summarize the happy conclusions that NISTBS draws:

Yeah, sounds great, except that it's all based on a botched calculation.

How easy it is to catch the error. This blog post is aimed at people who want to understand the whole picture of what's going on here. But imagine that you're looking at NISTBS without knowing any of this. How quickly can you see that NISTBS is wrong?

I think the fastest answer is the following simple sanity check. If

how can NIST end up estimating that breaking Kyber-512 costs 2177 bit operations?

This doesn't tell you where NIST went wrong, but there's a more basic trick that works for that. See where NISTBS is claiming that the NTRU Prime documentation estimates "40 bits of security more than would be suggested by the RAM model" (NIST's words), without giving a full quote from the NTRU Prime documentation?

I'm one of the NTRU Prime submitters. I already knew that this NISTBS claim was false: it's misattributing NIST's wishful thinking to the NTRU Prime documentation. But say you're reading this claim without knowing in advance that it's false. How do you figure out that it's false?

Here's a hard answer and an easy answer:

As a followup, let's imagine that NIST responds by saying "We calculated the 40 by subtracting 129 from 169 on the top row of Table 2". NIST is then implicitly claiming that the 129 is an example of calculating security in "the RAM model". How do you figure out that this implicit claim is false?

This followup similarly has a hard answer and an easy answer:

Asking questions is the normal scientific process for rapidly reaching clarity—and rapidly fixing errors. For the particular error at hand, it takes very few rounds to pinpoint the discrepancy: the 2129 in the source document for sntrup653 is Core-SVP, not a gate count in what NIST calls "the RAM model".

Of course, this clarification process doesn't work when an agency decides to dodge clarification questions, for example because it doesn't want errors to be fixed.

The research that would be needed for a correct calculation. To fix NIST's calculation, one needs to carefully distinguish two different effects:

Quantifying these effects requires tracing carefully through hundreds of pages of papers on state-of-the-art lattice attacks (not just rewriting the Asiacrypt 2020 paper) to see what would happen if costs of memory access were included.

What makes this really tough is that a change of cost metric also forces reoptimization of the entire stack of attack subroutines, along with all applicable parameters.

Consider, as one of many examples, the choice between low-memory "enumeration" and high-memory "sieving" as a subroutine inside BKZ. The Kyber documentation uses cost metrics that ignore the cost of memory access to conclude that enumeration is less efficient than sieving. If NIST is suddenly saying that memory access makes sieving slower then obviously there's a gap in the Kyber analysis. Where's the recalculation that accounts for the cost of memory access, and for the large recent improvements in enumeration?

Shortly after Matzov's attack appeared in April 2022, I had sent a message to the NISTPQC mailing list summarizing the complicated analysis that needed to be done. I took, as an example, a less Kyber-favorable scenario in which the "known unknowns" reduce 137 to 121, and I said that simply multiplying the bit-operation count by 240 would be wrong:

Does accounting for real RAM costs close the gap between 2121.5 and 2143? One might think that, sure, this is covered by the 240 mentioned above: Kyber-512 previously had security 240*2135.5 = 2175.5, so a 32.5-bit security margin, and the new paper is reducing this to an 18.5-bit security margin: i.e., the new paper is merely cutting out 40% of the Kyber security margin, rather than breaking Kyber outright.

But let's look more closely at the numbers. As a preliminary point, round-3 Kyber-512 is starting from Core-SVP just 2112 and revised-Core-SVP just 2118, with exponent 87% and 91% of 129 respectively, so the obvious estimate is about 236 instead of 240.

Furthermore, this 236 is accounting for the energy cost of accesses to a giant RAM array, while it's clear that many of the bits of security beyond Core-SVP claimed in the round-3 Kyber security analysis are coming from accounting for the cost of local bit operations. These effects don't multiply; they add!

Internally, Core-SVP is starting from estimates of the number of "operations" inside sieving. It makes sense to say that the attacker needs to pay for the large-scale memory access inside each "operation". It also makes sense to say that the attacker needs to pay for all the bit operations inside each "operation". But the local bit operations are an asymptotically irrelevant extra cost on top of the memory access, and the best bet is that they don't make much difference for Kyber-512. The real cost of this type of algorithm is, at a large scale, driven primarily by data motion, not by local computation. ...

So I don't see how current knowledge can justify suggesting that the costs of RAM rescue Kyber-512 from the new attack. It seems entirely possible that the real costs of this Kyber-512 attack are considerably below the costs of a brute-force AES-128 attack. Deciding this one way or the other will require much more serious analysis of attack costs.

An agency desperate to rescue Kyber-512 will take note of the first part of what I had written: great, memory-access costs bump Kyber's security level up by 40 bits, giving us a healthy security margin!

The agency won't listen to the subsequent part saying that, no, this calculation is garbage.

The agency won't even listen to the preliminary adjustment of 40 to 36: we have a healthy security margin, why worry about a few bits here and there?

Meanwhile, if there's something that sounds like a few bits favoring Kyber-512, then the desperate agency happily takes note of that, as the following example illustrates.

The fact that the cost of memory access in each iteration adds to the cost of computation in each iteration, rather than multiplying, has a silver lining for defenders: in the common situation of memory access being dominant, improvements in the cost of computation per iteration make little difference in total cost. I mentioned this in my April 2022 message regarding the Matzov paper:

The new paper seems to have some local speedups to the sieving inner loop, which similarly should be presumed to make little difference next to the memory-access bottleneck, but my understanding is that this is under half of the bits of security loss that the paper is reporting.

Now look at this from the perspective of the desperate agency. Aha, some bits of the Matzov speedup are computation speedups that won't matter next to memory access! As long as we're willing to switch to counting memory access, this effect downgrades the Matzov speedup, which sounds good for Kyber-512!

Sure enough, NISTBS says that "about 5 of the 14 claimed bits of security by Matzov involved speedups to local computations", and portrays this as a "further" reason for confidence in Kyber-512, beyond the "40 bits of additional security" supposedly produced by memory access.

This is double-counting the silver lining. Multiplying the 240 cost of memory access per iteration by Matzov's 2137 bit operations is already assuming (implicitly and incorrectly) that every bit operation has its own iteration, giving 2137 iterations. This leaves no room for multiplying by a "further" 25. The estimated 25 is actually on a completely different axis: it's an estimate for the Matzov-vs.-previous speedup ratio in one metric divided by the Matzov-vs.-previous speedup ratio in another metric.

NIST rescuing Kyber-512, part 3: dodging clarification requests. When NISTBS appeared in December 2022, I looked through and saw that NISTBS was multiplying, rather than adding, the cost of memory access per iteration and the cost of computation per iteration, despite my having already pointed out in April 2022 that this was wrong.

But, hmmm, NIST didn't write NISTBS in a verification-friendly way. In particular, as noted above, NIST didn't include any examples of confirming tallies.

It seemed perfectly clear that NIST was adding "40 bits of additional security" to 137 in scenario X. But NIST didn't bother saying, yes, the security level is 177 in that scenario. NIST also didn't make clear where exactly it was getting the 40 from.

When I find mistakes in security analyses, the authors usually say "Thanks for catching the mistake!"—except in lattice-based cryptography, where the authors usually claim that they meant something different from what they had written. This continual evasion is a serious disincentive to security review. If there was any way that I could have misunderstood what NISTBS was saying, then I wanted to know that at the outset, before doing the work of writing up an explanation of the error.

So I posted a short clarification question. Specifically, I spelled out scenario X and asked whether, in that scenario, I was "correctly gathering that you're calculating the Kyber-512 security level as 2177 (i.e., 34 bits of security margin compared to 2143 for AES-128), where this 177 comes from the above 137 plus 40, where 40 comes from 169 minus 129 on page 103 of the NTRU Prime documentation, specifically 'real' minus 'free' for pre-quantum sieving for sntrup653".

I was expecting a prompt answer saying "Yes, for that specific scenario we're calculating 177 bits of security, and we're getting the 40 from the 169 and 129 that you mentioned."

What actually happened is that NIST didn't reply.

Seriously? NIST picks a risky, bleeding-edge cryptosystem to standardize for users worldwide, and then doesn't even bother answering clarification questions about what NIST claims the security level is?

I mentioned above that I filed a formal complaint regarding the lack of transparency. Here's what the complaint said:

NIST has publicly claimed that Kyber-512 is as difficult to break as AES-128 (see, e.g., page 8 and Figure 1 of NISTIR 8413 claiming that Kyber-512 is "category 1"), at least by known attacks. As you know, this is the minimum security level allowed by the official evaluation criteria for the NIST Post-Quantum Cryptography Standardization Project.

However, NIST has concealed many details of the investigation that led to this claim. NIST admits that "we did consult among ourselves and with the Kyber team"; NIST still has not published those communications.

I have been trying to review the details of NIST's work on this topic. NIST's lack of transparency makes this review process unnecessarily difficult.

Some information was released by Dr. Moody and Dr. Perlner in response to my requests, but this information is (1) incomplete and (2) unclear. My email dated 8 Dec 2022 03:10:06 +0100 consisted of an "am I correctly gathering" clarification question that could have been immediately answered with a simple "Yes, that's correct" if my understanding of NIST's calculations was correct; but there was no reply, so presumably NIST actually meant something else. Surely the communications that NIST is concealing shed light on how NIST actually reached the above claim.

I am writing to file a formal complaint regarding NIST's failure to promptly and publicly disclose full details of its investigation of the security of Kyber-512. This investigation should have been carried out transparently from the outset, allowing prompt correction of any errors that NIST failed to detect. The fact that NIST was still concealing the details in July 2022 prevented the public from seeing how NIST arrived at NISTIR 8413's claims on the topic. The fact that NIST is continuing to conceal the details today seems inexplicable except as part of NIST trying to limit public review of NIST's security evaluations.

Please acknowledge receipt of this message, and please publish full details of NIST's investigation of the security of Kyber-512.

I escalated the complaint to NIST's Matthew Scholl on 20 January 2023. Scholl didn't reply. The public still hasn't seen the details of NIST's consultations "among ourselves and with the Kyber team" regarding Kyber-512.

Maybe Scholl was sending internal email: "Why is djb asking about this? Did we screw something up again?" Maybe NIST looked again at my April 2022 message, realized how badly it had botched its Kyber-512 security analysis, and then decided that it could get away with being obstructionist rather than admitting the error.

Or maybe NIST, still struggling to catch up on post-quantum cryptography, simply hasn't had time to figure out the meaning of the numbers that it's multiplying to obtain its claims regarding Kyber-512. But this doesn't explain what happened next, namely NIST spending more time dodging clarification questions than it would have spent simply answering the questions.

The same day that I escalated my non-transparency complaint to Scholl, I publicly noted NIST's non-responsiveness, and asked if anyone saw another way to interpret NIST's calculations:

In the absence of such clarity, reviewers have to worry that putting NIST's stated components together in what seems to be the obvious way, and then doing the work to disprove what NIST appears to be claiming about the security margin, will lead to a response claiming that, no, NIST meant something else. It's natural to ask for clarification.

... I've again gone through NIST's 7 December email, and again concluded that for this scenario NIST is claiming 34 bits in the way spelled out below. Is there any way I could be missing something here? Does anyone see another way to interpret NIST's calculations?

NIST dodged, replying that NIST's email "speaks for itself".

Well, yes, I think NISTBS speaks for itself, and is very clearly adding the "40 bits of additional security" to the 137 postulated in scenario X, obtaining 177 in that scenario, i.e., 34 bits more than NIST's 143 target. I was simply asking for NIST to confirm that, yes, in that scenario you take the 137 from Matzov, and add the "40 bits of additional security", giving 177 bits of security.

NIST also tried to shift attention to the question of "whether or not our current plan to standardize Kyber512 is a good one", while downplaying the question of whether NIST had correctly calculated the Kyber-512 security level:

While reviewers are free, as a fun exercise, to attempt to "disprove what NIST appears to be claiming about the security margin," the results of this exercise would not be particularly useful to the standardization process.

Seriously? NIST

I replied, starting with again asking for clarification:

I think I understand what NIST is claiming in that message regarding the quantitative Kyber security level.

I think that my clarification question (focusing on one example, much shorter than NIST's message) is identifying the obvious interpretation.

But then why hasn't NIST simply said "Yes, that's correct" in response?

If the interpretation I've identified differs from what NIST meant, can NIST please simply say what the difference is, so that security reviewers don't have to spend time on the quantitative security claims that NIST currently seems to be making?

I also commented on the notion that this wouldn't be useful input:

If Kyber-512 doesn't meet the minimum security level allowed by the official call for submissions to the NIST Post-Quantum Cryptography Standardization Project then Kyber-512 should not be standardized.

NIST's evaluation of the Kyber-512 security level---after various attack advances newer than the latest version of the Kyber submission---depends explicitly on NIST's calculations of the impact of memory costs.

With all due respect, is it so hard to imagine that NIST has botched those calculations? If NIST is so sure that it got the whole sequence of calculations right, why is it so resistant to clarification questions that will help reviewers check and confirm that NIST got this right? If NIST isn't sure, doesn't that make public review even more important?

In any case, there's a strong public interest in having NIST's security evaluations clearly and promptly explained, to maximize the chance of having errors corrected before bad decisions are set into stone.

NIST dodged again:

It would be helpful to redirect discussion to

1) The question of whether Kyber512 is as hard to break as AES128, (which is a scientific question that cannot be settled by NIST pronouncements)

2) The related question of whether Kyber512 should be standardized, (which is a question where NIST will ultimately need to make a definitive decision, but thus far we have only signaled we are leaning towards yes.)

With this in mind, I would like to note that the technical point on which Dan has asked for clarification is effectively "how much additional security does Kyber512 get on account of memory access costs, according to the NTRUprime submission's memory cost model?" Surely Dan, being on the NTRUPrime team, is in a better position to answer this question than us.

Seriously? NIST

Even if NIST didn't understand by this point that it had screwed up, it certainly knew that

Those conclusions were the result of calculations announced by NIST. It's completely inappropriate for NIST to be trying to deflect clarification questions about those calculations.

Chris Peikert had entered the discussion in the meantime to issue blanket denials that NIST was claiming any particular number of bits of security. Of course, Peikert didn't propose an alternative interpretation of NIST's words "40 bits of additional security".

I posted a line-by-line dissection of NISTBS, very similar to the line-by-line dissection shown above, and asked if anyone could see any alternative interpretation:

If anyone sees any way that I could be misunderstanding the details of NIST's posting, please pinpoint which step is at issue and what the alternative interpretation of NIST's calculation is supposed to be.

There was no reply.

Perhaps NIST will now claim that, when it wrote "40 bits of additional security", it actually meant something different from, um, 40 bits of additional security. But then why didn't NIST promptly answer my first question by saying that, no, they didn't mean 40 bits of additional security, and here's what they did mean?

I went far beyond the call of duty in informing NIST of my understanding of NISTBS, asking for confirmation, and giving them ample time to reply. By dodging, NIST successfully delayed having NISTBS publicly debunked.

At some point one has to draw a line and say that this has gone too far. NIST's miscalculation of Kyber-512's security level is still sitting there misinforming people, and it has to be corrected.

NIST rescuing Kyber-512, part 4: standards making unreviewable security claims. In August 2023, NIST released a draft of its Kyber standard ("ML-KEM"), in particular saying "it is claimed that the computational resources needed to break ML-KEM are greater than or equal to the computational resources needed to break the block cipher ... ML-KEM-512 is claimed to be in security category 1, ML-KEM-768 is claimed to be in security category 3, and ML-KEM-1024 is claimed to be in security category 5".

Impressive use of the passive voice. Is NIST claiming these categories? Are the designers claiming these categories? Is someone else claiming these categories?

Citation needed. Or, really, responsibility needed.

Appendix A of the draft again says that these "categories" are defined as matching or surpassing AES-128, AES-192, and AES-256 respectively in every "potentially relevant" cost metric:

Each category is defined by a comparatively easy-to-analyze reference primitive, whose security will serve as a floor for a wide variety of metrics that NIST deems potentially relevant to practical security. ...

In order for a cryptosystem to satisfy one of the above security requirements, any attack must require computational resources comparable to or greater than the stated threshold, with respect to all metrics that NIST deems to be potentially relevant to practical security.

The latest Kyber documentation says that the Kyber-512 attack cost could be as low as 2135 "classical gates". That's below NIST's estimate of 2143 "classical gates" for AES-128, never mind subsequent attack developments. Where exactly is the justification for claiming that Kyber-512 reaches the AES-128 floor in all potentially relevant metrics?

Is NIST now officially declaring that "classical gates" aren't "potentially relevant to practical security"?

If so, how does NIST reconcile this with NIST's 2022 selection report, which used gate counts ("the non-local cost model") as an excuse to kick out the most efficient lattice KEM that NIST was considering, namely NTRU-509?

What exactly are the metrics that NIST is now using for the claim that Kyber-512 is as hard to break as AES-128? When and where were the definitions of those metrics published? (NISTBS doesn't even pass basic type-checking, let alone refer to a clearly defined metric.)

Where's the analysis of Kyber-512's security level in NIST's metrics?

For comparison, where's the analysis of the AES-128 security level in NIST's metrics?

The Kyber documentation concentrates on Kyber-512 for its concrete cost analysis, but the subexponential "dimensions for free" speedup (and subsequent improvements) should do more damage to security at larger sizes. Where are the analyses of the Kyber-768, AES-192, Kyber-1024, and AES-256 security levels in NIST's metrics?

NIST's call for submissions said the following:

All submitters are advised to be somewhat conservative in assigning parameters to a given category, but submitters of algorithms where the complexity of the best known attack has recently decreased significantly, or is otherwise poorly understood, should be especially conservative.

How exactly is this being handled for the latest "category" claims? Are the claims accounting for the 32-bit range of "known unknowns" in the latest Kyber documentation? A wider range given the "unknowns" appearing in newer papers? An even wider range to protect against the likelihood of further attack speedups?

Readers understand the word "claim" to be asserting that something is true, not to be merely saying "we don't think it's terribly likely that this is false". Why does this draft standard conceal NIST's assessment of the probability of failure?

The official NISTPQC call for submissions said "NIST will perform a thorough analysis of the submitted algorithms in a manner that is open and transparent to the public". Scholl said "We operate transparently. We've shown all our work". But the reality is that security reviewers aren't even being given a clear statement of what exactly is being claimed about Kyber's security, let alone what the justification for that claim is supposed to be.

Next steps. Given how unstable and poorly understood the lattice attack surface is, standardizing Kyber-512 (or NTRU-509) would be reckless.

The poor understanding is a sign of danger. Contrary to NISTBS, it's entirely possible that Kyber-512 is substantially easier to break than AES-128 with attacks that have already been published, even considering the costs of memory access. The opposite is also possible. Figuring out the actual status of this bleeding-edge proposal would be a tough research project.

The instability is another sign of danger. How are we supposed to manage the risks of better attacks wiping out many more bits of security?

("Bad news: It's broken. Good news: Comparing it to AES-128 has become much easier.")

AES-128 isn't some stratospheric security level. For example, multi-target attacks against AES-128 take only 288 computations to break one of a trillion keys. That amount of computation is already feasible for large-scale attackers today. Even if you think this is too expensive to worry about, what happens if a cryptosystem actually loses 10 or 20 or 30 bits compared to that?

A paper at ACM CCS 2021 claimed to be able to show that one-out-of-many-ciphertext attacks against Kyber are as hard as single-ciphertext attacks. But I have a paper "Multi-ciphertext security degradation for lattices" that

The main theorem of my paper isn't easy but now has a proof fully verified by HOL Light. "Asymptotically" refers to what happens when sizes grow to infinity; more research is required to quantify the impact of these multi-ciphertext attacks—and whatever improved attacks people find—upon Kyber's limited range of sizes. This is just one of many unexplored parts of the attack surface.

Some attack avenues have clear quantitative limits: for example, 240-target attacks can't eliminate more than 40 bits of security. Replacing Kyber-512 with Kyber-1024 clearly reduces risks (which is not to say that it eliminates all risks: look at what happened to SIKE). There are many previous examples in cryptography of attacks that would have been stopped if cryptographic parameters had been chosen just twice as large as what people had thought was necessary.

Standardizing Kyber-512 means that Kyber-512 will be deployed in many applications that would easily have been able to afford Kyber-1024 or NTRU-1229 or something even larger. This is true even if the standard has Kyber-1024 (or Kyber-768) as an option, even the recommended option. It's easier for a manager to take the fastest option than to investigate whether the fastest option is actually needed. Why exactly won't a manager take the fastest option if NIST has declared it to be a standard option?

Security is supposed to be job #1. So I recommend eliminating Kyber-512. I also recommend that NIST be honest with the public about what happened here:

After everything that has happened, I'm skeptical that we're going to suddenly see Honest NIST, but hope springs eternal.

[2023.10.04 edit: "NIST-509" -> "NTRU-509". 2023.10.05 edits: "than obviously" -> "then obviously"; "40 bits of NTRU Prime" -> "40 bits from NTRU Prime". 2023.10.30 edit: Add Huff reference.]


Version: This is version 2023.10.30 of the 20231003-countcorrectly.html web page.