cr.yp.to: 2020.12.06: Optimizing for the wrong metric, part 1: Microsoft Word

Newer (Access-K): 2022.01.29: Plagiarism as a patent amplifier: Understanding the delayed rollout of post-quantum cryptography. #pqcrypto #patents #ntru #lpr #ding #peikert #newhope

Older (Access-J): 2019.10.24: Why EdDSA held up better than ECDSA against Minerva: Cryptosystem designers successfully predicting, and protecting against, implementation failures. #ecdsa #eddsa #hnp #lwe #bleichenbacher #bkw

Table of contents (Access-I for index page)

2025.04.23: McEliece standardization: Looking at what's happening, and analyzing rationales. #nist #iso #deployment #performance #security

2025.01.18: As expensive as a plane flight: Looking at some claims that quantum computers won't work. #quantum #energy #variables #errors #rsa #secrecy

2024.10.28: The sins of the 90s: Questioning a puzzling claim about mass surveillance. #attackers #governments #corporations #surveillance #cryptowars

2024.08.03: Clang vs. Clang: You're making Clang angry. You wouldn't like Clang when it's angry. #compilers #optimization #bugs #timing #security #codescans

2024.06.12: Bibliography keys: It's as easy as [1], [2], [3]. #bibliographies #citations #bibtex #votemanipulation #paperwriting

2024.01.02: Double encryption: Analyzing the NSA/GCHQ arguments against hybrids. #nsa #quantification #risks #complexity #costs

2023.11.25: Another way to botch the security analysis of Kyber-512: Responding to a recent blog post. #nist #uncertainty #errorbars #quantification

2023.10.23: Reducing "gate" counts for Kyber-512: Two algorithm analyses, from first principles, contradicting NIST's calculation. #xor #popcount #gates #memory #clumping

2023.10.03: The inability to count correctly: Debunking NIST's calculation of the Kyber-512 security level. #nist #addition #multiplication #ntru #kyber #fiasco

2023.06.09: Turbo Boost: How to perpetuate security problems. #overclocking #performancehype #power #timing #hertzbleed #riskmanagement #environment

2022.08.05: NSA, NIST, and post-quantum cryptography: Announcing my second lawsuit against the U.S. government. #nsa #nist #des #dsa #dualec #sigintenablingproject #nistpqc #foia

2022.01.29: Plagiarism as a patent amplifier: Understanding the delayed rollout of post-quantum cryptography. #pqcrypto #patents #ntru #lpr #ding #peikert #newhope

2020.12.06: Optimizing for the wrong metric, part 1: Microsoft Word: Review of "An Efficiency Comparison of Document Preparation Systems Used in Academic Research and Development" by Knauff and Nejasmic. #latex #word #efficiency #metrics

2019.10.24: Why EdDSA held up better than ECDSA against Minerva: Cryptosystem designers successfully predicting, and protecting against, implementation failures. #ecdsa #eddsa #hnp #lwe #bleichenbacher #bkw

2019.04.30: An introduction to vectorization: Understanding one of the most important changes in the high-speed-software ecosystem. #vectorization #sse #avx #avx512 #antivectors

2017.11.05: Reconstructing ROCA: A case study of how quickly an attack can be developed from a limited disclosure. #infineon #roca #rsa

2017.10.17: Quantum algorithms to find collisions: Analysis of several algorithms for the collision problem, and for the related multi-target preimage problem. #collision #preimage #pqcrypto

2017.07.23: Fast-key-erasure random-number generators: An effort to clean up several messes simultaneously. #rng #forwardsecrecy #urandom #cascade #hmac #rekeying #proofs

2017.07.19: Benchmarking post-quantum cryptography: News regarding the SUPERCOP benchmarking system, and more recommendations to NIST. #benchmarking #supercop #nist #pqcrypto

2016.10.30: Some challenges in post-quantum standardization: My comments to NIST on the first draft of their call for submissions. #standardization #nist #pqcrypto

2016.06.07: The death of due process: A few notes on technology-fueled normalization of lynch mobs targeting both the accuser and the accused. #ethics #crime #punishment

2016.05.16: Security fraud in Europe's "Quantum Manifesto": How quantum cryptographers are stealing a quarter of a billion Euros from the European Commission. #qkd #quantumcrypto #quantummanifesto

2016.03.15: Thomas Jefferson and Apple versus the FBI: Can the government censor how-to books? What if some of the readers are criminals? What if the books can be understood by a computer? An introduction to freedom of speech for software publishers. #censorship #firstamendment #instructions #software #encryption

2015.11.20: Break a dozen secret keys, get a million more for free: Batch attacks are often much more cost-effective than single-target attacks. #batching #economics #keysizes #aes #ecc #rsa #dh #logjam

2015.03.14: The death of optimizing compilers: Abstract of my tutorial at ETAPS 2015. #etaps #compilers #cpuevolution #hotspots #optimization #domainspecific #returnofthejedi

2015.02.18: Follow-You Printing: How Equitrac's marketing department misrepresents and interferes with your work. #equitrac #followyouprinting #dilbert #officespaceprinter

2014.06.02: The Saber cluster: How we built a cluster capable of computing 3000000000000000000000 multiplications per year for just 50000 EUR. #nvidia #linux #howto

2014.05.17: Some small suggestions for the Intel instruction set: Low-cost changes to CPU architecture would make cryptography much safer and much faster. #constanttimecommitment #vmul53 #vcarry #pipelinedocumentation

2014.04.11: NIST's cryptographic standardization process: The first step towards improvement is to admit previous failures. #standardization #nist #des #dsa #dualec #nsa

2014.03.23: How to design an elliptic-curve signature system: There are many choices of elliptic-curve signature systems. The standard choice, ECDSA, is reasonable if you don't care about simplicity, speed, and security. #signatures #ecc #elgamal #schnorr #ecdsa #eddsa #ed25519

2014.02.13: A subfield-logarithm attack against ideal lattices: Computational algebraic number theory tackles lattice-based cryptography.

2014.02.05: Entropy Attacks! The conventional wisdom says that hash outputs can't be controlled; the conventional wisdom is simply wrong.

2020.12.06: Optimizing for the wrong metric, part 1: Microsoft Word: Review of "An Efficiency Comparison of Document Preparation Systems Used in Academic Research and Development" by Knauff and Nejasmic. #latex #word #efficiency #metrics

The boss needed item 3 inserted into a numbered list of hundreds of items. The intern used a mouse to select the original 3 on the screen, then typed 4, then selected the original 4, then typed 5, then scrolled down, then selected the original 5, then typed 6, and so on. Another intern sat watching the screen to make sure there were no mistakes.

I happened to be in the room for other reasons. I remember the horror of watching the beginning of this barbaric editing process. Those poor interns!

When I enter a list of items into the computer, what I'm typing doesn't look like

Each asterisk is a special command to the computer, telling the computer to automatically display the next number for the reader. The reader eventually sees

but that isn't what I typed. This small difference produces a tremendous savings of time whenever I insert an item, or delete an item, or move an item.

If I decide later to skip the numbers and use bullets instead, I tell the computer to introduce each list item with a bullet. This is one command covering the whole list. There's also a command that does the same thing for the whole document. There isn't separate work for each item. It's no problem if a coauthor later wants to change bullets back to numbers.

The interns, I suppose, would be manually changing "1." and "2." and "3." and so on to "•" and "•" and "•" and so on. Or maybe they would be trying to figure out how some search-and-replace feature could do the same thing; let's hope the document doesn't have a sentence somewhere that talks about something that happened in the year 2001. Or maybe the interns would be quitting and finding a better job.

[Note added 2020.12.07: I was expecting that many of my readers would already be accustomed to relying on the computer for automatic numbering. I was surprised, however, to see some comments along the lines of "Inconceivable!" from readers unable to imagine how the interns could have been in a different situation, going through such a shockingly inefficient revision process. Here's a hint: Each item in the list looked like a flush-left paragraph, like the paragraphs in this blog post, adjacent to the left margin. The text being selected by the mouse, for example to change "3" to "4", was to the right of the margin, like the rest of the text in each item.]

Abstraction as a time-saver for authors. This use of asterisks is just one example of how I'm often typing something more abstract than what's seen by the ultimate reader. I don't type "Figure 12" or "see [41]", for example; I type things like "Figure \ref{network-measurements}" and "see \cite{multiplication-survey}", and I let the computer automatically convert "\ref{network-measurements}" and "\cite{multiplication-survey}" into numbers to display for the reader.

With one extra command, covering the entire document, I can tell the computer to include section numbers as part of all figure numbers in the document, so that the figures are easier for the reader to find: e.g., Figures 3.1 and 3.2 and 3.3 are in Section 3. With another command, again covering the entire document, I can tell the computer to cite all authors by name rather than by number.

As another example, I was recently editing a mathematical paper, and I decided that a particular concept would be easier for the reader to remember if I changed the notation that I was using for the concept. The notation was all over the paper, but this change took just a few seconds of editing. I had given a name to the concept, had told the computer once to display this name as a particular notation, and had then typed this name throughout the paper, so there was only one place where I had to change the notation.

Of course one can't, and shouldn't try to, prepare in advance for every possible change to a document. But it's not hard to prepare for the most likely changes. This small initial effort saves a tremendous amount of time later. When I say "small", I'm including the effort to select a document-creation system that's designed to make this sort of thing easy.

(As a side note, programmers will recognize this strategy as an example of the information hiding strategy introduced by Parnas, and will recognize that modern program-creation systems are designed to make this easy.)

Microsoft Word isn't completely missing abstractions, but these abstractions are competing for user-interface resources against features encouraging the user to work at lower abstraction layers. The extra effort to use the abstractions ends up pushing users into doing something simpler, something that just works now, and paying heavily for this choice later when the document is being revised.

Have I done a scientific study proving that Microsoft Word is less efficient than LaTeX? No. I'd love to see a careful study of this topic. Short-term, this would help guide new authors to make sensible choices. Longer-term, insights from this sort of study could be the basis for further improving our document-creation systems. I certainly don't think that the existing systems are perfect. (Example.)

Imagine, however, that a study looks only at the time for someone looking at a printout to create a document matching this printout. This would be blind to the time for subsequent edits. This would be blind to the suffering of those interns. This would incorrectly conclude that typing "1. ... 2. ... 3. ..." and "see [41]" is more efficient than typing "* ... * ... * ..." and "see \cite{multiplication-survey}". It is slightly more efficient in this limited metric, but it is much less efficient in the metric that matters, namely the total time spent by the user.

Participants in the study were given a page of text and were given a limited time to type the page into the computer. There were three different types of text:

Participants were scored on the basis of how much text they typed and how accurately they typed it. The time was so rushed that a significant fraction of participants didn't finish typing the whole page, even for the case of simple prose.

The study considered two document-creation systems: LaTeX and Microsoft Word, in each case with "all tools, editors, plug-ins, and add-ons" that participants were "accustomed to using". Of course different "add-ons" could have different efficiency, and of course there are other document-creation systems, but these are topics for another blog post.

The study produced many pages of results, which I'll summarize by saying that Word did slightly better on the prose and much better on the table, while LaTeX did better on the formulas. The study authors made no effort to measure any subsequent document-editing step.

Slithering from one metric to another. The fundamental mistake in the KN paper is the change of cost metric.

The original question was how efficiently authors are creating documents: in particular, how efficiently authors are creating academic research papers. KN claimed in their title to be comparing "efficiency" of "document preparation systems used in academic research". But they then quietly changed this metric in three ways:

Did KN use the honest title "A comparison of the unreliability of rushed retyping of a page using document preparation systems that are also used for academic research and development"? No. Would you expect a journal to accept a paper with such a title?

Instead they used a title claiming, without justification, to measure something else: "An efficiency comparison of document preparation systems used in academic research and development". So they were advertising metric X, the efficiency of academic document preparation, while actually studying metric Y, the unreliability of rushed retyping of an existing page. Anyone who simply asks "Could a Y comparison mispredict an X comparison?" will immediately come up with all sorts of reasons that the answer is yes.

Fake science, piled higher and deeper. I'll close this review by commenting on some quotes from KN:

Where is the justification for the claim that this was "highly realistic"? Is it "highly realistic" to have researchers starting from an existing page of text, rushing to type the page into the computer, and then not spending any time revising the text? Perhaps the authors of this study produce their own papers this way, but they don't say this, and they also don't justify extrapolating from anecdotal evidence.

Let me suggest a followup study of the following hypothesis. Compared to researchers who use LaTeX, researchers who use Microsoft Word produce papers that are significantly worse, not just in appearance but also in content. One of the reasons for this is that researchers who use Microsoft Word need much more time for revisions than researchers who use LaTeX, and as a result are systematically deterred from making revisions that would significantly improve the content of their papers.

If scientists claim that researchers "should reflect on" something, aren't they under an obligation to cite at least a small sample of the previous literature doing exactly this? Of course there were also various responses to this study, and the responses generally sound like things that people had already thought through.

It's certainly striking to see the contrast between (1) LaTeX users being more satisfied than Microsoft Word users and (2) KN claiming that Microsoft Word is more efficient. It's even more striking to see how KN explained this: basically, LaTeX users are emotionally unable to handle the thought that they might be making a mistake in using LaTeX, and thus bury this thought under an artificial feeling of satisfaction. Using LaTeX is a happiness drug! Hey, buddy, want to give LaTeX a try? It'll make you feel great! First page is free!

A much more obvious explanation is that KN screwed up their entire study by choosing an unrealistic efficiency metric. Nowhere did KN acknowledge this explanation. From a psychological perspective, this surprising blindness on the part of KN may be related to motivational factors, such as the tendency to reduce cognitive dissonance. Authors who have put work into a study have a bias towards being satisfied with their own work, and this interferes with them rationally considering the possibility that their study was fundamentally flawed.

Wow. We're supposed to be making decisions about how public money is used on the basis of an "efficiency" study that incorrectly equates different efficiency metrics and displays no understanding of the importance of the selection of a metric?

Did KN carry out a scientific study of the efficiency of preparing academic research papers? No, they didn't. What they actually measured was something different. This change of metric undermined their "facts", their "control", their "systematic methods", their "careful measurement", their "connecting causes and effects", and their "rational evidence-based decisions". Their title and main conclusions were, and are, speculation posing as science.

I'm not saying that quantitative efficiency studies are a bad thing in general. Again, I would love to see a properly designed study of the total cost of creating a document. But this would take serious effort, first to monitor the entire lifecycle of various documents and then to see how efficiently these lifecycles can be reproduced in different document-creation systems.

The authors of this particular study didn't want to bother spending so much time, so they switched to another metric that was easier to measure. The fundamental problem is that this metric says very little about the user's actual costs.

The cr.yp.to blog

2020.12.06: Optimizing for the wrong metric, part 1: Microsoft Word: Review of "An Efficiency Comparison of Document Preparation Systems Used in Academic Research and Development" by Knauff and Nejasmic. #latex #word #efficiency #metrics