→ See also our original findings: On the Promises of ‘High-Assurance’ Cryptography, and our paper: The Verification Theater: When Formal Methods Create False Assurance in Cryptographic Libraries
→ Nadim Kobeissi will be presenting these findings in a free talk on February 25: High Assurance Cryptography and the Ethics of Disclosure
On February 5, we published five new vulnerabilities in Cryspen’s libcrux, hpke-rs, and libcrux-psq cryptographic libraries, along with a paper analyzing the structural pattern behind them. Cryspen’s response was to block our GitHub account, close our pull requests containing working fixes, and accuse us of acting in bad faith.
One week later, on February 12, Cryspen published a 1,570-word response titled “The strengths and limits of formal verification.” In it, they write:
So far, no bugs have been found in the verified code, although we would be very interested to know if any of our code generation tools inadvertently introduced a bug, and even more interested if the bug should have been caught by formal verification.
We are happy to oblige. We forked libcrux and found three bugs in the verified code: two in the ML-KEM mathematical specification itself, and one in a correctness proof that Cryspen’s documentation claims is complete.
What Cryspen’s Response Left Out
Before we get to the bugs, it is worth noting what Cryspen chose to address and what they chose to ignore. Their 1,570-word response acknowledged—obliquely—two of six vulnerabilities (one reported not by us, in November 2025):
- V1 (platform-dependent cryptographic output failure): described as “a fallback implementation for a platform-specific intrinsic.”
- V5 (Ed25519 double clamping): described as “one in an unverified wrapper around HACL* code.”
What they did not mention, in 1,570 words ostensibly about the strengths and limits of their verification methodology:
- V3: Nonce reuse via integer overflow. A
u32sequence counter in hpke-rs that silently wraps to zero in release builds, reusing AEAD nonces. For AES-GCM, nonce reuse leaks the GHASH authentication key, enabling universal forgeries. For ChaCha20-Poly1305, it enables direct plaintext recovery. Not one word. - V6: Denial of service via
.unwrap(). A single malformed ciphertext crashes the process in the PSQ protocol’s AES-GCM decryption path. Breaks IND-CCA security for 9 of 18 supported PSQ ciphersuites. The correct error handling code was present but commented out. Not one word. - V2: Missing X25519 all-zero validation. An attacker can supply a low-order public key and force the HPKE shared secret to zero, making session keys deterministic and predictable. Forward secrecy completely broken. Not one word.
- V4: ECDSA signature malleability. The ECDSA P-256 implementation lacks low-S normalization, a standard requirement for over a decade. Not one word.
The word “hpke-rs” does not appear anywhere in Cryspen’s response. This is the library that implements RFC 9180, the one depended upon by Signal’s signal-crypto crate and by OpenMLS. Its two most severe vulnerabilities—nonce reuse and broken forward secrecy—were simply not mentioned.
Cryspen wrote 1,570 words about “the strengths and limits of formal verification” without acknowledging four of the six vulnerabilities, including the two most dangerous ones.
Four days after blocking our account and closing our pull requests, Cryspen’s CEO manually copied our fixes and merged them under his own name. The fixes that were characterized as “not a good-faith attempt to improve the project’s security posture” were adopted verbatim.
Bugs in the Verified Code
With that context, let us return to Cryspen’s invitation. They said they would be “very interested” if someone found a bug that should have been caught by formal verification. We forked libcrux and looked.
We found three.
Bug 1: Wrong Specification for ML-KEM Decompression
The F* specification function decompress_d in Spec.MLKEM.Math.fst (line 238) uses the wrong rounding constant. This is not a proof-level issue. This is the mathematical specification itself—the correctness target for the entire ML-KEM decompression pipeline—computing the wrong function.
let decompress_d (d: dT {d <> 12}) (x: field_element_d d): field_element
= let r = (x * v v_FIELD_MODULUS + 1664) / pow2 d in
r -- ^^^^ should be: pow2 (d - 1)
When rounding integer division by $N$, the correct procedure is to add $N/2$ before dividing. For decompress_d, the denominator is $2^d$, so the rounding constant should be $2^{d-1}$. The value $1664 = (q-1)/2$ is the correct rounding constant for the compress_d function directly above it, which divides by $q = 3329$. The author used 1664 for both functions.
Compare with compress_d (line 225), where 1664 is correct:
let compress_d (d: dT {d <> 12}) (x: field_element): field_element_d d
= let r = (pow2 d * x + 1664) / v v_FIELD_MODULUS in
... -- ^^^^ correct here: dividing by q, so add q/2 = 1664
The pattern is clear. For compress_d, $N = q = 3329$, so $N/2 = 1664$. For decompress_d, $N = 2^d$, so $N/2 = 2^{d-1}$. One constant was copied to both.
Concrete Counterexamples
FIPS 203 defines $\text{Decompress}_d(y) = \text{round}\!\left(\frac{q}{2^d} \cdot y\right)$, implemented as $\left\lfloor\frac{q \cdot y + 2^{d-1}}{2^d}\right\rfloor$.
| d | x | Spec (wrong: 1664) | Correct ($2^{d-1}$) | FIPS 203 |
|---|---|---|---|---|
| 1 | 0 | 832 | 0 | 0 |
| 1 | 1 | 2496 | 1665 | 1665 |
| 4 | 0 | 104 | 0 | 0 |
| 4 | 1 | 312 | 208 | 208 |
| 10 | 0 | 1 | 0 | 0 |
| 10 | 1 | 4 | 3 | 3 |
The spec gives decompress_d(d, 0) != 0 for $d < 12$. Decompressing zero should always yield zero. The spec and FIPS 203 disagree on every input for $d \in \{1, 4, 5, 10, 11\}$.
The Implementation Is Correct; the Spec Is Wrong
Libcrux’s own Rust implementation (libcrux-ml-kem/src/vector/portable/compress.rs, line 317) uses the correct formula:
let mut decompressed = a.elements[i].as_i32() * FIELD_MODULUS.classify().as_i32();
decompressed = (decompressed << 1) + (1i32 << COEFFICIENT_BITS);
decompressed = decompressed >> (COEFFICIENT_BITS + 1);
This computes $(x \cdot q + 2^{d-1}) / 2^d$—the correct rounding. The CRYSTALS-Kyber reference implementation in C uses the same correct formula. The implementation is right. The specification is wrong. They disagree.
Why It Goes Undetected
The spec function decompress_d is the correctness target for the entire decompression pipeline. It is called via byte_decode_then_decompress (line 270), which feeds into decode_then_decompress_message ($d=1$), decode_then_decompress_u ($d=d_u$), and decode_then_decompress_v ($d=d_v$)—every decompression path in ML-KEM encryption and decryption.
The implementation’s ensures clauses in ind_cpa.rs reference these spec functions. But Libcrux_ml_kem.Ind_cpa.fst is in ADMIT_MODULES (line 4 of the extraction Makefile), so the proof that the implementation matches the spec is admitted—the mismatch is never checked. Meanwhile, verification_status.md claims ind_cpa is yes/yes/yes.
Provenance and Impact
Introduced in commit c1935b9cf8 on August 12, 2024, by Karthikeyan Bhargavan (Cryspen’s Chief Research Scientist and corresponding author of their response post), with the commit message “spec”. This was the very first version of the ML-KEM specification file. The bug has been present since inception.
This is a specification-level bug. The formal specification—the mathematical reference that correctness proofs are verified against—computes a different function than FIPS 203. Even if all proofs were completed and fully verified, they would prove the implementation correct against the wrong mathematical definition. The overall claim “ML-KEM implementation is correct with respect to FIPS 203” is undermined at the foundation.
The fix is to change 1664 to pow2 (d - 1) on line 238:
let decompress_d (d: dT {d <> 12}) (x: field_element_d d): field_element
- = let r = (x * v v_FIELD_MODULUS + 1664) / pow2 d in
+ = let r = (x * v v_FIELD_MODULUS + pow2 (d - 1)) / pow2 d in
r
Bug 2: Missing Inverse NTT in ML-KEM Encryption Specification
The F* specification function ind_cpa_encrypt_unpacked in Spec.MLKEM.fst (line 277) computes the ciphertext component v by adding an NTT-domain polynomial directly to coefficient-domain polynomials, without first applying the inverse NTT. This is a domain mismatch—the mathematical equivalent of adding a frequency-domain signal to a time-domain signal. The Rust implementation correctly applies invert_ntt_montgomery before the addition.
FIPS 203, Algorithm 14 (K-PKE.Encrypt), specifies two structurally identical computations:
$$\text{Step 19:} \quad \mathbf{u} \leftarrow \text{NTT}^{-1}(\hat{\mathbf{A}}^T \circ \hat{\mathbf{y}}) + \mathbf{e}_1$$$$\text{Step 21:} \quad \mathbf{v} \leftarrow \text{NTT}^{-1}(\hat{\mathbf{t}}^T \circ \hat{\mathbf{y}}) + \mathbf{e}_2 + \mu$$Both require $\text{NTT}^{-1}$ on the NTT-domain product before adding coefficient-domain error terms. The specification implements step 19 correctly but omits $\text{NTT}^{-1}$ from step 21:
-- Line 275 (u, correct):
let u = vector_add (vector_inv_ntt (matrix_vector_mul_ntt ...)) error_1
^^^^^^^^^^^^^^
NTT⁻¹ applied
-- Line 277 (v, BUG):
let v = poly_add (poly_add (vector_dot_product_ntt t_as_ntt r_as_ntt) error_2) mu
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Returns NTT-domain. Missing: poly_inv_ntt
The operand domains confirm the mismatch. vector_dot_product_ntt computes $\hat{\mathbf{t}}^T \circ \hat{\mathbf{y}}$ via pointwise NTT-domain multiplication and summation (vector_sum (vector_mul_ntt a b), line 185), returning an NTT-domain polynomial. Meanwhile, error_2 (from sample_poly_cbd2, line 165—CBD sampling with no NTT) and mu (from decode_then_decompress_message, line 212—byte decode and decompress) are both in the coefficient domain. The spec adds them directly.
The decryption specification, thirty lines below, performs the same operation correctly:
-- Line 307 (decryption, correct):
let w = poly_sub v (poly_inv_ntt (vector_dot_product_ntt secret_as_ntt (vector_ntt u)))
^^^^^^^^^^^^^^
NTT⁻¹ applied
The Implementation Is Correct; the Spec Is Wrong
The Rust implementation in libcrux-ml-kem/src/matrix.rs (line 77) has a correct English comment:
/// Compute InverseNTT(tᵀ ◦ r̂) + e₂ + message
And the code (lines 99–105) correctly applies the inverse NTT before the addition:
for i in 0..K {
let product = t_as_ntt[i].ntt_multiply(&r_as_ntt[i]);
result.add_to_ring_element::<K>(&product);
}
invert_ntt_montgomery::<K, Vector>(&mut result); // NTT⁻¹ applied correctly
result = error_2.add_message_error_reduce(message, result);
But its F* postcondition (line 88) matches the buggy spec, not the correct implementation:
res_spec == Spec.MLKEM.(poly_add (poly_add (vector_dot_product_ntt ...) e2_spec) m_spec)
-- ^^^ Missing poly_inv_ntt
The programmer knew what the code should do. The English is right, the implementation is right, but the formal annotation is wrong.
Why It Goes Undetected
The function compute_ring_element_v is marked verification_status(lax) (line 79 of matrix.rs)—its postcondition is only lax-checked, not proven. And Libcrux_ml_kem.Ind_cpa.fst is in ADMIT_MODULES, so the linking proof between the implementation and the specification is fully admitted.
Provenance and Impact
This bug was introduced in the same commit as Bug 1: c1935b9cf8 on August 12, 2024, the initial creation of the ML-KEM specification file by Karthikeyan Bhargavan.
This is a specification-level bug in ML-KEM encryption. The NTT representation of a polynomial is a completely different sequence of 256 field elements than its coefficient representation. Adding them directly produces a result that is neither the correct NTT-domain nor coefficient-domain value—the spec computes a ciphertext v that differs from what FIPS 203 prescribes for virtually all inputs. Both the u computation (line 275) and the decryption (line 307) correctly apply the inverse NTT, making this an inconsistency within the specification itself.
The fix is to add poly_inv_ntt on line 277:
- let v = poly_add (poly_add (vector_dot_product_ntt t_as_ntt r_as_ntt) error_2) mu in
+ let v = poly_add (poly_add (poly_inv_ntt (vector_dot_product_ntt t_as_ntt r_as_ntt)) error_2) mu in
Bug 3: False Proof in ML-KEM Serialization
The F* lemma deserialize_12_bit_vec_lemma_bounded in libcrux-ml-kem/src/vector/portable/serialize.rs (line 691) claims that all elements produced by deserialize_12 are bounded by 1—that is, they fit in a single bit. This is mathematically false. The deserialize_12 function extracts 12-bit values from a byte array; its outputs range from 0 to 4095. The bound should be 12.
let deserialize_12_bit_vec_lemma_bounded (v: t_Array u8 (sz 24))
: squash (
let result = ${deserialize_12} v in
(forall (i: nat {i < 16}).
Rust_primitives.bounded (Seq.index result.f_elements i) 1)
^
) = should be 12, not 1
_ by (Tactics.GetBit.prove_bit_vector_equality' ())
This is not a subtle semantic question. The lemma asserts that a function producing 12-bit values only ever produces values that fit in 1 bit. It is a false mathematical statement embedded in a proof that Cryspen’s own documentation claims is complete.
The interface-level lemma deserialize_12_lemma (line 706) correctly states the bound as 12. But it calls deserialize_12_bit_vec_lemma_bounded to discharge its proof obligation, so the entire correctness proof for deserialize_12 rests on a false statement:
let deserialize_12_lemma inputs =
deserialize_12_bit_vec_lemma inputs;
deserialize_12_bit_vec_lemma_bounded inputs; <-- false lemma
BitVecEq.bit_vec_equal_intro ...
A Copy-Paste Bug
Every other deserializer in the same file has the correct bound. The 1 was copied from deserialize_1 and never updated:
| Lemma | Bound | Correct? |
|---|---|---|
deserialize_1_bit_vec_lemma (line 108) |
1 | Yes |
deserialize_4_bit_vec_lemma_bounded (line 274) |
4 | Yes |
deserialize_10_bit_vec_lemma_bounded (line 478) |
10 | Yes |
deserialize_12_bit_vec_lemma_bounded (line 691) |
1 | No |
The bug was introduced in commit 5f6e3c2083 on March 25, 2025, by a Cryspen developer, with the commit message: “fix(ml-kem): deserialize_10/12: remove admit.” The commit replaced admit() calls with dedicated bounded lemmas. The deserialize_10 version was added correctly with bound 10. The deserialize_12 version was copy-pasted from deserialize_1 with the bound left as 1. The irony of a commit titled “remove admit” introducing a false proof should not be lost.
Three Layers of Silent Failure
The false lemma goes undetected because three independent mechanisms conspire to prevent it from ever being checked:
1. “Slow modules” are admitted by default. In fstar-helpers/Makefile.base:
VERIFY_SLOW_MODULES ?= no
ifeq (${VERIFY_SLOW_MODULES},no)
ADMIT_MODULES += ${SLOW_MODULES}
The portable serialize module is listed in SLOW_MODULES. When VERIFY_SLOW_MODULES is no—the default—the module is added to ADMIT_MODULES and verified with --admit_smt_queries true, which accepts all SMT proofs without checking them.
2. The proof tactic has a lax-mode escape hatch. In fstar-helpers/fstar-bitvec/Tactics.GetBit.fst:
let prove_bit_vector_equality' (): Tac unit =
if lax_on ()
then iterAll tadmit
else prove_bit_vector_equality'' ()
When the module is lax-checked, the tactic calls tadmit on all goals unconditionally, bypassing the actual bit-vector equality prover entirely. The false lemma is never subjected to any mathematical scrutiny.
3. Full verification is never triggered. A search of the entire libcrux repository finds that VERIFY_SLOW_MODULES=yes is never set in any CI configuration, Makefile target, or script. The proof is never actually checked in any automated pipeline.
The Documentation Says “Yes”
Meanwhile, Cryspen’s own verification_status.md claims the portable serialize module is fully verified:
| Category | File | Lax Checking | Runtime Safety | Correctness |
|---|---|---|---|---|
| Portable | serialize | yes | yes | yes |
Three “yes” entries. Full verification claimed for lax checking, runtime safety, and correctness.
But the file’s own header comment says:
//! Verification status: Lax
And the build system admits the module by default. So we have three sources that contradict each other:
| Source | Claimed status |
|---|---|
verification_status.md |
Fully verified (yes / yes / yes) |
| File header comment | Lax |
| Actual build configuration | Admitted by default |
The code that Cryspen documents as formally verified for correctness contains a false mathematical statement that is silently accepted because the proof is never actually checked.
The fix is to change 1 to 12 on line 691:
let deserialize_12_bit_vec_lemma_bounded (v: t_Array u8 (sz 24))
: squash (
let result = ${deserialize_12} v in
- (forall (i: nat {i < 16}). Rust_primitives.bounded (Seq.index result.f_elements i) 1)
+ (forall (i: nat {i < 16}). Rust_primitives.bounded (Seq.index result.f_elements i) 12)
) =
_ by (Tactics.GetBit.prove_bit_vector_equality' ())
What This Means
Cryspen invited scrutiny of their verified code. We accepted the invitation, and it took one fork and a careful read to find:
-
A wrong mathematical specification for ML-KEM decompression—the correctness target that the entire verification effort is built on top of—that disagrees with FIPS 203 on every input and has been wrong since the spec file was first created, authored by Cryspen’s Chief Research Scientist himself.
-
A missing inverse NTT in the ML-KEM encryption specification that adds an NTT-domain polynomial to coefficient-domain polynomials—mathematically nonsensical—introduced in the same commit by the same author.
-
A false proof annotation in ML-KEM’s portable serialization that claims 12-bit values fit in 1 bit, silently accepted by a build system that defaults to not checking proofs.
All three bugs survive for the same structural reason: the modules that would expose them are admitted by default. The proofs are never checked. The documentation says “yes” when the answer is “no.”
Cryspen’s response to our original findings was to block us, close our fixes, and then quietly adopt them. Their public response acknowledged two of six vulnerabilities and ignored the four most severe. Their verification status documentation claims that code is verified when it is not. Their proof infrastructure silently admits false mathematical statements rather than checking them. And their mathematical specification—the foundation on which every correctness claim rests—computes a different function than the standard it claims to formalize, in two independent places.
The formal verification methodology that Cryspen markets as providing “the highest level of assurance” contains a build system that defaults to not checking proofs, a tactic library that silently accepts all claims in lax mode, specifications that disagree with the standards they model, and documentation that says “yes” when the answer is “no.” The result is a formally verified cryptographic library where wrong specifications and false proofs survive indefinitely because no one—and no automated system—ever actually verifies them.
We continue to believe that formal verification is a valuable technique when applied honestly and communicated precisely. What we object to is the gap between what is claimed and what is delivered. That gap is not a limitation of formal methods. It is a choice.
→ See also our original findings: On the Promises of ‘High-Assurance’ Cryptography, and our paper: The Verification Theater: When Formal Methods Create False Assurance in Cryptographic Libraries
→ Nadim Kobeissi will be discussing these findings, the disclosure process, and the ethics of vulnerability reporting in formally verified software in a free online talk on February 25: High Assurance Cryptography and the Ethics of Disclosure
Want to work together?
Choose Symbolic Software as your trusted partner in enhancing security and fostering integrity within the digital ecosystem.
Start a Project