Making Verifpal Easier to Reason About

March 1, 2026

Software Verifpal Cryptography Formal Verification

A couple of weeks after releasing the Rust rewrite, we’ve shipped a deep architectural redesign of Verifpal’s analysis engine in Verifpal version 0.50.0. The observable behavior hasn’t changed, but the engine is now faster (3x on the full test suite), formally cleaner, and significantly easier to reason about. This post explains what changed and why, starting from the basics.

What Verifpal Does, and Why the Engine Matters

Verifpal analyzes cryptographic protocols. You describe a protocol — who talks to whom, what gets encrypted, what gets signed — and Verifpal checks whether an attacker can break its security properties: steal secrets, forge messages, impersonate participants.

The engine that does this analysis has three jobs:

Deduction: figure out everything the attacker can learn from observing (and tampering with) the protocol.
Search: systematically try different attacker strategies — replacing values on the wire, injecting forged messages.
Query evaluation: check whether any security property has been violated.

The old engine did all three jobs correctly, but the way it organized its work made it hard to see why it was correct. The redesign doesn’t change what the engine computes. It changes how the computation is structured, so that correctness becomes obvious rather than subtle.

Change 1: One Place for All the Rules

The Problem

A protocol verifier is only as good as its equational theory — the formal rules that define what cryptographic operations do. In Verifpal, these rules say things like:

DEC(k, ENC(k, m)) → m                  [given: k]
AEAD_DEC(k, AEAD_ENC(k, m, ad), ad) → m [given: k, ad]
SIGNVERIF(G^sk, msg, SIGN(sk, msg)) → nil [verification check]
G^a^b = G^b^a                           [DH commutativity]

The first rule says: if you have the key k and you see ENC(k, m) — a value encrypted with that key — you can recover the plaintext m. The Diffie-Hellman commutativity rule says: G^a^b and G^b^a are the same value, which is the mathematical property that makes key agreement work.

Before this redesign, these rules were scattered across five files. The declarative specifications lived in primitive/spec.rs. The code that interpreted those specs and checked “can the attacker actually do this?” lived in possible.rs. The code that applied the rules during protocol execution lived in rewrite.rs. DH commutativity was handled in equivalence.rs. And the filtering of constructible values for the attacker lived in inject.rs.

This meant that the answer to “what can the attacker do?” didn’t have a single location in the code. You had to hold all five files in your head simultaneously.

The Solution

A new module, theory.rs, now centralizes every derivation operation. The complete equational theory — all 14 rewrite rules for 21 cryptographic primitives — is listed in a single doc comment at the top of the file:

DEC(k, ENC(k, m)) → m                                    [given: k]
AEAD_DEC(k, AEAD_ENC(k, m, ad), ad) → m                  [given: k, ad]
AEAD_ENC(k, m, ad) → ad                                   [passive]
ENC(k, m) → m                                             [given: k]
BLIND(factor, m) → m                                      [given: factor]
PKE_DEC(sk, PKE_ENC(G^sk, m)) → m                         [given: sk]
PKE_ENC(G^sk, m) → m                                      [given: sk (via G^sk)]
SIGNVERIF(G^sk, msg, SIGN(sk, msg)) → nil                 [rewrite: def check]
RINGSIGNVERIF(G^sk, ..., RINGSIGN(sk, ...)) → nil         [rewrite: def check]
UNBLIND(factor, SIGN(sk, BLIND(factor, m)), m) → SIGN(sk, m) [rewrite]
SHAMIR_SPLIT(s)[i] + SHAMIR_SPLIT(s)[j] → s               [recompose: 2-of-3]
SHAMIR_JOIN(SHAMIR_SPLIT(s)[i], SHAMIR_SPLIT(s)[j]) → s   [rebuild]
CONCAT(a, b, ...) → a, b, ...                             [passive: reveals args]
G^a^b = G^b^a                                              [DH commutativity]

The eight operations the attacker can perform — decompose, passive-decompose, recompose, reconstruct (primitive), reconstruct (equation), rewrite, rebuild, and password extraction — are all implemented in this one file. The old possible.rs became a one-line re-export shim. No callers needed to change their imports.

The practical benefit: to understand what the attacker can do, read one file. To add a new cryptographic primitive, edit one file (spec.rs). The engine picks up the new rules automatically.

Change 2: The Three-Phase Pipeline

The Problem

When analyzing a protocol, the engine needs to do three things for each principal (participant): resolve all symbolic values, figure out what the attacker learns, and check security queries. The old engine interleaved these steps:

for each principal:
    resolve values
    loop:
        check queries        ← interleaved!
        try one deduction step
        if no progress: stop
    check queries again

The query check inside the deduction loop meant that the engine could exit early — if all queries happened to resolve mid-deduction, why keep going? This was a performance optimization, and it worked. But it had a subtle cost: the deduction loop was no longer a clean mathematical object. Its behavior depended on when queries resolved, which depended on what order values were processed, which was an implementation detail rather than a formal property.

The Solution: Separation of Concerns

The pipeline is now three explicit phases:

pub fn verify_standard_run(ctx, km, principal_states, stage) {
    for ps in principal_states {
        // Phase 1: Trace generation
        let ps_resolved = generate_trace(ctx, km, ps, &attacker)?;

        // Phase 2: Knowledge closure (monotone fixed-point)
        compute_knowledge_closure(ctx, km, &ps_resolved, stage)?;

        // Phase 3: Query evaluation
        verify_resolve_queries(ctx, km, &ps_resolved)?;
    }
}

Phase 1 (generate_trace) does all the mechanical work: resolve symbolic constant references to concrete values, record which slots differ from the original protocol trace, inject structural templates into the attacker’s knowledge, apply cryptographic rewrite rules (like DEC(k, ENC(k, m)) → m), and check that nothing went wrong.

Phase 2 (compute_knowledge_closure) is a pure deduction loop. It runs the attacker’s derivation rules until nothing new can be learned — no query checks, no early exits, no side effects beyond expanding the attacker’s knowledge set. This is now a textbook monotone fixed-point computation, which gives us a clean correctness argument.

Phase 3 (verify_resolve_queries) checks all security queries against the final, complete attacker knowledge.

Why Fixed Points Matter

This might sound like an academic distinction, but it has real consequences. A monotone fixed point is a concept from order theory. Informally: if you have a function F that only ever adds to a set (never removes), and the set has a maximum size, then repeatedly applying F will eventually reach a stable state where applying F doesn’t change anything. That stable state is the least fixed point.

The attacker’s deduction rules are monotone: the attacker only ever learns new values, never forgets them. The set of derivable values is finite (bounded by the protocol model). So the deduction loop is computing a least fixed point, and the Knaster-Tarski theorem guarantees it converges. Every value in the result is genuinely derivable. No value is missed. These properties hold regardless of evaluation order.

The old interleaved loop approximated this, but the early exits and query checks made the argument indirect. Now it’s immediate: compute_knowledge_closure is a Kleene iteration over a monotone function, and the Knaster-Tarski theorem does the rest.

Change 3: Provenance-Tagged Values

The Problem

Verifpal’s analysis is fundamentally about perspective. When an attacker intercepts and modifies a message on the wire, the sender and receiver see different things. Alice computes k = HASH(secret) and sends it to Bob. The attacker replaces k with HASH(attacker_value). Alice still “sees” her original k. Bob sees the attacker’s version. If the engine gets this wrong — if Alice somehow “sees” the tampered value in her own subsequent computations — authentication queries produce false positives.

The old engine tracked this with three separate copies of every value:

// Old design
pub struct SlotValues {
    pub assigned: Value,        // current value (after mutation + rewrite)
    pub before_rewrite: Value,  // after mutation, before rewrite
    pub before_mutate: Value,   // original (before attacker tampering)
    pub mutated: bool,
    pub creator: PrincipalId,
    pub sender: PrincipalId,
    pub rewritten: bool,
}

The decision about which copy a principal “sees” was made by a function called should_use_before_mutate:

// Old logic
pub fn should_use_before_mutate(&self, i: usize) -> bool {
    self.values[i].creator == self.id
        || !self.meta[i].known
        || !self.meta[i].wire.contains(&self.id)
        || !self.values[i].mutated
}

This worked, but the meaning was indirect. You had to know that creator == self.id means “I made this myself, so I see my version.” You had to know that !mutated means “the attacker didn’t touch it, so there’s only one version.” The fields creator, sender, and mutated were scattered alongside the value copies, with no explicit connection between them.

The Solution: Provenance

The redesign introduces an explicit Provenance record:

pub struct Provenance {
    pub creator: PrincipalId,
    pub sender: PrincipalId,
    pub attacker_tainted: bool,
}

pub struct SlotValues {
    pub value: Value,           // current canonical value
    pub original: Value,        // the principal's own computation
    pub pre_rewrite: Value,     // forensic snapshot (for error messages)
    pub rewritten: bool,
    pub provenance: Provenance,
}

The visibility decision is now a direct provenance check:

pub fn should_use_original(&self, i: usize) -> bool {
    !self.values[i].provenance.attacker_tainted
        || self.values[i].provenance.creator == self.id
        || !self.meta[i].known
        || !self.meta[i].wire.contains(&self.id)
}

The semantic difference is subtle but important. The old code asked: “was this value mutated, and who created it?” The new code asks: “is this value tainted by the attacker?” The answer is the same — the conditions haven’t changed — but the question is phrased in terms of what happened to the value rather than what fields happen to be set.

This also cleaned up the naming throughout the codebase: assigned → value, before_mutate → original, before_rewrite → pre_rewrite, mutated → provenance.attacker_tainted, set_assigned() → set_value(). These renamings touched every module, but each change made the code read more naturally.

Change 4: Bounded-Depth Search

The Problem

Verifpal’s active attacker analysis works by trying mutations: replacing wire values with attacker-controlled alternatives, then checking whether the attacker can derive new secrets. The old engine used a ten-stage search controlled by seven empirical constants:

// Old design
const BUDGET: StageBudget = StageBudget {
    exhaustion_threshold: 6,
    max_stage: 10,
    max_subset_weight: 3,
    max_subsets_per_weight: 150,
    max_weight1_mutations: 150,
    max_mutations_per_subset: 50000,
    max_full_product: 50000,
    max_scan_budget: 80000,
};

These numbers worked — Verifpal found attacks in Signal, Scuttlebutt, TLS 1.3, and 94 test models. But several independent dimensions were conflated into a single “stage” counter. The stage number simultaneously controlled:

Which values to mutate: stages 1–3 only offered minimal replacements (nil, G^nil); stage 4+ offered all known values
How deeply to nest injected values: stage 5+ enabled recursive injection
Which primitives to consider: stages 0–1 blocked primitive injection; stage 2 blocked “explosive” primitives (HASH, HKDF, CONCAT)
When to give up: exhaustion after stage 6

This meant you couldn’t answer a simple question: “what class of attacks does Verifpal guarantee to find?” The answer depended on stage progression, budget exhaustion, and the interaction of four separate thresholds.

On top of this, a special-case function called verify_active_equation_bypass ran before the main search. It directly tried replacing Diffie-Hellman public keys with the attacker’s own key — the canonical man-in-the-middle attack. This was necessary because the staged search might not try this specific combination within its budget. But it was a symptom: if the general search needed a hand-coded shortcut for the most common attack class, the general search had blind spots.

The Solution: One Parameter

The new search is controlled by a single depth parameter:

struct SearchConfig {
    depth: usize,                // Default: 3
    max_subsets_per_weight: usize,
    max_weight1_mutations: usize,
    max_mutations_per_subset: usize,
    max_full_product: usize,
    max_scan_budget: u32,
}

At depth d, the engine explores all k-subsets for k ≤ d, with all attacker-known values as potential replacements, and recursive injection nesting bounded by max(0, d − 1). The coverage guarantee is precise:

At depth d, Verifpal explores all attacker strategies involving at most d simultaneous value substitutions, with injected values of nesting depth at most d.

The ten stages collapsed into three depth levels. The STAGE_MUTATION_EXPANSION constant was deleted — all mutations are now available at every depth. The STAGE_RECURSIVE_INJECTION constant and inject_primitive_stage_restricted function were deleted — injection eligibility is controlled solely by the depth parameter.

The equation bypass was deleted entirely. Replacing a received G^sk with G^nil is a natural first-order mutation at depth 1: G^nil is always in the attacker’s knowledge set (it’s the attacker’s own DH public key), and equation-valued slots accept equation replacements. No special case needed. The fact that the old engine required one tells you the staged search was leaving holes that were patched ad hoc.

The Result

The numbers speak for themselves:

	Old (10-stage)	New (depth-3)
Parameters	7 empirical constants	1 depth + 5 budget caps
Stages/depths	10	3
Equation bypass	Special-case function	Natural depth-1 mutation
Test suite time	~11s	~3.5s
Coverage guarantee	Implicit	Explicit (k ≤ d substitutions, nesting ≤ d)
Attacks found	All 147 tests	All 147 tests (identical)

3x faster, clearer coverage semantics, less code. The default depth of 3 finds every attack that the 10-stage system found, in a third of the time.

How It All Fits Together

These four changes are not independent. They build on each other:

The unified theory gives the engine a single, auditable rule set. When the deduction loop asks “what can the attacker derive?”, the answer comes from one module with one doc comment listing every rule.
The three-phase pipeline separates what the attacker learns from what security properties hold. The deduction loop runs to completion as a clean fixed point; query evaluation happens after, on the final knowledge set.
Provenance tagging makes the visibility decision — which value a principal perceives — an explicit metadata check rather than an indirect inference from scattered boolean fields.
The bounded-depth search eliminates the staged complexity and the equation bypass special case, giving a clear coverage guarantee parameterized by a single number.

Each change made the next one easier. The unified theory simplified the deduction loop. The clean deduction loop enabled the phase separation. The phase separation made the provenance changes straightforward to verify. And the provenance model made the search redesign cleaner because mutation injection has simpler semantics when taint is explicit.

Updated Tooling

The engine redesign was a good occasion to bring the rest of the Verifpal ecosystem up to date.

VSCode extension. The VSCode extension has been updated to work with the Rust rewrite, restoring hover documentation, document formatting, attacker analysis with inline result decorations, and protocol diagram rendering. The interface between the extension and the Rust binary was redesigned to be simpler than the original — pre-formatted display strings replace the old JSON round-trip, cutting two subcommands and eliminating the need for JSON deserialization on the Rust side.

Neovim extension. There’s a new Neovim extension with syntax highlighting, inline diagnostics that mark passing and failing queries directly in the sign column, document formatting via verifpal pretty, and hover documentation for all 21 primitives, all query types, and all keywords. It uses Neovim’s native diagnostic API — no LSP server required — and installs with a single line in lazy.nvim or packer.

Verifpal Manual. Print 18 of the Verifpal Manual covers all of the changes described in this post — the unified theory, the three-phase pipeline, provenance tagging, and the bounded-depth search — alongside the existing reference material.

Verifpal Workbench. The browser-based Workbench has been updated with the latest WASM build, incorporating the engine redesign and all new primitives.

Want to work together?

Choose Symbolic Software as your trusted partner in enhancing security and fostering integrity within the digital ecosystem.

Start a Project