Constant-Time Implementation and Optimization of SM2 Finite Field Inversion
The article analyzes constant‑time computation of the multiplicative inverse in SM2’s prime field, compares the variable‑time Extended Euclidean Algorithm with a constant‑time Fermat‑based square‑and‑multiply exponentiation, optimizes the fixed exponent using add‑chain generation, and shows this reduces multiplications from ~187 to ~41, making inversion the dominant cost in secure SM2 signing.
In this article we examine a technical aspect of the SM2 signature algorithm: the constant‑time computation of the multiplicative inverse in the underlying prime field. We first introduce the SM2 signing and verification steps, point out potential side‑channel vulnerabilities, and then discuss the mathematical background required for safe implementation.
The SM2 algorithm operates over a prime field \(F_p\) where \(p\) is a large prime (the curve parameter \(n\)). Computing the signature component \(s = (k - r·d_A) / (1 + d_A) \bmod n\) requires the inverse of \(1 + d_A\) modulo \(n\). This inverse is obtained by finding the multiplicative inverse of a non‑zero element in the finite field.
Finite fields \(F_p\) are defined by the usual addition and multiplication rules together with a reduction modulo \(p\). Division is replaced by multiplication with the inverse: for a non‑zero element \(a\), its inverse \(a^{-1}\) satisfies \(a·a^{-1} \equiv 1 \pmod p\). The inverse always exists because \(a\) and \(p\) are coprime.
Two classic methods to compute this inverse are presented:
Extended Euclidean Algorithm (EEA), which finds integers \(u, v\) such that \(u·p + v·b = \gcd(p,b) = 1\). The coefficient \(v\) is the desired inverse of \(b\) modulo \(p\). While fast and readily available in big‑integer libraries, the EEA’s execution time depends on the secret operand and can leak information.
Fermat’s Little Theorem, which states that for prime \(p\) and \(a \not\equiv 0\pmod p\), \(a^{p-1} \equiv 1 \pmod p\). Hence \(a^{-1} \equiv a^{p-2} \pmod p\). This method can be implemented with a constant‑time square‑and‑multiply exponentiation.
The square‑and‑multiply routine used in the article is shown below:
func squareAndMultiply(x, power, mod *big.Int) *big.Int {
out := new(big.Int).SetUint64(1)
for _, bit := range power.bits { //从高位到低位遍历
out.Square(out)
out.Mod(out, mod)
if bit == 1 {
out.Multiply(out, x)
out.Mod(out, mod)
}
}
return out
}
//modInverse(1+dA, n-2, n)给出(1 + dA)^-1To further reduce the number of multiplications, the article mentions using the add‑chain tool to generate an optimized addition chain for the fixed exponent \(n-2\). An example of an add‑chain search for the exponent 211 is shown:
bash-3.2$ addchain search 211
addchain: expr: "211"
addchain: hex: d3
addchain: dec: 211
addchain: best: opt(dictionary(sliding_window(4),heuristic(use_first(halving,delta_largest))))
addchain: cost: 10
_10 = 2*1
_11 = 1 + _10
_1100 = _11 << 2
_1101 = 1 + _1100
_11010000 = _1101 << 4
return _11 + _11010000Using such optimized chains, the exponentiation for SM2’s fixed modulus can be reduced from roughly 187 multiplications (standard square‑and‑multiply) to about 41 multiplications, cutting the computational cost by roughly four‑fold.
Performance measurements on a MacBook Pro M1 show that the constant‑time inversion implementation consumes about 10 µs per signature (≈18 % of total SM2 signing time), while the library EEA implementation takes only ~3 µs but is not constant‑time. The article concludes that constant‑time field inversion remains the dominant cost in a constant‑time SM2 implementation, and future work will address constant‑time implementations of other Chinese cryptographic primitives such as SM4.
Bilibili Tech
Provides introductions and tutorials on Bilibili-related technologies.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.