Quick Intro
LeetArxiv is Leetcode for implementing Arxiv and other research papers.
*Code along in C and Python. Here is 12 months of Perplexity Pro on us.
There’s free GPU credits hidden somewhere below :)
This is part of our series on Practical Index Calculus for Computer Programmers.
Part 1: Discrete Logarithms and the Index Calculus Solution.
Part 2: Solving Pell Equations with Index Calculus and Algebraic Numbers.
Part 3: Solving Index Calculus Equations over Integers and Finite Fields.
Part 4 (we are here): Semaev’s Naive Index Calculus for Elliptic Curves using Summation Polynomials.

Context
We’ve tackled index calculus over integer fields and algebraic rings. Now it’s index calculus over elliptic curve points lol.
1.0 Introduction
Let E be an elliptic curve defined over a finite field 𝔽p and let S, T ∈ E(𝔽p). The Elliptic Curve Discrete Logarithm Problem (ECDLP) is the challenge of finding an integer m satisfying (Silverman, 2007)1:
The intractibility of the discrete logarithm problem provides a basis for the security of many public-key cryptosystems and a survey of various algorithms for attacking the DLP are presented in (Howell, 1998)2.
The index calculus is a fast lifting algorithm designed to solve the classical (DLP) Discrete Logarithm Problem (Silverman, 2009)3. The index calculus is built upon the idea: one can lift the DLP from 𝔽p to ℤ, solve the problem in ℤ, and then reduce the solution modulo p.
However, index calculus lifting techniques tend to fail on elliptic curves.
1.1 Why Index Calculus Fails on Elliptic Curves
Informally, index calculus fails on elliptic curves because prime numbers (or irreducibe polynomials) are a hand-wavy concept in elliptic curve groups. That is, there is not a straightforward notion of smoothness in a curve’s group structure (Neves, 2013)4.
(Petit & Quisquater, 2012)5 attempt to identify irreducible polynomials aka primes in binary elliptic curves while (Gaudry, 2009)6 and (Diem, 2011)7 relate integers in ℤ to their corresponding binary polynomial elliptic curves in the extension field GF(2n).
As summarized in (Silverman, 2009), these index calculus techniques fail to generalize because:
They are difficult to compute.
Do not preserve relations.
The curves where they work are insecure for cryptography.
2.0 Semaev’s Naive Index Calculus Technique
The paper Summation Polynomials and the Discrete Logarithm Problem on Elliptic Curves (Semaev, 2004)8 introduced the idea of index calculus on an elliptic curve over a prime finite field.
Index calculus over elliptic curves is structured similar to index calculus over integer fields and algebraic rings:
Construct a factor base.
Find random x values modulo the curve order and test if you can find corresponding y coordinates. These points are our factor base.
Collect relations using Semaev polynomials.
We are searching for sums of points in our factor base that equal sums of multiples of our generator P and target point Q.
Semaev polynomials are a shortcut to finding these points.
Use linear algebra to solve a system of equations.
Find the Reduced Row Echelon form of the system like in Part 3.
If you use Semaev3 then we provide a quick script to solve simultaneous equations over a finite field here.
Below we go into the fine implementation details of each step.
2.1 Constructing an Elliptic Curve Factor Base
An elliptic curve factor base is defined in an algebraic rather than an arithmetic way since smoothness does not exist (Diem, 2011).
Our example implementation uses the C library linked here with the secp curve parameters :
In practice, one selects random X coordinates then attempts to find a corresponding y value. In our code we just generate all possible points then select random points.
*In practice you wouldn’t know what scalar was multiplied to find the factor base element
In C, this resembles:
2.2 Using Semaev Summation Polynomials to Find Relations
TL;DR: Semaev polynomials are a computational shortcut to find points that sum to infinity. Each elliptic curve has a unique semaev polynomial set. You sum Semaev polynomials modulo the field characteristic(number of points on the elliptic curve) not the generator order
Informally, a Semaev summation polynomial is a shortcut to test if some points sum to zero just by looking at their x-coordinates.
More formally, a Semaev summation polynomial is a symmetric, recursive polynomial defined over the elliptic curve points (xi,yi)∈ E(𝔽p) that displays the property fn(x1, x2,...,xn) = 0 and (x1,y1) + (x2,y2) + ... + (xn,yn) = P∞ (Semaev, 2004)
Note: P∞ is the point at infinity
2.2.1 Constructing Semaev Polynomials
Semaev polynomials are constructed via recursion. It is difficult to construct these summation polynomials after 7 recursion steps (Kazuhiro, 2020).
The first three semaev polynomials are:
*I made a mistake lol. S2 = x1-x2
Finding the fourth polynomial would involve the resultant* of the third Semaev polynomial:
*The resultant of two polynomials is the determinant of their Sylvester matrix and is used to compute the intersection of two algebraic curves (Meliot, 2020)9.
In C, we can write the first three semaev polynomials:
In practice, overflow is highly likely so we’d implement modulo ops pretty early:
*Semaev polynomials sum modulo the field characteristic(number of points on the elliptic curve), not the generator order
2.2.2 Using Semaev Polynomials to Find Relations
We find relations like this:
Multiply generator P and target point Q by random scalars, then find sum.
Search for points in our factor base that equal the sum in Step 1.
In practice, we pass the x-coordinates of multiples of P, Q and a search within our factor base for a valid point with our Semaev functions.
3.0 Solving Linear System of Relations
Finding relations is the hardest part. Now we can solve a linear system modulo the generator points order (not the field characteristic).
Say we found these relations:
The next steps are exactly like index calculus from before:
Assign a variable L to each element in the factor base to indicate this is the logarithm to the generator base P:
(3258, 19385) → log_P(3258, 19385) = L0 (17857, 1358) → log_P(17857, 1358) = L1 (16530, 11658) → log_P(16530, 11658) = L2 (16562, 7608) → log_P(16562, 7608) = L3 (4656, 2303) → log_P(4656, 2303) = L4Each of our relations represents:
a*1 + b*x + 1*lᵢ ≡ 0 (mod generatorOrder)So we solve this system of congruences
11602*x + L3 ≡ -14664 (mod 20947) 10579*x + L0 ≡ -8235 (mod 20947) 18271*x + L0 ≡ -12820 (mod 20947) 3182*x + L2 ≡ -11886 (mod 20947) 14810*x + L3 ≡ -1141 (mod 20947) 2084*x + L2 ≡ -15300 (mod 20947) 19034*x + L3 ≡ -6894 (mod 20947) 7773*x + L4 ≡ -2307 (mod 20947) 16991*x + L1 ≡ -14868 (mod 20947) 4568*x + L2 ≡ -14101 (mod 20947) 12079*x + L2 ≡ -820 (mod 20947) 13793*x + L3 ≡ -1556 (mod 20947)
In practice one would use the technique from Part 3 to find the RREF of the matrix. In our example, we construct systems of simultaneous equations like:
11602*x + L3 ≡ -14664 (mod 20947)
19034*x + L3 ≡ -6894 (mod 20947)This is a simultaneous equation involving only two variables so we can write a Python script to solve it fast
def solve_two_variables_system(coeffs1, coeffs2, n):
“”“
Solve system:
a1*x + b1*L ≡ c1 (mod n)
a2*x + b2*L ≡ c2 (mod n)
“”“
a1, b1, c1 = coeffs1
a2, b2, c2 = coeffs2
# Solve using elimination
# Multiply first eq by b2, second eq by b1
# Then subtract to eliminate L
coeff_x = (a1 * b2 - a2 * b1) % n
rhs = (c1 * b2 - c2 * b1) % n
try:
inv_coeff_x = pow(coeff_x, -1, n)
x = (rhs * inv_coeff_x) % n
#Now solve for L using first equation
#b1*L ≡ c1 - a1*x (mod n)
rhs_L = (c1 - a1 * x) % n
inv_b1 = pow(b1, -1, n)
L = (rhs_L * inv_b1) % n
# Verify both equations
eq1 = (a1 * x + b1 * L) % n
eq2 = (a2 * x + b2 * L) % n
print(f”Solution:”)
print(f”x = {x}”)
print(f”L = {L}”)
print(f”Verification:”)
print(f”Eq1: {a1}*{x} + {b1}*{L} = {eq1} (should be {c1})”)
print(f”Eq2: {a2}*{x} + {b2}*{L} = {eq2} (should be {c2})”)
return x, L
except ValueError:
print(f”No solution: coefficient {coeff_x} has no inverse mod {n}”)
return None, None
#Solving:
#11602*x + 1*L ≡ -14664 (mod 20947)
#19034*x + 1*L ≡ -6894 (mod 20947)
n = 20947
coeffs1 = [11602, 1, (-14664) % n] # [a1, b1, c1]
coeffs2 = [19034, 1, (-6894) % n] # [a2, b2, c2]
x, L = solve_two_variables_system(coeffs1, coeffs2, n)x is our target. From the C code, our target point indeed had a value of 19691. It worked!
4.0 Complexity Bounds
The complexity bounds of Semaev’s naive index calculus method is described in (Kazuhiro et al, 2020)10. The authors arrive at the conclusion that Semaev’s index calculus algorithm cannot be more efficient than Pollard Rho or Pollard Kangaroo.
5.0 Known Improvements
In GF(2n), (Shantz & Teske, 2013)11 test improvements on Semaev’s technique by (Gaudry, 2009) and by combining summation polynomials with weil descent and grobner basis methods.
These improvements will not work on the bitcoin curve because we’re working modulo a prime number, not a binary field.
Here are some free gpu credits if you made it this far lol.
Sources
Silverman, J,. H,. (2007). The Four Faces of Lifting for the Elliptic Curve Discrete Logarithm Problem. 11th Workshop on Elliptic Curve Cryptography. Link.
Howell, S,. J,. (1998). The Index Calculus Algorithm for Discrete Logarithms. Clemson University Masters Thesis.
Silverman, J,. H,. (2009). Lifting and Elliptic Curve Discrete Logarithms. In: Avanzi, R.M., Keliher, L., Sica, F. (eds) Selected Areas in Cryptography. SAC 2008. Lecture Notes in Computer Science, vol 5381. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04159-4_6
Neves, S,. (2013). Answer to Question: Trying to better understand the failure of the Index Calculus for ECDLP. CryptographyStackExchange.
Petit, C., & Quisquater, J.-J. (2012). On polynomial systems arising from a Weil descent. Cryptology ePrint Archive, Paper 2012/146. https://eprint.iacr.org/2012/146
Gaudry, P. (2009). Index calculus for abelian varieties of small dimension and the elliptic curve discrete logarithm problem. Journal of Symbolic Computation, 44(12), 1690–1702. https://doi.org/10.1016/j.jsc.2008.08.005
Diem, C. (2011). On the discrete logarithm problem in elliptic curves. Compositio Mathematica, 147(1), 75–104. doi:10.1112/S0010437X10005075
Semaev, I,. (2004). Summation polynomials and the discrete logarithm problem on elliptic curves. IACR Cryptology ePrint Archive 2004/031 (2004).
Kazuhiro, K., Yasuda, M., Takahashi, Y. & Kogure, J. (2020). Complexity bounds on Semaev’s naive index calculus method for ECDLP. Journal of Mathematical Cryptology, 14(1), 460-485. https://doi.org/10.1515/jmc-2019-0029
Shantz, M., & Teske, E,. (2013). Solving the Elliptic Curve Discrete Logarithm Problem Using Semaev Polynomials, Weil Descent and Grobner Basis Methods – an Experimental Study. IACR Preprint.

























