0:00
/
0:00

Semaev's Naive Index Calculus for Elliptic Curves

Coding Semaev's 2004 Paper to refine our Satoshi Wallet Searcher
Quick Intro
LeetArxiv is Leetcode for implementing Arxiv and other research papers.
*Code along in C and Python. Here is 12 months of Perplexity Pro on us.
There’s free GPU credits hidden somewhere below :)

Math papers applied to computer science. Subscribe for weekly paper coding guides.

This is part of our series on Practical Index Calculus for Computer Programmers.

Part 1: Discrete Logarithms and the Index Calculus Solution.

Part 2: Solving Pell Equations with Index Calculus and Algebraic Numbers.

Part 3: Solving Index Calculus Equations over Integers and Finite Fields.

Part 4 (we are here): Semaev’s Naive Index Calculus for Elliptic Curves using Summation Polynomials.

Abstract for the 2004 paper Summation Polynomials and the Discrete Logarithm Problem on Elliptic Curves by Igor Semaev

Context

We’ve tackled index calculus over integer fields and algebraic rings. Now it’s index calculus over elliptic curve points lol.

1.0 Introduction

Let E be an elliptic curve defined over a finite field 𝔽p and let S, TE(𝔽p). The Elliptic Curve Discrete Logarithm Problem (ECDLP) is the challenge of finding an integer m satisfying (Silverman, 2007)1:

Elliptic curve discrete logarithm problem taken from (Silverman, 2007)

The intractibility of the discrete logarithm problem provides a basis for the security of many public-key cryptosystems and a survey of various algorithms for attacking the DLP are presented in (Howell, 1998)2.

The index calculus is a fast lifting algorithm designed to solve the classical (DLP) Discrete Logarithm Problem (Silverman, 2009)3. The index calculus is built upon the idea: one can lift the DLP from 𝔽p to , solve the problem in , and then reduce the solution modulo p.

However, index calculus lifting techniques tend to fail on elliptic curves.

1.1 Why Index Calculus Fails on Elliptic Curves

Informally, index calculus fails on elliptic curves because prime numbers (or irreducibe polynomials) are a hand-wavy concept in elliptic curve groups. That is, there is not a straightforward notion of smoothness in a curve’s group structure (Neves, 2013)4.

(Petit & Quisquater, 2012)5 attempt to identify irreducible polynomials aka primes in binary elliptic curves while (Gaudry, 2009)6 and (Diem, 2011)7 relate integers in to their corresponding binary polynomial elliptic curves in the extension field GF(2n).

As summarized in (Silverman, 2009), these index calculus techniques fail to generalize because:

  1. They are difficult to compute.

  2. Do not preserve relations.

  3. The curves where they work are insecure for cryptography.

2.0 Semaev’s Naive Index Calculus Technique

The paper Summation Polynomials and the Discrete Logarithm Problem on Elliptic Curves (Semaev, 2004)8 introduced the idea of index calculus on an elliptic curve over a prime finite field.

Outline of elliptic curve index calculus method taken from (Kazuhiro et al., 2020)

Index calculus over elliptic curves is structured similar to index calculus over integer fields and algebraic rings:

  1. Construct a factor base.

    • Find random x values modulo the curve order and test if you can find corresponding y coordinates. These points are our factor base.

      Example factor base
  2. Collect relations using Semaev polynomials.

    Relation equation
    • We are searching for sums of points in our factor base that equal sums of multiples of our generator P and target point Q.

    • Semaev polynomials are a shortcut to finding these points.

      Factor base with 5 elements (green) and 12 relations (red)
  3. Use linear algebra to solve a system of equations.

    • Find the Reduced Row Echelon form of the system like in Part 3.

    • If you use Semaev3 then we provide a quick script to solve simultaneous equations over a finite field here.

Below we go into the fine implementation details of each step.

2.1 Constructing an Elliptic Curve Factor Base

An elliptic curve factor base is defined in an algebraic rather than an arithmetic way since smoothness does not exist (Diem, 2011).

Our example implementation uses the C library linked here with the secp curve parameters :

Secp256k1 Curve parameters used in our example

In practice, one selects random X coordinates then attempts to find a corresponding y value. In our code we just generate all possible points then select random points.

*In practice you wouldn’t know what scalar was multiplied to find the factor base element

In C, this resembles:

C code to find a random factor base

2.2 Using Semaev Summation Polynomials to Find Relations

TL;DR: Semaev polynomials are a computational shortcut to find points that sum to infinity. Each elliptic curve has a unique semaev polynomial set. You sum Semaev polynomials modulo the field characteristic(number of points on the elliptic curve) not the generator order
Definition of Semaev summation polynomials taken from (Kazuhiro et al., 2020)

Informally, a Semaev summation polynomial is a shortcut to test if some points sum to zero just by looking at their x-coordinates.

More formally, a Semaev summation polynomial is a symmetric, recursive polynomial defined over the elliptic curve points (xi,yi)E(𝔽p) that displays the property fn(x1, x2,...,xn) = 0 and (x1,y1) + (x2,y2) + ... + (xn,yn) = P (Semaev, 2004)

Note: P is the point at infinity

2.2.1 Constructing Semaev Polynomials

Semaev polynomials are constructed via recursion. It is difficult to construct these summation polynomials after 7 recursion steps (Kazuhiro, 2020).

Third Semaev summation polynomial. Taken from (Semaev, 2004)

The first three semaev polynomials are:

First three Semaev polynomials
*I made a mistake lol. S2 = x1-x2

Finding the fourth polynomial would involve the resultant* of the third Semaev polynomial:

*The resultant of two polynomials is the determinant of their Sylvester matrix and is used to compute the intersection of two algebraic curves (Meliot, 2020)9.
Finding the fourth Semaev polynomial using the resultant

In C, we can write the first three semaev polynomials:

First three Semaev polynomials

In practice, overflow is highly likely so we’d implement modulo ops pretty early:

Semaev3 with modulos to prevent integer overflow
*Semaev polynomials sum modulo the field characteristic(number of points on the elliptic curve), not the generator order

2.2.2 Using Semaev Polynomials to Find Relations

We find relations like this:

  1. Multiply generator P and target point Q by random scalars, then find sum.

  1. Search for points in our factor base that equal the sum in Step 1.

In practice, we pass the x-coordinates of multiples of P, Q and a search within our factor base for a valid point with our Semaev functions.

Example search loop

3.0 Solving Linear System of Relations

Finding relations is the hardest part. Now we can solve a linear system modulo the generator points order (not the field characteristic).

Say we found these relations:

Factor base alongside found relations

The next steps are exactly like index calculus from before:

  1. Assign a variable L to each element in the factor base to indicate this is the logarithm to the generator base P:

    (3258, 19385)   → log_P(3258, 19385) = L0
    (17857, 1358)   → log_P(17857, 1358) = L1
    (16530, 11658)  → log_P(16530, 11658) = L2
    (16562, 7608)   → log_P(16562, 7608) = L3
    (4656, 2303)    → log_P(4656, 2303) = L4
  2. Each of our relations represents:

    a*1 + b*x + 1*lᵢ ≡ 0 (mod generatorOrder)
  3. So we solve this system of congruences

    11602*x + L3 ≡ -14664 (mod 20947)
    10579*x + L0 ≡ -8235  (mod 20947) 
    18271*x + L0 ≡ -12820 (mod 20947)
    3182*x  + L2 ≡ -11886 (mod 20947)
    14810*x + L3 ≡ -1141  (mod 20947)
    2084*x  + L2 ≡ -15300 (mod 20947)
    19034*x + L3 ≡ -6894  (mod 20947)
    7773*x  + L4 ≡ -2307  (mod 20947)
    16991*x + L1 ≡ -14868 (mod 20947)
    4568*x  + L2 ≡ -14101 (mod 20947)
    12079*x + L2 ≡ -820   (mod 20947)
    13793*x + L3 ≡ -1556  (mod 20947)

In practice one would use the technique from Part 3 to find the RREF of the matrix. In our example, we construct systems of simultaneous equations like:

11602*x + L3 ≡ -14664 (mod 20947)
19034*x + L3 ≡ -6894  (mod 20947)

This is a simultaneous equation involving only two variables so we can write a Python script to solve it fast

def solve_two_variables_system(coeffs1, coeffs2, n):
    “”“
    Solve system:
    a1*x + b1*L ≡ c1 (mod n)
    a2*x + b2*L ≡ c2 (mod n)
    “”“
    a1, b1, c1 = coeffs1
    a2, b2, c2 = coeffs2
    
    # Solve using elimination
    # Multiply first eq by b2, second eq by b1
    # Then subtract to eliminate L
    coeff_x = (a1 * b2 - a2 * b1) % n
    rhs = (c1 * b2 - c2 * b1) % n
    
    try:
        inv_coeff_x = pow(coeff_x, -1, n)
        x = (rhs * inv_coeff_x) % n
        
        #Now solve for L using first equation
        #b1*L ≡ c1 - a1*x (mod n)
        rhs_L = (c1 - a1 * x) % n
        inv_b1 = pow(b1, -1, n)
        L = (rhs_L * inv_b1) % n
        
        # Verify both equations
        eq1 = (a1 * x + b1 * L) % n
        eq2 = (a2 * x + b2 * L) % n
        
        print(f”Solution:”)
        print(f”x = {x}”)
        print(f”L = {L}”)
        print(f”Verification:”)
        print(f”Eq1: {a1}*{x} + {b1}*{L} = {eq1} (should be {c1})”)
        print(f”Eq2: {a2}*{x} + {b2}*{L} = {eq2} (should be {c2})”)
        
        return x, L
        
    except ValueError:
        print(f”No solution: coefficient {coeff_x} has no inverse mod {n}”)
        return None, None

#Solving:
#11602*x + 1*L ≡ -14664 (mod 20947)
#19034*x + 1*L ≡ -6894  (mod 20947)
n = 20947
coeffs1 = [11602, 1, (-14664) % n]  # [a1, b1, c1]
coeffs2 = [19034, 1, (-6894) % n]   # [a2, b2, c2]

x, L = solve_two_variables_system(coeffs1, coeffs2, n)
Solving the system provides x and L3

x is our target. From the C code, our target point indeed had a value of 19691. It worked!

Original value was 19691

4.0 Complexity Bounds

The complexity bounds of Semaev’s naive index calculus method is described in (Kazuhiro et al, 2020)10. The authors arrive at the conclusion that Semaev’s index calculus algorithm cannot be more efficient than Pollard Rho or Pollard Kangaroo.

5.0 Known Improvements

In GF(2n), (Shantz & Teske, 2013)11 test improvements on Semaev’s technique by (Gaudry, 2009) and by combining summation polynomials with weil descent and grobner basis methods.

These improvements will not work on the bitcoin curve because we’re working modulo a prime number, not a binary field.

Here are some free gpu credits if you made it this far lol.

Sources

1

Silverman, J,. H,. (2007). The Four Faces of Lifting for the Elliptic Curve Discrete Logarithm Problem. 11th Workshop on Elliptic Curve Cryptography. Link.

2

Howell, S,. J,. (1998). The Index Calculus Algorithm for Discrete Logarithms. Clemson University Masters Thesis.

3

Silverman, J,. H,. (2009). Lifting and Elliptic Curve Discrete Logarithms. In: Avanzi, R.M., Keliher, L., Sica, F. (eds) Selected Areas in Cryptography. SAC 2008. Lecture Notes in Computer Science, vol 5381. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04159-4_6

4

Neves, S,. (2013). Answer to Question: Trying to better understand the failure of the Index Calculus for ECDLP. CryptographyStackExchange.

5

Petit, C., & Quisquater, J.-J. (2012). On polynomial systems arising from a Weil descent. Cryptology ePrint Archive, Paper 2012/146. https://eprint.iacr.org/2012/146

6

Gaudry, P. (2009). Index calculus for abelian varieties of small dimension and the elliptic curve discrete logarithm problem. Journal of Symbolic Computation, 44(12), 1690–1702. https://doi.org/10.1016/j.jsc.2008.08.005

7

Diem, C. (2011). On the discrete logarithm problem in elliptic curves. Compositio Mathematica, 147(1), 75–104. doi:10.1112/S0010437X10005075

8

Semaev, I,. (2004). Summation polynomials and the discrete logarithm problem on elliptic curves. IACR Cryptology ePrint Archive 2004/031 (2004).

9

Meliot. P,. (2020). The Resultant of Two Polynomials. PDF Link.

10

Kazuhiro, K., Yasuda, M., Takahashi, Y. & Kogure, J. (2020). Complexity bounds on Semaev’s naive index calculus method for ECDLP. Journal of Mathematical Cryptology, 14(1), 460-485. https://doi.org/10.1515/jmc-2019-0029

11

Shantz, M., & Teske, E,. (2013). Solving the Elliptic Curve Discrete Logarithm Problem Using Semaev Polynomials, Weil Descent and Grobner Basis Methods – an Experimental Study. IACR Preprint.

Discussion about this video

User's avatar