Playback speed

Share post at current time

0:00

Transcript

Sinkhorn Knopp Algorithm

Getting Softmax-like Probabilities but for Optimal Transport Problems

Murage Kibicho

Oct 21, 2025

Quick intro

LeetArxiv is Leetcode for implementing Arxiv and other research papers.

*We code this paper in C and Python. Here is 12 months of Perplexity Pro on us.

This is part of our What Every Programmer Needs to Know about Optimal Transport.

Available Chapterss

Chapter 1 : Introduction to Optimal Transport.
Chapter 2 (we are here) : Sinkhorn-Knopp Algorithm for Solving Optimal Transport Problems.
Chapter 3 : Sinkhorn Solves Sudoku

**Frontmatter for the 1967 paper ‘Concerning Nonnegative Matrices and Doubly Stochastic Matrices’ by Richard Sinkhorn and Paul Knopp**

We provide C code here and Python code here. The article focuses on the C implementation.

Summary

Sinkhorn-Knopp is an algorithm used to ensure the rows and columns of a matrix sum to 1, like in a probability distribution.

1.0 Paper Introduction

The paper Concerning nonnegative matrices and doubly stochastic matrices (Sinkhorn & Knopp, 1967) introduces an iterative algorithm for balancing doubly stochastic matrices.

A doubly stochastic matrix is a matrix with non-negative elements whose rows each sum to 1 and whose columns each sum to 1 (Moon, Gunther & Kupin, 2009).

Matrix balancing is finding a doubly stochastic diagonal scaling of a square nonnegative matrix (Knight & Ruiz, 2013).

Definitions taken from page 1 (Sinkhorn & Knopp, 1967)

If A is a non-negative square matrix, A is said to have total support if every positive element of A lies on a positive diagonal.

A thorough description of total support is given in (Gnarls, 2024).

Summay of total support taken from (Gnarls, 2024)

The easiest way to test if a matrix A has support is by testing for invertibility because every invertible matrix has support (Grossmann, 2020).

2.0 Sinkhorn-Knopp Algorithm

This section goes into the fine details of implementing Sinkhorn-Knopp in C without libraries.

2.1 Testing for Total Support

Sinkhorn-knopp is proven to converge when a matrix has total support. We offer. Fully testing for support grows factorially because we need to find all permutations of matrix columns.

We offer this quick heuristic test instead for a matrix A:

 1. Check if A is a square matrix.
- Yes, proceed to step 2. No, A failed stop here.
 2. Check if all the entries of A are greater than 0.
- Yes, A has total support, stop here. No, proceed to step 3.
 3. Test if A is invertible. (A quick test is checking determinant is not equal to 0 by finding LU decomposition)
- Yes, A has total support, stop here. No, proceed to step 4.
 4. Check for zero rows or columns. (If any column is entirely zero then A is disconnected, ie has no total support)
- Yes, some rows/cols are entirely 0, stop A failed. No, proceed to Step 
5. Check if every row and column sum is greater than 0.
- Yes, proceed to step 6. No, A failed, stop here.
6. Check for perfect matching in the bipartite graph of A.
- Total support is equivalent to the bipartite graph having a perfect matching.


  [1]: https://leetarxiv.substack.com/p/sinkhorn-solves-sudoku

In C, we implement these tests like this:

In our implementation, we use the LU Decomposition to find the determinant when testing for invertibility.

2.2 Implementing Sinkhorn-Knopp Scaling

For a square, nonnegative matrix with total support, we follow the pseudocode at the beginning of the section:

Step 1: Initialize scaling vectors to 1
//Initialize Scaling Vectors to 1
for(int i = 0; i < rows; i++)
{
	rowScaling[i] = 1.0;
	colScaling[i] = 1.0;
}

Step 2: Update the row scaling using the column sum
//Update Row Scaling
for(int i = 0; i < n; i++)
{
	double rowSum = 0.0;
	for(int j = 0; j < n; j++)
	{
		rowSum += matrix[i * n + j] * colScaling[j];
	}
	if(rowSum > 1e-12)
	{
		rowScaling[i] = 1.0 / rowSum;
	}
	maxError = fmax(maxError, fabs(rowSum - 1.0));
}
Step 3: Update the column scaling using the row sum
//Update column scaling
for(int j = 0; j < n; j++)
{
	double colSum = 0.0;
	for(int i = 0; i < n; i++)
	{
		colSum += rowScaling[i] * matrix[i * n + j];
	}
	if(colSum > 1e-12)
	{
		colScaling[j] = 1.0 / colSum;
	}
	maxError = fmax(maxError, fabs(colSum - 1.0));
}
Step 4: Apply scaling to our result (we store the result in matrix copy)
for(int i = 0; i < n; i++)
{
	for(int j = 0; j < n; j++)
	{
	matrixCopy[i * n + j] = rowScaling[i] * matrix[i * n + j] * colScaling[j];
	}
}

In C, the code resembles:

3.0 Results

We expect our row sums and column sums to add up to 1. Like a probability distribution.

Testing our matrix yields:

Results of our Sinkhorn-Knopp implementation

We observe that the row sums added up to 1.

The column sums are pretty close to 1. It worked lol!

You made it this far. Here are some free gpu credits.

LeetArxiv is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

References

Sinkhorn, R., & Knopp, P,. (1967). Concerning nonnegative matrices and doubly stochastic matrices. Pacific Journal of Mathematics Vol. 21 (1967), No. 2, 343–348 . DOI: 10.2140/pjm.1967.21.343.

Moon, T., Gunther, J., & Kupin, J.,. (2009). Sinkhorn Solves Sudoku. IEEE Transactions on Information Theory, vol. 55, no. 4, pp. 1741-1746. doi: 10.1109/TIT.2009.2013004.

Knight, P., & Ruiz, D.,. (2013) A fast algorithm for matrix balancing. IMA Journal of Numerical Analysis, 33 (3). pp. 1029-1047. ISSN 0272-4979. Link.

Gnarls. (2024). What does it mean for a matrix to have total support? . Mathematics Stack Exchange. Link.

Grossman, B,. (2020). Check if a matrix has support. Mathematics Stack Exchange. Link.