Code Poetry
and Text Adventures

by catid posted (>30 days ago) 3:14pm Thu. Mar 15th 2012 PDT
Now that the matrix parameters are somewhat tuned, I can start benchmarking the performance of my FEC codec, Wirehair, available here: http://wirehairfec.com

At the heart of the codec is a sparse matrix solver that operates in linear time.  About SQRT(N) of the matrix rows are dense, but they are built using Shuffle-2 Codes (see previous posts) so the MultiplyDenseValues step takes 2.5 * N row operations.

I simulated the codec for different file sizes, with a block size of 1500 bytes (around the size of an Internet datagram), as if it were being used to protect a file transfer over the Internet:

PeelDiagonal used 1 row ops = 0.5*N
InitializeColumnValues used 10 row ops = 5*N
MultiplyDenseValues used 3 row ops = 1.5*N
AddSubdiagonalValues used 0 row ops = 0*N and 20 heavy ops
BackSubstituteAboveDiagonal used 6 row ops = 3*N and 12 heavy ops
Substitute used 2 row ops = 1*N
>> OKAY! N=2(0.003 MB)

PeelDiagonal used 190 row ops = 2.96875*N
InitializeColumnValues used 53 row ops = 0.828125*N
MultiplyDenseValues used 164 row ops = 2.5625*N
AddSubdiagonalValues used 92 row ops = 1.4375*N and 86 heavy ops
BackSubstituteAboveDiagonal used 164 row ops = 2.5625*N and 22 heavy ops
Substitute used 307 row ops = 4.79688*N
>> OKAY! N=64(0.096 MB)

PeelDiagonal used 3942 row ops = 3.84961*N
InitializeColumnValues used 121 row ops = 0.118164*N
MultiplyDenseValues used 2569 row ops = 2.50879*N
AddSubdiagonalValues used 692 row ops = 0.675781*N and 85 heavy ops
BackSubstituteAboveDiagonal used 902 row ops = 0.880859*N and 22 heavy ops
Substitute used 5713 row ops = 5.5791*N
>> OKAY! N=1024(1.536 MB)

PeelDiagonal used 8855 row ops = 4.375*N
InitializeColumnValues used 343 row ops = 0.169466*N
MultiplyDenseValues used 5047 row ops = 2.49358*N
AddSubdiagonalValues used 1283 row ops = 0.633893*N and 85 heavy ops
BackSubstituteAboveDiagonal used 1571 row ops = 0.776186*N and 23 heavy ops
Substitute used 12397 row ops = 6.125*N
>> OKAY! N=2024(3.036 MB)

PeelDiagonal used 17300 row ops = 4.22363*N
InitializeColumnValues used 676 row ops = 0.165039*N
MultiplyDenseValues used 10242 row ops = 2.50049*N
AddSubdiagonalValues used 2560 row ops = 0.625*N and 87 heavy ops
BackSubstituteAboveDiagonal used 3171 row ops = 0.77417*N and 21 heavy ops
Substitute used 24391 row ops = 5.95483*N
>> OKAY! N=4096(6.144 MB)

PeelDiagonal used 37274 row ops = 4.54949*N
InitializeColumnValues used 2445 row ops = 0.298425*N
MultiplyDenseValues used 20441 row ops = 2.49493*N
AddSubdiagonalValues used 5152 row ops = 0.628829*N and 84 heavy ops
BackSubstituteAboveDiagonal used 5991 row ops = 0.731234*N and 24 heavy ops
Substitute used 51616 row ops = 6.30001*N
>> OKAY! N=8193(12.2895 MB)

PeelDiagonal used 73492 row ops = 4.59325*N
InitializeColumnValues used 958 row ops = 0.059875*N
MultiplyDenseValues used 39875 row ops = 2.49219*N
AddSubdiagonalValues used 9220 row ops = 0.57625*N and 84 heavy ops
BackSubstituteAboveDiagonal used 12026 row ops = 0.751625*N and 22 heavy ops
Substitute used 101056 row ops = 6.316*N
>> OKAY! N=16000(24 MB)

PeelDiagonal used 149466 row ops = 4.67081*N
InitializeColumnValues used 4210 row ops = 0.131563*N
MultiplyDenseValues used 80037 row ops = 2.50116*N
AddSubdiagonalValues used 24807 row ops = 0.775219*N and 82 heavy ops
BackSubstituteAboveDiagonal used 29286 row ops = 0.915188*N and 24 heavy ops
Substitute used 204418 row ops = 6.38806*N
>> OKAY! N=32000(48 MB)

PeelDiagonal used 232902 row ops = 4.65804*N
InitializeColumnValues used 3733 row ops = 0.07466*N
MultiplyDenseValues used 125120 row ops = 2.5024*N
AddSubdiagonalValues used 46677 row ops = 0.93354*N and 82 heavy ops
BackSubstituteAboveDiagonal used 68097 row ops = 1.36194*N and 25 heavy ops
Substitute used 318827 row ops = 6.37654*N
>> OKAY! N=50000(75 MB)

PeelDiagonal used 294344 row ops = 4.59912*N
InitializeColumnValues used 8312 row ops = 0.129875*N
MultiplyDenseValues used 159905 row ops = 2.49852*N
AddSubdiagonalValues used 36679 row ops = 0.573109*N and 86 heavy ops
BackSubstituteAboveDiagonal used 54880 row ops = 0.8575*N and 21 heavy ops
Substitute used 403807 row ops = 6.30948*N
>> OKAY! N=64000(96 MB)

A short summary of what each of the steps are for:

+ PeelDiagonal : Diagonalize the peeling matrix after applying peeling as far as possible.
+ InitializeColumnValues : Set column values for GE submatrix after solving the GE submatrix.
+ MultiplyDenseValues : Solve for row values for GE submatrix for dense rows.
+ AddSubdiagonalValues : Eliminate below the diagonal of the GE submatrix.
+ BackSubstituteAboveDiagonal : Eliminate above the diagonal of the GE submatrix.
+ Substitute : Solve all of the remaining columns of the matrix in order of peeling.

Edited: This data was updated to include the results from the latest software, which also uses a window technique for AddSubdiagonalValues.  The result overall is just a few more MB/s throughput but in terms of row operations the advantage is clear.  Now all steps are linear algorithms.
last edit by catid edited (>30 days ago) 9:34pm Sun. Mar 18th 2012 PDT