TL;DR Check out the code here: https://github.com/catid/brook/blob/master/Redundancy.cpp

Motivation

Streaming data over a channel in packets (like Internet or deep space) is incredibly well-studied and modeled. The major variables are: Transmission delay average/maximum. Ambient random packet loss probability. Maximum "normal" burst loss duration in milliseconds. There are other minor considerations like re-ordering and corruption.

With rate-less forward error correction (FEC), each of these variables can be considered *separately*:

FEC basics

Original data packets are supplemented with redundant data packets generated by an FEC encoder. The encoder needs to see all of the original data before it can start generating redundant data packets. There are N original data packets. There are R redundant data packets. So overall there are M = N + R packets that get sent across the channel.

If at least N of M packets arrive, there is a very good chance (97%) that all the original data will be recovered.

One set of M packets is called a "code group."

Transmission delay

FEC does not care about the transmission delay, but this will be a base time past which data cannot be delivered faster.

Burst duration

The length R should be large enough to "eat" burst losses. If bursts are larger than R losses, then recovery is impossible. This means that interleaving packets from two separate code groups is a good idea since each code group will experience lower burst loss and can use smaller R redundant packets.

The maximum burst duration is exactly how much buffering should be added to the channel before a packet is considered lost in a real-time streaming application.

For large buffer applications, N can be chosen large, and bursts can be ignored entirely.

Ambient loss

While the ambient random packet loss probability of the normal Internet is about 3%, when a loss event occurs, the probability that the next packet will also be lost is conditionally much higher (around 50%). This means that a lot of streaming protocols randomize the order of data to spread the losses out, for example. The ambient loss is still 3% overall.

When FEC is applied, you can *decide* what the packet loss rate of the channel will be with a dial, in exchange for extra bandwidth. Want less probability messages get lost? Dial up the redundancy: 3% becomes 0.0001%

The rest of this post I want to just talk about how to choose a value of R to go from 3% loss to some target loss. I'll describe an algorithm and its mathematical basis.

Assuming that the code group size is large enough to eat burst losses, then the loss probability distribution is roughly uniform with probability of loss equal to p.

Deriving perceived packet loss rate after FEC is applied:

Note this looks like a Bernoulli random variable, which makes it amenable to approximation via the Central Limit Theorem.

Normal Approximation:

For this to be somewhat accurate, np >= 10 and n(1-p) >= 10.

This is a problem since we also want to solve it exactly for all cases. So when the approximation breaks down, we evaluate the Bernoulli CDF directly (see code).

The approximation tends to be close to the exact solution, typically off by < 1 packet from some experiments.

Here is some C++ code that implements it:

It tries values of r until the loss is acceptable. A faster approach would determine rate of convergence and skip intermediate values of r, since the loop needs to run over 300 times in interesting cases.

Discussion

Interestingly, adding just a few more redundant packets tends to reduce the perceived loss by a LOT. So it's hard to go from 3% to 0.01%, but to go from 0.01% to 0.001% just requires bumping the redundancy by another 10 packets of 1000 or so. This means that when FEC is used, it's economical to target very very low loss rates.

Larger N are also more efficient. Increasing N from 100 to 1000 cuts the bandwidth requirement for redundant data in half to reach the same target:

Larger N are more efficient for other reasons. Compression works better on larger data chunks. Operating on bulk data is more efficient. But FEC encoders are limited to N < 64,000.

An ideal operating region for FEC encoders seems to be around 1024 symbols. I've noticed that the MBPS performance of the encoder is good there, and it's hard to get better bandwidth efficiency.

Motivation

Streaming data over a channel in packets (like Internet or deep space) is incredibly well-studied and modeled. The major variables are: Transmission delay average/maximum. Ambient random packet loss probability. Maximum "normal" burst loss duration in milliseconds. There are other minor considerations like re-ordering and corruption.

With rate-less forward error correction (FEC), each of these variables can be considered *separately*:

FEC basics

Original data packets are supplemented with redundant data packets generated by an FEC encoder. The encoder needs to see all of the original data before it can start generating redundant data packets. There are N original data packets. There are R redundant data packets. So overall there are M = N + R packets that get sent across the channel.

If at least N of M packets arrive, there is a very good chance (97%) that all the original data will be recovered.

One set of M packets is called a "code group."

Transmission delay

FEC does not care about the transmission delay, but this will be a base time past which data cannot be delivered faster.

Burst duration

The length R should be large enough to "eat" burst losses. If bursts are larger than R losses, then recovery is impossible. This means that interleaving packets from two separate code groups is a good idea since each code group will experience lower burst loss and can use smaller R redundant packets.

The maximum burst duration is exactly how much buffering should be added to the channel before a packet is considered lost in a real-time streaming application.

For large buffer applications, N can be chosen large, and bursts can be ignored entirely.

Ambient loss

While the ambient random packet loss probability of the normal Internet is about 3%, when a loss event occurs, the probability that the next packet will also be lost is conditionally much higher (around 50%). This means that a lot of streaming protocols randomize the order of data to spread the losses out, for example. The ambient loss is still 3% overall.

When FEC is applied, you can *decide* what the packet loss rate of the channel will be with a dial, in exchange for extra bandwidth. Want less probability messages get lost? Dial up the redundancy: 3% becomes 0.0001%

The rest of this post I want to just talk about how to choose a value of R to go from 3% loss to some target loss. I'll describe an algorithm and its mathematical basis.

Assuming that the code group size is large enough to eat burst losses, then the loss probability distribution is roughly uniform with probability of loss equal to p.

Deriving perceived packet loss rate after FEC is applied:

let:

p = probability of packet loss

n = original packet count

r = redundant packet count

(n, k) = n chooses k, binomial coefficient = n! / ( k! * (n-k)! )

p(l) = p^l * (1 - p)^(n+r-l) = probability of losing l packets of n+r

Perceived loss q = sum( p(l) * (n+r, l) ; l = r+1..n+r ), which is the probability of losing r+1 or more packets.

p = probability of packet loss

n = original packet count

r = redundant packet count

(n, k) = n chooses k, binomial coefficient = n! / ( k! * (n-k)! )

p(l) = p^l * (1 - p)^(n+r-l) = probability of losing l packets of n+r

Perceived loss q = sum( p(l) * (n+r, l) ; l = r+1..n+r ), which is the probability of losing r+1 or more packets.

Note this looks like a Bernoulli random variable, which makes it amenable to approximation via the Central Limit Theorem.

Normal Approximation:

P(X > r), X ~ B(n+r, p)

Recall: E[X] = (n+r)*p, SD(X) = sqrt((n+r)*p*(1-p))

X is approximated by Y ~ N(mu, sigma)

where mu = E[X], sigma = SD(X).

And: P(X > r) ~= P(Y >= r + 0.5)

Recall: E[X] = (n+r)*p, SD(X) = sqrt((n+r)*p*(1-p))

X is approximated by Y ~ N(mu, sigma)

where mu = E[X], sigma = SD(X).

And: P(X > r) ~= P(Y >= r + 0.5)

For this to be somewhat accurate, np >= 10 and n(1-p) >= 10.

This is a problem since we also want to solve it exactly for all cases. So when the approximation breaks down, we evaluate the Bernoulli CDF directly (see code).

The approximation tends to be close to the exact solution, typically off by < 1 packet from some experiments.

Here is some C++ code that implements it:

#include <cmath>

using namespace std;

#define INVSQRT2 0.70710678118655

// Precondition: r > 0

// Returns probability of loss for given configuration

double NormalApproximation(int n, int r, double p) {

const int m = n + r;

double u = m * p;

double s = sqrt(u * (1. - p));

return 0.5 * erfc(INVSQRT2 * (r - u - 0.5) / s);

}

int CalculateApproximate(double p, int n, double Qtarget) {

int r = 0;

double q;

do {

++r;

q = NormalApproximation(n, r, p);

} while (q > Qtarget);

++r;

// Add one extra symbol to fix error of approximation

if (n * p < 10. || n * (1 - p) < 10.) {

++r;

}

return r;

}

using namespace std;

#define INVSQRT2 0.70710678118655

// Precondition: r > 0

// Returns probability of loss for given configuration

double NormalApproximation(int n, int r, double p) {

const int m = n + r;

double u = m * p;

double s = sqrt(u * (1. - p));

return 0.5 * erfc(INVSQRT2 * (r - u - 0.5) / s);

}

int CalculateApproximate(double p, int n, double Qtarget) {

int r = 0;

double q;

do {

++r;

q = NormalApproximation(n, r, p);

} while (q > Qtarget);

++r;

// Add one extra symbol to fix error of approximation

if (n * p < 10. || n * (1 - p) < 10.) {

++r;

}

return r;

}

It tries values of r until the loss is acceptable. A faster approach would determine rate of convergence and skip intermediate values of r, since the loop needs to run over 300 times in interesting cases.

Discussion

Interestingly, adding just a few more redundant packets tends to reduce the perceived loss by a LOT. So it's hard to go from 3% to 0.01%, but to go from 0.01% to 0.001% just requires bumping the redundancy by another 10 packets of 1000 or so. This means that when FEC is used, it's economical to target very very low loss rates.

Larger N are also more efficient. Increasing N from 100 to 1000 cuts the bandwidth requirement for redundant data in half to reach the same target:

N = 100, p = 3%, target = 0.0001% => R = 15 packets (15% more bandwidth)

N = 1000, p = 3%, target = 0.0001% => R = 60 packets (6% more bandwidth)

N = 64000, p = 3%, target = 0.0001% => R = 2196 packets (3.4% more bandwidth)

N = 1000, p = 3%, target = 0.0001% => R = 60 packets (6% more bandwidth)

N = 64000, p = 3%, target = 0.0001% => R = 2196 packets (3.4% more bandwidth)

Larger N are more efficient for other reasons. Compression works better on larger data chunks. Operating on bulk data is more efficient. But FEC encoders are limited to N < 64,000.

An ideal operating region for FEC encoders seems to be around 1024 symbols. I've noticed that the MBPS performance of the encoder is good there, and it's hard to get better bandwidth efficiency.

last edit by catid
edited (>30 days ago) 6:42pm Sat. Sep 21st 2013 PDT