Mac Williams Identities

INTRODUCTION
The aim of Error Correcting Codes study is to transmit a given message in some encoded form such that the maximum possible error can be tolerated during transmission with minimum redundancy. The Big Question in Error Correcting Codes is to study "How much redundancy is needed to correct a given amount of errors?" . Using Relative Distance and Rate of a family of codes analogous to Redundancy and Errors respectively[1], the aforementioned Big Question can be stated in another way i.e., "Given a relative distance of the code to be $ \delta $ what is the best rate $ R $ that can be achieved?" [2]. In this article, MacWilliam's Identities are used to derive a Linear Programming Bound(MRRW Bound) on the Rate of family of codes. The Linear Programming(LP) bound was developed as a consequence of Jessie MacWilliam's research stated as MacWilliam's Identities which relates the "weight enumerator function" of a linear code to its dual. For Binary Codes, the LP Bound gives the best upper bound on $ R $ as a function of $ \delta $.


___________________________________________________________________________________________________________________

1. STATEMENT
The Mac Williams Identities relate the weight enumerator of a linear code $  A(z)  $ to the weight enumerator of its dual code $ B(z)  $

$ B(z) =  \(2^{-k}(1+z)^nA\left(\dfrac{1-z}{1+z}\right) $ where $  k  $ is the number of bits in the original message and $ n $ is the number of bits in the encoded message

1.1. Dual of a linear code :

A linear code $  C  $ is specified by a generator matrix $  G  $of dimension $ k \times n  $ . The parity check matrix of $  C  $ is of order $  n \times (n-k) $. The dual of linear code $ C  $, represented by $ C^\perp $ is generated by the transpose of parity check matrix of $ C $ .

1.2 Properties of duals:
Proposition 1 :
1) If $  x \in{C^\perp}  $ then for every codeword $  c\in $$ C $, $ \langle c,x \rangle $$ =0 $

2)If $  x \notin  $ $ C^\perp $ then for any $  \alpha, \beta \in \mathbb{F}_{\mathrm{q}}  $ the two sets $ \{c $ $  \in  $ $ C |  $$  \langle c,x \rangle= \alpha \} $ , $ \{c  $ $  \in  $ $ C | $ $  \langle c,x \rangle=\beta\} $ have same number of elements.

Proof:

The first part follows from the fact that, a linear code is generated by a generator matrix $ G   $ and its dual is generated by the transpose of parity check matrix $ H^T $ and we have $  G.H^T=0 $

For second part, assume $ \alpha = 0 , \beta \neq 0  $. Assume $  S_\alpha =  $ $  \{c  \in  C  $ | $  \langle c,x \rangle =\alpha \} $ , $   S_0=  $ $ \{c \in  $ $ C | $ $  \langle c,x \rangle=0\} $. We note that both are non empty. Since $  0 \in S_0  $ it is non empty. As $  x  $ is not a member of dual, we have a codeword $ c \in C $ such that $  \langle c,x \rangle=\alpha^' $$  \neq 0  $ . Then we have a codeword $ b = (\alpha^')^{-1} $$  \alpha c $ in $  C  $ with $  \langle b,x \rangle=\alpha  $. So $ b   \in S_\alpha  $ which implies that $ S_\alpha $ is also nonempty. Next, we prove that both sets have same number of elements from the following observations:

1) if $  a\in S_0  $ then we have $  \langle b+a,x \rangle=\langle b,x \rangle+\langle a,x \rangle= \alpha + $ $ 0 $ $ = \alpha $ where $  b \in S_\alpha  $. Thus $  b + S_0 \subseteq S_\alpha  $

2)$ |b+S_0| $=$ |S_0| $, since if $ b+a=b+a^' $ then $ a=a^' $

3)Now we proved that $  b+S_0  $ is a subset of $ S_\alpha  $ and number of elements in $  b+S_0  $ equals $  S_0  $ . So we can infer that $  |S_0| \leq |S_\alpha|  $

4) Similarly, we can prove that $  |S_\alpha| \leq |S_0|  $. Thus we successfully proved that both sets have same cardinality.

1.3 Weight Enumerator, Weight distributions and Weight Generating Functions :

If we have a linear code $ C [n,k,d]_q  $ and an index $ i  $ $  \in  $ $ \{0,1,2,. . .n\} $ then the $ i $th weight enumerator of $ C  $ is the number of code words in $ C $ having a Hamming Weight of $  i $.

The weight distribution of linear code C is the sequence $  \{ A_0,A_1,A_2,. . . .A_n\}  $ where $  A_i  $ is the $  i $th weight enumerator of $ C $.

The weight generating function of a linear code is defined as $ A_C(z) = \sum_{i=0}^{n} A_i x^i  $ where $   \{A_0,A_1,A_2,. . . .A_n \}  $ is the weight distribution of $ C $

1.4 Extended Generating Function :
The extended generating function presents all the information about a linear code, unlike weight generating function which does not provides us with any information. The extended generating function is defined for a bit, word and code.

1.4.1 Extended weight generating function of a bit :
For a bit $ b $$ \in  $ $ \{0,1\} $ the extended weight generating function $ W_b(x,y)  $ is a polynomial in x and y and is defined as
$  W_b(x,y)=x  $ if $ b=0 $
$  W_b(x,y)=y $ if $ b=1 $

1.4.2 Extended weight generating function of a word :
For a vector $ b=<b_1,b_2,....b_n> \in  $ $ \{0,1\}^n $, the extended generating function is defined as $ W_b(x,y)  $ which is a polynomial consisting of $ 2n $ variables which are separated as $  x=<x_1,x_2,x_3....x_n> $ and $ y=<y_1,y_2,y_3....y_n> $ and is defined as $ W_b(x,x)= \prod_{i=0}^{n} W_b{_i}(x_i,y_i)  $
Example: a = 1000101010111
$ W_b(x,y)=y_1x_2x_3x_4y_5x_6y_7x_8y_9x_1_0y_1_1y_1_2y_1_3 $

1.4.3 Extended weight generating function of a code :
For a linear code $ C \subseteq $ $ \{0,1\}}^n $ , the extended generating function is defined as $ W_b(x,y)  $ which is again a polynomial consisting of $ 2n  $ variables which are separated as $ x=<x_1,x_2,x_3....x_n> $ and $ y=<y_1,y_2,y_3....y_n> $ and is defined as $ W_b(x,x)= \sum_{i=0}^{n} W_b(x_i,y_i)  $
Example: Let $ C={010,011,001,101,000} $
$ W_b(x,y) =  x_1y_2x_3 + x_1y_2y_3 + x_1x_2y_3 + y_1x_2y_3 + x_1x_2x_3  $

These extended generating functions carry all the information(codewords with weight $ i $ , where $ i $ can range upto the length of the code, number of codewords of the code etc) about the code, are very useful when studying the combinatorics of the code. They are not very computationally efficient. If we have the extended generating function of a code, then we can easily compute the extended generating function of its dual. As we said $ W_c(x,y) $ contains all the information about $ C  $, we can derive $  C  $ from it. Now we can derive the dual $  C^\perp  $from $  C  $ and in turn its extended generating function from $ C \perp  $.

The core theme of Mac Williams identities is based on Discrete Fourier Transform (DFT) or in particular, Hadamard transform. This transform, when applied to the extended weight generating function of a word, transforms it according to the following Lemma

Lemma 1.4.4:

$ W_b(x+y,x-y)=\sum_{c\in{0,1}}^n(-1)^{<b,c>}W_c(x,y) $.

Proof:

From definition of $ W_b $ we have
$ W_b(x+y,x-y) = \prod_{i=1}^n {W_b}_i(x_i+y_i,x_i,y_i)<br />
     =\prod_{i=1}^n (x_i + (-1)^{b_i} y_i )  $

Expanding the term results in $  2^n  $ terms of the form where $  z_i \in \{ x_i, y_i \}  $ and sign is negative if $  i  $ is odd. Let $  C \in \{0,1\}^n  $ denote the sum of this $  2^n  $ terms. If $ z_i = y_i  $ is set when $  c_i=1 $, then $ z_i={W_c}_i(x_i , y_i)  $ and the sign of $  z_i  $ is negative if $ \langle b,c\rangle=1 $. Now,

$<br />
\prod_{i=1}^n (x_i + (-1)^b_i y_i ) \\<br />
= \sum_{c \in {\{0,1\}}^n}(-1)^{\langle b,c \rangle}\prod_{i=1}^n {W_c}_i(x_i,y_i) \\<br />
=\sum_{c \in {\{0,1\}}^n}(-1)^{\langle b,c \rangle}W_c(x,y) \\<br />
 $

This Lemma is used to prove the following theorem

Theorem 1.4.5 (Generalized Mac Williams Identity):
The extended weight generating function of a code $  C  $ and its dual $  C ^\perp  $ satisfy the equation

$ W_{C^\perp}(x,y)=\dfrac{1}{|C|}W_C(x+y,x-y) $
Proof :
From Lemma 1.4.4,

$ W_C(x+y,x-y) = \sum_{b \in C} W_b(x+y , x-y)  $
$ =\sum_{b \in C}\sum_{c \in {\{0,1\}}^n}} (-1)^{\langle b,c \rangle} W_c(x,y)  $
$ =\sum_{b \in C} \sum_{c \in {C}^\perp} W_c(x,y) + \sum_{b \in C} \sum_{c \notin {C}^\perp} (-1)^{\langle b , c \rangle}W_c(x,y)  $ (From proposition 1)
$ =|C|W_c(x+y) + 0 $ (From definition of extended weight generating function in section 1.4)

1.5 Corollary :
Every Linear Code $ C $ satisfies $ x^nA_C(y/x)=W_C(x,...x,y...y) $.
Proof:
Considering $ W_C $ for all $ x_i=x $ and $ y_i=y $ we get:
$ W_C(x,. . .x, y,. . .y) = \sum_{b \in C}x^{n-wt(b)}y^{wt(b)} $ $  \newline  $
$ =x^n\sum_{b \in C}(y/x)^{wt(b)}  $ $ \newline $ $ =x^n\sum_{i=0}^{n}\sum_{b \in C|wt(b)=i}(y/x)^i  $ $ \newline $
$ =x^n \sum_{i=0}^{n}A_i(y/x)^i $$ \newline $$ =x^n A_C(y/x) $

Proposition 2:
$  A_{C^\perp}(y)=1^n W_{c^\perp}(1,. . . .1,y,y. . . . y)  $
Proof :

$ W_{c^\perp}(1,. . . .1,y,y. . . . y)  = \sum_{b \in C} x^{n-wt(b)} y^{wt(b)} \\<br />
=x^n \sum_{b \in C} (y/x)^{wt(b)} \\<br />
=x^n \sum_{1=0}^n   \sum_{b \in C | wt(b)=i}   (y/x)^i   \\<br />
=x^n \sum_{i=0}^n A_i (y/x)^i  \\<br />
=x^n A_C (y/x)</p>
<p> $

___________________________________________________________________________________________________________________


2. THE MAC WILLIAMS IDENTITIES

2.1 Binary Case :
$ A_{C^\perp}(y) = \dfrac{1}{|C|}(1+y)^n A_C\left(\dfrac{1-y}{1+y}\right) $
Proof :
From Proposition 2 , we have $  A_{C^\perp}(y)=1^n W_{c^\perp}(1,. . . .1,y,y. . . . y)  $
From Theorem (1) we have
$ W_{C^\perp}(1,. . . .1,y,y. . . . y)  $=$  \dfrac{1}{|C|}W_c(1+y,1+y. . . .,1-y,1-y. . . .1-y) $

Again applying the result from proposition 2, we get

$ W_C(1+y,. . . .,1+y,1-y, . . . .1-y)=(1+y)^n A_C \left( \dfrac{1-y}{1+y} \right) $

From above two equations, we have

$  A_{C^\perp}(y)=(1+y)^n \dfrac{1}{|C|} A_C \left( \dfrac{1-y}{1+y} \right) $

2.2. General q-ary case :

If the alphabet over which $  C  $ is defined is $ q $ , then the relation between weight enumerator of $  C  $ and its dual $  C^\perp  $ is given by

$  A_{C^\perp }(y) = \dfrac{ (1 + (q-1)y)^n}{|C|}A_C \left( \dfrac{1-y}{1+(q-1)y} \right) $

For complete proof of q-ary case refer IEEE paper on simple derivation of the MacWilliams identity for linear codes

___________________________________________________________________________________________________________________

3. WEIGHT DISTRIBUTION OF DUAL CODE AS LINEAR FUNCTION OF WEIGHT DISTRIBUTION OF PRIMAL CODE
Now that we have obtained expressions for MacWilliams Identities for $ q $-ary and binary codes,we now determine the exact relationship between the weight distribution of a code and its dual. The following corollary proves the same:

3.1 Corollary
For a given linear code $ C $ with weight distribution vector $ a=<A_0,. . . ,A_n> $ and let $ b=<B_0,. . .,B_n> $ be the weight distribution of its dual $ C^\perp $ then they are related by $ b=(\dfrac{1}{|C|})aM $ where $ M  $ is a $ (n+1) $ x $ (n+1) $ matrix such that elements of $ M  $ are
$ {m_{ij}}=\sum_{l=0}^{i} \dbinom{n-j}{i-l} \dbinom{j}{l}  (q-1)^{i-l} (-1)^l  $
Proof
We can see that $ B_i $ is the coefficient of $ y^i $ in the expression for $ A_{C^\perp} $ $  (y) $ that we had derived as MacWilliams Identity i.e.,
$  A_{C^\perp} $ $ (y) $ $ = \dfrac{ (1 + (q-1)y)^n}{|C|}A_C( \dfrac{1-y}{1+(q-1)y}) $

This can be expressed as :
$ B_i = \dfrac{1}{|C|} \sum_{j=0}^{n} A_j (1+(q-1)y)^{n-j} (1-y)^{j}  $

Thus any power raised to $ y $ say $ y^k $ is given by product of coefficients of $ y^{i-l} $ in $ (1+(q-1)y)^{n-j}  $ and $ y^l $ in $ (1-y)^{j}  $ i.e., by expanding using Binomial theorem

$ B_i = \dfrac{1}{|C|} \sum_{j=0}^{n} A_j  \sum_{l=0}^{i}\dbinom{n-j}{i-l} \dbinom{j}{l}  (q-1)^{i-l} (-1)^l  $
$  = \dfrac{1}{|C|} \sum_{j=0}^{n} A_j m_{ij}  $

Thus we have obtained $ B_i $ in term of linear function of $ A_i $ in the above corollary. Thus using MacWilliams Identity we try to derive an upper bound on rate of a linear code(actually linear and non linear codes, but we will restrict our discussion to linear codes). This forms basis for the following section where we use Linear Programming methodology for the Rate of a code.


___________________________________________________________________________________________________________________

4. LINEAR PROGRAMMING BOUND

In 1973, Philippe Delsarte proposed a bound on the Rate of a code with prescribed minimum distance using Linear Programming method. The LP Bound can be stated as one of the most powerful applications of Mac Williams identities and is one of the best known upper bound on Rate of binary codes.

Let $ A_1,A_2, . . .,A_n $ be the weight distributions of a $ [n,k,d]_q $ Linear Code of length $ n $,rank $ k $ and distance $ d  $ . For any given code, the number of codewords is given by $ \sum_{i}A_i $. We can see that $ A_0 = 1 $ and all weight distributions less than minimum distance $ d $ are all zeroes i.e., $ A_1=0,A_2=0,. . . A_{d-1}=0 $ and $ A_d,. . .,A_n > 0 $.
We define rate of any code as ratio of $ k  $ (length of original message) to $ n $ i.e.,
Rate of a code$ R = (log_q \sum_{i}A_i)/n $.

For deriving an upper bound on rate R, it is enough to maximize the Linear function $ \sum_{i}A_i $. However this expression is subject to the following restrictions:
a)$ A_0=1  $
b)$ A_1=0,A_2=0,. . . A_{d-1}=0 $
c) $ A_d,. . .,A_n > 0 $
d) $ aM>0 $

___________________________________________________________________________________________________________________

5. MRRW BOUNDS
A breakthrough in deriving an asymptotic analysis of LP bound was done by McEliece, R., Rodemich, E. , Rumsey, H. , Welch, L. They gave two upper bounds based on the linear function described above . These bounds are formally known as MRRW bounds. The first bound is the best known asymptotic bound of a binary code for rough ranges of $ \delta \in (0.273, 1/2) $.

We shall state the bounds for binary and $ q $-ary linear codes in its simplest forms in the following sections.
For complete proof of $ q $-ary codes see New Upper bounds on the Rate of a code via the DelSarte-Macwilliams Inequalities- IEEE Xplore

5.1 First MRRW Bound for $ q $-ary Codes
The Rate of a family of codes($ q $-ary code) of relative distance $ \delta $ , where $  0 < \delta < 1-\dfrac{1}{q} $ is
$ R \leq h_{q} (\dfrac{1}{q} (q-1-(q-2)\delta -2\sqrt{(q-1){\delta} (1-{\delta})})) + O(1) $

where $ h_q $ is a $ q $-ary entropy function defined by
$  h_q(x) = x\log_q(q-1)-x\log_qx-(1-x)\log_q(1-x)  $

5.2 First MRRW Bound for binary Codes
The MMRW bound for binary codes is understandably obtained by putting $ q=2 $ in the $ q $-ary MMRW bound and is stated as follows:
The Rate of a family of binary codes $ R $ of relative distance $ \delta $ is given by
$  R \leq h(\dfrac{1}{2} - \sqrt{\delta (1-\delta)) $
where $  h_2(x) = -x\log_2x-(1-x)\log_2(1-x)  $

Proof
The above bound is proved on the basis of following theorem stated.

5.2.1 Theorem : Dual Codes have a small "essential covering radius"
Let $ C $ be a linear code with minimum distance $ d $,length $ n $ and dual $ C^\perp $ then :
$ |\cup_{z \in {C^\perp}} B(z,r)| \geq \dfrac{2^n}{n} $
where $ B(z,r) $ is a Hamming Ball of radius $ r $ such that for all vectors $ z \in {C^\perp} $
$ B(z,r) = \{{x \in {C^\perp} | \bigtriangleup (z,x) \leq r  \} $
where $ \bigtriangleup $ denotes distance between $ z $ and $ x $
Proof
By inspection we can say that $ B(z,r) = z + B(0,r) $. Then we proceed by considering two propositions.
(i) Consider a set $ B , B \subset \{0,1\}^{n} $ with sufficiently large maximum eigenvalue, so that $ \cup_{z \in {C^\perp}} (z+B) $ covers a significant , in precise $ \dfrac{1}{n} $ fraction of the whole space i.e,
For $ B \subset  ${$ 0,1 $}$ {^{^n}}  $ and $ \Lambda_B \geq n-2d+1 $,
$ |\cup_{z\in {C^\perp}} (z+B)| \geq \dfrac{2^n}{n} $
where $ \Lambda_B $ is the maximal eigenvalue of adjacency matrix of the subgraph of $ \{0,1\}^{n} $ induced by the vertices of $ B $.
(ii) $ B(0,r) $ achieves the required maximum eigenvalue even when r is not too large i.e.,
$ \Lambda_B \geq 2 \sqrt{r(n-r)} -o(n) $

We pick a radius $ r $ such that $ \Lambda_B \geq n-2d+1 $
This is satisfied if :
$ 2 \sqrt{r(n-r)} -o(n) \geq n-2d+1 $
$ 2 \sqrt{r(n-r)}  \geq n-2d+o(n) $
$ r \leq \dfrac{n}{2} - \sqrt{d(n-d)} + o(n) $

The proof of these propositions in 5.2.1 can be found in Venkat Guruswami's lecture notes .

Having stated the propositions we are ready to prove the First MRRW Bound on Binary linear Codes.

Consider the following expression :
$ |C^\perp|.Vol(n,r) \geq |\cup_{z \in {C^\perp}} B(z,r)| \geq \dfrac{2^n}{n} $
This follows since volume of each ball $ B(z,r) $ is exactly $ Vol(n,d) $. But the sizes of a code $ C $ and its dual are related by : $ |C||C^\perp|=2^n $.
Combining the above two relations we can say that $ |C|=\dfrac{2^n}{|C^\perp|} \leq n.Vol(n,r) $
Substituting the bound on $ r $ in the earlier theorem and substituting $ d=\delta n $, we get
$ |C| \leq n.Vol(n,r) = n.2^{h(\dfrac{n}{2}-\sqrt{\delta (n-\delta)} + o(n))} $
i.e., $ R(\delta) \leq h(\dfrac{1}{2} - \sqrt{\delta (1-\delta)}) $

The graph below plots the comparison for the First MRRW Bound alongside the Gilbert–Varshamov bound and Elias_Bassalygo_bound. This represents the Rate of a Code $ R $ (on $ y $ axis) versus Relative distance $ \delta $ (on $ x $ axis).

5.3 Second MRRW Bound for binary Codes
Let $ 0<\delta<\dfrac{1}{2} $ for a binary Code, the rate of this code is given by
$ R \leq MRRW^{(2)} (\delta) + O(1)  $ where
$ MRRW^{(2)} (\delta) = min_{\dfrac{\delta}{2} \leq \epsilon \leq \dfrac{1}{2}} (1- h(\epsilon) + R_{cw} (\epsilon , \delta))  $
$ R_{cw} = h(\dfrac{1}{2} (1-\sqrt{1-({\sqrt{4 \epsilon (1-\epsilon) - 2\delta + \delta^{2}}-\delta})^2})) $ if $ \delta\leq 2 \epsilon (1 - \epsilon) $ else $ R_{cw} = 0  $

The second MRRW Bounds are stated for binary codes and is not generally extended for larger alphabets.


___________________________________________________________________________________________________________________

See Also:
1. Hamming Bound
2. Singleton Bound
3. Gilbert Varshamov Bound
4. Plotkin Bound


___________________________________________________________________________________________________________________

REFERENCES:

1. ^ Chapter 2 : A Look at Some Nicely Behaved Codes: Linear Codes.
2. ^ Chapter 4 : What can and what cannot be Done

1. Madhu Sudan's Algorithmic Introduction to Coding theory
2. Lecture Notes from Introduction to Coding Theory by Venkatesan Guruswami on Fourier Transform, MacWilliam's Identities and LP Bound
3. Lecture Notes from Introduction to Coding Theory by Venkatesan Guruswami on Proof of the MRRW Bound
4. Linear programming bounds for codes via a covering argument by Michael Navon and Alex Samorodnitsky
5. Chapter 1: The Fundamental Question: Essential Coding Theory By Venkatesan Guruswami, Atri Rudra and Madhu Sudan