Performance Analysis of Turbo Codes

The basic elements of turbo codes are convolutional codes [1, 2] and decoding algorithms that use soft input and soft output [3, 4, 5, 6]. The input bit sequence is encoded by two encoders, between which is stored interleaver to ensure that the encoded sequences are mutually independent. RSC (Recursive Systematic Convolutional) encoders [1, 2] are often used where each RSC encoder produces a systematic output, which is equivalent to input information, and produces parity bits. Both parity sequences can be punctured before they are transferred with systematic bits to the decoder. Via puncturing it is possible to reduce the number of parity bits to one half and thereby also increase the information rate to 1/2.


Introduction
The basic elements of turbo codes are convolutional codes [1,2] and decoding algorithms that use soft input and soft output [3,4,5,6]. The input bit sequence is encoded by two encoders, between which is stored interleaver to ensure that the encoded sequences are mutually independent. RSC (Recursive Systematic Convolutional) encoders [1,2] are often used where each RSC encoder produces a systematic output, which is equivalent to input information, and produces parity bits. Both parity sequences can be punctured before they are transferred with systematic bits to the decoder. Via puncturing it is possible to reduce the number of parity bits to one half and thereby also increase the information rate to 1/2.
For decoding, special algorithms must be used which use soft input and soft output [4,5,6]. These soft inputs and outputs do not determine only whether the decoded bit has the logical value 0 or 1 but the likelihood ratio which determines the probability of whether the bit was correctly decoded. The turbo decoder operates iteratively. The first iteration of the first decoder gives an estimate of the original data sequence, based on the soft output channel. It also provides an extrinsic output. Extrinsic output for a given bit is not dependent on the value of the transmission channel for this bit but on the information for the surrounding bits. This extrinsic output from the first decoder is used as a-priori information for the second decoder together with input information from the channel.
The second decoder will give us again extrinsic information and soft output. In the second iteration the extrinsic information from the second decoder in the first iteration is used as a-priori information for the first decoder. Thus the decoder achieves a more accurate estimate of the decoded bits than was the case in the first iteration. This cycle is continuously repeated. In each iteration of the two decoders soft output and extrinsic information are calculated based on the input sequence and a-priori information obtained from extrinsic information of the previous decoder. After each iteration, the BER (Bit Error Rate) decreases.

Turbo encoder
The block diagram of turbo encoder [5,7] is shown in Fig. 1. Two identical encoders are used here, usually RSC, which are separated by interleaver. It is possible to use a structure with more than two encoders, but in this chapter we will deal with the classical structure of two RSC encoders [1,2].
The input bit sequence is fed to the input of the first encoder where it is encoded. The output of the first encoder is formed by systematic and parity bits. The input bits for the second encoder are interleaved and encoded in the second encoder. The input bits for the second encoder thus become independent of the input bits of the first encoder. Typically, pseudo-random or block interleaver is used. The second encoder produces parity bits only. The output from the two encoders is punctured and then multiplexed. Usually both RSC encoders have an information rate of 1/2 and give one systematic and one parity bit for each input bit. This means that the turbo encoder output sequence contains for each input bit one systematic and two parity bits, i.e. y 1sЈ , y 1 1lЈ , y 2 1lЈ , y 2sЈ , y 1 2lЈ , y 2 2lЈ , …, y ksЈ , y 1 klЈ , y 2 klЈ . For this output sequence the turbo encoder has an information rate of 1/3. For the total information rate to be 1/2, the output bits from the turbo encoder must be punctured. The output sequence is punctured so that all the systematic bits are preserved and only the parity bits are punctured. Puncturing of the systematic bits will degrade the code performance. After puncturing and multiplexing the turbo encoder output sequence x kl would be y 1sЈ , y 1 1lЈ , y 2 1lЈ , y 2sЈ , y 1 2lЈ , y 2 2lЈ , …, y ksЈ , y 1 k+1sЈ , y 2 k+1lЈ .

A. Soft output Viterbi algorithm
The turbo decoder uses the Viterbi algorithm which is referred to as the SOVA (Soft-Output Viterbi Algorithm [3,4,5]. For decoding turbo codes, this algorithm has two modifications. The first modification adapts the path metric so that it takes into account a-priori information when selecting the maximum likelihood paths in the trellis diagram. The second modification of the algorithm consists in soft output in the form of a-posteriori LLR (Log Likelihood Ratio) L (u k Έ y _ ) for each decoded bit.
The first modification considers the state sequence s _ s k which gives the states along the surviving paths at the state S k ϭ s at stage k in the trellis diagram. The metric should be easy to compute via the recursive way where we go from stage k -1 to the kth stage in the trellis diagram. A suitable metric for the path s _ s k is defined as [4,5]: where M(s _ s k ) is the metric of surviving path through the state S kϪ1 at stage kϪ1 in the trellis diagram, u k is the encoder input bit, x kl is the transmitted channel sequence (output from encoder) associated with a given transition, and y kl is the received sequence from the transmission channel for that transition. Using the transmission channel with BPSK (Binary Phase Shift Keying) modulation and AWGN (Additive White Gaussian Noise), the channel reliability L c is defined as follows [5]: where E b is the transmitted energy per bit, α is the fading amplitude, and σ is the noise variance. Now we will discuss the second modification of the algorithm which is the soft output. In a binary trellis diagram there will be two paths reaching the state S k ϭ s at the stage k. The modified Viterbi algorithm takes a-priori information, calculates the metric of these two paths according to Equation (1) and discards the path with a lower metric. When both paths s _ s k and ŝ _ s k reaching state S k have the metric M(s _ s k ) and M(ŝ _ s k ), respectively and the path with the higher metric s _ s k is selected as surviving, we define the difference metric Δ s k of these paths as [4,5]: is the metric for the surviving path, and M(ŝ _ s k ) is the metric for the discarded path.
When we reach the end of the trellis diagram and find the ML (Maximum Likelihood) path, it is necessary to find the LLR. This determines the reliability of deciding on the bits around the ML path. The Viterbi algorithm shows that all the surviving paths at the stage in the trellis diagram come from the same path a few steps before this stage. This previous stage may attain δ transitions before the stage k, where δ is usually set to five times the constraint length of the convolutional code. Therefore, the bit value u k associated with the transition from the state S kϪ1 ϭ s to the state S k ϭ s on the ML path may be different when the Viterbi algorithm selects one path merged with the ML path instead of the ML path after the δ transitions, i.e. k ϩ δ stage in the trellis diagram. If the algorithm selects one of the paths merged with the ML path, it will not affect the value u k , because this path will differ from the ML path from the transition S kϪ1 ϭ s to S k ϭ s. When we calculate the LLR for the bit u k , SOVA has to take into account the probability of paths merging with the ML path at the stage k to stage k ϩ δ. By comparing the differences in the metric Δ s i i for all states s i along the ML path from the state i ϭ k to i ϭ k ϩ δ. This LLR is defined as [4,5]: , (4) where u k is the bit value of the ML path, and u i k is the value of the bit of the path that merged with the ML path and was discarded in the state i. The minimization in Equation (4) is only used for paths merging with the ML path which gives a different value for the bit u k when this path is selected as the surviving path. The paths that gave the same value u k as the ML path do not affect the decision.
B. Implementation of the SOVA SOVA is implemented as follows. In every state at every stage in the trellis diagram the metric M(s _ s k ) is calculated for the two paths merging into the state using Equation (1). The path with the higher metric is chosen as the surviving for this state and the metric indicator stored as the Viterbi algorithm does it. However, in order to provide reliable decoded bits, it also stores the value of L (u k Έ y _) calculated by using Equation (4). The metric differences between the surviving and the discarded path are stored together with the binary vector of δ ϩ 1 bits in length, which indicates the sequence of discarded path bits u k from k back to k Ϫ δ to compare the differences with the surviving path. This series of bits is called the update sequence and is given by output modulo 2 between the previous δ ϩ 1 and the decoded bit along the surviving and the discarded paths. When SOVA identifies the ML path, the update sequences and metric differences along the path are stored and used to calculate the value of L (u k Έ y _).
C. Iterative decoding Now we will describe how iterative decoding works. Fig. 2 shows the schematic of turbo decoder, and it describes the inputs and outputs of individual blocks.
The first decoder in the first iteration receives a sequence L c y _ (1) from the transmission channel, which includes systematic bits L c y ks and parity bits L c y kl from the first encoder. Usually only half of the parity bits are received because these bits have been punctured in the transmitter. The decoder inserts zeros on the punctured places in the soft channel output L c y ks . The first decoder begins processing the soft input from the channel. The output of the first decoder is conditional LLR L 11 (u k Έ y _) of data bits u k , where k ϭ 1, 2, … N. The subscript of symbol L 11 (u k Έ y _) denotes a-posteriori LLR in the first iteration from the first decoder. In the first iteration the first decoder has no a-priori information about bits, therefore the value of L(u k ) ϭ 0, which corresponds to an apriori probability of 0.5. Now the second decoder begins to operate. It receives the sequence L c y _ (2) which contains systematic bits for the first decoder which passes through the interleaver and the parity bits from the second encoder. Furthermore, it receives apriori LLRs L(u k ) which is generated from the conditional LLR L 11 (u k Έ y _) from the first decoder. As can be seen from the figure, the extrinsic information L s (u k ) from the first decoder is adjusted by the interleaver to match with the sequence of input bits entering the second decoder. The second decoder uses this information and the received interleaved sequence L c y _ (2) to calculate the a-posteriori LLR L 12 (u k Έ y _). Now by the equation [4,5]: the systematic soft input L c y ks and a-priori information L(u k ) from the previous decoder are subtracted from the decoder output L(u k Έ y _). The calculated value is the extrinsic information L s (u k ) and it is used as a-priori information for the first decoder in the second iteration. This ends the first iteration for both decoders.
In the second iteration the first decoder processes the received sequence L c y _ (1) again, but now it has available a-priori informa- tion which is de-interleaved extrinsic information L s (u k ) calculated by the second decoder in first iteration from the a-posteriori L 12 (u k Έ y _). Now, the first decoder can calculate a more accurate a-posteriori LLR L 21 (u k Έ y _). The second iteration continues in the second decoder. It uses the more accurate a-posteriori LLR L 21 (u k Έ y _) from the first decoder which calculated more accurate a-priori information L(u k ) by using Equation (5). This information is used together with the received sequence L c y _ (2) to calculate L 22 (u k Έ y _) from which L s (u k ) is then calculated for the following (first) decoder.
When the series of iterations is completed, the turbo decoder output is given by de-interleaving the a-posteriori LLR L 12 (u k Έ y _) of the second decoder where i is the number of iterations used. The signs in a-posteriori sequences give the hard decision output, that is ϩ1 or Ϫ1.

Performance analysis of turbo codes
In this chapter we will present simulations based on the effect of parameters on the performance of turbo codes. The parameters that were used in the simulation are shown in Table 1. Turbo encoder uses two parallel concatenated encoders. Selected as the code was the RSC with generator polynomials G 0 ϭ 37, G 1 ϭ 21 (octal) and constraint length of code K ϭ 5. The interleaver chosen was the pseudo-random interleaver with length L ϭ 2048 bits. Unless specified otherwise, puncturing the parity bits to one half will always be used, which will increase the information rate to R ϭ 1/2. The decoder uses the SOVA algorithm; usually 8 iterations were used for decoding. The AWGN transmission channel with BPSK modulation is used in the simulation.

Fig. 2 Turbo decoder schematic
The performance of turbo codes can be influenced by many parameters. Some of these parameters are: G The number of decoding iterations used. G The use of puncturing in encoding. G The generator polynomials of the codes. G The frame lengths of input data. Fig. 3 shows the performance of turbo codes depending on the number of decoder iterations. Uncoded BER is shown for com-parison. The performance after the first iteration of the turbo decoder should be theoretically comparable with the performance of the convolutional code [5]. As the number of iterations increases, the performance of the decoder increases too. For example, the improvement of the performance between the first and second iterations is about 1.2 dB at BER 10 Ϫ4 . This performance increase continues up to the eighth iteration. Code gain between the eighth and fourteenth iteration is only 0.1 dB at BER 10 Ϫ4 . From the figure it is possible to conclude that the increasing number of iterations increases not only the performance of the code but also the computational complexity in decoding; therefore it is recommended to use between 4 and 14 decoder iterations. For this reason, only 8 decoder iterations are used in the following simulations.

B. Effect of puncturing
As already described, the turbo encoder uses two or more encoders which produce parity bits. In these simulations the RSC encoders are used. This is the most common solution which is able to achieve an information rate of below 1/3. In order to achieve an information rate of 1/2, every second parity bit from each encoder must be punctured. It is also possible to use the code without puncturing and thus keep the information rate at 1/3. The performance of unpunctured code is shown in Fig. 4. Encoders use the same parameters as in the previous simulation, Fig. 3. The turbo encoder for unpunctured code has at BER 10 Ϫ4 of a code gain which is 0.5 dB better than the turbo encoder which used puncturing. Very similar gains may also be achieved for different Parameters of turbo encoder and decoder Table 1 Channel AWGN   Fig. 5 shows the dependence of the performance of turbo convolutional code on the generator polynomial. The first code selected was the RSC code with generator polynomials G 0 ϭ 7, G 1 ϭ 5 and constraint length K ϭ 3. The second code selected was K ϭ 4, G 0 ϭ 17, G 1 ϭ 15. This code achieves performance that is about higher than that achieved by the code with constraint length K ϭ ϭ 3 at BER of 10 Ϫ4 . The third selected code, which was used for all simulations, has a constraint length K ϭ 5 and generator polynomials G 0 ϭ 37, G 1 ϭ 21. Compared with the first code (K ϭ 3), it reaches a performance that is about 0.3 dB higher at a BER of 10 Ϫ4 ; in comparison with the code K ϭ 4, its performance increases by about 0.125 dB. With increasing constraint length of the code and with greater generator polynomials the performance of turbo codes increases, but what also increases is the size of trellis diagram and thus the computational complexity of decoding. Fig. 6 shows the performance of turbo codes depending on the frame length. For many applications, such as applications using real-time transmission, a large frame length is absolutely unacceptable. Frames with a length of 256 bits are useful for voice transmission and 1024 to 2048 bits for video transmission. Systems with larger frame lengths can be used to transfer data and for applications that do not require real-time transmission. The best result in the simulation was reached by a turbo code with a frame length of 65536 bits. The turbo code with a frame length of 65536 bits has a code gain of 0.35 dB compared to turbo codes with a frame length of 2048 bits and 0.6 dB to turbo codes with a frame length of 1024 for BER of 10 Ϫ4 . With growing frame length the performance of turbo convolutional codes increases but the delay gets affected and for shorter frames reaches lower values.

Conclusion
This article deals with the problem of turbo codes. It describes a basic structure of turbo encode using two identical RSC codes and turbo decoder which uses Viterbi algorithm. Furthermore, it also presents basic mathematical equations for the SOVA decoding algorithm and describes iterative decoding. Simulations were performed for different parameters of turbo codes. Based on these simulations, it is possible to conclude that the performance of turbo codes decreases when puncturing is used. On the contrary, the performance of turbo codes increases with increasing number of the decoding iterations performed by an appropriate choice of the code (generator polynomial) or by changing the frame length. It is possible to implement a high performance codec.