Posts Tagged “vhdl”

So for various reasons, it turns out that I don’t really need Reed-Solomon codes for what I’ll be doing, but I still have more than a passing interest in the decoding side of things, if only for hobby reasons.

You can make the case that the fundamental building block for a Reed-Solomon decoder is the finite field multiplier. Addition in GF(2m) is easy — just xor — but multiplication is another story entirely. The multiplier itself is dependent upon the field primitive polynomial, so that’s why you don’t see embedded FFM blocks on an FPGA, or even a general purpose processor for that matter.

That may change in the future when Intel introduces AVX with its cryptography accelerating instructions, but even then, from what little I can understand, it doesn’t sound like a single instruction does FFM on a register; you may still have to do the modulo reduction yourself.

Anyway, I’m here today to talk about hardware FFM’s, specifically on 8-bit operands. Google has a tough time finding information on how to design a FFM, although I suppose it all depends on how you ask it. You can find some implementations, like on OpenCores, but I’m not that enthusiastic about having to register for a once totally open site (you can try your luck with BugMeNot though), plus implementations rarely ever convey how they were created.

I finally found a fairly friendly method, in a paper by A. Reyhani-Masoleh and M. A. Hasan. See Low Complexity Bit Parallel Polynomial Basis Multiplication over GF(2m).

Note that “FFM”, “finite field”, or “Galois” are nowhere to be found in the title. I suppose I am unfamiliar with the nomenclature, which is what made searching for information more difficult.

I think it should also be noted that IEEE papers are generally not freely available, so if it weren’t for Dr. Reyhani-Masoleh’s generosity, you’d have to access the paper through your own, or university’s, subscription to IEEE Xplore.

For the ASIC layout folks, the paper details a very regular structure for the multiplier, so bonus.

From a design standpoint, there are three variables. Two of them, the d and e vectors, depend only on the bit-width so they are constant regardless of the primitive polynomial. How they are combined is determined by the transpose of the Q matrix, and to get at Q you need to solve equation (5).

The elements of Q are constrained to be either 1 or 0, so multiplication is really an on/off operation, and addition is xor. At least for small values of m (the m in GF(2m)) you can brute force the solution. Also, when working out d and e from the Toeplitz matrices, multiplication is the and operation.

All vectors are indexed starting from 0, i.e. e = [e0, e1, ... , em-2]. Bit-level multiplication is accomplished with an and gate.

The algorithm lends itself well to code generation, so I wrote my own Python functions to generate and verify the basic bit-level equations. With a bit of find/replace, it’s easy to just port those over to something like VHDL. Here’s a link for a 0x1D multiplier in GF(256), i.e. P(x) = x8 + x4 + x3 + x2 + 1. I think it should synthesize, but without running it through Quartus or ISE, I can’t say for sure. Until I get my hands on an FPGA development board, I’m not really motivated to install either of those packages.

In Part 2, I’ll post my Python code, which I’m more confident in. For now, enjoy the multiplier, whose polynomial is used in ATSC, WiMAX, and DVB Reed-Solomon codes.

Tags: , , , , ,

Comments 2 Comments »