Many cryptographic schemes rely on modular arithmetic. Modular arithmetic is a system of arithmetic operators on integers where the numbers “wrap around” back to zero when they reach a value called the modulus. For example, if our modulus is 17, then (12 + 14) mod 17 = 9. When we implement modular arithmetic, a key problem is to reduce a value that is larger than the modulus, q, down to a number in the range 0 to q-1. This can be achieved by dividing the large number by q, and finding the remainder. However, hardware implementation of division usually requires a large (and often slow) division circuit. This is a problem in hardware implementations of cryptography algorithms, where the area of the hardware circuit is important, and we may need several modular reduction units. More efficient solutions are possible, particularly where the modulus in a given cryptography algorithm is a constant. For example, in the Kyber post-quantum cryptography scheme, the modulus is 3329.
The goal of this project is to build a generator program, that generates different versions of hardware circuits to perform modular reduction for a given modulus. There are a number of existing schemes for computing the modular reduction for a given modulus. These include Barret reduction, LUT-reduction (on FPGAs), K-reduction, and other schemes. Your program will take as input (1) the modulus and (2) the set of values that need to be reduced. The set of values will often be a range, such as all 20-bit unsigned integers. Your program should be capable of generating a variety of different hardware reduction units for those inputs in a hardware description language such as VHDL, Verilog, or another language. These hardware units should then be evaluated by synthesising the generated units using hardware tools such as Vivado from Xilinx/AMD.