Sorting Networks with Vector Instructions [Available]

A sorting network is a notional model of a hardware circuit that can sort a fixed number of inputs. Sorting networks are made up of hardware comparator units connected together by a fixed network of wires so that the outputs of one comparator can become an input to another. Sorting networks can be implemented in hardware or software. Sorting networks can be a particularly useful abstraction for modelling sorting routines that are implemented by vector instructions. The deterministic sequence of comparisons and conditional swaps used in sorting networks typically map well to branchless sequences of vector instructions.

A common strategy for creating sorting routines with vector instructions is to first design the sorting network, and then to implement the sorting network using vector instructions. This approach works pretty well, but it could be better. The problem is that vector instructions operate on L adjacent lanes (for typical values of L such as 4, 8 or 16). Ideally, the sequence of comparisons in the sorting network can be mapped directly to these vector instructions. However, in many cases the sequence of comparisons expressed by the sorting network does not involve simple sequences of elements to be compared. To implement these parts of the sorting network, additional swizzle operations are needed, to shuffle, permute and blend the vectors to align the values correctly in the vectors.

The goal of this project is to develop software that can automatically generate sorting networks that are optimized for vector instructions. There are existing sorting network generator programs that search for sorting networks that use fewer comparators, or fewer levels of parallel comparators. Examples include SorterHunter (https://github.com/bertdobbelaere/SorterHunter). The goal of this project is to generate sorting networks that map efficiently to vector instructions.