# R – Fast Euclidean division in C

bit-manipulationc++micro-optimization

I am interested in getting the remainder of the Euclidean division, that is, for a pair of integers (i, n), find r such as:

``````i = k * n + r, 0 <= r < |k|
``````

the simple solution is:

``````int euc(int i, int n)
{
int r;

r = i % n;
if ( r < 0) {
r += n;
}
return r;
}
``````

But since I need to execute this tens of million of times (it is used inside an iterator for multidimensional arrays), I would like to avoid the branching if possible. Requirements:

• Branching but faster is also desirable.
• A solution which works only for positive n is acceptable (but it has to work for negative i).
• n is not known in advance, and can be any value > 0 and < MAX_INT

# Edit

It is actually quite easy to get the result wrong, so here is an example of the expected results:

• euc(0, 3) = 0
• euc(1, 3) = 1
• euc(2, 3) = 2
• euc(3, 3) = 0
• euc(-1, 3) = 2
• euc(-2, 3) = 1
• euc(-3, 3) = 0

Some people also worry that it does not make sense to optimize this. I need this for an multi-dimensional iterator where out of bounds items are replaced by items in a 'virtual array' which repeats the original array. So if my array x is [1, 2, 3, 4], the virtual array is […., 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4], and for example, x[-2] is x1, etc…

For a nd array of dimension d, I need d Euclidean division for every point. If I need to do a correlation between a n^d array with a m^d kernel, I need n^d * m^d * d euclidean divisions. For a 3d image of 100x100x100 points and a kernel of 5*5*5 points, that's already ~ 400 million Euclidean divisions.

#### Best Solution

Edit: No multiplication or branches woot.

``````int euc(int i, int n)
{
int r;

r = i % n;
r += n & (-(r < 0));

return r;
}
``````

Here is the generated code. According to the MSVC++ instrumenting profiler (my testing) and the OP's testing, they perform nearly the same.

``````; Original post
00401000  cdq
00401001  idiv        eax,ecx
00401003  mov         eax,edx
00401005  test        eax,eax
00401007  jge         euc+0Bh (40100Bh)
0040100B  ret

; Mine
00401020  cdq
00401021  idiv        eax,ecx
00401023  xor         eax,eax
00401025  test        edx,edx
00401027  setl        al
0040102A  neg         eax
0040102C  and         eax,ecx