Okay, so you want to calculate a^b mod m
. First we'll take a naive approach and then see how we can refine it.
First, reduce a mod m
. That means, find a number a1
so that 0 <= a1 < m
and a = a1 mod m
. Then repeatedly in a loop multiply by a1
and reduce again mod m
. Thus, in pseudocode:
a1 = a reduced mod m
p = 1
for(int i = 1; i <= b; i++) {
p *= a1
p = p reduced mod m
}
By doing this, we avoid numbers larger than m^2
. This is the key. The reason we avoid numbers larger than m^2
is because at every step 0 <= p < m
and 0 <= a1 < m
.
As an example, let's compute 5^55 mod 221
. First, 5
is already reduced mod 221
.
1 * 5 = 5 mod 221
5 * 5 = 25 mod 221
25 * 5 = 125 mod 221
125 * 5 = 183 mod 221
183 * 5 = 31 mod 221
31 * 5 = 155 mod 221
155 * 5 = 112 mod 221
112 * 5 = 118 mod 221
118 * 5 = 148 mod 221
148 * 5 = 77 mod 221
77 * 5 = 164 mod 221
164 * 5 = 157 mod 221
157 * 5 = 122 mod 221
122 * 5 = 168 mod 221
168 * 5 = 177 mod 221
177 * 5 = 1 mod 221
1 * 5 = 5 mod 221
5 * 5 = 25 mod 221
25 * 5 = 125 mod 221
125 * 5 = 183 mod 221
183 * 5 = 31 mod 221
31 * 5 = 155 mod 221
155 * 5 = 112 mod 221
112 * 5 = 118 mod 221
118 * 5 = 148 mod 221
148 * 5 = 77 mod 221
77 * 5 = 164 mod 221
164 * 5 = 157 mod 221
157 * 5 = 122 mod 221
122 * 5 = 168 mod 221
168 * 5 = 177 mod 221
177 * 5 = 1 mod 221
1 * 5 = 5 mod 221
5 * 5 = 25 mod 221
25 * 5 = 125 mod 221
125 * 5 = 183 mod 221
183 * 5 = 31 mod 221
31 * 5 = 155 mod 221
155 * 5 = 112 mod 221
112 * 5 = 118 mod 221
118 * 5 = 148 mod 221
148 * 5 = 77 mod 221
77 * 5 = 164 mod 221
164 * 5 = 157 mod 221
157 * 5 = 122 mod 221
122 * 5 = 168 mod 221
168 * 5 = 177 mod 221
177 * 5 = 1 mod 221
1 * 5 = 5 mod 221
5 * 5 = 25 mod 221
25 * 5 = 125 mod 221
125 * 5 = 183 mod 221
183 * 5 = 31 mod 221
31 * 5 = 155 mod 221
155 * 5 = 112 mod 221
Therefore, 5^55 = 112 mod 221
.
Now, we can improve this by using exponentiation by squaring; this is the famous trick wherein we reduce exponentiation to requiring only log b
multiplications instead of b
. Note that with the algorithm that I described above, the exponentiation by squaring improvement, you end up with the right-to-left binary method.
a1 = a reduced mod m
p = 1
while (b > 0) {
if (b is odd) {
p *= a1
p = p reduced mod m
}
b /= 2
a1 = (a1 * a1) reduced mod m
}
Thus, since 55 = 110111 in binary
1 * (5^1 mod 221) = 5 mod 221
5 * (5^2 mod 221) = 125 mod 221
125 * (5^4 mod 221) = 112 mod 221
112 * (5^16 mod 221) = 112 mod 221
112 * (5^32 mod 221) = 112 mod 221
Therefore the answer is 5^55 = 112 mod 221
. The reason this works is because
55 = 1 + 2 + 4 + 16 + 32
so that
5^55 = 5^(1 + 2 + 4 + 16 + 32) mod 221
= 5^1 * 5^2 * 5^4 * 5^16 * 5^32 mod 221
= 5 * 25 * 183 * 1 * 1 mod 221
= 22875 mod 221
= 112 mod 221
In the step where we calculate 5^1 mod 221
, 5^2 mod 221
, etc. we note that 5^(2^k)
= 5^(2^(k-1)) * 5^(2^(k-1))
because 2^k = 2^(k-1) + 2^(k-1)
so that we can first compute 5^1
and reduce mod 221
, then square this and reduce mod 221
to obtain 5^2 mod 221
, etc.
The above algorithm formalizes this idea.
Best Solution
The modulus/modulo operation is usually understood as the integer equivalent of the remainder operation - a side effect or counterpart to division.
Except for some degenerate cases (where the divisor is a power of the operating base - i.e. a power of 2 for most number formats) this is just as expensive as integer division!
So the question is really, why is integer division so expensive?
I don't have the time or expertise to analyze this mathematically, so I'm going to appeal to grade school maths:
Consider the number of lines of working out in the notebook (not including the inputs) required for:
So in simple terms, this should give you a feel for why division and hence modulo is slower: computers still have to do long division in the same stepwise fashion tha you did in grade school.
If this makes no sense to you; you may have been brought up on school math a little more modern than mine (30+ years ago).
The Order/Big O notation used above as O(something) expresses the complexity of a computation in terms of the size of its inputs, and expresses a fact about its execution time. http://en.m.wikipedia.org/wiki/Big_O_notation
O(1) executes in constant (but possibly large) time. O(N) takes as much time as the size of its data-so if the data is 32 bits it takes 32 times the O(1) time of the step to calculate one of its N steps, and O(N^2) takes N times N (N squared) the time of its N steps (or possibly N times MN for some constant M). Etc.
In the above working I have used O(N) rather than O(N^2) for addition since the 32 or 64 bits of the first input are calculated in parallel by the CPU. In a hypothetical 1 bit machine a 32 bit addition operation would be O(32^2) and change. The same order reduction applies to the other operations too.