06-25-2018, 04:31 PM
Finally a solution for Java that might be accepted:
Map Reduce
Parallel Streams
With this small change we can run the calculation in parallel:
This would not be possible using a classic for-loop.
Will it be faster on a MacBook with 4 cores?
Guess what: it makes it even a bit slower.
These are the results for the real time of 20 runs:
Sequential
Parallel
We have to increase the upper limit to 10000000 to notice the difference:
Sequential
Parallel
During this run the CPU was used up to 350% by this program:
That's why the real-time is smaller than the user-time.
Premature Optimization
My Python program above runs in about 0.045s.
It's not1 optimised but doing so wouldn't gain that much.
It took me much longer to write the Java program.
And today the biggest cost is our time. Not only the time that you spend to implement a piece of code. But also the time you or others must spend debugging and maintaining your code.
Contrary to what others might think I consider this a reasonable interview question.
Because it could lead the discussion into different directions.
[1]: Ok, it's a bit optimized in that instead of a list-comprehension (using []) the corresponding generator (using ()) is used. Thus not the whole list has to be kept im memory. Instead summation and producing the arguments can be interleaved. But even this doesn't change much.
Map Reduce
Code:
import java.math.BigInteger;
import java.util.stream.IntStream;
public class SumModPow {
private static final BigInteger M = BigInteger.valueOf(10).pow(10);
public static void main(String[] args) {
System.out.println(
IntStream.rangeClosed(1, 1000)
.mapToObj(n -> {
BigInteger m = BigInteger.valueOf(n);
return m.modPow(m, M);
})
.reduce(BigInteger.ZERO, BigInteger::add)
.mod(M)
);
}
}
Parallel Streams
With this small change we can run the calculation in parallel:
Code:
import java.math.BigInteger;
import java.util.stream.IntStream;
public class SumModPow {
private static final BigInteger M = BigInteger.valueOf(10).pow(10);
public static void main(String[] args) {
System.out.println(
IntStream.rangeClosed(1, 1000)
.parallel() // add this line
.mapToObj(n -> {
BigInteger m = BigInteger.valueOf(n);
return m.modPow(m, M);
})
.reduce(BigInteger.ZERO, BigInteger::add)
.mod(M)
);
}
}
Will it be faster on a MacBook with 4 cores?
Guess what: it makes it even a bit slower.
These are the results for the real time of 20 runs:
Sequential
Code:
0.2100 #
0.2200 ####
0.2300 #####
0.2400 #######
0.2500 #
0.2600 ##
---------------+---------+
0 10
min = 0.2130
max = 0.2690
mean = 0.2388
sigma = 0.0130
mode = 0.2400
median = 0.2385
Parallel
Code:
0.2300 #
0.2400 #########
0.2500 ########
0.2600 ##
---------------+---------+
0 10
min = 0.2360
max = 0.2620
mean = 0.2498
sigma = 0.0074
mode = 0.2400
median = 0.2490
We have to increase the upper limit to 10000000 to notice the difference:
Sequential
Code:
real 0m35.134s
user 0m36.712s
sys 0m0.298s
Parallel
Code:
real 0m17.546s
user 0m56.692s
sys 0m0.420s
During this run the CPU was used up to 350% by this program:
Code:
PID USER PRI NI VIRT RES S CPU% MEM% TIME+ Command
32002 tom 0 0 9.6G 253M ? 341. 1.5 0:44.54 /usr/bin/java -cp parallel-10000000 SumModPow
Premature Optimization
Quote:Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered.-- Donald Knuth
We should forget about small efficiencies, say about 97% of the time:
premature optimization is the root of all evil.
Yet we should not pass up our opportunities in that critical 3%.
My Python program above runs in about 0.045s.
It's not1 optimised but doing so wouldn't gain that much.
It took me much longer to write the Java program.
And today the biggest cost is our time. Not only the time that you spend to implement a piece of code. But also the time you or others must spend debugging and maintaining your code.
Contrary to what others might think I consider this a reasonable interview question.
Because it could lead the discussion into different directions.
[1]: Ok, it's a bit optimized in that instead of a list-comprehension (using []) the corresponding generator (using ()) is used. Thus not the whole list has to be kept im memory. Instead summation and producing the arguments can be interleaved. But even this doesn't change much.