Evaluation of Fibonacci-Number Benchmark
Use of a parallel doubly recursive Fibonacci-number benchmark is
said to be a good way to compare the thread-creation and
synchronization overheads of different parallel work-stealing
schedulers.
So what is the problem, anyway?
- Disposition of generally useful batching optimizations.
If you are going to incur the overhead required to steal a
work item, why not steal a sufficiently large batch to make the
theft worthwhile?
But this is a two-edged sword.
On the one hand, if such batching is permitted, the
benchmark cannot be said to be fairly evaluating
task-creation and synchronization overhead.
On the other hand, if such batching is prohibited, the benchmark
is suppressing a critically important optimization.
- Handling of synchronization-elision optimizations.
It does not take too much imagination to conceive of a
scheduler that optimizes away task-creation operations,
substituting much-cheaper function-call operations.
This is also a two-edged sword: aggressive application of
this optimization yields hundred-million-fold speedups,
but totally eliminates the synchronization operations that
are ostensibly being measured.
- Obscuring generally useful higher-level optimizations.
If there really was some real-world reason to parallelize
Fibonacci-number generation, such parallelization should
start with an
efficient algorithm, whose computational complexity is
O(log(n)).
For large n, the focus should then be parallelizing the
large-integer multiplies — but this generally useful
optimization is completely obscured by this benchmark.
- Any developers who foolishly attempt to learn parallel programming
from the code implementing this benchmark would be sadly misled.
The road to high performance and excellent scalability is through
increasing the granularity of parallelism, not by
minimizing it.
Let's evaluate this benchmark in each of the roles benchmarks play:
- Providing a fair framework for comparing competing implementations.
Given the potential for “gaming” this benchmark
via any of the optimizations called out above, the
Fibonacci-number benchmark cannot be said to play this role.
- Focusing competitive energy on improving implementations in ways
that matter to users.
Given the number of useful optimizations that this benchmark
suppresses, it cannot be said to play this role particularly well.
- Serving as example uses of the implementations being benchmarked.
This benchmark can at best be held out as an example use to
avoid like the plague.
In short, my advice to you is to ignore any comparisons based on the
doubly recursive Fibonacci-number benchmark, and instead run your
own benchmark using code sequences that are important to your application.
Or, better yet, implement and popularize a set of good benchmarks for
these schedulers.