Comparison with other libraries / Credit
Many efforts have already been deployed in this area. The point here is to try to summarize their properties.
"STAPL: Standard Template Adaptive Parallel Library"
- Requires specific containers.
- Is able to use several
multi-threading libraries, although one cannot add his/hers.
PSTL
Does not seem to be supported anymore ?
"Range Partition Adaptors : A Mechanism for Parallelizing STL" of Matthew H. Austern, Ross A. Towle and Alexander A. Stepanov
http://www.stepanovpapers.com
We go to the same direction, with some differences :
- The type 'subrange' is not explicitly shown.
- We do not use pragmas.
- We aim at parallelizing, not only
random access iterators but all types of iterators - this may sound
ambitious, but we do not have the choice, because most of iterators, in
real-live programs, are forward iterators.
- We also aim at
parallelizing general-purpose algorithms such as 'find' , whatever the
container is: Because they are very commonly used due to their
commodity and flexibility.
OPENMP
One of the possible approaches of C++ parallelisation is based on
the use of preprocessor pragmas (OpenMP). It has the advantage
that the source program can still be compiled and tested in sequential
mode without any code change. It has however some severe drawbacks :
- A specific compiler or preprocessor must be developped, for each
target machine.
RPA uses any C++ compiler.
- No debugger is able to show what happens 'behind'
the pragmas.
A debugger is of no help because the generated code has nothing to do with the C++ code written by the programmer.
RPA is pure C++, the preprocessor is not needed.
RPA is totally open, the whole code can be crippled by debug statements.
- The developer has absolutely no control on the system
primitives which are used : Threads, mutexes ... cannot be chosen.
This is why it is preferable to have programs which do exactly what
they meant to do, for any C++ compiler. For the moment, parallelized
containers must be vectors.
It makes the application dependent on another architecture.
On the other hand, RPA lets you control any function or system call, any class or structure used.
You can know what your threads are doing at any moment. You can have your own threads pool, or none.
- OpenMP is able to parallelize automatically the code.
This is true if you are manipulating arrays. With std::map or std::list, OpenMP may not help.
Some parallelization may be possible, but is very complicated.
OpenMP sets up a specific semantic level for expressing parallelism.
First having a bug-free sequential program, and later parallelize it with pragma is a brillant idea.
On the other hand, it may not suit intrisically-parallel problems, for exemple involving pipelines.
RPA tries to give C++ structure which are intrisically tailored for parallelism.
MPTL: http://spc.unige.ch/mptl
- The algorithm
rpa::remove
will never be implemented (We did not
yet, but it is designed and planned).
- The only thread model is
POSIX pthread : There is no way to use another thread model.
- There is no interleaved scheduling.
- Output iterators are not usable (On
the other hand, not only output iterators are usable with RPA, but
input iterators will because the design is already there).