Doc. No.: | WG14/N 1805 |
---|---|
Date: | 2014-03-17 |
Reply to: | Hans-J. Boehm |
Email: | [email protected] |
This is a quick attempt to summarize discussions in WG21/SG1 (the C++ concurrency study group) that may be of interest to WG14:
WG21/SG1 and WG14/CPLEX are both discussing support for task parallelism and vector/SIMD parallelism. The two groups, for excellent reasons, seem ot have a significantly different perspective on the issues. CPLEXes emphasis has been on standardizing something close to existing OpenMP and Cilk practice. All other things being equal, SG1 would clearly prefer to exploit C++'s existing abstraction mechanism, and provide these facilities using (mostly?) a standard library interface.
Although there may be a good justification for diverging APIs, there is also a huge costs to independently designing completely incompatible C and C++ interfaces, particularly if they do not interoperate and cannot share the same underlying run-time facilities. There appeared to be a general consensus that we need to be careful to avoid that outcome. It is unclear whether overlapping CPLEX and SG1 membership is enough to do so.
The C++ treatment of atomics use in signal handlers was long recognized to be broken. We finally voted in a fix (N3910). I believe the C wording situation is significantly different, but I'm reasonably sure it does not correctly address all the issues addressed in N3910. I believe N3910 reflects what WG21 intended to do in C++11.
We originally set out to fix some egregious wording problems introduced into C++11 when we added support for atomics in signal handlers. But the the discussion raised numerous additional issues that nobody had thought through. These included questions about whether races involving signal handlers should be treated as threads data races or as unsequenced expression evaluations, and whether it matters. (It matters, and it should be the former, at least for C++.) It also raised profound questions about the role of "volatile sig_atomic_t".
The new C++ wording allows atomics to be used to communicate between a thread and a signal handler in a different thread, but this does not apply to "volatile sig_atomic_t". We concluded that "volatile sig_atomic_t" does not behave like an atomic variable, since it usually does not enforce visibility ordering with respect to other threads. Thus this seemed to be the safest solution that's backwards compatible with the pre-2011 state. There was idle speculation about eventually deprecating "volatile sig_atomic_t". My personal feeling is that that's probably the right long term plan, but we didn't officially consider it yet. Atomics seem like a much better alternative, now that we have them. Deprecation may come up for C++17.
We added a (suitably weasel-worded) definition of what "lock-free" means. (N3927) You may want something similar.
SG1 decided to replace the "visible sequence of side-effects" wording in the memory model. This is mostly a simplification. See N3914, issue 1466. Again you may want to do something similar. (This was based on an observation by Mark Batty from a few years ago.)
I raised the question from N3710 about adding support for non-racing accesses to atomics. That seemed to generate SG1 interest. One of my goals is to reduce reliance on memory_order_relaxed, which is exceedingly difficult to specify well. It would also address the fact that memory_order_relaxed is actually not quite as cheap to implement on some architectures as we had hoped. (Which might become even more true if we specified it precisely but more conservatively, e.g. as suggested in N3710.) I don't know whether WG14 might eventually be interested in changes like this. They would still take while to materialize.
SG1 is developing additional technical specifications, one for "concurrency", and one for "parallellism". So far both focus on additional C++-specific libraries.
The "concurrency" TS is expected to include at least extensions to C++11/14 "future"s, and support for "executors", a framework for specifying where/how something is to be executed. Additional library facilities (e.g. barrier synchronization, efficient counters, cocnurrent queues and hash tables) are also under discussion and may find thir way into this TS.
The "parallel" TS is expected to include at least a parallel/SIMD generic algorithms library. It may also include more CPLEX-like facilities, but so far there is very limited concensus in that area.