Comparing Atomics in
C Proposal N1473 and C++ FCD N3092

ISO/IEC JTC1 SC22 WG14 N1489 - 2010-05-29

Lawrence Crowl, [email protected], [email protected]

Introduction
Operators
    Loads and Stores
    Compound Assignment
    Ternary Compare-Exchange Operator
    Lost Non-Volatile Optimization
Type Names
    Over-Constrained Size and Alignment
    Prohibiting Internal Locks
    Qualification Inconsistency
    Nested Atomicity
    Recommendation
Summary

Introduction

The C++ standards committee, with liason support from the C committee, developed a facility for atomic types and operations with a goal of making it possible to write code using atomics for inclusion and compilation in both C and C++. That is, there should be a common subset of the atomics facility that means the same is both languages. The resulting facility is embodied in the C++ Final Committee Draft N3092 and in the C Working Draft N1425. Code using the common subset has been successfully compiled and executed in both languages.

This common facility requires that all atomic oerations be written as function calls. In contrast, programmers writing in pure C++ can use the natural operators. Furthermore, the common subset defines a fixed set of types, it is not extensible. Pure C++ programs can obtain new atomic types using the atomic class template. To enable C programmers to have extensible types and use operators, Blaine Garst has proposed a new facility for C, with the latest description in N1473.

This paper compares the full C++ facility to the proposed C facility, and makes some recommendations that improve the C proposal.

Operators

Both facilities provide for sequentially consistent operations when using the operator syntax.

Loads and Stores

Both facilities provide implicit lvalue-to-rvalue conversion for atomics loads and assignment operator syntax for atomic stores.

Most modern processors cannot support both a read of one location and a write to another location as a single atomic action. As a consequence, processors must implement assignment as two separate atomic actions. C++ prevents assignment from one atomic to another to prevent the illusion of full atomicity when it is not present.


atomic_int a = { 1 }, b = { 2 };
a = 3; // okay, a single atomic write
a = b; // error, an implicit two-location atomic
a = (int)b; // okay, clearly a two-step process

Recommendation: Make assignment from one atomic to another ill-formed.

Compound Assignment

Both facilities provide atomic compound assignment, unlike Java.

Both facilities provide conventional return values for the compound assignment operators, i.e. the result of the operation and not the prior values as with the fetch_op functions.

The C++ facility provides only the compound assignment operators '+=', '-=', '&=', '|=', and '^=', and only for the integer and pointer types. The C facility provides the same set of compound operators as the underlying types. C++ programmers can add the additional operations in user code by defining a non-member operator function.


float operator+=( atomic<float>& object, float value ) {
    float expected = object.load(memory_order_relaxed);
    float desired;
    do desired = expected + value;
    while ( !object.compare_exchange_weak(expected, desired) );
    return desired;
}

The compound operators that are in the C proposal but not in the C++ proposal, are under-specified. C++ carefully defined '+=' to use two's complement arithmetic to prevent any undefined behavior, redundant representations, or trap values in the operation. In particular, this observation applies to signaling NaNs. Without additional specification, a floating-point '+=' would suffer from these problems. These problems are significant because programmers have no opportunity to check whether or not an upcoming operation will stray into undefined behavior.

Recommendation: Either explicitly remove the extended operations or explicitly specify them to have semantics "as if" the above code.

Ternary Compare-Exchange Operator

The C proposal provides a new ternary compare-exchange operator. The closest equivalent in C++ are the compare_exchange_* member functions.


bool done = object ?= expected : desired;
bool done = object.compare_exchange_weak(expected, desired);

The return value is a boolean indicating whether or not the assignment occured. This behavior is consistent with the existing C/C++ compare-exchange functions.

The 'expected' argument is an r-value. However, the corresponding argument in the C/C++ compare-exchange functions is a pointer or a reference. The reason is two-fold. First, processors tend to pick one of three compare-exchange signatures: return only the boolean, return only the value found, or both. The C/C++ compare-exchange functions maps efficiently to all three signatures. Second, the functions write back through the 'expected' argument when the comparison fails, which sets makes 'expected' ready for recompution. The resulting loop is both simpler to write and more efficient.

Recommendation: Change the second operand of the operator to an l-value rather than an r-value.

The existing C/C++ facility provides separate functions for the weak and strong semantics of compare-exchange. The C proposal does not specify which of these semantics it provides. Of the two semantics, the weak semantics are the best choice if only one is available. (The reasons are efficiency on some platforms and robustness to padding and redundant representations.)

Recommendation: Define the operator to have weak semantics.

The proposal presents spinlocks as an example use of the new compare-exchange operator. However, the compare-exchange operation is significantly stronger than necessary for spinlocks. Furthermore, spinlocks can have really pathological behavior on multi-programmed systems.

Recommendation: Use a better example, perhaps from wait-free data structures.

Lost Non-Volatile Optimization

The C proposal states that the __atomic qualifier implies the volatile qualifier. This implication is unfortunate because it prevents a broad class of optimizations. For example, given a non-volatile atomic object a the operation sequence a+=1,a+=2 could be optimized to a+=3. This optimization is possible because it is possible that no other regular thread will observe a state between those two operations. Given that atomic operations may take a hundred cycles, such optimizations could be valuable.

The C++ committee also anticipated potential future compilers that would completely remove threads of execution, and in the process turning variables that formerly needed to be atomic into simple sequential variables. Such an optimization would not be possible if all atomic variables were implicitly volatile.

Recommendation: Keep volatile and atomic as distinct and orthogonal concepts.

Type Names

The C proposal constructs atomic types by qualifying the base type with a new __atomic type qualifier. The C++ FCD constructs atomic types via a class template taking the base type as an argument. The qualifier approach has several problems.

Over-Constrained Size and Alignment

Because qualification implies a differing interpretation over a given base type, changes in alignment and size are not possible. Changing the alignment and size is important to efficiency. For example, some IA32 ABIs require only 16-bit alignment for 32-bit integers. However, the atomic operations require 32-bit alignment. One would want to impose additional alignment constraints on the atomic versions of many types. Likewise, structures that are of odd sizes would become more efficient when extended to sizes that are a multiple of the word size. C++ avoids these problems because atomic types are separate types with separate sizes and alignments, not qualified types.

Prohibiting Internal Locks

The representation constraints of the qualifier approach prevents use of locks internal to the atomic. While we expect most programmers would not be happy with such a representation, the existing C/C++ specification has been carefully crafted to enable such an implementation when unavoidable, without penalizing implementations that do not use internal locks.

Qualification Inconsistency

Qualifiers are often added or removed when passing pointers, which is very likely to produce programs in which some operations on a memory location are atomic and some are not. Such programs may silently fail in their value computations or silently become sequentially inconsistent.

Nested Atomicity

The C proposal allows both a struct and its members to be __atomic qualified. as one thread could access an atomic member, another accesses the struct as a whole. If both are intended to be atomic, which is not clear from the proposal, then the implementation of the accesses requires a substantially more expensive implementation. one that is reminiscent of transactional memory, but weaker. As Hans Boehm noted, a hierarchy of lock acquisitions is sufficient to implement the nested atomicity.

The C proposal further allows non-atomic access to members of __atomic qualified structs. If such accesses are not atomic with respect to full-struct accesses, then the specification provides the weak atomicity of transactional memory. If such accesses are atomic, then the specification provides the strong atomicity of transactional memory. Strong atomicity is not presently supported by hardware.

C++ prevents such problems in two ways. First, the argument to the atomic template class must be trivially copyable. Atomic types are not themselves trivially copyable, and hence are invalid as arguments. Second, all operations on atomic objects are value-in/value-out. The types and operations provide no references to any internal state.

Recommendation

C++ adopted the approach of distinct constructed types rather than qualified types to prevent use of both atomic and non-atomic operations on the same type, which provides for increased program safety and increased opportunity for optimization.

Given the problems with the qualification approach in constrast to the constructed type approach, I must recommend against the qualifier approach in favor of the constructed type approach.

There still remains the issue of another syntax for constructing atomic types. Compatibility and interoperability with C++ would be enhanced if that syntax resembles C++ syntax, e.g. a new syntax rule:

type-specifier:
_Atomic < type-name >

Note that support of such a syntax does not imply the introduction of templates into the C language.

Other type construction syntax is certainly possible, at the cost of more complexity in code that must be compiled by both languages.

Summary

The C proposal N1473 provides syntactically convenient access to atomics. Much of its semantics are already consistent with C++. However, it has several problems that are a consequence of an excess of generality. Most of these problems can be solved with well-targeted restrictions, which also brings semantics in line with those of C++. However, the type qualifier approach is inconsistent with the state of the art in systems implementation. Instead, the proposal should use more direct type construction, and therefore should propose a different syntax.