Doc No:

<N 1875>

Date:

2014-09-29

Reply to:

Max Abramson <[email protected]>



Adding classes to C

Summary

This paper proposes the addition of C++ style classes used in the C programming language. The benefit over programs written in C++ is shorter compile times and smaller executables than those for programs which nevertheless require object-oriented features familiar to nearly all modern programmers, a reduced memory footprint than comparable object-oriented code, as well as better interoperability with existing C code and libraries. Only static features that are compatible both with existing C++ (and, as much as possible, Objective C) compilers are used. The emphasis on speed and small size, as compared to dynamic languages, is recognized by this author as aligning itself with the growing use of the C language where performance, size, and real time applications is desired.

The intent of this proposal (as with the accompanied papers suggesting both access specifiers and single inheritance), is to take the most conservative approach possible. I propose adding features already implemented by C++ compilers (and, wherever possible, that are not incompatible with Objective C) that are not likely to create problems for future expansions to the C language, to increase compile time, the difficulty in implementing dedicated C compilers, or to expand the size of binaries generated for existing projects. This paper proposes features that may be added independently or together with the features proposed in “Access specifiers” and/or “Single Inheritance.”

Initializers and Deleters

Members of a class can be initialized using a C-style initialization list:

Date stValentinesDay = (2, 15);

or by calling a custom initializer method, using a proposed initClass and deleteClass naming conventions:

Date thanksgiving.initDate(11, 22);

At present, some compilers will not allow initialization lists to be used where a structure contains any objects, such as strings, nor can data members be skipped over:

Date birthday = (“February”, , 1996);

This author considers this to be a major shortcoming of simple initialization which should be added as part of the standard. While automatic constructors and destructors are some of the most popular features in C++, this paper proposes this more conservative naming convention, manually writing both initClassName() and deleteClassName() methods which are called explicitly only when needed. One downside to this is that initializer methods must explicitly call the initializer of the base class if it any parent classes contain instance variables.

class Ostrich : public Animal {

initOstrich(){/**/} //no-argument “{/**/}” proposed convention for //unfinished initializer

initOstrich(double speedGuess) { //overloaded initializer taking one double

initAnimal(arg1, arg2);

double* guessPtr = malloc(sizeof(double));

(speedGuess) ? m_speedEst = speedGuess : m_speedEst = SPEED_LOOKUP;

++numRatites; //pre-increment is faster when working with objects

}

deleteOstrich() {

--numRatites; //static int numRatites, previously declared globally

}

deleteOstrich(double speedGuess) {

--numRatites; //static int numRatites, previously declared globally

free speedGuess;

deleteAnimal();

}

};

For reference, C++ handles backward compatibility with C programs by implementing a structure using a class, though the default data members and methods remain public, by default, while C++ classes make members and methods private by default. C++ handles methods by building a list of pointers to those methods.

C++ also allows the use of virtual or dynamic functions, run time type inference, namespaces, and friend classes. The first two are dynamic, and the mere existence of those features increase both compile time and the size of the resulting binaries in C++ programs. All methods in Objective C are both virtual and public. This author hopes for the future implementation of all static, simple features commonly found in the C++ specification, but leaves other such questions and features for future discussion.

Keeping with the conservative philosophy proposed, and only adding public methods (as well as access specifiers and single inheritance in the two attached papers), the compiler would view the hierarchy of classes thus:

This paper proposes standard initializers and deleters that adhere to a predictable naming convention. Though C++ style constructors and destructors are among the most popularly features in the language, there may be a loss in performance and expansion of code size when they are used.

Nested classes can be expanded with methods that apply only to objects instantiated for them to provide composition and aggregation for the development of useful design patterns in object-oriented programming. This also allows for the development of useful code that might otherwise use inherited data members and methods in OO languages like C++, Java, Objective-C, or Python.

Initializers and deleters can be overloaded allowing different versions to be called when a structure or class is instantiated. The standard practice in C has been to allow programmers to discover innate functionality in the language, putting a limited set of features to more novel uses with time.

More fully support Resource Allocation Is Initialization as CUYOM (Clean Up Your Own Mess), both for performance and maintainability of code. The segment of a program that opens a file, creates an object, or allocates memory should also be responsible for cleanup.

The first two reasons motivate to expansion of the language and would help to avoid name collisions and other problems that develop with very large programs. The second two are happy byproducts of these features, as is the improved ability to add popular libraries (like a version of Standard Template Library and Boost libraries) in ways that are more predictable for users of Java, C++, and other high level languages.

Example

For reference, C++ treats structures as classes for backward compatibility, but with the default access specifier changed to public. Again, the paper proposes strict conformance to the C++ standard (WG 21/N3337) in handling structures and classes for C. New C code must operate in exactly the same way under C++, or the program must not compile, by default.

class Runner {

const double SPEED_LOOKUP = 55;

double m_speedEst;

double estRunningSpeed(double hipHeight, double strideLength){...}

};



class Ratites {

int m_numRatites;



Ostrich bob; //this is the simplest way to nest classes

Runner runner; //no constructor/initializer is needed, no instance variables

public:

void setNumRatites(int num) {

m_numRatites = num;

}

int getNumRatites() {

return m_numRatites;

}

void print() {/**/}

};



Method calls

Method calls are handled the same way that C++ handles them. When an object is instantiated, we use one of its available methods using the following syntax:

runner->method(speedVal, slowVal);



Wording

The proposed wording draws heavily on the wording in WG21/N3337. Note: It is the intent of this proposal that access specifiers and other features conform to the specifications given in therein. Wording additions are included as a new keyword imported from C++, with identical meaning and syntax, defined in N3337.

6.4.1 Keywords

Add this, class, and public to the list of reserved keywords.

Add a new sections 6.7.11 and 6.7.13:

6.7.11 Member Access

1 A data member of a structure can be

private; that is, its name can be used only by members of the structure in which it is declared. This is the default access level for classes.

protected; that is, its name can be used only by members of the structure in which it is declared or by structures derived from that structure.

public; that is, its name can be used anywhere without access restriction. This is the default access level for structs. All methods are public.

[This section applies only if the “Access specifiers” proposal is also approved.]

2 A data member of a class can also access all the names to which the class itself has access. A local class of a member function may access the same names that the member function itself may access. Access permissions are thus transitive and cumulative to nested and local classes.

3 Data members of a structure defined with the keywords struct or union are public by default.

4 Access control is applied uniformly to all names, whether the names are referred to from declarations or expressions. In the case of overloaded function names, access control is applied to the function selected by overload resolution. Because access control applies to names, access control is applied to a typedef name--not the entity referred to by the typedef.

5 It should be noted that it is access to members and base classes that is controlled, not their visibility. Names of members are still visible, and implicit conversions to base classes are still considered, when those members and base classes are inaccessible. The interpretation of a given construct is established without regard to access control. If the interpretation established makes use of inaccessible member names or base classes, the construct is not constructed according to the syntax rules, diagnosable semantic rules, or may have ambiguous meaning. [Note: Base structures are not yet a concern for this specification if the Single Inheritance page is not defined as part of the specification for C, but is included here for future reference.]

6 All access controls in this section affect the ability to access a struct or class member name from the declaration of a particular entity, including parts of the declaration preceding the name of the entity being declared and, if the entity is a structure, the definitions of members of the struct or class appearing outside the structure’s member specification. [Note: this access also applies to implicit references to constructors, conversion functions, and destructors. —does not apply if constructors are not implemented]

7 Here, all the uses of A::I are constructed according to the syntax rules, diagnosable semantic rules, and applies one definition because A::f, A::x, and A::Q are members of struct A. This implies, for example, that access checking on the first use of A::I must be deferred until it is determined that this use of A::I is as the return type of a member of struct A. Similarly, the use of A::B as a base-specifier is constructed according to the syntax rules, diagnosable semantic rules, and applies one definition because D is derived from A, so checking of base-specifiers must be deferred until the entire base-specifier-list has been seen.

8 The names in a default argument (8.3.6) are bound at the point of declaration, and access is checked at that point rather than at any points of use of the default argument.



6.7.13 Nested structures and classes

1 A nested structure is a member and as such has the same access rights as any other member. The members of an enclosing structure or class have no special access to members of a nested structure; the usual access rules for private, protected, and public shall be obeyed [if accompanied proposal “Access specifiers” is adopted].

typedef struct Eagle {

int x; //public, by default

struct Bird { };

struct Golden {

void f(Eagle* ptrE, int i) { //methods must be public

ptrE->x = i; //OK: Eagle::Golden can access Eagle::x

}

}Eagle;



The this pointer [structName.this]

1 In the body of a non-static member function, the keyword this is a prvalue expression whose value is the address of the object for which the function is called. The type of this in a member function of a class X is X*. If the member function is declared const, the type of this is const X*, if the member function is declared volatile, the type of this is volatile X*, and if the member function is declared const volatile, the type of this is const volatile X*.

2 In a const member function, the object for which the function is called is accessed through a const access path; therefore, a const member function shall not modify the object and its non-static data members.

Example:

class s {

int a;

int f() const;

int g() { return a++; }

int h() const { return a++; } // error

}s;

int s::f() const { return a; }

The a++ in the body of s::h is ill-formed because it tries to modify (a part of) the object for which s::h() is called. This is not allowed in a const member function because this is a pointer to const; that is, *this has const type.

3 Similarly, volatile semantics apply in volatile member functions when accessing the object and its non-static data members.

4 A cv-qualified member function can be called on an object-expression (5.2.5) only if the object-expression is as cv-qualified or less-cv-qualified than the member function. Example:

void k(s& x, const s& y) {

x.f();

x.g();

y.f();

y.g(); // error: y is const and s::g() is a non-const member function

};