A C++ source file contains a series of zero or more declarations. A declaration can be a function, type, object, namespace, template instantiation or specialization, or a number of similar entities. This chapter mainly discusses namespaces, objects, and types. Functions, classes, templates, and related declarations each get their own chapters: Chapter 7 for functions; Chapter 8 for classes and friends; Chapter 9 for template declarations, specializations, and instantiations.
A declaration is the all-encompassing term for anything that tells the compiler about an identifier. In order to use an identifier, the compiler must know what it means: is it a type name, a variable name, a function name, or something else? Therefore, a source file must declare every name it uses.
A definition provides the body or contents for a declaration, such as a function body or the initial value of a static variable. The difference between a declaration and a definition, briefly, is that a declaration tells the compiler how to use a name, and the definition provides additional information that the program needs to back up that name.
In a single source file, there can be at most one definition of an entity. In an entire program, there must be exactly one definition of each function or object used in the program, except for inline functions, which must be defined in every source file that uses the function.
A program can have more than one definition of certain entities (classes, enumerations, inline functions, and templates), provided the definitions are the same.
Note that only declarations are required to use an entity (functions, classes, etc.). In any non-trivial program, the declarations and definitions are kept in separate files. Place declarations in a header file (whose name typically ends with .h or .hpp), so other source files can #include
the header file. Place the definitions in a source file (whose name typically ends with .cpp, .cc, or .C). Any source file that needs to use one of the declarations must #include
the necessary header files. Compile and link the source files to produce the program.
In this chapter and subsequent chapters, the description of each entity clearly states whether the entity (type, variable, class, function, etc.) has separate definitions and declarations, when definitions are required, and any other rules pertaining to declarations and definitions.
Some language constructs can look like a declaration or an expression. Such ambiguities are always resolved in favor of declarations. Others declarations can look like function declarations or object declarations. A consequence of the first rule is that these declarations are interpreted as function declarations.
A related rule is that a type specifier, followed by a name and empty parentheses is a declaration of a function that takes no arguments, not a declaration of an object with an empty initializer. Example 3-1 shows some examples of how declarations are interpreted.
Example 3-1: Disambiguating declarations.
#include <iostream> #include <ostream> class T { public: T() { std::cout << "T()\n"; } T(int) { std::cout << "T(int)\n"; } }; int a, x; int main() { T(a); // Variable named a of type T, // not an invocation of the T(int) constructor. T b(); // Function named b of no arguments, // not a variable named b of type T. T c(T(x)); // Declaration of a function named c, // with one argument of type T. }
The last item deserves further explanation. The function parameter T(x)
could be interpreted as an expression: constructing an instance of T
with the parameter x
, or it can be interpreted as a declaration of a function parameter of type T
named x
, with a redundant set of parentheses around the parameter name. According to the disambiguation rule, it must be a declaration, not an expression. That means the entire declaration cannot be the declaration of an object named c
, whose initializer is the expression T(x)
. Instead, it must be the declaration of a function named c
, whose parameter is of type T
, named x
.
If your intention is to declare an object, not a function, the simplest way to do this is not to use the function-call style of type cast. Instead, use one of the new type cast expressions, such as static_cast<>
, for example,
T c(static_cast<T>(x)); // Declares an object named c // whose initial value is x, cast // to type T.
Every declaration has a well-defined scope, a region of source code where any use of the unqualified name (that is, as a plain identifier) refers to that entity. Scope regions can be nested, and the same name can be declared within an outer and an inner scope region. Within the inner scope region, the unqualified name refers to the inner entity. Outside of the inner region, the name refers to the outer entity. Example 3-2 illustrates simple scope rules.
Example 3-2: Names in inner scopes can hide names in outer scopes.
#include <iostream> #include <ostream> int main() { for (int i = 0; i < 100; ++i) { int x = 42; if (x < i) { double x = 3.14; std::cout << x; // prints 3.14 } std::cout << x; // prints 42 } std::cout << x; // error: no x declared in this scope }
At the same scope level, you cannot have multiple declarations for the same name, unless every declaration of that name is for an overloaded function or function template, or if the declarations are identical typedef
declarations.
A name can be hidden by a different declaration in an inner scope level. Also, a class name or enumeration type name can be hidden by an object, function, or enumerator at the same scope level. For example,
enum x { a, b, c }; const int x = 42; // okay: hides enum x const int a = 10; // error: int cannot hide enumerator a { const int a = 10; // okay: inner scope can hide outer a }
Different entities (functions, statements, classes, namespaces, etc.) establish scopes in different ways. The description for each entity (in this and subsequent chapters) includes scope details. The general rule of thumb is that curly braces delimit a scope region. The outermost scope region is outside of all curly braces and is called the global scope. Read more about the global scope under the Namespaces section, later in this chapter.
When the compiler reads an identifier, it must lookup the identifier to determine which declaration is comes from. In most cases, you can readily tell which identifier is which, but it is not always so simple. To understand name lookup fully, you must first understand namespaces (covered later in this chapter), functions (Chapter 6), classes (Chapter 7), and templates (Chapter 8).
Name lookup takes place before overload resolution and access-level checking. If a name is found in an inner scope, the compiler uses that declaration, even if a better declaration would be found in an outer scope, as illustrated in Example 3-3. Refer to Chapter 6 for more information about overloaded functions and to Chapter 7 for information about access levels in a class declaration.
Example 3-3: Name lookup trumps overload resolution.
void func(int i) { std::cout << "int: " << i << '\n'; } namespace N { void func(double d) { std::cout << "double: " << std::showpoint << d << '\n'; } void call_func() { // Even though func(int) is a better match, the compiler // finds N::func(double) first. func(3); } } int main() { N::call_func(); // prints "double: 3.000000" using N::func; using ::func; // Now all overloaded func()s are at the same scope level func(4); // prints "int: 4" }
A name that follows the global scope operator (unary ::
) is looked up in the global scope of the source file where the operator is used. The name must be declared in the global scope, not in a nested scope, or the name must have been introduced into the global scope by a using
directive (see Namespaces, later in this chapter for information about using
directives).
Use the global scope operator to access names that have been hidden by an inner scope. Example 3-4 shows the use of the global scope operator.
Example 3-4: Using the global scope operator.
#include <iostream> #include <ostream> int x = 42; int main() { double x = 3.1415; // hides the global x std::cout << x << '\n'; // prints 3.1415 std::cout << ::x << '\n'; // prints 42 }
The binary scope resolution operator (::
) requires a namespace or class name as its left-hand operand, and an identifier as its right-hand operand. The identifier is looked up in the scope of the left-hand namespace. Example 3-5 shows the use of the scope resolution operator. Notice how the inner counter
hides the outer counter
, so the simple name counter
refers to the int
variable. The left-hand operand to ::
, however, must be a class or namespace, so the inner counter
does not hide the outer counter
in the expression counter::c
.
Example 3-5: Using the scope resolution operator.
#include <iostream> #include <ostream> namespace n { struct counter { static int n; }; double n = 2.71828; } int n::counter::n = 42; int main() { int counter = 0; // unrelated to n::counter int n = 10; // hides namespace n ::n::counter x; // refer to outer n std::cout << n::counter::n; // prints 42 std::cout << n::n; // prints 2.71828 std::cout << x.n; // prints 42 std::cout << n; // prints 10 std::cout << counter; // prints 0 }
The compiler looks up an unqualified name, that is, a bare identifier or operator symbol, in a series of scopes in order to find its declaration. The simple rule is that the innermost scope is searched first, and succeeding outer scopes are searched until the declaration is found. An associated simple rule is that the name must be declared before it is used, reading the source file from top to bottom.
In a class declaration, the class and its ancestor classes are searched. If the class is local to a function, the block that contains the class is searched, then enclosing blocks. Finally the namespaces that contain the class declarations are searched.
In the definition of a member function (as a parameter type, default argument value, or in the function body, but not as the return type), a name can be declared anywhere in the declaration of the class. (This is the only exception to the declare-before-use rule.) The name is looked up first in the block that contains its use, then in enclosing blocks, then in the enclosing class and ancestor classes, then nested classes and their ancestors. If the class is local to a function, the enclosing blocks within the function are searched, and finally the enclosing namespaces are searched.
If a namespace contains a using
directive (described later in this chapter), the referenced namespace is searched if all other searches fail to find the name.
If a friend
function is defined within the body of the friend class, the lookup rules are the same as for a member function. If the function is defined outside the class, the rules are the same as for an ordinary function. Example 3-6 shows how the two different lookup rules can make a difference.
Example 3-6: Defining a friend function.
class foo { public: friend void bar1(foo& f) { f.x = y; ++y; // okay: refers to foo::x } friend void bar2(foo& f); private: static int y; int x; }; void bar2(foo& f) { f.x = y; ++y; // error: y not in scope }
If the friend is a member function, the function and other names in its declaration are looked up first in the friend class, and if not found, in the class that contains the friend declaration, as you can see in Example 3-7.
Example 3-7: Declaring friend member functions.
class test { public: typedef int T1; typedef float T2; void f(T1); void f(T2); }; class other { typedef char T2; friend void test::f(T1); // look up f and T1 in test friend void test::f(T2); // look up f and T2 in test before // looking it up in other };
In the definition of a class member (function or static data) outside of the class declaration, lookup includes the class members and ancestor classes, but only after the class name appears in the declarator. Thus, the type specifier for the function return type or for the static data member are not looked up in the class declaration unless the type name is explicitly qualified. (Declarators and type specifiers are defined later in this chapter.) Example 3-8 shows the consequences of this rule.
Example 3-8: Defining members outside of a class declaration.
class node { public: enum color { red, black }; node(color x); color toggle(color c); private: color c; static color root_color; }; // Must qualify node::color and node::root_color, but // initializer is in the scope of node, so it doesn't need // to be qualified. node::color node::root_color = red; // Similarly, return type must be qualified, but parameter // type need not be. node::color node::toggle(color c) { return static_cast<color>(1 - c); }
Argument-dependent name lookup is also known as Koenig lookup, named after Andrew Koenig who devised this lookup rule. The short version of the rule is that the compiler looks up a function name in the usual places, and also in the namespaces that contain the argument types for each user-defined type (classes and enumerations).
The slightly longer version is that the compiler first searches all the usual places. If it does not find a declaration, it then searches an additional list of classes and namespaces. The additional list depends on the function's argument types:
Example 3-9 shows a typical case where argument-dependent name lookup is needed. The operator<<
function is declared in the std
namespace in the <string>
header. It is not a member function of ostream
, and the only way the compiler can find the operator is to search the std
namespace. The fact that its arguments are in the std
namespace tells the compiler to look there.
Example 3-9: Using argument-dependent name lookup.
#include <string> #include <iostream> #include <ostream> int main() { std::string message("Howdy!\n"); std::cout << message; }
Another way to look at argument-dependent lookup is to consider a simple class declaration, say, for rational numbers. In order to support the customary and usual arithmetic operators, you can choose to declare them as member functions, e.g.,
namespace numeric { class rational { ... rational operator+(const rational& other); rational operator+(int i); }; }
Expressions of the form r1
+
42
will compile successfully because the +
operator is found as a member of rational
(assuming r1
is an instance of rational
). But 42
+
r1
does not work. You must declare an operator at the namespace level, as follows,
namespace numeric { class rational { ... }; rational operator+(int i, const rational& r); ... }
In order to compile the expression 42
+
r1
, the compiler needs to find operator+
in namespace numeric
. Without requiring the user to use a using
directive, the only guideline the compiler has to know to search namespace numeric
is that the second argument is declared in numeric
. Thus, argument-dependent lookup enables the everyday, expected use of overloaded operators.
A namespace is a named scope. By grouping related declarations in a namespace, you can avoid name collisions with declarations in other namespaces. For example, suppose you are writing a word processor. Suppose also that you make use of packages that others have written, including a screen layout package, an equation typesetting package, and an exact-arithmetic package for computing printed positions to high accuracy with fixed-point numbers.
Suppose further than the equation package has a class called fraction
, which represents built-up fractions in an equation; the arithmetic package also has a class called fraction
, for computing with exact rational numbers; and the layout package has a class called fraction
for laying out fractional regions of a page. Without namespaces, all three names would collide and you could not use more than one package in a single source file.
With namespaces, each class can reside in a separate namespace, say, layout::fraction
, eqn::fraction
, and math::fraction
.
C++ namespaces are similar to Java packages.
Declare a namespace with the namespace
keyword, followed by an optional identifier (the namespace name), followed by zero or more declarations in curly braces. Namespace declarations can be discontiguous, even in separate source files or headers. The namespace scope is the accumulation of all declarations for the same namespace that the compiler has seen at the time it looks up name in the namespace. Namespaces can be nested. Example 3-10 shows a sample namespace definition.
Example 3-10: Defining a namespace.
namespace numeric { class rational { ... } template<typename charT, typename traits> basic_ostream<charT,traits>& operator<<( basic_ostream<charT,traits>& out, const rational& r); } namespace numeric { rational operator+(const rational&, const rational&); } numeric::rational numeric::operator+(const rational& r1, const rational& r2) { ... } int main() { using numeric::rational; rational a, b; std::cout << a + b << '\n'; }
You can also declare a namespace without a name, in which case the compiler uses a unique, internal name. Thus, each source file's unnamed namespace is separate from the unnamed namespace in every other source file.
You can define an unnamed namespace nested within a named namespace (and vice versa). Each unnamed namespace has a unique name. As with a named namespace, you can use multiple declarations to compose an unnamed namespace, as shown in Example 3-11.
Example 3-11: Using unnamed namespaces.
#include <iostream> #include <ostream> namespace { int i = 10; } namespace { int j; // same unnamed namespace namespace X { int i = 20; // hides i in outer, unnamed namespace } namespace Y = X; int f() { return i; } } namespace X { int i = 30; // X::unnamed is different namespace from ::unnamed namespace { int i = 40; // hides ::X::i, but is inaccessible // outside the unnamed namespace int f() { return i; } } } int main() { int i = X::i; // ambiguous: unnamed::X or ::X? std::cout << ::X::f() << '\n'; // prints 40 std::cout << Y::i << '\n'; // prints 20 std::cout << f() << '\n'; // prints 10 }
The advantage of using an unnamed namespace is that you are guaranteed that all names declared in it can never clash with names in other source files. The disadvantage is that you cannot use the scope operator (::
) to qualify identifiers in the unnamed namespace, so you must take care to avoid name collisions.
C programmers are accustomed to using global static declarations for names that are private to a source file. You can do the same in C++, but it is better to use an unnamed namespace because a namespace can contain any kind of declaration (including classes, enumerations, and templates), whereas static declarations are limited to functions and objects.
Declarations outside of all namespaces, functions, and classes are implicitly declared in the global namespace. A program has a single global namespace, which is shared by all source files that are compiled and linked into the program. Declarations in the global namespace are typically referred to as global declarations. Global names can be accessed directly using the global scope operator (unary ::
).
A namespace alias is a synonym for an existing namespace. You can use the alias name to qualify names (with the ::
operator), in using
declarations and directives, but not in namespace definitions. Example 3-12 shows some alias examples.
Example 3-12: Using namespace aliases.
namespace original { int f(); } namespace short = original; // alias int short::f() { return 42; } // okay using short::f(); // okay int g() { return f(); } namespace short { // error: cannot use alias here int h(); }
A using
declaration brings a name into a different namespace. The new name is a synonym for the original name. Only the declared name is added to the new namespace, which means using an enumerated type does not use all the enumerated literals. If you want to use all the literals, each one requires its own using
declaration.
Because the name is added to the current namespace, it might hide names in outer scopes, and you might not be able to declare any other objects with that name in the same scope.
Example 3-13 shows some examples of using declarations.
Example 3-13: Using the using declaration.
namespace numeric { class fraction { ... }; fraction operator+(int, const fraction&); fraction operator+(const fraction&, int); fraction operator+(const fraction&, const fraction&); } namespace eqn { class fraction { ... }; fraction operator+(int, const fraction&); fraction operator+(const fraction&, int); fraction operator+(const fraction&, const fraction&); } int main() { numeric::fraction nf; eqn::fraction qf; nf = nf + 1; // okay: calls numeric::operator+ qf = 1 + qf; // okay: calls eqn::operator+ nf = nf + qf; // error: no operator+ using numeric::fraction; fraction f; f = nf + 2; // okay f = qf; // error: type mismatch using eqn::fraction; // error: like trying to declare // fraction twice in the same scope if (f > 0) { using eqn::fraction; // okay: hides outer fraction fraction f; // okay: hides outer f f = qf; // okay: same types f = nf; // error: type mismatch } int fraction; // error: name fraction in use }
You can also copy names from one namespace to another with a using
declaration. Suppose you refactor your program, and you realize that the numeric::fraction
class has all the functionality you need in the equation package. You decide to use numeric::fraction
instead of eqn::fraction
., but you want to keep the eqn
interface the same. So you insert "using
numeric::fra
ction
;
" in the eqn
namespace.
Incorporating a name into a namespace with a using
declaration is not quite the same as declaring the name normally. The new name is just a synonym for the original name in its original namespace. When the compiler searches namespaces under argument-dependent name lookup, it searches the original namespace. Example 3-14 shows how the results can be surprising if you are not aware of the using
declaration.
Example 3-14: Creating synonym declarations with using declarations.
namespace eqn { using numeric::fraction; // Big, ugly declaration for ostream << fraction template<typename charT, typename traits> basic_ostream<charT,traits>& operator<<( basic_ostream<charT,traits>& out, const fraction& f) { out << f.numerator() << '/' << f.denominator(); return out; } } int main() { eqn::fraction qf; numeric::fraction nf; nf + qf; // okay because the types are the same std::cout << qf; // error: numeric namespace is searched // for operator<<, but not eqn namespace }
The using
declaration can also be used within a class. You can add names to a derived class from a base class, possibly changing the accessibility. For example, a derived class can promote a protected member to public visibility. Another use for using declarations is for private inheritance, promoting specific members to protected or public visibility. For example, the standard container classes are not designed for public inheritance. Nonetheless, in a few cases, it is possible to derive from them successfully. Example 3-15 shows a crude way to implement a container type to represent a fixed-size array. The array
class template derives from std::vector
using private inheritance. A series of using
declarations make selected members public, keeping private members that are meaningless for a fixed-size container, such as insert
.
Example 3-15: Using declarations in classes.
template<typename T> class array: private std::vector<T> { public: typedef T value_type; using std::vector<T>::size_type; using std::vector<T>::difference_type; using std::vector<T>::iterator; using std::vector<T>::const_iterator; using std::vector<T>::reverse_iterator; using std::vector<T>::const_reverse_iterator; array(size_t n, const T& x = T()) : std::vector<T>(n, x) {} using std::vector<T>::at; using std::vector<T>::back; using std::vector<T>::begin; using std::vector<T>::empty; using std::vector<T>::end; using std::vector<T>::front; using std::vector<T>::operator[]; using std::vector<T>::rbegin; using std::vector<T>::rend; using std::vector<T>::size; };
A using
directive adds a namespace to the list of scopes to use when the compiler searches for a name's declaration. Unlike a using
declaration, no names are added to the current namespace. Instead, the used namespace is added to the list of namespaces to search, right after the innermost namespace that contains the current and used namespaces. The using
directive is transitive, so if namespace A uses namespace B, and namespace C uses namespace A, a name search in C also searches B. Example 3-16 illustrates the using
directive.
Example 3-16: Using the using directive.
#include <iostream> #include <ostream> namespace A { int x = 10; } namespace B { int y = 20; } namespace C { int z = 30; using namespace B; } namespace D { int z = 40; using namespace B; // harmless but pointless because int y = 50; // D::y hides B::y } int main() { int x = 60; using namespace A; // does not introduce names, // so there is no conflict with x using namespace C; using namespace std; // to save typing std::cout // repeatedly cout << x << '\n'; // prints 60 (local x) cout << y << '\n'; // prints 20 cout << C::y << '\n'; // prints 20 cout << D::y << '\n'; // prints 50 using namespace D; cout << y << '\n'; // error: y is ambiguous: it can be // found in D::y and C's use of B::y. }
Namespaces have no runtime cost. Don't be afraid to use them, especially in large projects where many people contribute code and might accidentally devise conflicting names. This section presents some additional tips and suggestions for using namespaces.
using
directive in a header. It can create name collisions with any user of the header.using
directives local to functions to save typing and enhance clarity.using
namespace
std
outside functions only for tiny programs or for backward compatibility in legacy projects.An object (variable or constant) declaration has two parts: specifiers and a list of declarators. Each declarator has a name and an optional initializer.
Each declaration begins with a list of specifiers. The specifiers can contain a storage class, const
and volatile
qualifiers, and the type, in any order.
The storage class is optional. For function parameters and local variables in a function, the default is auto
. For declarations at namespace scope, the default is an object with static lifetime and internal linkage. (There is no explicit storage class for such a declaration.) If you use a storage class specifier, you must choose only one of the following:
auto
auto
specifier is the default for function parameters and local variables, so it is rarely used explicitly.extern
extern
.mutable
const
. See Chapter 7 for more information.register
register
storage class because the compilers are better than people at determining which variables belong in registers.static
static
.The const
and volatile
specifiers are optional and can be used in any order. The const
and volatile
keywords can be used in other parts of a declaration, so they are often called qualifiers as a more general term than specifiers; for brevity, they are often referred to as cv qualifiers.
const
volatile
volatile
.Every object must have a type, in the form of a list of type specifiers. The type specifiers can describe a fundamental type, a class type, an enumerated type, or the name of a type (class, enumeration, or typedef
). Read about fundamental and enumerated types later in this chapter. Class types are covered in Chapter 7.
You can declare a class or enumerated type in the same declaration as an object declaration, although the custom is to declare the type separately, then use the type name in a separate object declaration. For example,
enum color { red, black } node_color; // or enum color { red, black }; color node_color;
Specifiers can appear in any order, but the convention is to list the storage class first, followed by cv qualifiers, followed by the type specifiers. For example,
extern const long int mask; // conventional int extern long const mask; // valid, but strange
The convention for types that require multiple keywords is to place the base type last, and put the modifiers in front. For example,
unsigned long int x; // conventional int unsigned long y; // valid, but strange long double a; // conventional double long b; // valid, but strange
A declarator declares a single object within a declaration. A declarator contains the name being declared, additional type information (for pointers, references, and arrays), and an optional initializer. Use commas to separate multiple declarators in a declaration.
An array is declared with a constant size in square brackets. The array size is fixed for the lifetime of the object and cannot change. (For an array-like container whose size can change at runtime, see <vector>
in Chapter 13.) To declare a multidimensional array, use a separate set of square brackets for each dimension. For example,
int point[2]; double matrix[3][4]; // A 3 x 4 matrix
You can omit the array size if there is an initializer; the number of initial values determines the size. In a multidimensional array, you can omit the first (left-most) size. For example,
int data[] = { 42, 10, 3, 4 }; // data[4] int identity[][3] = { { 1,0,0 }, {0,1,0}, {0,0,1} }; char str[] = "hello"; // str[6], with trailing \0
In a multidimensional array, all elements are stored contiguously, with the rightmost index varying fastest (row major order).
When used as a function parameter, the size is ignored, and the type is actually a pointer type, which is the subject of the next section. For a multidimensional array used as a function parameter, the first dimension is ignored, so the type is a pointer to an array.
A pointer object stores the address of another object. A pointer is declared with a leading asterisk, optionally followed by cv qualifiers, and then the object name and optional initializer. For example,
int x; int *p; // pointer to int int * const cp = &x; // const pointer to int const int * pc; // pointer to const int const int * const cpc = &x; // const pointer to const int int **pp; // pointer to pointer to int
The cv qualifier applies to the pointer object. That is, a const
pointer must be initialized and cannot be the target of an assignment. The cv qualifiers in the declaration's specifier determines the type of the pointer's target. A pointer to const
, for example, can change values, but its value must always be the address of a const
object. You can have pointers to pointers, as deep as you want.
When a function parameter is declared with an array type, the actual type is a pointer, and at runtime, the address of the first element of the array is passed to the function. You can use array syntax, but the size is ignored. For example,
int sum(int data[], size_t n); // these two declarations int sum(int* data, size_t n); // mean the same thing void transpose(double matrix[][3]); // only the first // size can be omitted
A useful convention is that parameters that are used in an array-like fashion (values does not change, dereferenced with []
operator) within the function are declared with array syntax. Parameters that are used in pointer-like fashion (value changes, dereferenced with *
operator) are declared with pointer syntax.
A function pointer is declared with an asterisk to denote a function, and the function signature (parameter types and optional names). The name and asterisk must be enclosed in parentheses, so the asterisk is not interpreted as part of the return type. An optional exception specification can follow the signature. See Chapter 6 for more information about function signatures and exception specifications.
Declaring an object with a function pointer type can be hard to read, so typically, you would declare the type separately with a typedef
declaration, and then declare the object using the typedef
name, as shown in Example 3-17.
Example 3-17: Simplifying declarations with typedef.
// Declare an array named fp, of 10 elements, where each // element is a pointer to a function that returns int* // and takes two parameters: the first of type pointer // to function that takes an int* and returns int*, // and the second of type int. int* (*fp[10])(int*(*)(int*), int); // Declare a type for pointer to int. typedef int* int_ptr; // Declare a function pointer type for a function that // takes an int_ptr parameter and returns an int_ptr. typedef int_ptr (*int_ptr_func)(int_ptr); // Declare a function pointer type for a function that // returns int_ptr and takes two parameters: the first // of type int_ptr and the second of type int. typedef int_ptr (*func_ptr)(int_ptr_func, int); // Declare an array of 10 func_ptrs. func_ptr fp[10];
Pointers to members (data and functions) work differently from other pointers. The syntax requires a class name before the asterisk. Pointers to members can never be cast to ordinary pointers, and vice versa. You cannot declare a reference to a member. See Chapter 4 for information about expressions that dereference pointers to members. A pointer to a static data member is an ordinary pointer, not a member pointer. Example 3-18 shows some declarations of pointers to members.
Example 3-18: Declaring pointers to members.
class simple { public: int data; static int num_instances; int func(int); }; int *static_ptr = &simple::num_instances; int simple::* p = &simple::data; int (simple::*fp)(int) = &simple::func;
A reference is a synonym for another object. A reference is declared with a leading ampersand followed by the object name and initializer. For example,
int x; const int c; int &r = x; // reference to int const int& rc = c; // error: no cv qualified references int &&rr; // error: no reference of reference
A reference, unlike a pointer, cannot be made to refer to a different object. Assignments to a reference are just like assignments to the referenced object.
References are often used as function parameters. For example, the standard library has the div
function, which divides two integers and returns the quotient and remainder in a struct
. Suppose you would rather have the function return the results as arguments. Example 3-19 shows one way to do this.
Example 3-19: Returning results in function arguments.
#include <cstdlib> #include <iostream> #include <ostream> template<typename T> void div(T num, T den, T& quo, T& rem) { std::div_t result = std::div(num, den); quo = result.quot; rem = result.rem; } template<> void div<long>(long num, long den, long& quo, long& rem) { std::ldiv_t result = std::div(num, den); quo = result.quot; rem = result.rem; } int main() { int quo, rem; div(42, 5, quo, rem); std::cout << quo << " remainder " << rem << '\n'; }
A common idiom is to use a const
reference for function parameters, especially for large objects. Function arguments are passed by value in C++, which requires copying the argument. This can be costly for a large object, so passing a reference has better performance. If the function can modify the object, that would violate the pass-by-value convention, so the reference can be declared const
, which prevents the function from modifying the object. In this way, pass-by-value semantics are preserved, with the improved performance of pass-by-reference. The standard library often makes use of this idiom. For example, operator<<
for std::string
uses a const
reference to the string, to avoid making unnecessary copies of the string. (See <string>
in Chapter 13 for details.)
A reference must be initialized to refer to an object. Data members must be initialized in the constructor's initializer list; function parameters are initialized in the function call. All other definitions must have an initializer. (An extern
declaration is not a definition, so it doesn't take an initializer.)
You cannot declare a reference to a reference, a pointer to a reference, or an array of references. This poses an additional challenge for template authors. For example, you cannot store references in a container because a number of container functions explicitly declare their parameters as references to the container's value type. (Try using std::vector<int&>
with your compiler, and see what happens. You should see a lot of error messages.)
Instead, you can write a wrapper template, call it rvector<typename
T>
, and specialize the template (rvector<T&>
) so references are stored as pointers, but all the access functions hide the differences. This approach requires you to duplicate the entire template, which is tedious. Instead, you can encapsulate the specialization in a traits template (refer to Chapter 9 for more information about traits), as shown in Example 3-20.
Example 3-20: Encapsulating reference traits.
// REF type trait encapsulates reference type, // and mapping to and from the type for use in a container. template<typename T> struct REF { typedef T value_type; typedef T& reference; typedef const T& const_reference; typedef T* pointer; typedef const T* const_pointer; typedef T container_type; static reference from_container(reference x) { return x; } static const_reference from_container(const_reference x) { return x; } static reference to_container(reference x) { return x; } }; template<typename T> struct REF<T&> { typedef T value_type; typedef T& reference; typedef const T& const_reference; typedef T* pointer; typedef const T* const_pointer; typedef T* container_type; static reference from_container(pointer x) { return *x; } static const_reference from_container(const_pointer x) { return *x; } static pointer to_container(reference x) { return &x; } }; // rvector<> is similar to vector<>, but allows references // by storing references as pointers. template<typename T, typename A=std::allocator<T> > class rvector { typedef typename REF<T>::container_type container_type; typedef typename std::vector<container_type> vector_type; public: typedef typename REF<T>::value_type value_type; typedef typename REF<T>::reference reference; typedef typename REF<T>::const_reference const_reference; typedef typename vector_type::size_type size_type; ... // other typedefs are similar class iterator { ... }; // wraps a vector<>::iterator class const_iterator { ... }; ... // constructors pass arguments to v iterator begin() { return iterator(v.begin()); } iterator end() { return iterator(v.end()); } void push_back(typename REF<T>::reference x) { v.push_back(REF<T>::to_container(x)); } reference at(size_type n) { return REF<T>::from_container(v.at(n)); } reference front() { return REF<T>::from_container(v.front()); } const_reference front() const { return REF<T>::from_container(v.front()); } ... // other members are similar private: vector_type v; };
An initializer supplies an initial value for the object being declared. When you declare a reference or a const
object, you must supply an initializer for local and global variables, but not for data members, function parameters, and extern
declarations. An initializer supplies the initial value of the object.
The two forms of initializers are assignment-like and function-like. An assignment-like initializer starts with an equal sign, followed by an expression or a list of comma-separated expressions in curly braces. A function-like initializer is a list of one or more comma-separated expressions in parentheses. Note that these initializers look like assignment statements or function calls, but they are not. They are initializers. The difference is particularly important for classes (details in Chapter 7). For example,
int x = 42; // initializes x with the value 42 int y(42); // initializes y with the value 42 int z = { 42 }; // initializes z with the value 42 int w[4] = { 1, 2, 3, 4 }; // initializes an array std::complex<double> c(2.0, 3.0);// calls complex constructor
When initializing a scalar value, the form is irrelevant. The initial value is converted to the desired type using the usual conversion rules (as described in Chapter 4).
Without an initializer, all non-POD class-type objects are initialized by calling their default constructors. (See Chapter 7 for more information about POD and non-POD classes.) All other static objects are initialized to zero, and local objects are left uninitialized. An uninitialized const
object is an error.
You must use a function-like initializer when constructing a class whose constructor takes two or more arguments, or when calling an explicit
constructor. The usual rules for resolving overloaded functions applies to the choice of overloaded constructors. (See Chapter 6 for more information.)
Empty parentheses cannot be used as an initializer in an objects declaration, but can be used in other initialization contexts (namely, a constructor initializer list or as a value in an expression). If the type is a class type, the default constructor is called; otherwise, the object is initialized to zero.
In an assignment-like initializer, if the object is of class type, the value to the right of the equal sign is converted to a temporary of the desired type and the object is constructed by calling its copy constructor.
The generic term for an array and or POD object is aggregate because it aggregates multiple values into a single object. To initialize an aggregate, you can supply multiple values in curly braces, as described in the following sections.
Chapter 7 has the complete definition of a POD (plain old data) class. Briefly, a POD object is one that can be copied bit-for-bit (no copy assignment operator, and no non-POD members). To initialize a POD object, you can supply an initial value for each non-static data member, separated by commas, enclosed in curly braces. For nested objects, use nested curly braces. Values are associated with members in order of declaration of the members. If there are more values than members, it is an error. If there are fewer values than members, the members without values are initialized by calling the default constructor or initializing to zero.
An initializer list can be empty, which means all members are initialized to the default, which is different from omitting the initializer entirely. The latter causes all members to be left uninitialized. For example,
class point { double x, y, z; } point origin = { }; // all members initialized to 0.0 point unknown; // uninitialized, value is not known point pt = { 1, 2, 3 }; // pt.x==1.0, pt.y==2.0, pt.z==3.0 class line { point p1, p2 }; line vec = { { }, { 1 } }; // vec.p1 is all zero // vec.p2.x==1.0, vec.p2.y==0.0, vec.p2.z==0.0
Initialize elements of an array with values, separated by commas, enclosed in curly braces. Multi-dimensional arrays can be initialized by nesting sets of curly braces. It is an error if there are more values than elements in the array; if the initializer has fewer values, the remaining elements in the array are initialized to default values (default constructors or zero). If the declarator omits the array size, the size is determined by counting the number of values in the initializer.
The initializer can be empty, to force all elements to be initialized to the default. Omitting the initializer entirely causes all elements of the array to be uninitialized.
When initializing a multi-dimensional array, you can flatten the curly braces and initialize elements of the array in row-major order (last index varies fastest).
For example,
int vector[] = { 1, 2, 3 }; // array of three elements // vector[0]==1 ... vector[2]==3 int zero[4] = { }; // initialize to all zeros // Initialize id1 and id2 to the identity matrix. int id1[3][3] = { { 1 }, { 0, 1 }, { 0, 0, 1 } }; int id2[3][3] = { 1, 0, 0, 0, 1, 0, 0, 0, 1 };
An array of char
or wchar_t
is special because you can initialize it with a string literal. Remember that every string literal has an implicit null character at the end. For example,
// The following two declarations are equivalent. char str1[] = "Hello"; char str2[] = { 'H', 'e', 'l', 'l', 'o', '\0' }; wchar_t ws1[] = L"Hello"; wchar_t ws2[] = { L'H', L'e', L'l', L'l', L'o', L'\0' };
The last expression in an initializer list can be followed by a comma. This can be convenient when maintaining software and you often need to change the order of items in the initializer list. You don't need to treat the last element differently from the other elements. For example,
const std::string keywords[] = { "and", "asm", ... "while", "xor", };
Every object has linkage, which determines how the compiler and linker associate object references with the object definition. Linkage has two aspects: scope and language. Scope linkage dictates which scopes have access to an entity. Language linkage dictates an entity's properties that depend on programming language.
Scope linkage can be external, internal, or none:
extern
specifier have external linkage, as do entities declared at namespace scope (outside of functions and classes) that do not have internal linkage.static
declarations have internal linkage, as do const
declarations that are not also extern
. Data members of anonymous unions have internal linkage.extern
have no linkage.Every entity has a language linkage, which is a simple character string. By default, the linkage is "C++"
. The only other standard language linkage is "C"
. All other language linkages and the properties associated with different language linkages are implementation-defined.
You can specify a language linkage for a single declaration or for a series of declarations in curly braces:
extern "C" { void cfunction(int); typedef void (*cfunc)(int); } extern "C++" cfunc cf = cfunction; // The variable cf has C++ linkage. Its value is a pointer // to function that has C linkage.
C does not support function overloading, so there can be at most one function with C linkage of a given name. Even if you declare the C function in two different namespaces, both declarations refer to the same function, for which there must be a single definition.
Typically, C linkage is used for external functions that are written in C (such as the C standard library), but that you want to call from a C++ program. C++ linkage is used for native C++ code. Sometimes, though, you want to write a function in C++ that can be called from C; in that case, you should declare the C++ function with C linkage.
An implementation might support other language linkages. It is up to the implementation to define the properties of each language: how parameters are passed to functions, how values are returned from functions, whether and how function names are altered, and so on. In many C++ implementations, a function with C++ linkage has a "mangled" name, that is, the external name encodes the function name and the types of all its arguments. So the function strlen(const
char*)
might have an external name of strlen__FCcP
, which makes it hard to call the function from a C program, which does not know about C++ name mangling rules. Using C linkage, the compiler does not mangle the name, exporting the function under the plain name of strlen
, which can be called easily from C.
One of the hallmarks of C++ is that you can define a type that seems just like any builtin type. Thus, if you need to, say, define a type that supports arbitrary-sized integers, call it bigint
, you can do so, and programmers can use bigint
objects the same way they use int
objects.
The user-defined types are classes and enumerations. Enumerations are described later in this section.
You can also declare a typedef
, which is a synonym for an existing type. Note that the name typedef
seems to be a shorthand for "type definition" but it is actually a type declaration.
This section lists the fundamental types that are built into the C++ language. Types that require multiple keywords (e.g., unsigned
long
int
) can mix the keywords in any order, but the order shown below is the convential order. If the type specifier requires multiple words, one of which is int
, the int
can be omitted. If the type is signed
, the signed keyword can be omitted (except for signed
char
).
bool
true
or false
.char
char
. All the character types (char
, signed
char
, and unsigned
char
) share a common size and representation. By definition, char
is the smallest fundamental type. A char
is signed or unsigned, depending on the implementation.double
float
. A floating point literal has type double unless you use the F
or L
suffix.float
long double
double
.signed char
signed int
signed long int
int
.signed short int
int
is at least as large as the range of a short
.unsigned char
unsigned long int
unsigned short int
void
void
, but you can declare a function that "returns" void
(that is, does not return a value), or declare pointers to void
.wchar_t
wchar_t
.The representation of the fundamental types is implementation-defined. The integral types (bool
, char
, wchar_t
, int
, etc.) require a binary representation: signed-magnitude, one's complement, or two's complement. Some types have alignment restrictions, which are implementation-defined. (Note that new
expressions always return pointers that are aligned for any type.)
The signed
and unsigned
variants of a given type always occupy the same amount of storage. The non-negative values for the signed
type are always a subset of the values supported by the unsigned
type, and have the same bit representation.
The unsigned
types always use arithmetic module 2n, where n is the number of bits in the type. Unsigned types take up the same amount of space and have the same alignment requirements as their signed companion type. Nonnegative signed values must have the same bit representation as the same unsigned value.
See the <limits>
header in Chapter 13 to determine the numerical limits of each fundamental type.
An enumerated type declares an optional type name (the enumeration) and a set of zero or more identifiers (enumerators). Each enumerator is a constant whose type is the enumeration. For example,
enum logical { no, maybe, yes }; logical is_permitted = maybe; enum color { red=1, green, blue=4 }; const color yellow = static_cast<color>(red | green); enum zeroes = { a, b = 0, c = 0 };
You can optionally specify the value of an enumerator after an equal sign (=
). The value can be an integer or an enumeration. The default value of the first enumerator is zero. The default value for subsequent enumerators is one more than the value of the previous enumerator (regardless of whether that value was explicitly specified). Enumerators can have duplicate values in a single enumeration declaration.
Each enumeration has an underlying integral type that can store all of the enumerator values. The actual type is implementation-defined, so the size of an enumerated type is implementation-defined.
An enumerated type is a unique integral type. Enumerated values have integer values, but integers cannot be implicitly converted to an enumerated type. Instead, you can use static_cast<>
to cast an integer to an enumeration or from one enumeration to a different enumeration. (See Chapter 4 for details.)
The range of values for an enumeration is defined by the smallest and largest bitfields that can hold all of its enumerators. In more precise terms, let the largest and smallest values of the enumerated type be vmin and vmax. The largest enumerator is emax and the smallest is emin. Using two's complement representation (the most common integer format), vmax is the smallest 2n - 1, such that vmax >= max(abs( emin) - 1, abs( emax)). If emin is not negative, vmin = 0, otherwise vmin = -( vmax + 1).
In other words, the range of values for an enumerated type can be larger than the range of enumerator values, but the exact range depends on the representation of integers on the host platform, and so is implementation-defined. All values between the largest and smallest enumerators are always valid, even if they do not have corresponding enumerators.
For example, consider the following enumerations:
enum sign { neg=-1, zero=0, pos=1 }; enum iostate { goodbit=0, failbit=1, eofbit=2, badbit=4 };
The enumeration sign
has the range (in two's complement) -2 to 1. Your program might not assign any meaning to static_cast<sign>(-2)
, but it is semantically valid in a program.
The type iostate
is designed to be a bitmask, where the enumerators can be combined using the bitwise operators. The range of values is 0 to 7. The enumeration can clearly fit in a char
, but the implementation is free to use int
, short
, char
, or the unsigned
flavors of these types as the underlying type. (The standard library has an iostate
type, and can implement it as this enumeration, but is free to choose a different implementation. See the <ios>
section in Chapter 13 for more information.)
Because enumerations are distinct types, it is up to the programmer to decide which operations are permitted. Of course, you can use any integer operations, and cast the result back to the enumerated type, but in some cases, you might want to overload certain operators to avoid the inconvenience of type casting. For example, you might want to overload the bitwise operators, but not the arithmetic operators, for the iostate
type. The sign
type does not need any additional operators; the comparison operators work just fine by implicitly converting sign
values to integers. Other enumerations might call for overloading ++
and --
operators (similar to the succ and pred functions in Pascal). How you handle overflow and underflow is up to you. Example 3-21 shows operators can be overloaded for enumerations.
Example 3-21: Overloading operators for enumerations.
// Explicitly cast to int, to avoid infinite recursion. inline iostate operator|(iostate a, iostate b) { return iostate(int(a) | int(b)); } inline iostate& operator|=(iostate& a, iostate b) { a = a | b; return a; } // repeat for &, ^, ~ int main() { iostate err = goodbit; if (error()) err |= badbit; }
A typedef
declares a synonym for an existing type. Syntactically, typedef
is a specifier in a declaration, and it must be combined with type specifiers and optional cv qualifiers (no storage class specifiers). After the specifiers come the list of declarators.
The declarator of a typedef
declaration is similar to that for an object declaration (as described earlier in this chapter), except you cannot have an initializer. Following are some examples of typedef
declarations:
typedef double[3][3] matrix; typedef void (*thunk)(); typedef signed char SCHAR;
By convention, the typedef
keyword appears before the type specifiers. For example,
typedef unsigned int UINT; // conventional long typedef unsigned ULONG; // valid, but strange
A typedef
is especially helpful with complex declarations, such as function pointers. They also can provide helpful information for the person who must read and maintain the code. Example 3-17, earlier in this chapter, has examples of how to use typedef
to simplify declarations and make them easier to read.
A typedef
does not create a new type, the way class and enum do. It simply declares a new name for an existing type. Therefore, function declarations where the parameters differ only as typedefs are not actually different declarations, as shown in the following example:
typedef unsigned int UINT; UINT func(UINT); // two declarations of the unsigned func(unsigned); // same function
Similarly, because you cannot overload an operator on fundamental types, you cannot overload an operator on typedef
synonyms for fundamental types. For example,
int operator+(int, int); // error typedef int INT; INT operator+(INT, INT); // error
C programmers are accustomed to declaring a typedef
for struct
, union
, and enum
declarations, but they are not necessary in C++. In C, the struct
, union
, and enum
namespaces are separate from the type namespace, but in C++, the declaration of a struct
, union
, class
, or enum
also adds the type to the type namespace. Nonetheless, such a typedef
is harmless. C++ lets you define a type name as a synonym for itself. For example,
struct point { int x, y; } typedef struct point point;// not needed in C++, but harmless point pt;