Draft 2002-09-16

Chapter 3

Declarations

A C++ source file contains a series of zero or more declarations. A declaration can be a function, type, object, namespace, template instantiation or specialization, or a number of similar entities. This chapter mainly discusses namespaces, objects, and types. Functions, classes, templates, and related declarations each get their own chapters: Chapter 7 for functions; Chapter 8 for classes and friends; Chapter 9 for template declarations, specializations, and instantiations.

Declarations and Definitions

A declaration is the all-encompassing term for anything that tells the compiler about an identifier. In order to use an identifier, the compiler must know what it means: is it a type name, a variable name, a function name, or something else? Therefore, a source file must declare every name it uses.

Definitions

A definition provides the body or contents for a declaration, such as a function body or the initial value of a static variable. The difference between a declaration and a definition, briefly, is that a declaration tells the compiler how to use a name, and the definition provides additional information that the program needs to back up that name.

In a single source file, there can be at most one definition of an entity. In an entire program, there must be exactly one definition of each function or object used in the program, except for inline functions, which must be defined in every source file that uses the function.

A program can have more than one definition of certain entities (classes, enumerations, inline functions, and templates), provided the definitions are the same.

Note that only declarations are required to use an entity (functions, classes, etc.). In any non-trivial program, the declarations and definitions are kept in separate files. Place declarations in a header file (whose name typically ends with .h or .hpp), so other source files can #include the header file. Place the definitions in a source file (whose name typically ends with .cpp, .cc, or .C). Any source file that needs to use one of the declarations must #include the necessary header files. Compile and link the source files to produce the program.

In this chapter and subsequent chapters, the description of each entity clearly states whether the entity (type, variable, class, function, etc.) has separate definitions and declarations, when definitions are required, and any other rules pertaining to declarations and definitions.

Ambiguity

Some language constructs can look like a declaration or an expression. Such ambiguities are always resolved in favor of declarations. Others declarations can look like function declarations or object declarations. A consequence of the first rule is that these declarations are interpreted as function declarations.

A related rule is that a type specifier, followed by a name and empty parentheses is a declaration of a function that takes no arguments, not a declaration of an object with an empty initializer. Example 3-1 shows some examples of how declarations are interpreted.

Example 3-1: Disambiguating declarations.

#include <iostream>
#include <ostream>

class T
{
public:
  T()    { std::cout << "T()\n"; }
  T(int) { std::cout << "T(int)\n"; }
};

int a, x;

int main()
{
  T(a);      // Variable named a of type T,
             // not an invocation of the T(int) constructor.

  T b();     // Function named b of no arguments,
             // not a variable named b of type T.

  T c(T(x)); // Declaration of a function named c,
             // with one argument of type T.
}

The last item deserves further explanation. The function parameter T(x) could be interpreted as an expression: constructing an instance of T with the parameter x, or it can be interpreted as a declaration of a function parameter of type T named x, with a redundant set of parentheses around the parameter name. According to the disambiguation rule, it must be a declaration, not an expression. That means the entire declaration cannot be the declaration of an object named c, whose initializer is the expression T(x). Instead, it must be the declaration of a function named c, whose parameter is of type T, named x.

If your intention is to declare an object, not a function, the simplest way to do this is not to use the function-call style of type cast. Instead, use one of the new type cast expressions, such as static_cast<>, for example,

T c(static_cast<T>(x)); // Declares an object named c
                        // whose initial value is x, cast
                        // to type T.

Scope

Every declaration has a well-defined scope, a region of source code where any use of the unqualified name (that is, as a plain identifier) refers to that entity. Scope regions can be nested, and the same name can be declared within an outer and an inner scope region. Within the inner scope region, the unqualified name refers to the inner entity. Outside of the inner region, the name refers to the outer entity. Example 3-2 illustrates simple scope rules.

Example 3-2: Names in inner scopes can hide names in outer scopes.

#include <iostream>
#include <ostream>

int main()
{
  for (int i = 0; i < 100; ++i)
  {
    int x = 42;
    if (x < i)
    {
      double x = 3.14;
      std::cout << x; // prints 3.14
    }
    std::cout << x;   // prints 42
  }
  std::cout << x;     // error: no x declared in this scope
}

At the same scope level, you cannot have multiple declarations for the same name, unless every declaration of that name is for an overloaded function or function template, or if the declarations are identical typedef declarations.

A name can be hidden by a different declaration in an inner scope level. Also, a class name or enumeration type name can be hidden by an object, function, or enumerator at the same scope level. For example,

enum x { a, b, c };
const int x = 42; // okay: hides enum x
const int a = 10; // error: int cannot hide enumerator a
{
  const int a = 10; // okay: inner scope can hide outer a
}

Different entities (functions, statements, classes, namespaces, etc.) establish scopes in different ways. The description for each entity (in this and subsequent chapters) includes scope details. The general rule of thumb is that curly braces delimit a scope region. The outermost scope region is outside of all curly braces and is called the global scope. Read more about the global scope under the Namespaces section, later in this chapter.

Name Lookup

When the compiler reads an identifier, it must lookup the identifier to determine which declaration is comes from. In most cases, you can readily tell which identifier is which, but it is not always so simple. To understand name lookup fully, you must first understand namespaces (covered later in this chapter), functions (Chapter 6), classes (Chapter 7), and templates (Chapter 8).

Name lookup takes place before overload resolution and access-level checking. If a name is found in an inner scope, the compiler uses that declaration, even if a better declaration would be found in an outer scope, as illustrated in Example 3-3. Refer to Chapter 6 for more information about overloaded functions and to Chapter 7 for information about access levels in a class declaration.

Example 3-3: Name lookup trumps overload resolution.

void func(int i)
{
  std::cout << "int: " << i << '\n';
}

namespace N {
  void func(double d)
  {
    std::cout << "double: " << std::showpoint << d << '\n';
  }

  void call_func()
  {
    // Even though func(int) is a better match, the compiler
    // finds N::func(double) first.
    func(3);
  }
}

int main()
{
  N::call_func();       // prints "double: 3.000000"
  using N::func;
  using ::func;
  // Now all overloaded func()s are at the same scope level
  func(4);              // prints "int: 4"
}

Qualified Name Lookup

A name that follows the global scope operator (unary ::) is looked up in the global scope of the source file where the operator is used. The name must be declared in the global scope, not in a nested scope, or the name must have been introduced into the global scope by a using directive (see Namespaces, later in this chapter for information about using directives).

Use the global scope operator to access names that have been hidden by an inner scope. Example 3-4 shows the use of the global scope operator.

Example 3-4: Using the global scope operator.

#include <iostream>
#include <ostream>

int x = 42;

int main()
{
  double x = 3.1415;         // hides the global x
  std::cout << x << '\n';    // prints 3.1415
  std::cout << ::x << '\n';  // prints 42
}

The binary scope resolution operator (::) requires a namespace or class name as its left-hand operand, and an identifier as its right-hand operand. The identifier is looked up in the scope of the left-hand namespace. Example 3-5 shows the use of the scope resolution operator. Notice how the inner counter hides the outer counter, so the simple name counter refers to the int variable. The left-hand operand to ::, however, must be a class or namespace, so the inner counter does not hide the outer counter in the expression counter::c.

Example 3-5: Using the scope resolution operator.

#include <iostream>
#include <ostream>

namespace n {
  struct counter {
    static int n;
  };
  double n = 2.71828;
}

int n::counter::n = 42;

int main()
{
  int counter = 0;    // unrelated to n::counter
  int n = 10;         // hides namespace n
  ::n::counter x;     // refer to outer n

  std::cout << n::counter::n; // prints 42
  std::cout << n::n;          // prints 2.71828
  std::cout << x.n;           // prints 42
  std::cout << n;             // prints 10
  std::cout << counter;       // prints 0
}

Unqualified Name Lookup

The compiler looks up an unqualified name, that is, a bare identifier or operator symbol, in a series of scopes in order to find its declaration. The simple rule is that the innermost scope is searched first, and succeeding outer scopes are searched until the declaration is found. An associated simple rule is that the name must be declared before it is used, reading the source file from top to bottom.

In a class declaration, the class and its ancestor classes are searched. If the class is local to a function, the block that contains the class is searched, then enclosing blocks. Finally the namespaces that contain the class declarations are searched.

In the definition of a member function (as a parameter type, default argument value, or in the function body, but not as the return type), a name can be declared anywhere in the declaration of the class. (This is the only exception to the declare-before-use rule.) The name is looked up first in the block that contains its use, then in enclosing blocks, then in the enclosing class and ancestor classes, then nested classes and their ancestors. If the class is local to a function, the enclosing blocks within the function are searched, and finally the enclosing namespaces are searched.

If a namespace contains a using directive (described later in this chapter), the referenced namespace is searched if all other searches fail to find the name.

If a friend function is defined within the body of the friend class, the lookup rules are the same as for a member function. If the function is defined outside the class, the rules are the same as for an ordinary function. Example 3-6 shows how the two different lookup rules can make a difference.

Example 3-6: Defining a friend function.

class foo {
public:
  friend void bar1(foo& f) {
    f.x = y;
    ++y;          // okay: refers to foo::x
  }
  friend void bar2(foo& f);
private:
  static int y;
  int x;
};

void bar2(foo& f) {
  f.x = y;
  ++y;           // error: y not in scope
}

If the friend is a member function, the function and other names in its declaration are looked up first in the friend class, and if not found, in the class that contains the friend declaration, as you can see in Example 3-7.

Example 3-7: Declaring friend member functions.

class test {
public:
  typedef int T1;
  typedef float T2;
  void f(T1);
  void f(T2);
};
class other {
  typedef char T2;
  friend void test::f(T1); // look up f and T1 in test
  friend void test::f(T2); // look up f and T2 in test before
                           // looking it up in other
};

In the definition of a class member (function or static data) outside of the class declaration, lookup includes the class members and ancestor classes, but only after the class name appears in the declarator. Thus, the type specifier for the function return type or for the static data member are not looked up in the class declaration unless the type name is explicitly qualified. (Declarators and type specifiers are defined later in this chapter.) Example 3-8 shows the consequences of this rule.

Example 3-8: Defining members outside of a class declaration.

class node {
public:
  enum color { red, black };
  node(color x);
  color toggle(color c);
private:
  color c;
  static color root_color;
};

// Must qualify node::color and node::root_color, but
// initializer is in the scope of node, so it doesn't need
// to be qualified.
node::color node::root_color = red;

// Similarly, return type must be qualified, but parameter
// type need not be.
node::color node::toggle(color c)
{
  return static_cast<color>(1 - c);
}

Argument-dependent Name Lookup

Argument-dependent name lookup is also known as Koenig lookup, named after Andrew Koenig who devised this lookup rule. The short version of the rule is that the compiler looks up a function name in the usual places, and also in the namespaces that contain the argument types for each user-defined type (classes and enumerations).

The slightly longer version is that the compiler first searches all the usual places. If it does not find a declaration, it then searches an additional list of classes and namespaces. The additional list depends on the function's argument types:

Example 3-9 shows a typical case where argument-dependent name lookup is needed. The operator<< function is declared in the std namespace in the <string> header. It is not a member function of ostream, and the only way the compiler can find the operator is to search the std namespace. The fact that its arguments are in the std namespace tells the compiler to look there.

Example 3-9: Using argument-dependent name lookup.

#include <string>
#include <iostream>
#include <ostream>
int main()
{
  std::string message("Howdy!\n");
  std::cout << message;
}

Another way to look at argument-dependent lookup is to consider a simple class declaration, say, for rational numbers. In order to support the customary and usual arithmetic operators, you can choose to declare them as member functions, e.g.,

namespace numeric {
  class rational {
    ...
    rational operator+(const rational& other);
    rational operator+(int i);
  };
}

Expressions of the form r1 + 42 will compile successfully because the + operator is found as a member of rational (assuming r1 is an instance of rational). But 42 + r1 does not work. You must declare an operator at the namespace level, as follows,

namespace numeric {
  class rational { ... };
  rational operator+(int i, const rational& r);
  ...
}

In order to compile the expression 42 + r1, the compiler needs to find operator+ in namespace numeric. Without requiring the user to use a using directive, the only guideline the compiler has to know to search namespace numeric is that the second argument is declared in numeric. Thus, argument-dependent lookup enables the everyday, expected use of overloaded operators.

Namespaces

A namespace is a named scope. By grouping related declarations in a namespace, you can avoid name collisions with declarations in other namespaces. For example, suppose you are writing a word processor. Suppose also that you make use of packages that others have written, including a screen layout package, an equation typesetting package, and an exact-arithmetic package for computing printed positions to high accuracy with fixed-point numbers.

Suppose further than the equation package has a class called fraction, which represents built-up fractions in an equation; the arithmetic package also has a class called fraction, for computing with exact rational numbers; and the layout package has a class called fraction for laying out fractional regions of a page. Without namespaces, all three names would collide and you could not use more than one package in a single source file.

With namespaces, each class can reside in a separate namespace, say, layout::fraction, eqn::fraction, and math::fraction.

C++ namespaces are similar to Java packages.

Namespace Definitions

Declare a namespace with the namespace keyword, followed by an optional identifier (the namespace name), followed by zero or more declarations in curly braces. Namespace declarations can be discontiguous, even in separate source files or headers. The namespace scope is the accumulation of all declarations for the same namespace that the compiler has seen at the time it looks up name in the namespace. Namespaces can be nested. Example 3-10 shows a sample namespace definition.

Example 3-10: Defining a namespace.

namespace numeric {
  class rational { ... }
  template<typename charT, typename traits>
  basic_ostream<charT,traits>& operator<<(
    basic_ostream<charT,traits>& out, const rational& r);
}

namespace numeric {
  rational operator+(const rational&, const rational&);
}

numeric::rational numeric::operator+(const rational& r1,
                                     const rational& r2)
{
  ...
}

int main()
{
  using numeric::rational;
  rational a, b;
  std::cout << a + b << '\n';
}

You can also declare a namespace without a name, in which case the compiler uses a unique, internal name. Thus, each source file's unnamed namespace is separate from the unnamed namespace in every other source file.

You can define an unnamed namespace nested within a named namespace (and vice versa). Each unnamed namespace has a unique name. As with a named namespace, you can use multiple declarations to compose an unnamed namespace, as shown in Example 3-11.

Example 3-11: Using unnamed namespaces.

#include <iostream>
#include <ostream>

namespace {
  int i = 10;
}

namespace {
  int j;          // same unnamed namespace
  namespace X {
    int i = 20;   // hides i in outer, unnamed namespace
  }
  namespace Y = X;
  int f() { return i; }
}

namespace X {
  int i = 30;
  // X::unnamed is different namespace from ::unnamed
  namespace {
    int i = 40;  // hides ::X::i, but is inaccessible
                 // outside the unnamed namespace
    int f() { return i; }
  }
}

int main()
{
  int i = X::i;  // ambiguous: unnamed::X or ::X?
  std::cout << ::X::f() << '\n'; // prints 40
  std::cout << Y::i << '\n';     // prints 20
  std::cout << f() << '\n';      // prints 10
}

The advantage of using an unnamed namespace is that you are guaranteed that all names declared in it can never clash with names in other source files. The disadvantage is that you cannot use the scope operator (::) to qualify identifiers in the unnamed namespace, so you must take care to avoid name collisions.

C programmers are accustomed to using global static declarations for names that are private to a source file. You can do the same in C++, but it is better to use an unnamed namespace because a namespace can contain any kind of declaration (including classes, enumerations, and templates), whereas static declarations are limited to functions and objects.

Declarations outside of all namespaces, functions, and classes are implicitly declared in the global namespace. A program has a single global namespace, which is shared by all source files that are compiled and linked into the program. Declarations in the global namespace are typically referred to as global declarations. Global names can be accessed directly using the global scope operator (unary ::).

Namespace Aliases

A namespace alias is a synonym for an existing namespace. You can use the alias name to qualify names (with the :: operator), in using declarations and directives, but not in namespace definitions. Example 3-12 shows some alias examples.

Example 3-12: Using namespace aliases.

namespace original {
  int f();
}

namespace short = original;     // alias

int short::f() { return 42; }   // okay
using short::f();               // okay

int g() { return f(); }

namespace short { // error: cannot use alias here
  int h();
}

Using Declarations

A using declaration brings a name into a different namespace. The new name is a synonym for the original name. Only the declared name is added to the new namespace, which means using an enumerated type does not use all the enumerated literals. If you want to use all the literals, each one requires its own using declaration.

Because the name is added to the current namespace, it might hide names in outer scopes, and you might not be able to declare any other objects with that name in the same scope.

Example 3-13 shows some examples of using declarations.

Example 3-13: Using the using declaration.

namespace numeric {
  class fraction { ... };
  fraction operator+(int, const fraction&);
  fraction operator+(const fraction&, int);
  fraction operator+(const fraction&, const fraction&);
}

namespace eqn {
  class fraction { ... };
  fraction operator+(int, const fraction&);
  fraction operator+(const fraction&, int);
  fraction operator+(const fraction&, const fraction&);
}

int main()
{
  numeric::fraction nf;
  eqn::fraction qf;

  nf = nf + 1;           // okay: calls numeric::operator+
  qf = 1 + qf;           // okay: calls eqn::operator+
  nf = nf + qf;          // error: no operator+

  using numeric::fraction;
  fraction f;
  f = nf + 2;            // okay
  f = qf;                // error: type mismatch
  using eqn::fraction;   // error: like trying to declare
                         // fraction twice in the same scope
  if (f > 0) {
    using eqn::fraction; // okay: hides outer fraction
    fraction f;          // okay: hides outer f
    f = qf;              // okay: same types
    f = nf;              // error: type mismatch
  }
  int fraction;          // error: name fraction in use
}

You can also copy names from one namespace to another with a using declaration. Suppose you refactor your program, and you realize that the numeric::fraction class has all the functionality you need in the equation package. You decide to use numeric::fraction instead of eqn::fraction., but you want to keep the eqn interface the same. So you insert "using numeric::fraction;" in the eqn namespace.

Incorporating a name into a namespace with a using declaration is not quite the same as declaring the name normally. The new name is just a synonym for the original name in its original namespace. When the compiler searches namespaces under argument-dependent name lookup, it searches the original namespace. Example 3-14 shows how the results can be surprising if you are not aware of the using declaration.

Example 3-14: Creating synonym declarations with using declarations.

namespace eqn {
  using numeric::fraction;
  // Big, ugly declaration for ostream << fraction
  template<typename charT, typename traits>
  basic_ostream<charT,traits>& operator<<(
    basic_ostream<charT,traits>& out, const fraction& f)
  {
    out << f.numerator() << '/' << f.denominator();
    return out;
  }
}

int main()
{
  eqn::fraction qf;
  numeric::fraction nf;
  nf + qf;         // okay because the types are the same
  std::cout << qf; // error: numeric namespace is searched
                   // for operator<<, but not eqn namespace
}

The using declaration can also be used within a class. You can add names to a derived class from a base class, possibly changing the accessibility. For example, a derived class can promote a protected member to public visibility. Another use for using declarations is for private inheritance, promoting specific members to protected or public visibility. For example, the standard container classes are not designed for public inheritance. Nonetheless, in a few cases, it is possible to derive from them successfully. Example 3-15 shows a crude way to implement a container type to represent a fixed-size array. The array class template derives from std::vector using private inheritance. A series of using declarations make selected members public, keeping private members that are meaningless for a fixed-size container, such as insert.

Example 3-15: Using declarations in classes.

template<typename T>
class array: private std::vector<T>
{
public:
  typedef T value_type;
  using std::vector<T>::size_type;
  using std::vector<T>::difference_type;
  using std::vector<T>::iterator;
  using std::vector<T>::const_iterator;
  using std::vector<T>::reverse_iterator;
  using std::vector<T>::const_reverse_iterator;

  array(size_t n, const T& x = T()) : std::vector<T>(n, x) {}
  using std::vector<T>::at;
  using std::vector<T>::back;
  using std::vector<T>::begin;
  using std::vector<T>::empty;
  using std::vector<T>::end;
  using std::vector<T>::front;
  using std::vector<T>::operator[];
  using std::vector<T>::rbegin;
  using std::vector<T>::rend;
  using std::vector<T>::size;
};

Using Directives

A using directive adds a namespace to the list of scopes to use when the compiler searches for a name's declaration. Unlike a using declaration, no names are added to the current namespace. Instead, the used namespace is added to the list of namespaces to search, right after the innermost namespace that contains the current and used namespaces. The using directive is transitive, so if namespace A uses namespace B, and namespace C uses namespace A, a name search in C also searches B. Example 3-16 illustrates the using directive.

Example 3-16: Using the using directive.

#include <iostream>
#include <ostream>

namespace A {
  int x = 10;
}
namespace B {
  int y = 20;
}
namespace C {
  int z = 30;
  using namespace B;
}
namespace D {
  int z = 40;
  using namespace B;       // harmless but pointless because
  int y = 50;              // D::y hides B::y
}

int main()
{
  int x = 60;
  using namespace A;       // does not introduce names,
                           // so there is no conflict with x
  using namespace C;

  using namespace std;     // to save typing std::cout
                           // repeatedly

  cout << x << '\n';       // prints 60 (local x)
  cout << y << '\n';       // prints 20 
  cout << C::y << '\n';    // prints 20
  cout << D::y << '\n';    // prints 50

  using namespace D;
  cout << y << '\n'; // error: y is ambiguous: it can be
                     // found in D::y and C's use of B::y.
}

How to Use Namespaces

Namespaces have no runtime cost. Don't be afraid to use them, especially in large projects where many people contribute code and might accidentally devise conflicting names. This section presents some additional tips and suggestions for using namespaces.

Object Declarations

An object (variable or constant) declaration has two parts: specifiers and a list of declarators. Each declarator has a name and an optional initializer.

Specifiers

Each declaration begins with a list of specifiers. The specifiers can contain a storage class, const and volatile qualifiers, and the type, in any order.

Storage class specifiers

The storage class is optional. For function parameters and local variables in a function, the default is auto. For declarations at namespace scope, the default is an object with static lifetime and internal linkage. (There is no explicit storage class for such a declaration.) If you use a storage class specifier, you must choose only one of the following:

auto
Denotes an automatic variable, that is, whose lifetime is limited to the block in which the variable is declared. The auto specifier is the default for function parameters and local variables, so it is rarely used explicitly.
extern
Denotes an object with external linkage, which might be defined in a different source file. Function parameters cannot be extern.
mutable
Denotes a class member that can be modified even if the containing class is const. See Chapter 7 for more information.
register
Denotes an automatic variable with a hint to the compiler that the variable should be stored in a fast register. Many modern compilers ignore the register storage class because the compilers are better than people at determining which variables belong in registers.
static
Denotes a variable with static lifetime. Function parameters cannot be static.

Cv qualifiers

The const and volatile specifiers are optional and can be used in any order. The const and volatile keywords can be used in other parts of a declaration, so they are often called qualifiers as a more general term than specifiers; for brevity, they are often referred to as cv qualifiers.

const
Denotes an object that cannot be modified.
volatile
Denotes an object whose value might change unexpectedly. The compiler is prevented from performing optimizations that depend on the value not changing. For example, variables that are shared between threads in a multi-threaded program might be volatile.

Type specifiers

Every object must have a type, in the form of a list of type specifiers. The type specifiers can describe a fundamental type, a class type, an enumerated type, or the name of a type (class, enumeration, or typedef). Read about fundamental and enumerated types later in this chapter. Class types are covered in Chapter 7.

You can declare a class or enumerated type in the same declaration as an object declaration, although the custom is to declare the type separately, then use the type name in a separate object declaration. For example,

enum color { red, black } node_color;
// or
enum color { red, black };
color node_color;

Specifiers can appear in any order, but the convention is to list the storage class first, followed by cv qualifiers, followed by the type specifiers. For example,

extern const long int mask; // conventional
int extern long const mask; // valid, but strange

The convention for types that require multiple keywords is to place the base type last, and put the modifiers in front. For example,

unsigned long int x; // conventional
int unsigned long y; // valid, but strange
long double a;       // conventional
double long b;       // valid, but strange

Declarators

A declarator declares a single object within a declaration. A declarator contains the name being declared, additional type information (for pointers, references, and arrays), and an optional initializer. Use commas to separate multiple declarators in a declaration.

Arrays

An array is declared with a constant size in square brackets. The array size is fixed for the lifetime of the object and cannot change. (For an array-like container whose size can change at runtime, see <vector> in Chapter 13.) To declare a multidimensional array, use a separate set of square brackets for each dimension. For example,

int point[2];
double matrix[3][4]; // A 3 x 4 matrix

You can omit the array size if there is an initializer; the number of initial values determines the size. In a multidimensional array, you can omit the first (left-most) size. For example,

int data[] = { 42, 10, 3, 4 }; // data[4]
int identity[][3] = { { 1,0,0 }, {0,1,0}, {0,0,1} };
char str[] = "hello";          // str[6], with trailing \0

In a multidimensional array, all elements are stored contiguously, with the rightmost index varying fastest (row major order).

When used as a function parameter, the size is ignored, and the type is actually a pointer type, which is the subject of the next section. For a multidimensional array used as a function parameter, the first dimension is ignored, so the type is a pointer to an array.

Pointers

A pointer object stores the address of another object. A pointer is declared with a leading asterisk, optionally followed by cv qualifiers, and then the object name and optional initializer. For example,

int x;
int *p;                     // pointer to int
int * const cp = &x;        // const pointer to int
const int * pc;             // pointer to const int
const int * const cpc = &x; // const pointer to const int
int **pp;                   // pointer to pointer to int

The cv qualifier applies to the pointer object. That is, a const pointer must be initialized and cannot be the target of an assignment. The cv qualifiers in the declaration's specifier determines the type of the pointer's target. A pointer to const, for example, can change values, but its value must always be the address of a const object. You can have pointers to pointers, as deep as you want.

When a function parameter is declared with an array type, the actual type is a pointer, and at runtime, the address of the first element of the array is passed to the function. You can use array syntax, but the size is ignored. For example,

int sum(int data[], size_t n); // these two declarations
int sum(int* data,  size_t n); // mean the same thing
void transpose(double matrix[][3]); // only the first
                                    // size can be omitted

A useful convention is that parameters that are used in an array-like fashion (values does not change, dereferenced with [] operator) within the function are declared with array syntax. Parameters that are used in pointer-like fashion (value changes, dereferenced with * operator) are declared with pointer syntax.

Function pointers

A function pointer is declared with an asterisk to denote a function, and the function signature (parameter types and optional names). The name and asterisk must be enclosed in parentheses, so the asterisk is not interpreted as part of the return type. An optional exception specification can follow the signature. See Chapter 6 for more information about function signatures and exception specifications.

Declaring an object with a function pointer type can be hard to read, so typically, you would declare the type separately with a typedef declaration, and then declare the object using the typedef name, as shown in Example 3-17.

Example 3-17: Simplifying declarations with typedef.

// Declare an array named fp, of 10 elements, where each
// element is a pointer to a function that returns int*
// and takes two parameters: the first of type pointer
// to function that takes an int* and returns int*,
// and the second of type int.
int* (*fp[10])(int*(*)(int*), int);

// Declare a type for pointer to int.
typedef int* int_ptr;
// Declare a function pointer type for a function that
// takes an int_ptr parameter and returns an int_ptr.
typedef int_ptr (*int_ptr_func)(int_ptr);
// Declare a function pointer type for a function that
// returns int_ptr and takes two parameters: the first
// of type int_ptr and the second of type int.
typedef int_ptr (*func_ptr)(int_ptr_func, int);
// Declare an array of 10 func_ptrs.
func_ptr fp[10];

Member pointers

Pointers to members (data and functions) work differently from other pointers. The syntax requires a class name before the asterisk. Pointers to members can never be cast to ordinary pointers, and vice versa. You cannot declare a reference to a member. See Chapter 4 for information about expressions that dereference pointers to members. A pointer to a static data member is an ordinary pointer, not a member pointer. Example 3-18 shows some declarations of pointers to members.

Example 3-18: Declaring pointers to members.

class simple {
public:
  int data;
  static int num_instances;
  int func(int);
};

int *static_ptr = &simple::num_instances;
int simple::* p = &simple::data;
int (simple::*fp)(int) = &simple::func;

References

A reference is a synonym for another object. A reference is declared with a leading ampersand followed by the object name and initializer. For example,

int x;
const int c;
int &r = x;          // reference to int
const int& rc = c;   // error: no cv qualified references
int &&rr;            // error: no reference of reference

A reference, unlike a pointer, cannot be made to refer to a different object. Assignments to a reference are just like assignments to the referenced object.

References are often used as function parameters. For example, the standard library has the div function, which divides two integers and returns the quotient and remainder in a struct. Suppose you would rather have the function return the results as arguments. Example 3-19 shows one way to do this.

Example 3-19: Returning results in function arguments.

#include <cstdlib>
#include <iostream>
#include <ostream>

template<typename T>
void div(T num, T den, T& quo, T& rem)
{
  std::div_t result = std::div(num, den);
  quo = result.quot;
  rem = result.rem;
}

template<>
void div<long>(long num, long den, long& quo, long& rem)
{
  std::ldiv_t result = std::div(num, den);
  quo = result.quot;
  rem = result.rem;
}

int main()
{
  int quo, rem;
  div(42, 5, quo, rem);
  std::cout << quo << " remainder " << rem << '\n';
}

A common idiom is to use a const reference for function parameters, especially for large objects. Function arguments are passed by value in C++, which requires copying the argument. This can be costly for a large object, so passing a reference has better performance. If the function can modify the object, that would violate the pass-by-value convention, so the reference can be declared const, which prevents the function from modifying the object. In this way, pass-by-value semantics are preserved, with the improved performance of pass-by-reference. The standard library often makes use of this idiom. For example, operator<< for std::string uses a const reference to the string, to avoid making unnecessary copies of the string. (See <string> in Chapter 13 for details.)

A reference must be initialized to refer to an object. Data members must be initialized in the constructor's initializer list; function parameters are initialized in the function call. All other definitions must have an initializer. (An extern declaration is not a definition, so it doesn't take an initializer.)

You cannot declare a reference to a reference, a pointer to a reference, or an array of references. This poses an additional challenge for template authors. For example, you cannot store references in a container because a number of container functions explicitly declare their parameters as references to the container's value type. (Try using std::vector<int&> with your compiler, and see what happens. You should see a lot of error messages.)

Instead, you can write a wrapper template, call it rvector<typename T>, and specialize the template (rvector<T&>) so references are stored as pointers, but all the access functions hide the differences. This approach requires you to duplicate the entire template, which is tedious. Instead, you can encapsulate the specialization in a traits template (refer to Chapter 9 for more information about traits), as shown in Example 3-20.

Example 3-20: Encapsulating reference traits.

// REF type trait encapsulates reference type,
// and mapping to and from the type for use in a container.
template<typename T>
struct REF {
  typedef T value_type;
  typedef T& reference;
  typedef const T& const_reference;
  typedef T* pointer;
  typedef const T* const_pointer;
  typedef T container_type;
  static reference from_container(reference x) { return x; }
  static const_reference from_container(const_reference x)
                                               { return x; }
  static reference to_container(reference x)   { return x; }
};

template<typename T>
struct REF<T&> {
  typedef T value_type;
  typedef T& reference;
  typedef const T& const_reference;
  typedef T* pointer;
  typedef const T* const_pointer;
  typedef T* container_type;
  static reference from_container(pointer x) { return *x; }
  static const_reference from_container(const_pointer x)
                                             { return *x; }
  static pointer to_container(reference x)   { return &x; }
};

// rvector<> is similar to vector<>, but allows references
// by storing references as pointers.
template<typename T, typename A=std::allocator<T> >
class rvector {
  typedef typename REF<T>::container_type container_type;
  typedef typename std::vector<container_type> vector_type;
public:
  typedef typename REF<T>::value_type value_type;
  typedef typename REF<T>::reference reference;
  typedef typename REF<T>::const_reference const_reference;
  typedef typename vector_type::size_type size_type;
  ...  // other typedefs are similar
  class iterator { ... }; // wraps a vector<>::iterator
  class const_iterator { ... };
  ... // constructors pass arguments to v
  iterator begin()            { return iterator(v.begin()); }
  iterator end()              { return iterator(v.end()); }
  void push_back(typename REF<T>::reference x) {
     v.push_back(REF<T>::to_container(x));
  }
  reference at(size_type n)   {
     return REF<T>::from_container(v.at(n));
  }
  reference front()           {
    return REF<T>::from_container(v.front());
  }
  const_reference front() const  {
    return REF<T>::from_container(v.front());
  }
  ... // other members are similar
private:
  vector_type v;
};

Initializers

An initializer supplies an initial value for the object being declared. When you declare a reference or a const object, you must supply an initializer for local and global variables, but not for data members, function parameters, and extern declarations. An initializer supplies the initial value of the object.

The two forms of initializers are assignment-like and function-like. An assignment-like initializer starts with an equal sign, followed by an expression or a list of comma-separated expressions in curly braces. A function-like initializer is a list of one or more comma-separated expressions in parentheses. Note that these initializers look like assignment statements or function calls, but they are not. They are initializers. The difference is particularly important for classes (details in Chapter 7). For example,

int x = 42;                // initializes x with the value 42
int y(42);                 // initializes y with the value 42
int z = { 42 };            // initializes z with the value 42
int w[4] = { 1, 2, 3, 4 }; // initializes an array
std::complex<double> c(2.0, 3.0);// calls complex constructor

When initializing a scalar value, the form is irrelevant. The initial value is converted to the desired type using the usual conversion rules (as described in Chapter 4).

Without an initializer, all non-POD class-type objects are initialized by calling their default constructors. (See Chapter 7 for more information about POD and non-POD classes.) All other static objects are initialized to zero, and local objects are left uninitialized. An uninitialized const object is an error.

Function-like initializers

You must use a function-like initializer when constructing a class whose constructor takes two or more arguments, or when calling an explicit constructor. The usual rules for resolving overloaded functions applies to the choice of overloaded constructors. (See Chapter 6 for more information.)

Empty parentheses cannot be used as an initializer in an objects declaration, but can be used in other initialization contexts (namely, a constructor initializer list or as a value in an expression). If the type is a class type, the default constructor is called; otherwise, the object is initialized to zero.

Assignment-like initializers

In an assignment-like initializer, if the object is of class type, the value to the right of the equal sign is converted to a temporary of the desired type and the object is constructed by calling its copy constructor.

The generic term for an array and or POD object is aggregate because it aggregates multiple values into a single object. To initialize an aggregate, you can supply multiple values in curly braces, as described in the following sections.

Initializing POD objects

Chapter 7 has the complete definition of a POD (plain old data) class. Briefly, a POD object is one that can be copied bit-for-bit (no copy assignment operator, and no non-POD members). To initialize a POD object, you can supply an initial value for each non-static data member, separated by commas, enclosed in curly braces. For nested objects, use nested curly braces. Values are associated with members in order of declaration of the members. If there are more values than members, it is an error. If there are fewer values than members, the members without values are initialized by calling the default constructor or initializing to zero.

An initializer list can be empty, which means all members are initialized to the default, which is different from omitting the initializer entirely. The latter causes all members to be left uninitialized. For example,

class point { double x, y, z; }
point origin = { };     // all members initialized to 0.0
point unknown;          // uninitialized, value is not known
point pt = { 1, 2, 3 }; // pt.x==1.0, pt.y==2.0, pt.z==3.0
class line { point p1, p2 };
line vec = { { }, { 1 } }; // vec.p1 is all zero
              // vec.p2.x==1.0, vec.p2.y==0.0, vec.p2.z==0.0

Initializing arrays

Initialize elements of an array with values, separated by commas, enclosed in curly braces. Multi-dimensional arrays can be initialized by nesting sets of curly braces. It is an error if there are more values than elements in the array; if the initializer has fewer values, the remaining elements in the array are initialized to default values (default constructors or zero). If the declarator omits the array size, the size is determined by counting the number of values in the initializer.

The initializer can be empty, to force all elements to be initialized to the default. Omitting the initializer entirely causes all elements of the array to be uninitialized.

When initializing a multi-dimensional array, you can flatten the curly braces and initialize elements of the array in row-major order (last index varies fastest).

For example,

int vector[] = { 1, 2, 3 }; // array of three elements
                            // vector[0]==1 ... vector[2]==3
int zero[4] = { }; // initialize to all zeros

// Initialize id1 and id2 to the identity matrix.
int id1[3][3] = { { 1 }, { 0, 1 }, { 0, 0, 1 } };
int id2[3][3] = { 1, 0, 0, 0, 1, 0, 0, 0, 1 };

An array of char or wchar_t is special because you can initialize it with a string literal. Remember that every string literal has an implicit null character at the end. For example,

// The following two declarations are equivalent.
char str1[] = "Hello";
char str2[] = { 'H', 'e', 'l', 'l', 'o', '\0' };
wchar_t ws1[] = L"Hello";
wchar_t ws2[] = { L'H', L'e', L'l', L'l', L'o', L'\0' };

The last expression in an initializer list can be followed by a comma. This can be convenient when maintaining software and you often need to change the order of items in the initializer list. You don't need to treat the last element differently from the other elements. For example,

const std::string keywords[] = {
  "and",
  "asm",
  ...
  "while",
  "xor",
};

Linkage

Every object has linkage, which determines how the compiler and linker associate object references with the object definition. Linkage has two aspects: scope and language. Scope linkage dictates which scopes have access to an entity. Language linkage dictates an entity's properties that depend on programming language.

Scope linkage

Scope linkage can be external, internal, or none:

Language linkage

Every entity has a language linkage, which is a simple character string. By default, the linkage is "C++". The only other standard language linkage is "C". All other language linkages and the properties associated with different language linkages are implementation-defined.

You can specify a language linkage for a single declaration or for a series of declarations in curly braces:

extern "C" {
  void cfunction(int);
  typedef void (*cfunc)(int);
}
extern "C++" cfunc cf = cfunction;
// The variable cf has C++ linkage. Its value is a pointer
// to function that has C linkage.

C does not support function overloading, so there can be at most one function with C linkage of a given name. Even if you declare the C function in two different namespaces, both declarations refer to the same function, for which there must be a single definition.

Typically, C linkage is used for external functions that are written in C (such as the C standard library), but that you want to call from a C++ program. C++ linkage is used for native C++ code. Sometimes, though, you want to write a function in C++ that can be called from C; in that case, you should declare the C++ function with C linkage.

An implementation might support other language linkages. It is up to the implementation to define the properties of each language: how parameters are passed to functions, how values are returned from functions, whether and how function names are altered, and so on. In many C++ implementations, a function with C++ linkage has a "mangled" name, that is, the external name encodes the function name and the types of all its arguments. So the function strlen(const char*) might have an external name of strlen__FCcP, which makes it hard to call the function from a C program, which does not know about C++ name mangling rules. Using C linkage, the compiler does not mangle the name, exporting the function under the plain name of strlen, which can be called easily from C.

Type Declarations

One of the hallmarks of C++ is that you can define a type that seems just like any builtin type. Thus, if you need to, say, define a type that supports arbitrary-sized integers, call it bigint, you can do so, and programmers can use bigint objects the same way they use int objects.

The user-defined types are classes and enumerations. Enumerations are described later in this section.

You can also declare a typedef, which is a synonym for an existing type. Note that the name typedef seems to be a shorthand for "type definition" but it is actually a type declaration.

Fundamental Types

This section lists the fundamental types that are built into the C++ language. Types that require multiple keywords (e.g., unsigned long int) can mix the keywords in any order, but the order shown below is the convential order. If the type specifier requires multiple words, one of which is int, the int can be omitted. If the type is signed, the signed keyword can be omitted (except for signed char).

bool
Represents a Boolean or logical value. It has one of two values: true or false.
char
Represents a character. Character literals have type char. All the character types (char, signed char, and unsigned char) share a common size and representation. By definition, char is the smallest fundamental type. A char is signed or unsigned, depending on the implementation.
double
Represents a double-precision floating point number. The range and precision are at least as much as those of float. A floating point literal has type double unless you use the F or L suffix.
float
Represents a single-precision floating point number.
long double
Represents an extended-precision floating point number. The range and precision are at least as much as those of double.
signed char
Represents a signed character.
signed int
Represents an integer in a size and format that is natural for the host environment.
signed long int
Represents an integer whose range is at least as large as that of int.
signed short int
Represents an integer such that the range of an int is at least as large as the range of a short.
unsigned char
Represents an unsigned character.
unsigned long int
Represents an unsigned long integer.
unsigned short int
Represents an unsigned short integer.
void
Represents the absence of any values. You cannot declare an object of type void, but you can declare a function that "returns" void (that is, does not return a value), or declare pointers to void.
wchar_t
Represents a wide character. Its representation must match one of the fundamental integer types. Wide character literals have type wchar_t.

The representation of the fundamental types is implementation-defined. The integral types (bool, char, wchar_t, int, etc.) require a binary representation: signed-magnitude, one's complement, or two's complement. Some types have alignment restrictions, which are implementation-defined. (Note that new expressions always return pointers that are aligned for any type.)

The signed and unsigned variants of a given type always occupy the same amount of storage. The non-negative values for the signed type are always a subset of the values supported by the unsigned type, and have the same bit representation.

The unsigned types always use arithmetic module 2n, where n is the number of bits in the type. Unsigned types take up the same amount of space and have the same alignment requirements as their signed companion type. Nonnegative signed values must have the same bit representation as the same unsigned value.

See the <limits> header in Chapter 13 to determine the numerical limits of each fundamental type.

Enumerated Types

An enumerated type declares an optional type name (the enumeration) and a set of zero or more identifiers (enumerators). Each enumerator is a constant whose type is the enumeration. For example,

enum logical { no, maybe, yes };
logical is_permitted = maybe;

enum color { red=1, green, blue=4 };
const color yellow = static_cast<color>(red | green);

enum zeroes = { a, b = 0, c = 0 };

You can optionally specify the value of an enumerator after an equal sign (=). The value can be an integer or an enumeration. The default value of the first enumerator is zero. The default value for subsequent enumerators is one more than the value of the previous enumerator (regardless of whether that value was explicitly specified). Enumerators can have duplicate values in a single enumeration declaration.

Each enumeration has an underlying integral type that can store all of the enumerator values. The actual type is implementation-defined, so the size of an enumerated type is implementation-defined.

An enumerated type is a unique integral type. Enumerated values have integer values, but integers cannot be implicitly converted to an enumerated type. Instead, you can use static_cast<> to cast an integer to an enumeration or from one enumeration to a different enumeration. (See Chapter 4 for details.)

The range of values for an enumeration is defined by the smallest and largest bitfields that can hold all of its enumerators. In more precise terms, let the largest and smallest values of the enumerated type be vmin and vmax. The largest enumerator is emax and the smallest is emin. Using two's complement representation (the most common integer format), vmax is the smallest 2n - 1, such that vmax >= max(abs( emin) - 1, abs( emax)). If emin is not negative, vmin = 0, otherwise vmin = -( vmax + 1).

In other words, the range of values for an enumerated type can be larger than the range of enumerator values, but the exact range depends on the representation of integers on the host platform, and so is implementation-defined. All values between the largest and smallest enumerators are always valid, even if they do not have corresponding enumerators.

For example, consider the following enumerations:

enum sign { neg=-1, zero=0, pos=1 };
enum iostate { goodbit=0, failbit=1, eofbit=2, badbit=4 };

The enumeration sign has the range (in two's complement) -2 to 1. Your program might not assign any meaning to static_cast<sign>(-2), but it is semantically valid in a program.

The type iostate is designed to be a bitmask, where the enumerators can be combined using the bitwise operators. The range of values is 0 to 7. The enumeration can clearly fit in a char, but the implementation is free to use int, short, char, or the unsigned flavors of these types as the underlying type. (The standard library has an iostate type, and can implement it as this enumeration, but is free to choose a different implementation. See the <ios> section in Chapter 13 for more information.)

Because enumerations are distinct types, it is up to the programmer to decide which operations are permitted. Of course, you can use any integer operations, and cast the result back to the enumerated type, but in some cases, you might want to overload certain operators to avoid the inconvenience of type casting. For example, you might want to overload the bitwise operators, but not the arithmetic operators, for the iostate type. The sign type does not need any additional operators; the comparison operators work just fine by implicitly converting sign values to integers. Other enumerations might call for overloading ++ and -- operators (similar to the succ and pred functions in Pascal). How you handle overflow and underflow is up to you. Example 3-21 shows operators can be overloaded for enumerations.

Example 3-21: Overloading operators for enumerations.

// Explicitly cast to int, to avoid infinite recursion.
inline iostate operator|(iostate a, iostate b) {
  return iostate(int(a) | int(b));
}
inline iostate& operator|=(iostate& a, iostate b) {
  a = a | b;
  return a;
}
// repeat for &, ^, ~

int main()
{
  iostate err = goodbit;
  if (error())
    err |= badbit;
}

typedef Declarations

A typedef declares a synonym for an existing type. Syntactically, typedef is a specifier in a declaration, and it must be combined with type specifiers and optional cv qualifiers (no storage class specifiers). After the specifiers come the list of declarators.

The declarator of a typedef declaration is similar to that for an object declaration (as described earlier in this chapter), except you cannot have an initializer. Following are some examples of typedef declarations:

typedef double[3][3] matrix;
typedef void (*thunk)();
typedef signed char SCHAR;

By convention, the typedef keyword appears before the type specifiers. For example,

typedef unsigned int UINT;   // conventional
long typedef unsigned ULONG; // valid, but strange

A typedef is especially helpful with complex declarations, such as function pointers. They also can provide helpful information for the person who must read and maintain the code. Example 3-17, earlier in this chapter, has examples of how to use typedef to simplify declarations and make them easier to read.

A typedef does not create a new type, the way class and enum do. It simply declares a new name for an existing type. Therefore, function declarations where the parameters differ only as typedefs are not actually different declarations, as shown in the following example:

typedef unsigned int UINT;
UINT func(UINT);             // two declarations of the
unsigned func(unsigned);     // same function

Similarly, because you cannot overload an operator on fundamental types, you cannot overload an operator on typedef synonyms for fundamental types. For example,

int operator+(int, int); // error
typedef int INT;
INT operator+(INT, INT); // error

C programmers are accustomed to declaring a typedef for struct, union, and enum declarations, but they are not necessary in C++. In C, the struct, union, and enum namespaces are separate from the type namespace, but in C++, the declaration of a struct, union, class, or enum also adds the type to the type namespace. Nonetheless, such a typedef is harmless. C++ lets you define a type name as a synonym for itself. For example,

struct point { int x, y; }
typedef struct point point;// not needed in C++, but harmless
point pt;