C++ has a rich I/O library, often called I/O streams because the header most often used is <iostream>
. This chapter presents an overview of the C++ I/O library. For details of individual classes and functions, see Chapter 13.
As with C and most modern languages, input and output in C++ is taken care of entirely by the library, with no language features that specifically support I/O.
The C++ I/O library is based on a set of templates, parameterized on the character type. Thus, you can read and write plain char
-type characters, wide wchar_t
characters, or some other, exotic character type that might need to invent. (But read about character traits in Chapter 9 first.) Figure 10-1 depicts the class hierarchy. Notice that the names are of the form basic_
name. These are the template names; the specializations have the more familiar names, e.g., istream
specializes basic_istream<char>
.
Figure 10-1: The I/O stream class hierarchy.
The ios_base
class declares types and constants that are used throughout the I/O library. Formatting flags, I/O state, open modes, and seek directions are all declared in ios_base
.
The basic_istream
template declares input functions, and basic_ostream
declares output functions. The basic_iostream
template inherits input and output functions through multiple inheritance from basic_istream
and basic_ostream
.
The I/O library supports formatted and unformatted I/O. Unformatted I/O simply reads or writes characters or character strings without interpretation. The I/O streams have a number of functions for performing unformatted I/O.
Formatted input can skip over leading white space, parse text as numbers, and interpret numbers in different bases (decimal, octal, hexadecimal). Formatted output can pad fields to a desired width and write numbers as text in different bases. Formatted I/O uses the stream's locale to parse numeric input or format numeric output.
To perform formatted I/O, the I/O streams overload the shift operators: left shift (<<
) writes and right shift (>>
) reads. Think of the shift operators as arrows pointing in the direction of data flow: output flows from a variable to the stream (cout
<<
var
). Input flows from the stream to a variable (cin
>>
var
). The string, complex, and other types in the standard library overload the shift operators so you can perform I/O with these objects just as easily as you can with the primitive types.
When you define your own classes where I/O is meaningful, you should override these operators, too. A common mistake is to overload the shift operators for istream
and ostream
, but that prevents the use of your operators for wide-character streams or streams with custom character traits. A better solution, and one that is no more difficult, is to write a function template that is as general as possible.
Other guidelines to follow when writing custom I/O functions are as follows:
failbit
for malformed input.Example 10-1 shows an example of the rational
class, which represents a rational number. (See Example 9-3 for more information about rational
.)
Example 10-1: Performng I/O with the rational class template.
// Read a rational number. The numerator and denominator // must be written as two numbers (the first can be signed) // separated by a slash (/), e.g., "2/3", "-14/19". template<typename T, typename charT, typename traits> ::std::basic_istream<charT, traits>& operator>>(::std::basic_istream<charT, traits>& in, rational<T>& r) { rational<T>::numerator_type n; rational<T>::denominator_type d; char c; if (! (in >> n)) return in; // Allow white space before and after the dividing '/'. if (! (in >> c)) return in; if (c != '/') { // Malformed input. in.setstate(::std::ios_base::failbit); return in; } if (! (in >> d)) return in; r.set(n, d); return in; } // Print a rational number as two integers separated // by a slash. Use a string stream so the two numbers // are written without padding, and the overall // formatted string is then padded to the desired width. template<typename T, typename charT, typename traits> ::std::basic_ostream<charT, traits>& operator<<(::std::basic_ostream<charT, traits>& out, const rational<T>& r) { // Use the same flags, locale, etc. to write the // numerator and denominator to a string stream. ::std::basic_ostringstream<charT, traits> s; s.flags(out.flags()); s.imbue(out.getloc()); s.precision(out.precision()); s << r.numerator() << '/' << r.denominator(); // Write the string to out, padding the entire fraction // as needed. out << s.str(); return out; }
Because the C++ library includes the C library, you can use any of the C stdio function, such as fopen
and printf
. In most cases, though, a C++ program should use C++ I/O functions rather than C functions.
The printf
and scanf
functions are noteworthy for their lack of safety. Although some compilers are now checking the format strings and comparing them with the actual arguments, many compilers do not. It is too easy to make a simple mistake. If you are lucky, your program will fail immediately. If you are unlucky, your program will appear to work on your system and fail only when it is run on your customers' systems. Following is an example of a common mistake when using printf
:
size_t s; printf("size=%u\n", s);
The problem with the example is that the size_t
type might be unsigned
long
, which means the argument s
and the format %u
do not match. On some systems, the mismatch is harmless, but on others, wrong information will be printed, or worse.
Other functions, such as gets
and sprintf
, are unsafe because they write to character arrays with no way to limit the number of characters written. Without a way to prevent buffer overruns, these functions are practically useless.
Another limitation of the C stdio library is that it has little support for alternate locales. In C++, every stream can have a different locale. This lets you write a program, say, that reads a data file in a fixed format (such as the "C"
locale), but prints the human-readable results in the native locale. Writing such a program using the C stdio library requires changing locales between reading and writing, which is much less convenient.
In spite of all these problems, some C++ programmers still use the C library. The printf
function, in particular, has a brevity that appeals to C and C++ programmers. Compare the following examples,
unsigned long count, mask; ... printf("count=%-9.9ld\n" "mask=%#-8.8lx\n", count, mask); cout.fill('0'); cout << "count=" << right << dec << setw(9) << count << "\nmask=0x" << hex << setw(8) << mask << '\n';
I recommend using the C++ I/O streams. You might think the printf
is saving you time now, but the lack of safety can present major problems in the future. (Reread the example and imagine what happens when count
exceeds the maximum value for type long
.)
Sometimes, use of the C stdio library is necessary (perhaps legacy C code is being called from C++ code). To help in such situations, the standard C++ I/O objects are associated with their corresponding C FILE
s. The C++ cin
object is associated with the C stdin
object, cout
with stdout
, and cerr
and log
are associated with stderr
. You can mix C and C++ I/O functions to the standard I/O objects. When mixing C and C++ I/O, set the unitbuf
flag to ensure the C++ buffer is flushed to avoid getting the C and C++ I/O buffers mixed up.
This section lists the I/O-related headers in the C++ standard library, with a brief description of each. See the corresponding sections in Chapter 13 for details.
<fstream>
fstream
, ifstream
, ofstream
, wfstream
, wistream
, wostream
, and other types.<iomanip>
<ios>
ios_base
, basic_ios
, and some common manipulators. All I/O streams derive from basic_ios
, which derives from ios_base
.<iosfwd>
<iosfwd>
can reduce the compile-time burden in certain situations.<iostream>
cin
, cout
, etc.<istream>
istream
, wistream
) and for input and output streams (iostream
, wiostream
).<ostream>
ostream
, wostream
).<sstream>
istringstream
, ostringstream
, stringstream
, etc.), which read from and write to strings using the stream protocol.<streambuf>
<strstream>
<sstream>
instead. This header is not covered in Chapter 13.Formatted I/O means values are converted to text for output, and various transformations are applied to prepare the text according to the programmer's specifications. On input, formatting controls how text is converted to values, and whether white space is skipped prior to reading an input field.
To control formatting,a stream keeps track of a set of flags, a field width, and a precision. Table 13-12 (in the <ios>
section) lists all the formatting flags.
The formatted input functions are overloaded operator>>
. If the skipws
flag is set, white space characters (according to the locale imbued in the stream) are skipped, and input begins with the first non-white space character.
If reading a string
or character array, all non-space characters are read into the string, ending with the first white space character or when width
characters have been read (if width
> 0), whichever comes first. The width
is then reset to zero.
For all other types, the width
and precision
are not used. To read a number from a fixed-width field, read the field into a string, then use a string stream to read the number, as shown in Example 10-2.
Example 10-2: Reading a number from a fixed-width field.
template<typename T, typename charT, typename traits> std::basic_istream<charT, traits>& fixedread(std::basic_istream<charT, traits>& in, T& x) { if (in.width() == 0) // Not fixed size, so read normally. in >> x; else { std::string field; in >> field; std::basic_stringstream<charT, traits> stream(field); stream >> x; } return in; }
The only other flags that affect input are basefield
and boolalpha
.
basefield
flags determine how integers are interpreted: in a fixed radix (oct
, hex
, or dec
), or if zero, the input determines the radix (leading 0x
or 0X
for hexadecimal, leading 0
for octal, otherwise decimal).boolalpha
flag is set, a formatted read of bool
reads a string, which must match the names true
or false
(in the stream's locale, according to the numpunct
facet). If the boolalpha
flag is not set, a bool is read as a long integer, and the number is converted to bool
using the standard rules: non-zero is true
and zero is false
.Floating point numbers are accepted in fixed or scientific format.
The decimal point character is determined by the stream's locale, as is the thousands separator. Thousands separators are optional in the input stream, but if present, they must match the locale's thousands separator character and grouping rules. For example, if the thousands separator is ","
and the grouping is for every 3 characters (grouping()
returns "\3"
), following are some valid and invalid input examples:
1,234,567 // valid 1,234,56 // invalid 1234567 // valid 1234,567 // valid
Thus, when reading data that the user types, you should imbue the input stream with the user's native locale (e.g., cin.imbue(locale(""))
). For input that is being read from files or other sources that require portable data formats, be sure to use the "C"
or classic locale (e.g., cin.imbue(locale::classic())
).
See the num_get
facet in the <locale>
section of Chapter 13 for details of how numeric input is parsed and interpreted.
The formatted output functions are overloaded operator<<
. They all work similarly: they use the flags and other information to format a value as a string. The string is then padded with a fill
character to achieve the desired width
. The adjustfield
flags are used to determine where the fill
characters are added (to the left
, right
, or internal
: after a sign or a leading 0x
or 0X
).
The padded string is written to the output stream, and the stream's width
is reset to zero. The width
is the only formatting parameter that is reset. The flags, precision
, and fill
character are "sticky" and persist until they are changed explicitly.
Formatting an integer depends on basefield
(hex
, oct
, or default is dec
), uppercase
(0X
for hexadecmial), showpos
(insert a +
for positive numbers), and showbase
flags (to insert a prefix of 0x
or 0X
for hexadecimal or 0
for octal). If the locale's numpunct
facet specifies thousands grouping, thounsands separators are inserted at the desired positions.
Formatting a floating point number depends on the floatfield
(fixed
, scientific
, or zero for general), uppercase
(E
for exponent), showpoint
(insert decimal point even if not needed), and showpos
(insert a +
for positive numbers) flags.
If the boolalpha
flag is set, bool
values are written as names (e.g., true or false, depending on the locale). If boolalpha
is not set, bool
values are written as integers, as described earlier.
When writing output for the user's immediate consumption, you should imbue the output stream with the user's native locale (e.g., cin.imbue(locale(""))
). For output that is being written to files or other sources that require portable data formats, be sure to use the "C"
or classic locale (e.g., cin.imbue(locale::classic())
).
See the num_put
facet in the <locale>
section of Chapter 13 for details of how numeric input is parsed and interpreted.
Unformatted I/O involves characters, character arrays, or strings, which are read or written without interpretation, padding, or other adjustment.
The unformatted read functions can read into a string or character array. The gcount()
function returns the number of characters read.
Unformatted output can write a string or character array. You can specify the exact number of characters to write from a character array, or you can write all characters up to a null character (to write a C-style null-terminated string).
The I/O stream classes rely on stream buffers for the low-level input and output. Usually, you can ignore the stream buffer, and deal only with the high-level stream object. Dropping down to the stream buffer level might be called for in some circumstances, though.
For example, there are several ways to copy a file in C++. Example 10-3 shows how a C programmer might copy a stream, once that programmer has learned about templates.
Example 10-3: Copying streams one character at a time.
template<typename charT, typename traits> void copy(std::basic_ostream<charT, traits>& out, std::basic_istream<charT, traits>& in) { char c; while (in.get(c)) out.put(c); }
After measuring the performance, the intrepid programmer might decide that copying larger buffers is the right way to go. Example 10-4 shows the new approach. On my system, the new version runs roughly twice as fast the original version. (Of course, performance measures depend highly on compiler, library, environment, and so on.)
Example 10-4: Copying streams with explicit buffers.
template<typename charT, typename traits> void copy(std::basic_ostream<charT, traits>& out, std::basic_istream<charT, traits>& in) { const size_t BUFFER_SIZE = 8192; std::auto_ptr<char> buffer(new char[BUFFER_SIZE]); while (in) { in.read(buffer.get(), BUFFER_SIZE); out.write(buffer.get(), in.gcount()); } }
After reading more about the C++ standard library, the programmer might try to improve the performance by delegating all the work to the stream buffer, as shown in Example 10-5.
Example 10-5: Copying streams via stream buffers.
template<typename charT, typename traits> void copy(std::basic_ostream<charT, traits>& out, std::basic_istream<charT, traits>& in) { out << in.rdbuf(); }
The newest version runs about as fast as the previous version, but is much simpler to read and write.
Another reason to mess around with stream buffers is that you might need to write your own. Perhaps you are implementing a network I/O package. The user opens a network stream that connects to a particular port on a particular host and then performs I/O using the normal I/O streams. To implement your package, you must derive your own stream buffer class template from the basic_streambuf
class template in <streambuf>
.
Your stream buffer must manage the actual buffer (character array) and set the buffer pointers, take care of the communication with the network, and define what it means to "put back" a character, seek on the stream, and so on. Example 10-6 shows an extremely over-simplified sketch of how the basic_netbuf
class template might work.
Example 10-6: The basic_networkbuf class template.
template<typename charT, typename traits = ::std::char_traits<char> > class basic_networkbuf : public ::std::basic_streambuf<charT, traits> { public: typedef charT char_type; typedef traits traits_type; typedef typename traits::int_type int_type; typedef typename traits::pos_type pos_type; typedef typename traits::off_type off_type; basic_networkbuf(); virtual ~basic_networkbuf(); bool is_connected(); basic_networkbuf* connect(const char* hostname, int port, ::std::ios_base::openmode mode); basic_networkbuf* disconnect(); protected: virtual ::std::streamsize showmanyc(); virtual int_type underflow(); virtual int_type overflow(int_type c = traits::eof()); virtual pos_type seekoff(off_type offset, ::std::ios_base::seekdir dir, ::std::ios_base::openmode); virtual pos_type seekpos(pos_type sp, ::std::ios_base::openmode); virtual basic_networkbuf* setbuf(char_type* buf, ::std::streamsize size); virtual int sync(); private: char_type* buffer_; ::std::streamsize size_; bool ownbuf_; // true means destructor must delete buffer_ // network connectivity stuff... }; // Construct initializes the buffer pointers. template<typename charT, typename traits> basic_networkbuf<charT,traits>::basic_networkbuf() : buffer_(new char_type[DEFAULT_BUFSIZ]), size_(DEFAULT_BUFSIZ), ownbuf_(true) { this->setg(buffer_, buffer_ + size_, buffer_ + size_); // Leave room in the output buffer for one last character. this->setp(buffer_, buffer_ + size_ - 1); } // Return the number of characters available in the // input buffer. template<typename charT, typename traits> ::std::streamsize basic_networkbuf<charT,traits>::showmanyc() { return this->egptr() - this->gptr(); } // Fill the input buffer and set up the pointers. template<typename charT, typename traits> basic_networkbuf<charT,traits>::int_type basic_networkbuf<charT,traits>::underflow() { // Get up to size_ characters from the network, // storing them in buffer_. Store the actual number // of characters read in the local variable, count. ::std::streamsize count; count = netread(buffer_, size_); this->setg(buffer_, buffer_, buffer_ + coun)); if (this->egptr() == this->gptr()) return traits::eof(); else return traits::to_int_type(*this->gptr()); } // The output buffer always has room for one more character, // so if c is not eof(), add it to the output buffer. Then // write the buffer to the network connection. template<typename charT, typename traits> basic_networkbuf<charT,traits>::int_type basic_networkbuf<charT,traits>::overflow(int_type c) { if (c != traits::eof()) { *(this->pptr()) = c; this->pbump(1); } netwrite(this->pbase(), this->pptr() - this->pbase()); // The output buffer is now empty. Make sure it has // room for one last character. this->setp(buffer_, buffer_ + size_ - 1); return traits::not_eof(c); } // Force a buffer write. template<typename charT, typename traits> int basic_networkbuf<charT,traits>::sync() { overflow(traits::eof()); return 0; }
A manipulator is a functional object that can used as an operand to an input or output operator to manipulate the stream (hence the name). Manipulators can send additional output to a stream, read input from a stream, set flags, and more. For example, to output a zero-padded, hexadecimal integer, you can use an ostream
's member functions or you can use manipulators. Example 10-7 shows both ways. You can decide which way you prefer.
Example 10-7: Manipulating an output stream to format a number.
using namespace std; cout.fill('0'); cout.width(8); cout.setf(ios_base::internal, ios_base::adjustfield); cout.setf(ios_base::hex, ios_base::basefield); cout << value; cout << setfill('0') << setw(8) << hex << internal << value;
Manipulators are declared in several different headers.
<ios>
header declares the manipulators that set the formatting flags: boolalpha
, dec
, fixed
, hex
, internal
, left
, noboolalpha
, noshowbase
, noshowpoint
, noshowpos
, noskipws
, nouppercase
, nounitbuf
, oct
, right
, scientific
, showbase
, showpoint
, showpos
, skipws
, uppercase
, and unitbuf
.<istream>
header declares the input manipulator: ws
.<ostream>
header declares the output manipulators: endl
, ends
, and flush
.<iomanip>
header declares several additional manipulators: resetioflags
, setioflags
, setbase
, setfill
, setprecision
, setw
.Most manipulators are declared in the same headers as the stream type they manipulate. The only time you need to #include
an additional header is when you use a manipulator that takes an argument. Those manipulators are in the <iomanip>
header.
To write your own manipulator, use the standard ones as a pattern. The easiest are manipulators that take no arguments. The manipulator is simply a function that takes a stream as an argument and returns the same stream. The standard streams overload operator<<
and operator>>
to take such a function pointer as an operand.
For example, suppose you want to write an input manipulator that skips all characters up and including a newline. (Perhaps this manipulator is used by a command processor after reading a //
comment sequence.) Example 10-8 shows one way to write the skipline
manipulator.
Example 10-8: Skipping a line in an input stream.
template<typename charT, typename traits> ::std::basic_istream<charT,traits>& skipline(::std::basic_istream<charT,traits>& in) { charT c; while (in.get(c) && c != '\n') ; return in; } ... std::cin >> x >> skipline >> nextline;
Manipulators that take arguments are harder to write, but only slightly. You need to write some supporting infrastructure, such as additional overloaded operator>>
or operator<<
functions.
For example, suppose you want to parameterize your input skip
manipulator, so it skips everything up to a caller-supplied character. This manipulator will be defined as a class, where the constructor takes the manipulator's argument, that is, the delimiter character. You must overload operator>>
so it recognizes your manipulator as an operand, and invokes the manpulator's operator()
. You don't need to use operator()
, but that's a good choice when building a reusable infrastructure for manipulators. Example 10-9 shows the new skip
manipulator.
Example 10-9: Writing a manipulator that takes an argument.
template<typename charT, typename traits = ::std::char_traits<char> > class skip { public: typedef charT char_type; typedef ::std::basic_istream<charT,traits> stream_type; skip(char_type delim) : delim_(delim) {} void operator()(stream_type& stream) const; private: char_type delim_; }; template<typename charT, typename traits> void skip<charT,traits>::operator()(stream_type& stream) const { char_type c; while (stream.get(c) && c != delim_) ; } template<typename charT, typename traits> std::basic_istream<charT,traits>& operator>>(std::basic_istream<charT,traits>& stream, const skip<charT,traits>& f) { f(stream); return stream; } ... std::cin >> x >> skip<char>('\n') >> nextline;
By default, I/O streams do not raise exceptions for errors. Instead, each stream keeps a mask of error bits, called the iostate, which the rdstate()
function returns. A common idiom is to read from an input stream until the input stream is false:
while (cin.get(c)) cout.put(c);
The basic_ios
class overloads operator void*
to return a null pointer for failure. Similarly, it overloads operator!
to return true for failure. This latter test is often used in conditional statements, e.g.,
if (! cout) throw("write error");
The conditional tests define "failure" as when fail()
returns true, that is, when the failbit
is set in the iostate. The failbit is typically set when an input operation produces no input or an output operation does not write any characters.
Sometimes, instead of testing for failure after each I/O operation, you want to simplify your code. You can assume that every operation succeeds, and arrange for the stream to throw an exception for any failure. In addition to the iostate mask, every stream has an exception mask, where the bits of the exception mask correspond to bits in the iostate mask. When the iostate mask changes, if any bit is set in both masks, the stream throws an ios_base::failure
exception.
For example, suppose you set the exception mask to failbit
|
badbit
. During a normal program run, the input stream's iostate is initially zero. After reading the last item from the input stream, eofbit
is set in iostate. At this time, rdstate()
&
exceptions()
is still zero, so the program continues by processing the last input item. The next time it tries to read from the input stream, no characters are read (because eofbit
is set), which causes the input stream to set failbit
. Now rdstate()
&
exceptions()
returns a non-zero value, so the stream throws ios_base::failure
.
A stream often relies on other objects (especially locale facets) to parse input or format output. If these other objects throw exceptions, the stream catches the exception and sets badbit
. If badbit
is set in the exceptions()
mask, the original exception is rethrown.
When testing for I/O success, be sure to test for badbit
, as a special indicator of a serious failure. A simple test for !
cin
does not distinguish between different reasons for failure: eofbit
|
failbit
might signal a normal end of file, but failbit
|
badbit
might tell you that there is something seriously wrong with the input stream (say, a disk error). One possibility, therefore, is to set badbit
in the exceptions()
mask, so normal control flow deals with the normal situation of reading an end of file, but more serious errors result in exceptions, as shown in Example 10-10.
Example 10-10: Handling serious I/O errors.
#include <algorithm> #include <exception> #include <iostream> #include <map> #include <string> void print(const std::pair<std::string, size_t>& count) { std::cout << count.first << '\t' << count.second << '\n'; } int main() { using namespace std; try { string word; map<string, size_t> counts; cin.exceptions(ios_base::badbit); cout.exceptions(ios_base::badbit); while (cin >> word) ++counts[word]; for_each(counts.begin(), counts.end(), print); } catch(ios_base::failure& ex) { std::cerr << "I/O error: " << ex.what() << '\n'; return 1; } catch(exception& ex) { std::cerr << "Fatal error: " << ex.what() << '\n'; return 2; } catch(...) { std::cerr << "Total disaster.\n"; return 3; } };