Add Book to My BookshelfPurchase This Book Online

Chapter 4 - The Standard I/O Library

UNIX Systems Programming for SVR4
David A. Curry
 Copyright © 1996 O'Reilly & Associates, Inc.

Formatted Input and Output
Up to this point, this chapter discussed methods of performing unformatted I/O. The programs in Examples 4-1, 4-2, and 4-3 simply read and write bytes, without assigning any particular meaning to them. Although this type of I/O is performed all the time, it is also necessary to be able to read or write data that is formatted in a particular way, usually to make it easier for human beings to understand and work with. The Standard I/O Library provides two sets of functions to do this: the printf functions handle writing formatted output, and the scanf functions handle reading formatted input.
The printf Functions
The printf functions allow you to print data in a wide variety of formats:
    #include <stdio.h>
    int printf(const char *format, ...);
    int fprintf(FILE *stream, const char *format, ...);
    int sprintf(char *s, const char *format, ...);
All three functions convert, format, and print their arguments according to the instructions contained in the format string. The printf function writes to the standard output, the fprintf function writes to the referenced stream, and the sprintf function copies its output to the array of characters pointed to by s. The number of arguments passed to each of these functions can vary, because the contents of the format string specify unambiguously how many arguments there are. Each function returns the number of characters written or the constant EOF if an error occurs.
The format string can contain three types of characters:
 Plain characters that are simply copied to the output
 C-language escape sequences that represent non-graphic characters (such as \n and \t)
 Conversion specifications
A conversion specification, in its simplest form, is a percent sign (%) followed by a single character that indicates the type of conversion to perform. For each conversion specification, another argument is passed to the printf function following format. The arguments are passed in the same order in which their conversion specifications appear.
There are three basic data types that can be specified in a conversion specification: integers, floating-point numbers, and characters and character strings.
Integers
The conversion specifications for integers are as follows:
%d or %i
The argument, of type int, is converted to a signed decimal number. The %i specification is specific to ANSI C.
%o
The argument, of type int, is converted to an unsigned octal number.
%u
The argument, of type int, is converted to an unsigned decimal number.
%X or %x
The argument, of type int, is converted to an unsigned hexadecimal number. The X conversion uses the letters ABCDEF;, and the x conversion uses abcdef.
Example 4-4 shows some examples of how these conversion specifications are used.
Example 4-4:  printf-int
#include <stdio.h>
#define N   4
int numbers[N] = { 0, -1, 3, 169 };
int
main(int argc, char **argv)
{
    int i;
    for (i = 0; i < N; i++) {
        printf("Signed decimal:       %d\n", numbers[i]);
        printf("Unsigned octal:       %o\n", numbers[i]);
        printf("Unsigned decimal:     %u\n", numbers[i]);
        printf("Unsigned hexadecimal: %x\n\n", numbers[i]);
    }
    exit(0);
}
    % printf-int
    Signed decimal:        0
    Unsigned octal:        0
    Unsigned decimal:      0
    Unsigned hexadecimal: 0
    Signed decimal:       -1
    Unsigned octal:        37777777777
    Unsigned decimal:      4294967295
    Unsigned hexadecimal: ffffffff
    Signed decimal:        3
    Unsigned octal:        3
    Unsigned decimal:      3
    Unsigned hexadecimal: 3
    Signed decimal:       169
    Unsigned octal:       251
    Unsigned decimal:     169
    Unsigned hexadecimal: a9
An optional h character can be used to indicate that the argument corresponding to one of the above conversions is a short int (such as %hd) or unsigned short int (such as %hu). Likewise, an optional l character can be used to indicate a long int or unsigned long int.
Floating-point numbers
The conversion specifications for floating-point numbers are as follows:
%f
The argument, of type double, is converted to decimal notation in the style [-]ddd.ddd. By default, six decimal digits are output.
%E or %e
The argument, of type double, is converted to decimal notation in the style [-]d.dddE±dd, where there is always one digit before the decimal point. By default, there are six digits after the decimal point. The E conversion causes an E to be used in the output, and the e conversion causes an e to be used.
%G or %g
The argument, of type double, is converted to decimal notation in either of the above two styles, depending on the number of significant digits in the result.
Example 4-5 shows some examples of how these conversion specifications are used.
Example 4-5:  printf-float
#include <stdio.h>
#define N   4
double numbers[N] = { 0, -1.234, 67.890, 1234567.98765 };
int
main(int argc, char **argv)
{
    int i;
    for (i = 0; i < N; i++) {
        printf("f notation: %f\n", numbers[i]);
        printf("e notation: %e\n", numbers[i]);
        printf("g notation: %g\n\n", numbers[i]);
    }
    exit(0);
}
    % printf-float
    f notation: 0.000000
    e notation: 0.000000e+00
    g notation: 0
    f notation: -1.234000
    e notation: -1.234000e+00
    g notation: -1.234
    f notation: 67.890000
    e notation: 6.789000e+01
    g notation: 67.89
    f notation: 1234567.987650
    e notation: 1.234568e+06
    g notation: 1.23457e+06
An optional L character can be used to indicate that the argument corresponding to one of the above conversions is a long double (such as %Lf).
Characters and character strings
The conversion specifications for characters and character strings are as follows:
%c
The argument, of type int, is converted to an unsigned char and printed.
%s
The argument, a pointer to a character string, is copied to the output character-by-character up to (but not including) a terminating null character.
%%
This specification allows a percent sign to be printed; no argument is converted.
Field width and precision
Example 4-6 shows a small program that prints out the cost of purchasing some number of items.
Example 4-6:  cost
#include <stdio.h>
#define COST_PER_ITEM   1.25
void    printCost(int);
int
main(int argc, char **argv)
{
    int i;
    for (i = 1; i < 1000; i *= 10)
        printCost(i);
    exit(0);
}
void
printCost(int n)
{
    printf("Cost of %d items at $%f each = $%f\n", n, COST_PER_ITEM,
           n * COST_PER_ITEM);
}
    % cost
    Cost of 1 items at $1.250000 each = $1.250000
    Cost of 10 items at $1.250000 each = $12.500000
    Cost of 100 items at $1.250000 each = $125.000000
There are a couple of problems with this example. First, because the numbers representing the quantity of items you want to purchase are of different sizes, the equal signs don't line up, making the total prices difficult to compare easily. Second, because you're dealing with dollars and cents, you really want only two decimal places on each of the dollar amounts.
You can solve the first of these problems by using a field width. A field width specifies how many character positions should be used by a specific output conversion. If you change the %d in the format string to %3d, then you are telling printf to print each integer in a field three characters wide:
    Cost of   1 items at $1.250000 each = $1.250000
    Cost of  10 items at $1.250000 each = $12.500000
    Cost of 100 items at $1.250000 each = $125.000000
If you specify a positive number as a field width, the output is right justified in the field. If you use a negative number, as in %-3d, the output is left justified:
    Cost of 1   items at $1.250000 each = $1.250000
    Cost of 10  items at $1.250000 each = $12.500000
    Cost of 100 items at $1.250000 each = $125.000000
If you specify a leading zero in the field width, as in %03d, the output is padded with zeros instead of spaces:
    Cost of 001 items at $1.250000 each = $1.250000
    Cost of 010 items at $1.250000 each = $12.500000
    Cost of 100 items at $1.250000 each = $125.000000
To fix the second problem (the number of decimal places), you can use a precision specification. The precision is specified with a decimal point and then a number, and it indicates:
 For the d, i, o, u, x, and X conversions, the minimum number of digits to appear (the field is padded with leading zeros)
 For the e, E, and f conversions, the number of digits to appear after the decimal point
 For the g and G conversions, the number of significant digits
 For the s conversion, the maximum number of characters to be copied from the string
So, you can fix the printing of the cost per item by changing the %f to %.2f:
    Cost of   1 items at $1.25 each = $1.250000
    Cost of  10 items at $1.25 each = $12.500000
    Cost of 100 items at $1.25 each = $125.000000
To fix the total cost, you need not only print just two decimal digits, but you also need to get the decimal points to line up. To do this, use a field width and a precision. Since the largest number occupies six character positions, you can change the %f to %6.2f. Example 4-7 shows the final result of all of these changes.
Example 4-7:  cost-fmt
#include <stdio.h>
#define COST_PER_ITEM   1.25
void    printCost(int);
int
main(int argc, char **argv)
{
    int i;
    for (i = 1; i < 1000; i *= 10)
        printCost(i);
    exit(0);
}
void
printCost(int n)
{
    printf("Cost of %3d items at $%.2f each = $%6.2f\n", n, COST_PER_ITEM,
           n * COST_PER_ITEM);
}
    % cost-fmt
    Cost of   1 items at $1.25 each = $  1.25
    Cost of  10 items at $1.25 each = $ 12.50
    Cost of 100 items at $1.25 each = $125.00
You can also specify both field widths and precisions with an asterisk character (*) instead of a number. In this case, the field width or precision is read from the next argument in the argument list. For example:
    double n;
    int fieldwidth, precision;
    fieldwidth = 10;
    precision = 4;
    printf("%*.*f\n", fieldwidth, precision, n);
Note that the field width and precision precede the value to be printed in the argument list.
Variable argument lists
Most newer versions of the Standard I/O Library offer a set of printf functions that accept varargs-style argument lists instead of explicit lists of arguments:
    #include <stdarg.h>
    #include <stdio.h>
    int vprintf(const char *format, va_list ap);
    int vfprintf(FILE *stream, const char *format, va_list ap);
    int vsprintf(char *s, const char *format, va_list ap);
These functions make calling the functions from routines that accept a variable number of arguments much easier. For example, to create a function error that works just like printf except that it always prepends the name of the program to its output, you could use the following code:
#include <stdarg.h>
#include <stdio.h>
void
error(const char *format, ...)
{
    va_list ap;
    extern char *programName;
    va_start(ap, format);
    fprintf(stderr, "%s: ", programName);
    vfprintf(stderr, format, ap);
    va_end(ap);
}
The scanf Functions
The scanf functions allow data in almost any format to be read:
    #include <stdio.h>
    int scanf(const char *format, ...);
    int fscanf(FILE *stream, const char *format, ...);
    int sscanf(const char *s, const char *format, ...);
All three functions read characters, interpret them according to the instructions contained in the format string, and store the results in their arguments. The scanf function reads from the standard input, the fscanf function reads from the referenced stream, and the sscanf function copies its input from the array of characters pointed to by s. The number of arguments passed to each of these functions can vary, because the contents of the format string specify unambiguously how many arguments there are. Each function returns the number of input items successfully matched and assigned. This number can be zero if the input does not match the format string or if end-of-file is encountered prematurely. If end-of-file is encountered before the first matching failure or conversion is performed, the constant EOF is returned.
The format string can contain three types of characters:
 Whitespace characters (spaces, tabs, newlines, and form feeds) that, except in two cases described below, cause input to be read up to the next non-whitespace character
 An ordinary character (not %) that must match the next input character
 Conversion specifications
A conversion specification, in its simplest form, is a percent sign (%) followed by a single character that indicates the type of conversion to be performed. For each conversion specification, another argument is passed to the scanf function following format. The arguments are passed in the same order that their conversion specifications appear.
There are three basic data types that can be specified in a conversion specification: integers, floating-point numbers, and characters and character strings.
Integers
The conversion specifications for integers are as follows:
%d
Matches an optionally signed decimal integer. The corresponding argument should be a pointer to a variable of type int.
%i
Matches an optionally signed integer, whose format is interpreted in the same fashion as strtol with a base argument of 0 (strtol is described in Chapter 2, Utility Routines). That is, numbers starting with 0 are taken to be octal, numbers starting with 0x or 0X are taken to be hexadecimal, and all others are taken to be decimal. The corresponding argument should be a pointer to a variable of type int. The %i specification is specific to ANSI C.
%o
Matches an optionally signed octal integer. The corresponding argument should be a pointer to a variable of type unsigned int.
%u
Matches an optionally signed decimal integer. The corresponding argument should be a pointer to a variable of type unsigned int.
%x
Matches an optionally signed hexadecimal integer. The corresponding argument should be a pointer to a variable of type unsigned int.
Example 4-8 shows an example of how to use the %d specification. It reads in lines telling how many quarters, dimes, and nickels you have and prints out the total amount of money.
Example 4-8:  scanf-int
#include <stdio.h>
int
main(int argc, char **argv)
{
    double total;
    int n, quarters, dimes, nickels;
    for (;;) {
        printf("Enter a line like:\n");
        printf("%%d quarters, %%d dimes, %%d nickels\n");
        printf("--> ");
        n = scanf("%d quarters, %d dimes, %d nickels", &quarters, &dimes,
                  &nickels);
        if (n != 3)
            exit(0);
        total = quarters * 0.25 + dimes * 0.10 + nickels * 0.05;
        printf("You have: $ %.2f\n\n", total);
    }
}
    % scanf-int
    Enter a line like:
    %d quarters, %d dimes, %d nickels
    --> 3 quarters, 2 dimes, 1 nickels
    You have: $ 1.00
    Enter a line like:
    %d quarters, %d dimes, %d nickels
    --> 6 quarters, 0 dimes, 2 nickels
    You have: $ 1.60
    Enter a line like:
    %d quarters, %d dimes, %d nickels
    --> 0 quarters, 2 dimes, 9 nickels
    You have: $ 0.65
    Enter a line like:
    %d quarters, %d dimes, %d nickels
    --> ^D
You can use an optional h to indicate that the argument corresponding to one of the above conversions is a pointer to a short int (such as %hd) or unsigned short int (such as %hu). Likewise, you can use an optional l character to indicate a long int or unsigned long int.
Floating-point numbers
The conversion specifications for floating-point numbers are as follows:
%e or %f or %g
Matches an optionally signed floating-point number, in any of the formats produced by the corresponding printf output conversions. The corresponding argument should be a pointer to a variable of type float.
An optional l character may be used to indicate that the argument corresponding to the above conversions is a pointer to type double (e.g., %lf). Likewise, an optional L may be used to indicate a pointer to type long double.
This brings up an important difference between printf and scanf. Since all floating-point arguments to printf are passed by value, it doesn't matter whether they are of type float or type double—either way, C's argument-type promotion rules make them all doubles inside printf. However, because scanf's arguments are all passed by reference (that is, pointers are used), the type promotion rules do not apply. You must specifically tell scanf whether you're giving it a pointer to an argument of type float or an argument of type double. This is a common source of problems that you should be careful to avoid.
Characters and character strings
The conversion specifications for characters and character strings are as follows:
%c
Matches a sequence of characters of the number specified by the field width (see below). If no field width is specified, matches one character. The corresponding argument should be a pointer of type char * that points to an array large enough to accept the sequence. No terminating null character is added. The normal skip over whitespace is suppressed during this conversion.
%s
A character string is expected. The corresponding argument should be a pointer of type char * and should point to an array of characters large enough to hold the string and a terminating null character. The input field is terminated by a whitespace character.
%[scanlist]
Matches a nonempty sequence of characters from a set of expected characters called the scanset. The corresponding argument should be a pointer of type char * and should point to an array of characters large enough to accept the sequence and a terminating null character. The characters between the brackets, called the scanlist, comprise the scanset unless the first character after the left bracket is a circumflex (^), in which case the scanset comprises all the characters that do not appear in the scanlist. A right bracket in the scanlist must immediately follow the left bracket or the circumflex. You can specify a range of characters by separating the first and last characters in the range with a hyphen; for example, %[0-9] matches a string of digits. A hyphen character in the scanlist should be either the first or last character in the list.
%%
This specification allows a percent sign to be matched in the input; no argument assignment is performed.
Field widths
As with printf, you can use a field width to tell scanf how wide an expected field should be. This is particularly useful with the %c conversion, which can be told how many characters to read in.
Note, however, that field widths used with the %s conversion do not work quite as you might expect. Many programmers expect %12s to read in the first 12 characters of a string regardless of the string's length. However, this is not the case, because %s does not consider anything but whitespace as a field terminator. To obtain the desired behavior, use %12c instead. Don't forget that the %c does not add a terminating null character.
Instead of a field width, you can use an asterisk character (*). However, unlike the asterisk in printf (which indicates that the field width should be obtained from a parameter) this asterisk indicates that the field it is attached to should be skipped over in the input rather than assigned to a variable.
Porting Notes
The printf and scanf functions are generally pretty standard across all platforms, provided that you stick to the conversions described in this chapter. The only exception to this is the %i conversion, which is specific to ANSI C. There are a number of other conversion specifications and modifiers that are much less widespread; indeed, the ANSI C standard introduced a number of them itself. These are described in the manual pages for your specific version of UNIX and will not be used in this book. Although they are fine for local programs, those other conversions and modifiers should not be used if portability is an issue.

Previous SectionNext Section
Books24x7.com, Inc © 2000 –  Feedback