Add Book to My BookshelfPurchase This Book Online

Chapter 4 - The Standard I/O Library

UNIX Systems Programming for SVR4
David A. Curry
 Copyright © 1996 O'Reilly & Associates, Inc.

Character-Based Input and Output
The simplest way to perform I/O is to treat a file as an unformatted stream of bytes. The simplest way to process a stream of bytes is one byte at a time. The Standard I/O Library provides several functions to do this:
    #include <stdio.h>
    int fgetc(FILE *stream);
    int getc(FILE *stream);
    int getchar(void);
    int fputc(int c, FILE *stream);
    int putc(int c, FILE *stream);
    int putchar(int c);
The getc function returns the next character (byte) from the file referenced by stream. If there are no more characters to read (end-of-file is reached) or if an error occurs, getc returns the constant EOF.
The putc function converts c to an unsigned char and places it on stream. If it succeeds, putc returns c, otherwise it returns the constant EOF.
The getchar and putchar functions are actually just macros, defined as:
    #define getchar()     getc(stdin)
    #define putchar(c)    putc(c, stdout)
These are often used as shorthand in programs that read from the standard input, write to the standard output, or both.
The fgetc and fputc functions behave exactly like getc and putc. The difference is that getc and putc are usually implemented as preprocessor macros, while fgetc and fputc are implemented as genuine C-language functions. This means that fgetc and fputc run more slowly than getc and putc (because of the overhead incurred when making a function call), but they take up less space in the executable code because they are not expanded in-line as macros are. Their other advantage is that because they are functions, they can be passed as arguments to other functions.
All of these functions use variables of type int to hold byte values, rather than type char. This is necessary to allow the functions to return the constant EOF, which is usually defined as -1. If the char type were used instead of int, then reading a character with decimal value 255 could erroneously cause a program to think end-of-file had been reached, because the char value -1 can get sign-extended to the int value -1 during comparisons. For this reason, it is important to always use variables of type int when working with these functions.
Example 4-1 shows another version of the append program introduced in Chapter 3, Low-Level I/O Routines. The program takes two filenames as arguments. It opens the first file for reading, the second file for writing, and then appends the contents of the first file to the second file.
Example 4-1:  append-char
#include <stdio.h>
int
main(int argc, char **argv)
{
    int c;
    FILE *in, *out;
    if (argc != 3) {
        fprintf(stderr, "Usage: append-char file1 file2\n");
        exit(1);
    }
    /*
     * Open the first file for reading.
     */
    if ((in = fopen(argv[1], "r")) == NULL) {
        perror(argv[1]);
        exit(1);
    }
    /*
     * Open the second file for writing.
     */
    if ((out = fopen(argv[2], "a")) == NULL) {
        perror(argv[2]);
        exit(1);
    }
    /*
     * Copy data from the first file to the second, a character
     * at a time.
     */
    while ((c = getc(in)) != EOF)
        putc(c, out);
    fclose(out);
    fclose(in);
    exit(0);
}
    % cat a
    file a line one
    file a line two
    file a line three
    % cat b
    file b line one
    file b line two
    file b line three
    % append-char a b
    % cat b
    file b line one
    file b line two
    file b line three
    file a line one
    file a line two
    file a line three
The internal buffering providing by the Standard I/O Library means that, even though this example reads and writes one character at a time, the data is actually transferred to disk in large chunks. This is very important—it allows a program to process files one byte at a time while preserving the efficiency of reading and writing large buffers of data. If the program in the example above were converted to use the low-level I/O routines described in the previous chapter, it would become too inefficient to use on all but the smallest input files.
The buffering features provided by the Standard I/O Library allow the library to provide another interesting function, ungetc:
    #include <stdio.h>
    int ungetc(int c, FILE *stream);
This function is quite literally the reverse of getc, causing the character c to be placed back onto the input stream referenced by stream. The next call to getc returns the character contained in c.
This function is often used in programs that read from a file until they encounter a special character. When a program reads the special character, the collection of input is stopped for the current token. The special character is placed back onto the input with ungetc so that another part of the program can deal with it later. For example, consider a program that reads lists of words separated by colon (:) characters:
    while ((c = getc(fp)) != EOF) {
        if (c == ':') {
            word[nchars] = '\0';
            ungetc(c, fp);
            return;
        }
        word[nchars++] = c;
    }
As each character is read, it is checked to see if it is the colon character. If it is not, it is appended to the current word. If the colon character is read, the word is terminated, the colon is placed back on the input stream, and the subroutine returns. The next character read from the input stream is the colon character again.
There is actually no requirement that the character passed to ungetc be the same character that was just read from the stream; in reality, any character can be placed onto the input. However, the library only guarantees that up to four characters can be pushed back on the input stream. It is not possible, for example, to “unread” an entire file.

Previous SectionNext Section
Books24x7.com, Inc © 2000 –  Feedback