Add Book to My BookshelfPurchase This Book Online

Chapter 5 - Files and Directories

UNIX Systems Programming for SVR4
David A. Curry
 Copyright © 1996 O'Reilly & Associates, Inc.

Obtaining File Attributes
Something that systems-level programs need to do quite often is obtain information about files. For example, it's important to make sure that files are owned by the right user, that they have the right permission bits, and so forth. This is discussed further in the section on writing set-user-id programs in Chapter 8, Users and Groups.
Getting Information from an I-node
As mentioned earlier, all of the information about a file, except its name, is contained in an on-disk structure called an i-node. You can use three system calls to obtain this information:
    #include <sys/types.h>
    #include <sys/stat.h>
    int stat(const char *path, struct stat *st);
    int lstat(const char *path, struct stat *st);
    int fstat(int fd, struct stat *st);
The stat function is the most commonly used of the three. It obtains the information about the file whose name is given by path and places the data into the variable pointed to by st, which should be of type struct stat. The lstat function is identical to stat, except when the last component of the pathname is a symbolic link. In that case, stat returns information about the file to which the link points, while lstat returns information about the link itself. The fstat variant, rather than taking the name of a file, takes a file descriptor to an open file and returns information about that file.
In all cases, the file being asked about does not have to have any special permissions; that is, it is possible to obtain information about an unreadable file or an unwritable file. However, the file must be accessible to the calling program. All directories along the pathname contained in path must have the appropriate search permissions set. This is discussed in more detail later in this chapter. If stat, lstat, or fstat succeed, a value of zero is returned. If an error occurs, -1 is returned and an error code describing the reason for failure is placed in the external variable errno.
The struct stat data type is declared in the include file sys/stat.h. The file sys/types.h must also be included, to get the definitions of a number of basic operating system data types. The structure includes at least the following members:
    struct stat {
        dev_t     st_dev;
        ino_t     st_ino;
        mode_t    st_mode;
        nlink_t   st_nlink;
        uid_t     st_uid;
        gid_t     st_gid;
        dev_t     st_rdev;
        off_t     st_size;
        time_t    st_atime;
        time_t    st_mtime;
        time_t    st_ctime;
        long      st_blksize;
        long      st_blocks;
    };
The elements of the structure are interpreted as:
st_dev
The major and minor device numbers of the device on which the i-node associated with this file (and therefore the file itself) is stored. You can extract the major and minor device numbers from this field by using the major and minor macros, which are defined in sys/mkdev.h in Solaris 2.x and IRIX 5.x, and in sys/sysmacros.h in HP-UX 10.x.
st_ino
The i-node number of the file. The root directory of a filesystem always has i-node number 2, and the special directory lost+found in each filesystem always has i-node number 3. For historical reasons, i-node number 1 is never used. All other files in the filesystem have i-node numbers greater than 3; they are usually allocated in a lowest-available-number fashion.
st_mode
A set of bits encoding the file's type and access permissions; see the explanation following this list for how to interpret this data.
st_nlink
The number of links (filenames) associated with the file. A just-created file has the value 1 in this field, and the field is incremented by 1 for every hard link made to it. Symbolic links to the file are not counted here (nor anywhere else).
st_uid
The user ID of the user owning the file.
st_gid
The group ID of the group owning the file.
st_rdev
If the file is a character-special or block-special device file, this field contains the major and minor device numbers of the file (as opposed to st_dev, which contains the major and minor device numbers of the device on which the file is stored). If the file is not a character-special or block-special device file, the contents of this field are meaningless.
st_size
The size of the file, in bytes.
st_atime
The last time the file was accessed for reading, or, in the case of an executable program, the last time the file was executed, stored in UNIX time format (see Chapter 7, Time of Day Operations).
st_mtime
The last time the file was modified (written).
st_ctime
The last time the i-node was changed. This time is updated whenever the file's owner, group, or permission bits are changed. It is also updated whenever the file's modification time is changed, but not when the file's access time is changed. Note that, contrary to popular belief (and contrary to many UNIX programming books), this field does not represent the time the file was created. File creation time is not recorded anywhere in the filesystem.
st_blksize
A hint to programs about the best buffer size to use for I/O operations on this file. Generally speaking, it is most efficient to perform I/O with the same block size that is used by the filesystem itself (that way, the filesystem does not have to copy data between multiple buffers); this field allows programs that care to obtain this information. This field is undefined for character- and block-special device files.
st_blocks
The total number of physical blocks, 512 bytes each, actually allocated on the disk for this file. Note that this number can be much smaller than (st_size / 512) if there are holes in the file.
The st_mode field mentioned previously is important, because it encodes both the file's type and its permission bits. These can be extracted using a number of constants defined in sys/stat.h:
S_IFMT
This constant extracts the file type bits from the st_mode word; st_mode should be and ed with this and then compared against the following constants:
S_IFREG
Regular file
S_IFDIR
Directory
S_IFCHR
Character-special device file
S_IFBLK
Block-special device file
S_IFLNK
Symbolic link
S_IFIFO
FIFO file
S_IFSOCK
UNIX-domain socket
Newer, POSIX-compliant systems also define a set of macros that you can use to determine file type:
S_ISREG(st_mode)
If true, the file is a regular file.
S_ISDIR(st_mode)
If true, the file is a directory.
S_ISCHR(st_mode)
If true, the file is a character-special device file.
S_ISBLK(st_mode)
If true, the file is a block-special device file.
S_ISLNK(st_mode)
If true, the file is a symbolic link.
S_ISFIFO(st_mode)
If true, the file is a FIFO file.
S_ISSOCK(st_mode)
If true, the file is a UNIX-domain socket.
S_ISUID
If the result of and ing this constant with st_mode is non-zero, the file has the set-user-id-on-execution bit set (see below).
S_ISGID
If the result of and ing this constant with st_mode is non-zero, the file has the set-group-id-on-execution bit set (see below).
S_ISVTX
If the result of and ing this constant with st_mode is non-zero, the file has the “sticky bit” set (see below).
S_IREAD
By and ing this constant with st_mode, you can determine if the owner of the file has read permission. By right-shifting the constant three places (or left-shifting st_mode three places) and and ing the two, you can determine if the group owner of the file has read permission. And by right-shifting the constant six places (or left-shifting st_mode six places) and and ing, you can determine if the rest of the world has read permission. Newer, POSIX-compliant systems define three constants that you can use in place of shifting:
S_IRUSR
If the result of and ing this contant with st_mode is non-zero, the owner has read permission for the file.
S_IRGRP
If the result of and ing this constant with st_mode is non-zero, the group owner has read permission for the file.
S_IROTH
If the result of and ing this constant with st_mode is non-zero, the world (everyone except the owner and group owner) has read permission for the file.
S_IWRITE
By and ing this constant with st_mode, you can determine if the owner of the file has write permission. By right-shifting the constant three places (or left-shifting st_mode three places) and and ing the two, you can determine if the group owner of the file has write permission. By right-shifting the constant six places (or left-shifting st_mode six places) and and ing, you can determine if the rest of the world has write permission. Newer, POSIX-compliant systems define three constants that you can use in place of shifting:
S_IWUSR
If the result of and ing this contant with st_mode is non-zero, the owner has write permission for the file.
S_IWGRP
If the result of and ing this constant with st_mode is non-zero, the group owner has write permission for the file.
S_IWOTH
If the result of and ing this constant with st_mode is non-zero, the world (everyone except the owner and group owner) has write permission for the file.
S_IEXEC
By and ing this constant with st_mode, you can determine if the owner of the file has execute permission. By right-shifting the constant three places (or left-shifting st_mode three places) and and ing the two, you can determine if the group owner of the file has execute permission. By right-shifting the constant six places (or left-shifting st_mode six places) and and ing, you can determine if the rest of the world has execute permission. Newer, POSIX-compliant systems define three constants that you can use in place of shifting:
S_IXUSR
If the result of and ing this contant with st_mode is non-zero, the owner has execute permission for the file.
S_IXGRP
If the result of and ing this constant with st_mode is non-zero, the group owner has execute permission for the file.
S_IXOTH
If the result of and ing this constant with st_mode is non-zero, the world (everyone except the owner and group owner) has execute permission for the file.
Note that the concept of execute permission only makes sense for files. For directories, this bit implies permission to search the directory. You cannot access a file unless the search (execute) bit is set on the directory that contains it. Note also that read permission on a directory only lets you obtain the contents of the directory; it does not let you access them. A file can be accessible even though its parent directory is not readable; likewise, a file can be visible but inaccessible if its parent directory is not searchable.
All of these constants can seem pretty overwhelming, and by now you're probably a little confused about just what it is you're supposed to do with them. To clarify the material presented in this section, Example 5-1 shows a program that uses lstat to obtain and print information about each file named on the command line. This example shows the “old-fashioned way,” rather than using the POSIX-defined constants described previously. The POSIX constants, while more convenient, are not portable to older systems, and any code that you are porting to SVR4 is not likely to use them.
Example 5-1:  lstat
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/sysmacros.h>
#include <stdio.h>
char    *typeOfFile(mode_t);
char    *permOfFile(mode_t);
void     outputStatInfo(char *, struct stat *);
int
main(int argc, char **argv)
{
    char *filename;
    struct stat st;
    /*
     * For each file on the command line...
     */
    while (--argc) {
        filename = *++argv;
        /*
         * Find out about it.
         */
        if (lstat(filename, &st) < 0) {
            perror(filename);
            putchar('\n');
            continue;
        }
        /*
         * Print out the information.
         */
        outputStatInfo(filename, &st);
        putchar('\n');
    }
    exit(0);
}
/*
* outputStatInfo - print out the contents of the stat structure.
*/
void
outputStatInfo(char *filename, struct stat *st)
{
    printf("File Name:          %s\n", filename);
    printf("File Type:          %s\n", typeOfFile(st->st_mode));
    /*
     * If the file is not a device, print its size and optimal
     * i/o unit; otherwise print its major and minor device
     * numbers.
     */
    if (((st->st_mode & S_IFMT) != S_IFCHR) &&
        ((st->st_mode & S_IFMT) != S_IFBLK)) {
        printf("File Size:          %d bytes, %d blocks\n", st->st_size,
               st->st_blocks);
        printf("Optimum I/O Unit:   %d bytes\n", st->st_blksize);
    }
    else {
        printf("Device Numbers:     Major: %u   Minor: %u\n",
               major(st->st_rdev), minor(st->st_rdev));
    }
    /*
     * Print the permission bits in both "ls" format and
     * octal.
     */
    printf("Permission Bits:    %s (%04o)\n", permOfFile(st->st_mode),
           st->st_mode & 07777);
    printf("Inode Number:       %u\n", st->st_ino);
    printf("Owner User-Id:      %d\n", st->st_uid);
    printf("Owner Group-Id:     %d\n", st->st_gid);
    printf("Link Count:         %d\n", st->st_nlink);
    /*
     * Print the major and minor device numbers of the
     * file system that contains the file.
     */
    printf("File System Device: Major: %u   Minor: %u\n",
           major(st->st_dev), minor(st->st_dev));
    /*
     * Print the access, modification, and change times.
     * The ctime() function converts the time to a human-
     * readable format; it is described in Chapter 7,
     * "Time of Day Operations."
     */
    printf("Last Access:        %s", ctime(&st->st_atime));
    printf("Last Modification:  %s", ctime(&st->st_mtime));
    printf("Last I-Node Change: %s", ctime(&st->st_ctime));
}
/*
* typeOfFile - return the english description of the file type.
*/
char *
typeOfFile(mode_t mode)
{
    switch (mode & S_IFMT) {
    case S_IFREG:
        return("regular file");
    case S_IFDIR:
        return("directory");
    case S_IFCHR:
        return("character-special device");
    case S_IFBLK:
        return("block-special device");
    case S_IFLNK:
        return("symbolic link");
    case S_IFIFO:
        return("FIFO");
    case S_IFSOCK:
        return("UNIX-domain socket");
    }
    return("???");
}
/*
* permOfFile - return the file permissions in an "ls"-like string.
*/
char *
permOfFile(mode_t mode)
{
    int i;
    char *p;
    static char perms[10];
    p = perms;
    strcpy(perms, "---------");
    /*
     * The permission bits are three sets of three
     * bits: user read/write/exec, group read/write/exec,
     * other read/write/exec.  We deal with each set
     * of three bits in one pass through the loop.
     */
    for (i=0; i < 3; i++) {
        if (mode & (S_IREAD >> i*3))
            *p = 'r';
        p++;
        if (mode & (S_IWRITE >> i*3))
            *p = 'w';
        p++;
        if (mode & (S_IEXEC >> i*3))
            *p = 'x';
        p++;
    }
    /*
     * Put special codes in for set-user-id, set-group-id,
     * and the sticky bit.  (This part is incomplete; "ls"
     * uses some other letters as well for cases such as
     * set-user-id bit without execute bit, and so forth.)
     */
    if ((mode & S_ISUID) != 0)
        perms[2] = 's';
    if ((mode & S_ISGID) != 0)
        perms[5] = 's';
    if ((mode & S_ISVTX) != 0)
        perms[8] = 't';
    return(perms);
}
    % lstat lstat.c
    File Name:          lstat.c
    File Type:          regular file
    File Size:          3571 bytes, 8 blocks
    Optimum I/O Unit:   8192 bytes
    Permission Bits:    rw-r----- (0640)
    Inode Number:       21558
    Owner User-Id:      40
    Owner Group-Id:     1
    Link Count:         1
    Filesystem Device: Major: 32   Minor: 31
    Last Access:        Sun Feb 13 13:54:18 1994
    Last Modification:  Sun Feb 13 13:54:15 1994
    Last I-Node Change: Sun Feb 13 13:54:15 1994
The results that you get from running lstat on your version of lstat.c can vary a little from the example; the i-node number, owner and group, filesystem device numbers, and of course the times can be different. You should experiment with running lstat on a number of different files on your system, to be sure you understand what it does.
Getting Information from a Symbolic Link
To find out what a symbolic link points to, use the readlink function:
    #include <unistd.h>
    int readlink(const char *path, void *buf, size_t bufsiz);
The contents of the symbolic link named by path are placed into the buffer buf, whose size is given by bufsiz. The contents are not null-terminated when they are returned. If readlink succeeds, the number of bytes placed in buf are returned; otherwise, -1 is returned and an error code is placed in the external variable errno.
Sometimes, it is desirable to convert a pathname that may contain symbolic links into one that is known not to contain any symbolic links. One good reason for wanting to do this is that because symbolic links can cross filesystems, the concept of the parent directory is a bit confusing. For example, on Solaris 2.x systems, /bin is a symbolic link to /usr/bin. Try executing the following commands:
    % cd /bin
    % cd ..
    % pwd
The parent directory of /bin is /, so you would expect the output from pwd to be /. However, because /bin is actually a symbolic link to /usr/bin, the parent directory is actually /usr, which is what pwd tells you.
To obtain a path that contains no symbolic links from one that may or may not contain symbolic links, SVR4 provides a function called realpath:
    #include <stdlib.h>
    char *realpath(const char *filename, char *resolvedname);
If no error occurs while processing the pathname in filename, the “real” path is placed in resolvedname and a pointer to it is returned. If an error occurs, the constant NULL is returned, and resolvedname contains the name of the pathname component that produced the error.
The realpath function is not available in HP-UX 10.x.
Determining the Accessibility of a File
Determining the accessibilty of a file can be a tricky proposition. Certainly, the stat function can tell you the permission bits on a file, but that is not the same thing as telling you whether a file can actually be read (or written, or executed) by a user. For example, consider a world-readable file (mode 0444, or r- -r- -r- -) that is in a directory that is searchable only by its owner (mode 0700, or rwx------). The owner can read the file, but another user cannot. Even though the file has read permission for her, the directory that contains the file does not have access permission for her, so she cannot reach the file to open it. Thus, to properly test whether or not a file is accessible, you must check the complete path to the file from the root of the filesystem, one directory at a time. This requires some non-trivial programming to handle all the special cases.
Fortunately, the designers of UNIX foresaw this problem, and they created a function called access:
    #include <unistd.h>
    int access(const char *path, int amode);
The path parameter contains the pathname of the file whose access is to be checked, and amode contains some combination of the following constants, or ed together:
R_OK
Test for read permission.
W_OK
Test for write permission.
X_OK
Test for execute (search) permission.
F_OK
Test for existence of file.
If the user running the program has the access permissions in question, access returns zero. If the user does not have the proper access permissions, -1 is returned and errno is set to indicate the reason why. Note that access works properly even when called from a set-user-id or set-group-id program (see Chapter 8, Users and Groups), because it uses the real user ID and group ID to make its checks, not the effective user ID and group ID.

Previous SectionNext Section
Books24x7.com, Inc © 2000 –  Feedback