Add Book to My BookshelfPurchase This Book Online

Appendix B - Accessing Filesystem Data Structures

UNIX Systems Programming for SVR4
David A. Curry
 Copyright © 1996 O'Reilly & Associates, Inc.

Reading Filesystem Data Structures
There are certain tasks for which it is preferable to access a filesystem by reading the disk directly, rather than going through the operating system kernel. The most common of these is filesystem backups. The principal reason for this type of access is speed; it is much faster to read the disk directly. It is also the only way to read a file that contains “holes” and obtain only the actual disk blocks in use.
Reading the disk directly, however, is complex. The program must understand the layout of the filesystem data structures on the disk, and must be able to interpret a number of “private” bits of information correctly. Because it bypasses all security mechanisms (file ownership and permissions bits), this operation is usually restricted to the superuser (by setting the ownership and permissions of the block and character special devices for the filesystem).
Two common on-disk filesystems have been developed over the years; the original filesystem as invented by Ken Thompson and Dennis Ritchie, and the Berkeley Fast File System, developed by Kirk McKusick, Bill Joy, Sam Leffler, and Robert Fabry. In SVR4, both filesystems are supported: the (slightly modified) original is called the “System V File System,” and the Fast File System is called the “UNIX File System.” Solaris 2.x supports only the Fast File System (“UNIX File System”); support for the “System V File System” has been removed. In this section we will only discuss the Fast File System, since that is by far the more popular of the two. The discussion applies for the most part to the older filesystem as well, although the details are different (generally, the older filesystem is somewhat simpler to implement, but it is also substantially less efficient).
 NoteSilicon Graphics uses their own filesystem format, the Extended File System (EFS). Although it is fairly similar to the UFS filesystem described in this seciton, there are some differences.
Disk Terminology
In order to understand how the filesystem is laid out on the disk, it is first necessary to understand a little bit about how a disk drive works.
A disk drive contains one or more platters, on which data is stored. Each platter is a circular piece of metal with a hole in the middle, much like a phonograph record or compact disc. The platter is coated with a substance that responds to magnetic fields, similar to the coating on a video tape. The platter(s) are mounted on a spindle, with gaps between them. Each platter has two surfaces on which data can be recorded, but the outer surfaces of the top and bottom platters are usually not used.
There is one read/write head for each platter surface in the disk drive. Usually, the heads are mounted to a common assembly so that they all move together, although this is not always the case. The heads move in and out from the edge to the center of the platters; there is no side-to-side motion. During a read/write operation, the heads are held stationary over a given section of the platters while the platters rotate at a high speed (several thousand revolutions per minute) underneath them.
The area on one side of a single platter that can be read or written without moving the head is called a track. Tracks are concentric circles, and each time a platter completes a full revolution, an entire track has passed under the read/write head. There may be anywhere from a few hundred to a few thousand tracks on each side of each platter. If each track is extended up and down to include the same track on all the other platters, this is called a cylinder. Thus, there are the same number of cylinders on the disk drive as there are tracks on a single platter. For a six-platter disk drive, there are ten tracks in each cylinder (remember, the outer surfaces of the top and bottom platters are not used).
Tracks are further subdivided into sectors. Each sector is 512 bytes in size, and is the smallest addressable unit on a disk drive. Thus, when a file that is fifteen bytes long is stored on the disk, it actually consumes 512 bytes of space. The term disk block (or just block) is often used as a synonym for sector, but this term is often ambiguous and should be avoided if possible.
Information is recorded on the tracks of a disk by writing data into one or more sectors. To perform this operation, the disk must be told the head number, track number, and sector number where the data is to be stored. When a write (or read) operation begins, the disk must first position the head assembly over the proper track. It then has to wait for the proper sector to arrive under the read/write head. Once this occurs, the data transfer can take place. There are, thus, three factors affecting the rate at which a disk can transfer data:
 seek time—the amount of time it takes to position the head assembly over the proper track
 latency time—the amount of time it takes for the right sector to arrive under the heads
 transfer rate—the speed at which the drive can transfer the data to or from the disk, given that the heads are in position
(Additional factors affecting the final transfer rate include the speed of the disk controller, the speed of the system's input/output bus, and the speed of the system's memory; but these are outside the control of the disk manufacturer.)
The Super Block
The super block is the most important part of a filesystem. It contains all of the information necessary to locate the other filesystem data structures on the disk. Without the super block to indicate where these data structures are located, the filesystem would be a meaningless collection of bits. Because the super block is so critical to the operation of the filesystem, it is replicated in several places on the disk when the filesystem is first created. Since the critical information in the super block does not change, it is not necessary to update these copies.
The super block structure is declared in the include file sys/fs/ufs_fs.h:
    struct  fs {
        struct fs *fs_link;         /* linked list of filesystems        */
        struct fs *fs_rlink;        /* used for incore super blocks      */
        daddr_t     fs_sblkno;      /* addr of super-block in filesys    */
        daddr_t     fs_cblkno;      /* offset of cyl-block in filesys    */
        daddr_t     fs_iblkno;      /* offset of inode-blocks in filesys */
        daddr_t     fs_dblkno;      /* offset of first data after CG     */
        long        fs_cgoffset;    /* cylinder group offset in cylinder */
        long        fs_cgmask;      /* used to calc mod FS_NTRAK         */
        time_t      fs_time;        /* last time written                 */
        long        fs_size;        /* number of blocks in FS            */
        long        fs_dsize;       /* number of data blocks in FS       */
        long        fs_ncg;         /* number of cylinder groups         */
        long        fs_bsize;       /* size of basic blocks in FS        */
        long        fs_fsize;       /* size of frag blocks in FS         */
        long        fs_frag;        /* number of frags in a block in FS  */
    /* these are configuration parameters */
        long        fs_minfree;     /* minimum percentage of free blocks */
        long        fs_rotdelay;    /* num of ms for optimal next block  */
        long        fs_rps;         /* disk revolutions per second       */
    /* these fields can be computed from the others */
        long        fs_bmask;       /* ``blkoff'' calc of blk offsets    */
        long        fs_fmask;       /* ``fragoff'' calc of frag offsets  */
        long        fs_bshift;      /* ``lblkno'' calc of logical blkno  */
        long        fs_fshift;      /* ``numfrags'' calc number of frags */
    /* these are configuration parameters */
        long        fs_maxcontig;   /* max number of contiguous blks     */
        long        fs_maxbpg;      /* max number of blks per cyl group  */
    /* these fields can be computed from the others */
        long        fs_fragshift;   /* block to frag shift               */
        long        fs_fsbtodb;     /* fsbtodb and dbtofsb shift constant*/
        long        fs_sbsize;      /* actual size of super block        */
        long        fs_csmask;      /* csum block offset                 */
        long        fs_csshift;     /* csum block number                 */
        long        fs_nindir;      /* value of NINDIR                   */
        long        fs_inopb;       /* value of INOPB                    */
        long        fs_nspf;        /* value of NSPF                     */
    /* yet another configuration parameter */
        long        fs_optim;       /* optimization preference, see below*/
    /* these fields are derived from the hardware */
        long        fs_npsect;      /* # sectors/track including spares  */
        long        fs_interleave;  /* hardware sector interleave        */
        long        fs_trackskew;   /* sector 0 skew, per track          */
    /*a unique id for this filesystem (currently unused and unmaintained)*/
    /* In 4.3 Tahoe this space is used by fs_headswitch and fs_trkseek   */
    /* Neither of those fields is used in the Tahoe code right now but   */
    /* there could be problems if they are.                              */
        long        fs_id[2];       /* filesystem id                    */
    /* sizes determined by number of cylinder groups and their sizes     */
        daddr_t     fs_csaddr;      /* blk addr of cyl grp summary area  */
        long        fs_cssize;      /* size of cyl grp summary area      */
        long        fs_cgsize;      /* cylinder group size               */
    /* these fields are derived from the hardware */
        long        fs_ntrak;       /* tracks per cylinder               */
        long        fs_nsect;       /* sectors per track                 */
        long        fs_spc;         /* sectors per cylinder              */
    /* this comes from the disk driver partitioning */
        long        fs_ncyl;        /* cylinders in filesystem          */
    /* these fields can be computed from the others */
        long        fs_cpg;         /* cylinders per group               */
        long        fs_ipg;        /* inodes per group                   */
        long        fs_fpg;         /* blocks per group * fs_frag        */
    /* this data must be re-computed after crashes */
        struct      csum fs_cstotal;/* cylinder summary information      */
    /* these fields are cleared at mount time */
        char        fs_fmod;        /* super block modified flag         */
        char        fs_clean;       /* filesystem state flag            */
        char        fs_ronly;       /* mounted read-only flag            */
        char        fs_flags;      /* currently unused flag              */
        char        fs_fsmnt[MAXMNTLEN];/* name mounted on               */
    /* these fields retain the current block allocation info             */
        long        fs_cgrotor;     /* last cg searched                  */
        struct csum *fs_csp[MAXCSBUFS];/* list of fs_cs info buffers     */
        long        fs_cpc;         /* cyl per cycle in postbl           */
        short       fs_opostbl[16][8]; /* old rotation block list head   */
        long        fs_sparecon[55];/* reserved for future constants     */
    #define fs_ntime fs_sparecon[54]/* INCORE only; time in nanoseconds  */
        long        fs_state;      /* filesystem state time stamp       */
        quad        fs_qbmask;      /* ~fs_bmask - for use with quad size*/
        quad        fs_qfmask;      /* ~fs_fmask - for use with quad size*/
        long        fs_postblformat;/* format of positional layout tables*/
        long        fs_nrpos;      /* number of rotaional positions      */
        long        fs_postbloff;   /* (short) rotation block list head  */
        long        fs_rotbloff;    /* (u_char) blocks for each rotation */
        long        fs_magic;       /* magic number                      */
        u_char      fs_space[1];    /* list of blocks for each rotation  */
    /* actually longer */
    };
Most of these fields are not of interest here; they are used by the kernel for implementing the filesystem, but have little meaning outside of that context. Some of the fields that are of interest are:
fs_bsize
The filesystem block size, in bytes.
The filesystem block size is some multiple of the disk sector size; it is more efficient to access the filesystem in larger units. The usual block size for a Fast File System is 8192 bytes (the old filesystem uses 512 or 1024 bytes). Since the maximum size of any individual file in the Fast File System is 231 bytes, this limits the minimum filesystem block size to 4096 bytes.
fs_fsize
The filesystem fragment block size.
The larger block sizes introduced in the Fast File System, although they make input and output more efficient, also waste more of the disk. For example, if the smallest available block size were 4096 bytes, a 1027-byte file would waste 3069 bytes on the disk. For this reason, the Fast File System allows a filesystem block to be divided into two, four, or eight fragments of equal size. A file will take up some number of full filesystem blocks, and then the last little bit of the file will be written into one or more fragments. The other fragments in the same block may be used by some other file. Thus, with a 4094-byte filesystem block size and a 1024-byte fragment size, a 5120-byte file would consume one filesystem block (4096 bytes) and one fragment (1024 bytes). The other three fragments could be used by other files.
fs_frag
The number of fragments in a filesystem block.
This is easily computed from the above two parameters, but is precomputed here for speed.
fs_size
The total number of blocks in the filesystem, in units of the fragment size.
This includes the blocks used to store other bookkeeping information as well as the blocks actually used for data storage.
fs_dsize
The number of filesystem data blocks in the filesystem that may be used for data storage, in units of the fragment size.
fs_ncg
The number of cylinder groups in the filesystem.
See the following sections for a discussion of cylinder groups.
fs_ipg
The number of i-nodes per cylinder group.
See the following sections for a discussion of cylinder groups. This number, when multiplied by fs_ncg (above), gives the maximum number of distinct files that may be stored in the filesystem.
fs_fsmnt
The name of the mount point on which this filesystem is currently mounted. If the filesystem is not currently mounted, this will contain the name of the last mount point on which it was mounted.
I-Nodes
As explained in Chapter 5, Files and Directories the i-node structure is used to store all of the important information about a file, such as its type, owner, group, mode, size, number of links, last access time, last modification time. As we shall see below, the i-node also contains the addresses of all the disk blocks used to store the contents of the file.
There is one i-node for each file in the filesystem. The i-nodes are allocated when the filesystem is created, which means that the number of files that can be created in the filesystem is static. If all the i-nodes are used up with very tiny files, it is possible to have a large quantity of free data blocks that simply cannot be used (because no more files can be created). However, it is much more common to run out of data blocks before running out of i-nodes.
There are two i-node structures; the one stored on the disk, and the one used in memory by the kernel. The in-memory one has some extra fields, used for bookkeeping purposes. The common part between the two structures is stored in a structure of type struct icommon; the on-disk i-node is called a struct dinode. These structures are defined in the include file sys/fs/ufs_inode.h:
    struct  icommon {
        o_mode_t ic_smode;   /*  0: mode and type of file                */
        short       ic_nlink; /*  2: number of links to file             */
        o_uid_t     ic_suid;  /*  4: owner's user id                     */
        o_gid_t     ic_sgid;  /*  6: owner's group id                    */
        quad        ic_size;  /*  8: number of bytes in file             */
    #ifdef _KERNEL
        struct timeval ic_atime;/* 16: time last accessed                */
        struct timeval ic_mtime;/* 24: time last modified                */
        struct timeval ic_ctime;/* 32: last time inode changed           */
    #else
        time_t      ic_atime; /* 16: time last accessed                  */
        long        ic_atspare;
        time_t      ic_mtime; /* 24: time last modified                  */
        long        ic_mtspare;
        time_t      ic_ctime; /* 32: last time inode changed             */
        long        ic_ctspare;
    #endif
        daddr_t     ic_db[NDADDR];/* 40: disk block addresses            */
        daddr_t     ic_ib[NIADDR];/* 88: indirect blocks                 */
        long        ic_flags;    /* 100: status, currently unused        */
        long        ic_blocks;   /* 104: blocks actually held            */
        long        ic_gen;      /* 108: generation number               */
        long        ic_mode_reserv;/* 112: reserved                      */
        uid_t       ic_uid;      /* 116: long EFT version of uid         */
        gid_t       ic_gid;      /* 120: long EFT version of gid         */
        ulong       ic_oeftflag; /* 124: reserved                        */
    };
    struct dinode {
        union {
            struct  icommon di_icom;
            char    di_size[128];
        } di_un;
    };
    #define di_ic           di_un.di_icom
    #define di_mode         di_ic.ic_smode
    #define di_nlink        di_ic.ic_nlink
    #define di_uid          di_ic.ic_uid
    #define di_gid          di_ic.ic_gid
    #define di_smode        di_ic.ic_smode
    #define di_suid         di_ic.ic_suid
    #define di_sgid         di_ic.ic_sgid
    #if defined(vax) || defined(i386)
    #define di_size         di_ic.ic_size.val[0]
    #endif
    #if defined(mc68000) || defined(sparc) || defined(u3b2) || defined(u3b15)
    #define di_size         di_ic.ic_size.val[1]
    #endif
    #define di_db           di_ic.ic_db
    #define di_ib           di_ic.ic_ib
    #define di_atime        di_ic.ic_atime
    #define di_mtime        di_ic.ic_mtime
    #define di_ctime        di_ic.ic_ctime
    #define di_ordev        di_ic.ic_db[0]
    #define di_blocks       di_ic.ic_blocks
    #define di_gen          di_ic.ic_gen
The di_mode, di_nlink, di_uid, di_gid, di_size, di_atime, di_mtime, and di_ctime elements of this structure have the obvious meanings. These are copied to the struct stat structure when the stat or fstat functions are called.
The di_db array stores the addresses of the first NDADDR data blocks in the file. These are called direct blocks, because their addresses are stored directly in the i-node. The value of NDADDR can vary, but is usually 12. The di_ib array stores NIADDR levels of indirect blocks. As with NDADDR, the value of NIADDR can vary, but is almost always 3.
The first element of the di_ib array contains the address of a singly-indirect block. This block is used to store the addresses of more direct blocks. Thus, for a filesystem block size of 8192, the first level of indirection allows another 2048 data blocks to be addressed.
The second element of the di_ib array contains the address of a doubly-indirect block. This block is used to store the addresses of more singly-indirect blocks. Thus, for our 8192-byte block size, the second level of indirection allows another 2048 singly-indirect blocks to be addressed, which in turn means that over four million additional data blocks can be addressed.
The third element of the di_ib array, of course, contains the address of a triply indirect block. This block is used to store the addresses of more doubly-indirect blocks. A triply-indirect block allows over eight trillion more data blocks to be addressed.
Cylinder Groups
In the original UNIX filesystem, the i-node structures were stored on the disk immediately following the super block, and then the data blocks followed the i-nodes. This is a simple layout, but results in a lot of back-and-forth head motion when accessing files. The Fast File System solves this problem by dividing the disk into several groups of cylinders called, appropriately, cylinder groups.
Each cylinder group contains a structure defining bookkeeping information for the group, a redundant copy of the super block, some i-node structures, and data blocks. The cylinder group bookkeeping information includes a list of which i-nodes in the group are in use, and which disk blocks are not in use. The cylinder group concept allows a file's data blocks to be laid out as much as possible in a contiguous fashion, minimizing the rotational latency from one block to the next.
The cylinder group information is stored in a structure of type struct cg, defined in the include file sys/fs/ufs_fs.h:
    struct  cg {
        struct      cg *cg_link;      /* linked list of cyl groups       */
        long        cg_magic;         /* magic number                    */
        time_t      cg_time;          /* time last written               */
        long        cg_cgx;           /* we are the cgx'th cylinder group*/
        short       cg_ncyl;          /* number of cyl's this cg         */
        short       cg_niblk;         /* number of inode blocks this cg  */
        long        cg_ndblk;         /* number of data blocks this cg   */
        struct      csum cg_cs;       /* cylinder summary information    */
        long        cg_rotor;         /* position of last used block     */
        long        cg_frotor;        /* position of last used frag      */
        long        cg_irotor;        /* position of last used inode     */
        long        cg_frsum[MAXFRAG];/* counts of available frags       */
        long        cg_btotoff;       /* (long) block totals per cylinder*/
        long        cg_boff;          /* (short) free block positions    */
        long        cg_iusedoff;      /* (char) used inode map           */
        long        cg_freeoff;       /* (u_char) free block map         */
        long        cg_nextfreeoff;   /* (u_char) next available space   */
        long        cg_sparecon[16];  /* reserved for future use         */
        u_char      cg_space[1];      /* space for cylinder group maps   */
    /* actually longer */
    };
Putting it All Together
Example B-2 shows a program that reads filesystem data structures directly from the disk to calculate the disk usage for each user. Running this program requires the ability to read the character-special device for the filesystem, which usually means it must be run as the superuser.
This example will not work on IRIX 5.x, which uses the EFS filesystem.
Example B-2:  diskuse
#include <sys/param.h>
#include <sys/time.h>
#include <sys/vnode.h>
#include <sys/fs/ufs_inode.h>
#include <sys/fs/ufs_fs.h>
#include <unistd.h>
#include <limits.h>
#include <fcntl.h>
#include <stdio.h>
#include <sys/vfstab.h>
#include <pwd.h>
#define sblock  sb_un.u_sblock
/*
* We need a union to hold the super block, because it takes up an
* entire disk block (the smallest unit in which you can read), but
* the structure is not actually that big.
*/
union {
    struct fs u_sblock;
    char      u_dummy[SBSIZE];
} sb_un;
/*
* Keep track of usage with this.  We need to save the uid so that
* we can sort the array by number of blocks used.
*/
struct usage {
    int     u_uid;
    size_t  u_blocks;
} usageByUid[UID_MAX];
/*
* Name of the file system defaults file.
*/
char    *vfstabFile = "/etc/vfstab";
int diskuse(char *);
int bread(int, daddr_t, char *, int);
int compare(const void *, const void *);
int
main(int argc, char **argv)
{
    int n;
    FILE *fp;
    char *fsname;
    struct passwd *pwd;
    struct vfstab vfstab;
    /*
     * Open vfstab.
     */
    if ((fp = fopen(vfstabFile, "r")) == NULL) {
        perror(vfstabFile);
        exit(1);
    }
    /*
     * For each file system...
     */
    while (--argc) {
        fsname = *++argv;
        /*
         * Rewind vfstab.
         */
        rewind(fp);
        /*
         * Look up the file system so we can get the
         * character device it's on.
         */
        if (getvfsfile(fp, &vfstab, fsname) != 0) {
            fprintf(stderr, "%s: not found in %s.\n", fsname, vfstabFile);
            continue;
        }
       
        /*
         * Zero out our counters.
         */
        memset(usageByUid, 0, UID_MAX * sizeof(struct usage));
        /*
         * Put the uids in the counters.  The array is
         * initially in uid order, but later we sort it
         * by blocks.
         */
        for (n = 0; n < UID_MAX; n++)
            usageByUid[n].u_uid = n;
        /*
         * Calculate disk usage.
         */
        if (diskuse(vfstab.vfs_fsckdev) < 0)
            continue;
        /*
         * Sort the usage array by blocks.
         */
        qsort(usageByUid, UID_MAX, sizeof(struct usage), compare);
        /*
         * Print a header.
         */
        printf("%s (%s):\n", vfstab.vfs_mountp, vfstab.vfs_fsckdev);
        /*
         * Print the usage information.
         */
        for (n = 0; n < UID_MAX; n++) {
            /*
             * Skip users with no usage.
             */
            if (usageByUid[n].u_blocks == 0)
                continue;
            /*
             * Look up the login name.  If not found,
             * use the user-id.
             */
            if ((pwd = getpwuid(usageByUid[n].u_uid)) != NULL)
                printf("\t%-10s", pwd->pw_name);
            else
                printf("\t#%-9d", usageByUid[n].u_uid);
            /*
             * Print the usage.  The number we have is in
             * 512-byte (actually DEV_BSIZE) blocks; we
             * convert this to kbytes.
             */
            printf("\t%8d\n", usageByUid[n].u_blocks / 2);
        }
        putchar('\n');
    }
    fclose(fp);
    exit(0);
}
/*
* diskuse - tabulate disk usage for the named device.
*/
int
diskuse(char *device)
{
    ino_t ino;
    daddr_t iblk;
    int i, fd, nfiles;
    struct dinode itab[MAXBSIZE / sizeof(struct dinode)];
    /*
     * Open the device for reading.
     */
    if ((fd = open(device, O_RDONLY)) < 0) {
        perror(device);
        return(-1);
    }
    /*
     * Sync everything out to disk.
     */
    (void) sync();
    /*
     * Read in the superblock.
     */
    if (bread(fd, SBLOCK, (char *) &sblock, SBSIZE) < 0) {
        (void) close(fd);
        return(-1);
    }
    /*
     * The number of files (number of inodes) is equal to
     * the number of inodes per cylinder group times the
     * number of cylinder groups.
     */
    nfiles = sblock.fs_ipg * sblock.fs_ncg;
    for (ino = 0; ino < nfiles; ) {
        /*
         * Read in the inode table for this cylinder group.  The
         * fsbtodb macro converts a file system block number to
         * a disk block number.  The itod macro converts an inode
         * number to its file system block number.
         */
        iblk = fsbtodb(&sblock, itod(&sblock, ino));
        if (bread(fd, iblk, (char *) itab, sblock.fs_bsize) < 0) {
            (void) close(fd);
            return(-1);
        }
        /*
         * For each inode...
         */
        for (i = 0; i < INOPB(&sblock) && ino < nfiles; i++, ino++) {
            /*
             * Inodes 0 and 1 are not used.
             */
            if (ino < UFSROOTINO)
                continue;
            /*
             * Skip unallocated inodes.
             */
            if ((itab[i].di_mode & IFMT) == 0)
                continue;
            /*
             * Count the blocks as used.
             */
            usageByUid[itab[i].di_uid].u_blocks += itab[i].di_blocks;
        }
    }
    return(0);
}
/*
* bread - read count bytes into buf, starting at disk block blockno.
*/
int
bread(int fd, daddr_t blockno, char *buf, int count)
{
    /*
     * Seek to the right place.
     */
    if (lseek(fd, (long) blockno * DEV_BSIZE, SEEK_SET) < 0) {
        perror("lseek");
        return(-1);
    }
    /*
     * Read in the data.
     */
    if ((count = read(fd, buf, count)) < 0) {
        perror("read");
        return(-1);
    }
    return(count);
}
/*
* compare - compare two usage structures for qsort.
*/
int
compare(const void *a, const void *b)
{
    struct usage *aa, *bb;
    aa = (struct usage *) a;
    bb = (struct usage *) b;
   
    return(bb->u_blocks - aa->u_blocks);
}
    # diskuse /usr
    /usr (/dev/rdsk/c0t3d0s5):
            root               58148
            bin                52888
            lp                  2289
            uucp                 779
            sys                    1
            adm                    1
The program begins by using the getvfsfile function to determine the character-special device for the filesystem. It then opens this device for reading. The first thing read from the disk is the super block. This is used to determine the number of i-node structures in the filesystem, which is computed by multiplying the number of cylinder groups by the number of i-nodes per cylinder group. The program then enters a loop, reading through all the groups of i-nodes. On each pass through the outer loop, a block of i-nodes is read in from the disk. The inner loop iterates over the block of i-nodes, and, for each allocated i-node, records the number of blocks used by that file.
This program does not read the data blocks associated with each file, since the information it needs is recorded in the i-node itself. To read the data blocks, it is necessary to first read the direct blocks, and then the indirect blocks. This can be done in a recursive function, as shown by the code in Example B-3.
This example will not work on IRIX 5.x, which uses the EFS filesystem.
Example B-3:  readblocks.c
#include <sys/param.h>
#include <sys/time.h>
#include <sys/vnode.h>
#include <sys/fs/ufs_inode.h>
#include <sys/fs/ufs_fs.h>
#include <unistd.h>
int bread(int, daddr_t, char *, int);
int readDataBlocks(int, struct fs *, struct dinode *, int (*)(char *, int));
int readIndirect(int, struct fs *, daddr_t, int, int *, int (*)(char *, int));
int
readDataBlocks(int fd, struct fs *sblock, struct dinode *dp,
               int (*fn)(char *, int))
{
    int i, n, count;
    char block[MAXBSIZE];
    /*
     * Read the direct blocks.  There are NDADDR of them.
     */
    count = dp->di_size;
    for (i = 0; i < NDADDR && count > 0; i++) {
        /*
         * Read in the block from disk.
         */
        n = min(count, sblock->fs_bsize);
        if (bread(fd, fsbtodb(sblock, dp->di_db[i]), block, n) < 0)
            return(-1);
        count -= n;
        /*
         * Call the user's function on the block.
         */
        (*fn)(block, n);
    }
    /*
     * Now read the indirect blocks.  There are NIADDR of them.
     * Recall that the first address is a singly indirect block,
     * the second is a doubly indirect block, and so on.
     */
    for (i = 0; i < NIADDR && count > 0; i++) {
        if (readIndirect(fd, sblock, dp->di_ib[i], i, &count, fn) < 0)
            return(-1);
    }
    return(0);
}
int
readIndirect(int fd, struct fs *sblock, daddr_t blkno, int level, int *count,
             int (*fn)(char *, int))
{
    int i, n;
    char block[MAXBSIZE];
    daddr_t idblk[MAXBSIZE / sizeof(daddr_t)];
    /*
     * Read the block in from disk.
     */
    if (blkno)
        bread(fd, fsbtodb(sblock, blkno), (char *) idblk, sblock->fs_bsize);
    else
        memset(idblk, 0, sizeof(idblk));
    /*
     * If level is zero, then this block contains disk block
     * addresses (i.e., it's singly indirect).  If level is
     * non-zero, then this block contains addresses of more
     * indirect blocks.
     */
    if (level == 0) {
        /*
         * Read the disk blocks.  There are NINDIR
         * of them.
         */
        for (i = 0; i < NINDIR(sblock) && *count > 0; i++) {
            n = min(*count, sblock->fs_bsize);
           
            if (bread(fd, fsbtodb(sblock, idblk[i]), block, n) < 0)
                return(-1);
            *count -= n;
            /*
             * Call the user's function.
             */
            (*fn)(block, n);
        }
    }
    else {
        /*
         * Decrement the level.
         */
        level--;
        /*
         * Handle the next level of indirection by calling
         * ourselves recursively with each address in this
         * block.
         */
        for (i = 0; i < NINDIR(sblock); i++) {
            n = readIndirect(fd, sblock, idblk[i], level, count, fn);
            if (n < 0)
                return(-1);
        }
    }
    return(0);
}
/*
* bread - read count bytes into buf, starting at disk block blockno.
*/
int
bread(int fd, daddr_t blockno, char *buf, int count)
{
    /*
     * Seek to the right place.
     */
    if (lseek(fd, (long) blockno * DEV_BSIZE, SEEK_SET) < 0) {
        perror("lseek");
        return(-1);
    }
    /*
     * Read in the data.
     */
    if ((count = read(fd, buf, count)) < 0) {
        perror("read");
        return(-1);
    }
    return(count);
}

Previous SectionNext Section
Books24x7.com, Inc © 2000 –  Feedback