Add Book to My BookshelfPurchase This Book Online

Chapter 2 - Utility Routines

UNIX Systems Programming for SVR4
David A. Curry
 Copyright © 1996 O'Reilly & Associates, Inc.

Parsing Command-Line Arguments
Almost every UNIX command has arguments, and most commands follow a generally accepted set of rules for how these arguments are formatted:
 1.Command names must be between two and nine characters long.
 2.Command names must include only lowercase letters and digits.
 3.Option names must be one character long.
 4.You must precede all options with a dash ().
 5.You can group options with no arguments after a single dash (). This means that you can use either -a -b -c or -abc.
 6.You must precede the first option argument following an option with a tab or space character. This means that you must use -a arg; -aarg is illegal.
 7.Option arguments cannot be optional. This means that you cannot allow both -a and -a arg.
 8.You must separate groups of option arguments following an option by commas or by space or tab characters and quotes. This means that you must use either -a xxx,yyy,zzz or -a “xxx yyy zzz”.
 9.All options must precede operands on the command line. This means that command -a -b -c filename is legal, while command -a filename -b -c is not.
 10.You can use a double dash (––) to indicate the end of the options. This allows operands that begin with a dash.
 11.The order of the options relative to one another should not matter.
 12.The relative order of the operands can affect their significance in ways determined by the command with which they are used. This means that a command is allowed to assign meaning to the order of its operands; for example, the cp command takes its first operand as the input file and its second operand as its output file. Reversing the order of these operands produces different results.
 13.A dash () preceded and followed by a space character should only be used to mean standard input. This tells a program that generally reads from files, such as troff, to read from the standard input. It allows files to be read before processing the standard input.
Depending on how long you've been using UNIX and how many versions you've used, most of these rules, except perhaps number 8, should look familiar. Early versions of System V provided a library routine, getopt, that enforced most of these rules and allowed a program to easily parse command lines that followed the rules. Later versions provided a shell command, getopt, which enabled shell scripts to use these rules as well.
In SVR4, the getopt command is available, as well as a newer command that is built in to the shell, called getopts. Two library routines are provided as well: getopt, which enforces the rules described previously and parses command lines that follow these rules, and getsubopt, which enforces rule number 8 and parses option arguments that follow that rule. These functions are called as follows:
      #include <stdlib.h>
      int getopt(int argc, char * const *argv, const char *optstring);
      extern char *optarg;
      extern int optind, opterr, optopt;
      int getsubopt(char **optionp, const char * const *tokens, char **valuep);
optstring contains a list of characters that are legal options for the command. If the option letter is to be followed by an option argument, then the letter should be followed by a colon (:) in optstring.
When getopt is called, it returns the next option letter in argv that matches one of the letters in optstring. If the option letter has an argument associated with it (as indicated by a colon character in optstring), getopt sets the external variable optarg to point to the option argument.
The external variable optind contains the index into argv of the next argument to process; it is initialized to 1 before the first call to getopt. When all options are processed, getopt returns -1. The special option two dashes (––) can be used to delimit the end of the options; when it is encountered, getopt skips over it and returns -1. This is used to stop option processing before encountering non-option arguments that begin with a dash.
When getopt encounters an option letter not included in optstring or cannot find an argument after an option that should have one, it prints an error message and returns a question mark (?). The character that caused the error is placed in the external variable optopt. To disable getopt's printing of the error message, the external variable opterr should be set to zero.
getsubopt is used to parse the suboptions in an option argument initially parsed by getopt. These suboptions are separated by commas (unlike rule 8 above, getsubopt does not allow them to be separated by spaces), and consist either of a single token or a token-value pair, separated by an equal sign (=). Since commas delimit suboptions in the option string, they are not allowed to be part of the suboption or the value of a suboption.
When calling getsubopt, optionp is the address of a pointer to the suboption string, tokens is a pointer to an array of strings representing the possible token values the option string can contain, and valuep is the address of a character pointer that can be used to return any value following an equal sign.
getsubopt returns the index of the token (in the tokens array) that matched the suboption in the option string, or -1 if there was no match. If the suboption has a value associated with it, getsubopt updates valuep to point at the first character of the value; otherwise, it sets valuep to null. If optionp contains only one suboption, optionp is updated to point to the null character at the end of the string. Otherwise, the suboption is isolated by replacing the comma character with a null character, and optionp is updated to point to the next suboption.
Although this sounds relatively complicated, Example 2-11 should make this clear. Example 2-11 shows a program that uses getopt and getsubopt to parse its command line.
Example 2-11:  parse-cmdline
#include <stdlib.h>
#include <string.h>
/*
* Sub-options.
*/
char    *subopts[] = {
#define COLOR   0
    "color",
#define SOLID   1
    "solid",
    NULL
};
int
main(int argc, char **argv)
{
    int c;
    char buf[1024];
    extern int optind;
    extern char *optarg;
    char *options, *value;
    /*
     * Process the arguments.
     */
    while ((c = getopt(argc, argv, "cf:o:st")) != -1) {
        switch (c) {
        case 'c':
            outputLine("circle");
            break;
        case 'f':
            strcpy(buf, "filename: ");
            strcat(buf, optarg);
            outputLine(buf);
            break;
        case 's':
            outputLine("square");
            break;
        case 't':
            outputLine("triangle");
            break;
        case '?':
            outputLine("command line error");
            break;
        case 'o':
            options = optarg;
            /*
             * Process the sub-options.
             */
            while (*options != '\0') {
                switch (getsubopt(&options, subopts, &value)) {
                case COLOR:
                    if (value != NULL) {
                        strcpy(buf, "color: ");
                        strcat(buf, value);
                    }
                    else {
                        strcpy(buf, "missing color");
                    }
                    outputLine(buf);
                    break;
                case SOLID:
                    outputLine("solid");
                    break;
                default:
                    strcpy(buf, "unknown option: ");
                    strcat(buf, value);
                    outputLine(buf);
                    break;
                }
            }
            break;
        }
    }
    /*
     * Process extra arguments.
     */
    for (; optind < argc; optind++) {
        strcpy(buf, "extra argument: ");
        strcat(buf, argv[optind]);
        outputLine(buf);
    }
    exit(0);
}
     % parse-cmdline -c -f picture.out -o solid
     circle
     filename: picture.out
     solid
     % parse-cmdline -o color=red,solid -t
     color: red
     solid
     triangle
     % parse-cmdline -s -z
     square
     parse-cmdline: illegal option -- z
     command-line error
This program represents the argument-parsing section for a hypothetical graphics program that draws a circle, square, or triangle, as specified by the -c, -s, or -t arguments. The -f argument allows you to specify an output file; otherwise, the program writes to the standard output. The -o argument allows you to specify two options: solid, which indicates that the figure should be filled in instead of hollow, and color, which allows you to specify a color for the figure.
As shown in the third command invocation in the example, an illegal option (-z) produces an error message. As mentioned earlier, you can disable this message by setting the external variable opterr to zero. Note that the program also parses additional operands on the command line (for example, the command might require two additional arguments, the height and width of the figure); this is done by the last few lines of code.
Porting Notes
The use of getopt has never really caught on. Some people use it, other people don't. One of the primary arguments against it is that the arguments to many commands simply don't fit into the set of rules that it enforces. Indeed, in SVR4, the modification of a number of commands to use getopt resulted in noticeable changes to the command lines with which most users are familiar.
Most versions of System V have some version of getopt, but getsubopt is new to SVR4 and is not very portable. Older BSD systems usually do not have either function, although a number of vendors have added one or both of them to their System V compatibility libraries. However, there are several public domain implementations of getopt available; if you really want to use it, consider obtaining one of these and distributing it with your program.

Previous SectionNext Section
Books24x7.com, Inc © 2000 –  Feedback