Add Book to My BookshelfPurchase This Book Online

Chapter 11 - Processes

UNIX Systems Programming for SVR4
David A. Curry
 Copyright © 1996 O'Reilly & Associates, Inc.

Process Concepts
To manipulate processes successfully, you must understand a number of basic concepts. These concepts are described below.
Process Identifiers
Each process in the system has a unique process identifier, or process ID. The process ID is a positive integer, usually in the range from 0 to about 32,000. Each time a new process is created, the operating system assigns it the next sequential, unused process ID. When the maximum process ID is reached, the numbers wrap around to 0 again. The process ID is the only well-known (i.e., accessible outside the operating system itself) identifier of a process. A process can determine its process ID by using the getpid function:
    #include <sys/types.h>
    #include <unistd.h>
    pid_t getpid(void);
The process ID is actually used as an index into an array of structures of type struct proc (see the include file sys/proc.h) called the process table. Each array element in the process table describes one process. Each struct proc structure contains all of the state information about a process, including its real and effective user- and group IDs, its signal mask, its list of pending signals, the command name, the amount of processor time used so far, pointers into the open file table, and all sorts of other information.
New processes come into being when existing processes create them. When a process creates another process, the new process is said to be a child of the existing process. Similarly, the existing process is said to be the parent of the new process. The parent process ID of a process is the process ID of the process that created it. A process can usually learn its parent's process ID by using the getppid function:
    #include <sys/types.h>
    #include <unistd.h>
    pid_t getppid(void);
System processes
Generally, there is no direct correspondence between process IDs and programs. When a program is executed, it just gets the next available process ID. Execute the program more than once, and it will have a different process ID each time. However, there are a few, usually fewer than five, special processes that always have the same process ID. These processes are called system processes.
The process with process-id-0 is the system scheduler, usually called sched or swapper. It is responsible for allocating those few-millisecond time slices to all the other processes on the system. The scheduler is not a command in the usual sense; there is no corresponding program on the disk for it. It is a part of the operating system kernel itself.
The process with process-id-1 is the init process. This program is responsible for bringing the system up after a reboot. It executes the /etc/rc files, and brings the system to a specific state (usually multiuser operation). The init process is a regular user-level process (i.e., it's a command that can be executed). After starting up the system, init stays around to perform some process-related bookkeeping tasks. If init is killed (or otherwise exits), the system will shut down.
On modern versions of UNIX that support virtual memory, the process with process-id-2 is usually the page daemon, called pagedaemon or pageout. This is a kernel process like the scheduler, and is responsible for moving unused pages of memory out to disk so that other programs may use them.
Termination Status
Eventually, most processes terminate normally, finishing whatever they're intended to do. There are three ways for a process to terminate normally; it may:
 1.Execute a return from the main function.
 2.Call the exit function (described later in this chapter). This function is defined by ANSI C; it calls any exit handlers that have been defined, and closes all Standard I/O Library streams.
 3.Call the _exit function. This function is not usually called directly, but is called by exit. It is responsible for cleaning up operating system-specific resources used by the process; since ANSI C is operating system-independent, it cannot specify these functions.
There are also two ways in which a process can terminate abnormally; it may:
 1.Call the abort function (see Chapter 16, Miscellaneous Routines).
 2.Receive a signal from itself, from another process, or from the operating system. The signal can cause the program to terminate, sometimes with an accompanying core dump.
When a program terminates, the operating system provides a termination status to the process' parent. The termination status indicates whether the process terminated normally or abnormally. If the process terminated normally, the termination status provides the parent process with an exit status for the process; the exit status is used by some programs to indicate success, failure, and other events. If the process terminated abnormally, the termination status includes information about how the program terminated (what signal it received) and whether or not a core dump was produced.
The termination status of a child process is returned to the parent process when the parent calls the wait function, or one of its derivatives. These functions are described later in the chapter. The important point to understand here is that it is up to the parent to ask for the termination status of a child—it can do this as soon as the child terminates, several minutes or hours later, or even not at all.
Zombie processes
Since it is up to the parent process to request the termination status of a child process, what happens when the child process terminates? The system can't keep the entire process around; resources such as memory, open files, process table slots (process IDs), and so forth would rapidly be exhausted. On the other hand, it can't get rid of the process entirely, either, because then the termination status would not be available to return to the parent process.
To resolve this dilemma, UNIX compromises. When a process terminates, the operating system frees up all of the resources used by the process except the process table entry. The termination status of the process is stored in the process table entry, where it can be retrieved later by the parent. When the parent process finally does issue a call to wait or a similar function, the termination status is returned and the process table slot can be freed for reuse.
During the time between when a process terminates and the parent picks up its termination status, the process is called a zombie process. All of its resources have been freed except for the process table entry, and thus it is dead, still walking around in the system. Zombie processes are usually labeled as “<defunct>” in the output from the ps command and have a process status of “Z.”
Orphaned processes
When a process terminates before its parent, it becomes a zombie process until the parent picks up its termination status. But what happens when the parent terminates before the child process? This is not an abnormal event; in fact, it happens all the time. Does the child process still have a parent? What happens if the child calls getppid?
UNIX handles this situation by arranging for the init process to become the new parent process of any process whose real parent terminates. When a process terminates, the operating system goes through the list of all active processes, looking for any whose parent is the terminating process. If it finds any, it sets those processes' parent process ID to 1 (the init process).
What happens when a process that has been inherited by init terminates? Since its original parent is no longer around to pick up its termination status, does it become a zombie forever? Fortunately, no. One of the functions of the init process is to call one of the wait functions each time one of its child processes terminates. In this way it picks up these orphaned processes' termination statuses (it simply discards them), and keeps the system from becoming clogged with zombie processes.
Process Groups
In addition to having a process ID, each process is also a member of a process group. A process group is a collection of one or more processes, and is identified by a unique positive integer called a process group ID. A process may obtain its process group ID by calling the getpgrp function:
    #include <sys/types.h>
    #include <unistd.h>
    pid_t getpgrp(void);
The processes in a process group are usually related in some way. Process groups were introduced in Berkeley UNIX to implement job control. Shells that perform job control, such as the C shell or the Korn shell, usually place all of the commands in a pipeline into a single process group. For example, in the command
    % eqn myreport | tbl | troff | psdit | lp
each program (eqn, tbl, troff, psdit, and lp) would be running as a separate process with a separate process ID (e.g., 123, 124, 125, 126, and 127). However, all five processes would have the same process group ID, e.g., 127. This allows the shell to treat those five processes as a single entity (a “job”) for purposes of stopping them, continuing them, and moving them between the foreground and the background.
The process group leader
Each process group starts out with a process group leader. This is the process whose process group ID is equal to its process ID. It is, of course, possible for the process group leader to terminate at any time. The process group, however, remains in existence until the last process in that process group terminates. When a process group is created as the result of a pipeline, the last process in the pipeline is usually the process group leader. There is no deep and meaningful reason for this; it is simply a side effect of the way pipelines are created.
Sessions
The POSIX standard introduced still another construct, called a session. A session is a collection of one or more process groups. The idea is that while each process group is a group of related processes (such as a pipeline), a session is a group of related process groups (such as all the jobs currently being run by the user logged in on a particular terminal). Sessions exist only for job control, serving mainly to fix some deficiencies in the Berkeley job control implementation (which only used process groups).
The session leader
When a process creates a new session, it becomes the leader of that session. The session leader has certain privileges that other members of the session do not (see below).
In the POSIX standard, there is no concept of a session ID like that of the process ID and process group ID. However, SVR4 defines such an identifier; it is equal to the process ID of the session leader. A process can be identified as a session leader if its process ID, process group ID, and session ID are all equal. To make this identification process easier, SVR4 provides the getsid function:
    #include <sys/types.h>
    pid_t getsid(void);
This function is not part of the POSIX standard.
The Controlling Terminal
A controlling terminal can be associated with a session; in the case of interactive logins, the controlling terminal is usually the device on which the user is logged in. When a session is initially created, it has no controlling terminal. A controlling terminal is allocated for a session when the session leader opens a terminal device that is not already associated with a session, unless the session leader supplies the O_NOCTTY flag on the call to open (see Chapter 3, Low-Level I/O Routines). The session leader that establishes the connection to the controlling terminal is called the controlling process.
When a session has a controlling terminal associated with it, a number of things can happen. At all times, the controlling terminal is associated with a process group. When one of the session's process groups has the same process group ID as that of the controlling terminal, that process group is said to be in the foreground. If the process group's process group ID is not the same as that of the controlling terminal, the process group is said to be in the background. The foreground or background status of a process group has a number of interesting effects.
Whenever a user presses the interrupt key (usually CTRL-C) or quit key (usually CTRL-\) on the controlling terminal, a signal (either SIGINT or SIGQUIT) is delivered to all processes in the foreground process group. If job control is enabled, pressing the suspend key (usually CTRL-Z) on the controlling terminal sends a SIGTSTP signal to all processes in the foreground process group. Whenever a modem disconnect on the controlling terminal is detected by the system, the SIGHUP signal is sent to the controlling process (session leader).
When job control is enabled, only a process in the foreground process group may read from the terminal. Processes in background process groups will be stopped with a SIGTTIN signal if they attempt to read from the controlling terminal. If the TOSTOP mode is set on the controlling terminal (see Chapter 12, Terminals), only processes in the foreground process group may write to the controlling terminal. If a process in a background process group attempts to write to the controlling terminal, it will be stopped with a SIGTTOU signal.
Job control shells, such as the C shell and Korn shell, use the controlling terminal to implement job control. In order to move a job into the foreground, the shell changes the process group of the controlling terminal to the process group ID of that job and, if necessary, starts the job running again by sending the processes in that process group a SIGCONT signal. Each time a different job is placed into the foreground, the controlling terminal's process group is changed to the process group of that job.
Sometimes, a program wishes to talk to the controlling terminal, regardless of whether or not the standard input or standard output have been redirected. For example, the passwd program insists on reading a new password from the keyboard; it does not want to read it from a file (if the password is stored in a file, it is probably not secret any more). When this is necessary, the process can open the special file /dev/tty. This special filename is translated within the kernel to refer to the controlling terminal for the process. Only a process with a controlling terminal can open /dev/tty.
Priorities
The UNIX scheduler is responsible for allocating slices of the processor's time to processes in the system. In order to do this fairly, the scheduler computes a priority for each process in the system. These priorities are recalculated frequently, based on a complex formula that takes into account such things as the amount of memory the process is using, the amount of input and output it is performing, and how long it's been since the last time the process got any processor time. The calculation varies among different versions of UNIX, but the end result is the same—an ordered list of processes, sorted by priority. Generally speaking, processes with a high priority execute more often and/or for longer time slices.
A process cannot set or change its priority; this calculation is performed by the operating system. However, the process can influence the priority calculation slightly. One of the parameters of the scheduler's priority calculation is a process' nice value. This is a number that ranges from 0 to 40, with the default value being 20. A process can lower its priority (allow other processes to take precedence) by increasing its nice value to something between 20 and 40. (This is where the name “nice” comes from—large jobs are supposed to be nice to the system by increasing their nice value.) To raise its priority (take precedence over other processes), a process decreases its nice value to something between 0 and 20. Usually, any process may increase its nice value (give itself a worse priority), but only processes with superuser privileges may lower their nice values. To change the nice value, use the nice function:
    #include <unistd.h>
    int nice(int incr);
When called, nice adds incr, which may be positive or negative, to the process' current nice value.

Previous SectionNext Section
Books24x7.com, Inc © 2000 –  Feedback