Add Book to My BookshelfPurchase This Book Online

Chapter 2 - Utility Routines

UNIX Systems Programming for SVR4
David A. Curry
 Copyright © 1996 O'Reilly & Associates, Inc.

Manipulating Byte Strings
The functions described in the previous section all operate on character strings, which are arrays of non-zero bytes terminated by a zero (null) byte. However, there are also times when you need to perform similar operations on strings in which the null byte is not a terminator, but a legal value. Because every byte value is legal, these strings, called byte strings, do not have a terminator character. Instead, they are always paired with an integer value indicating how many bytes are in the string.
The routines described in this section, for manipulating byte strings, closely resemble the character string routines described in the previous section. However, you can use these functions not only with strings of characters (which are a subset of byte strings), but also with any other arbitrary “chunk” of memory, such as a two-dimensional array, an array of pointers, an integer, an array of floating-point numbers, a structure, or an array of structures (although some of the routines don't really make sense on all these data types).
Comparing Byte Strings
To compare two byte strings (areas of memory), use the memcmp function:
   #include <string.h>
   int memcmp(const void *s1, const void *s2, size_t n);
memcmp compares the first n bytes of the areas of memory pointed to by s1 and s2. Just like strcmp, memcmp returns an integer less than, equal to, or greater than zero depending upon whether s1 is lexicographically less than, equal to, or greater than s2. Usually, this distinction is not meaningful for arbitrary “binary” data (what is the meaning of an array of floating-point numbers being lexicographically greater than another array of floating-point numbers?), and thus memcmp is usually just used to test for equivalence.
Copying Byte Strings
To copy one array of bytes to another, use the memcpy function:
   #include <string.h>
   void *memcpy(void *dst, const void *src, size_t n);
memcpy copies exactly n bytes from src into dst, and returns a pointer to dst.
memcpy is the preferred function for copying byte strings, but there is one case in which it does not work properly. If the areas pointed to by src and dst overlap, the algorithm used by memcpy fails. For this purpose, the memmove function is provided:
   #include <string.h>
   void *memmove(void *dst, const void *src, size_t n);
This function performs the same task as memcpy, but correctly handles the case where src and dst overlap. (There are two separate functions because the implementation of memcpy is more efficient than memmove on some architectures, and so you can use the faster implementation when overlap is not a concern.)
A third function for copying one byte string to another is called memccpy:
   #include <string.h>
   void *memccpy(void *dst, const void *src, int c, size_t n);
memccpy copies bytes from src to dst, stopping after a byte with the value c copied, or after n bytes are copied, whichever comes first. It returns a pointer to the next byte in src to copy (the one after the byte with value c), or a null pointer if no bytes with value c are found. Unlike the rest of the functions described in this section, memccpy is not specified by the ANSI C standard.
Searching Byte Strings
To search an array of bytes for the first occurrence of a specific value, use the memchr function:
   #include <string.h>
   void *memchr(const void *s, int c, size_t n);
memchr searches the first n bytes of s, starting from the beginning, until a byte with value c (interpreted as an unsigned char) is found. It returns a pointer to the byte, or the predefined constant NULL, if the byte is not found.
When using integers as bit fields, where each bit is interpreted as a Boolean /false value, it is convenient to find the first bit in the integer that is “set” (non-zero). To do this, use the ffs function:
   #include <string.h>
   int ffs(int i);
ffs finds the first bit set in the argument it is passed and returns the index of that bit. Bits are numbered starting with 1 (one) from the low order bit. A return value of zero indicates that no bits are set (i.e., the value passed was equal to zero). This function is not specified by the ANSI C standard.
Initializing Byte Strings
When working with arrays of data, it is frequently necessary to initialize the entire array to a known value (often zero or null). To do this, use the memset function:
   #include <string.h>
   void *memset(void *s, int c, size_t n);
memset fills the area pointed to by s with n bytes of value c and returns a pointer to s. The value in c is interpreted as an unsigned character, so only values between 0 and 255 can be used.
Porting Notes
The functions described in this section were first introduced in System V UNIX, and therefore exist on any System V-based system. Because they are a part of the ANSI C standard, they exist on most modern versions of UNIX as well, regardless of whether or not they are System V-based. However, when porting code from BSD-based systems, there are a number of things you need to consider:
 On BSD-based systems, the include file for these functions is called strings.h, rather than string.h. In fact, you can usually use the presence or absence of the string.h file to determine whether or not all of the functions described in this section are present. Some systems, such as SunOS 4.x, provide both files but their contents are not the same.
 The BSD equivalent of the memcmp function is called bcmp:
       #include <strings.h>
       int bcmp(const char *s1, const char *s2, int n);
bcmp returns 0 if the two strings are equal, and 1 if they are not.
 The BSD version of the memcpy and memmove functions is called bcopy:
       #include <strings.h>
       void bcopy(const char *src, char *dst, int n);
Note that the src and dst arguments are in the opposite order from that used by memcpy and memmove. bcopy is more properly replaced by memmove, because it does properly handle the case in which the source and destination strings overlap.
 The BSD version of the memset function is called bzero:
       #include <strings.h>
       void bzero(char *s, int n);
bzero initializes the array pointed to by s to zero; there is no choice of value as there is with memset.
 There are no BSD equivalents for memchr or memccpy.
 When porting from a BSD environment to SVR4, it is usually sufficient to add the following lines to your program:
       #define bcmp(b1, b2, n)        memcmp(b1, b2, n)
       #define bcopy(src, dst, n)     memmove(dst, src, n)
       #define bzero(b, n)            memset(b, '\0', n)

Previous SectionNext Section
Books24x7.com, Inc © 2000 –  Feedback