|  | @node String and Array Utilities, Character Set Handling, Character Handling, Top | 
|  | @c %MENU% Utilities for copying and comparing strings and arrays | 
|  | @chapter String and Array Utilities | 
|  |  | 
|  | Operations on strings (null-terminated byte sequences) are an important part of | 
|  | many programs.  @Theglibc{} provides an extensive set of string | 
|  | utility functions, including functions for copying, concatenating, | 
|  | comparing, and searching strings.  Many of these functions can also | 
|  | operate on arbitrary regions of storage; for example, the @code{memcpy} | 
|  | function can be used to copy the contents of any kind of array. | 
|  |  | 
|  | It's fairly common for beginning C programmers to ``reinvent the wheel'' | 
|  | by duplicating this functionality in their own code, but it pays to | 
|  | become familiar with the library functions and to make use of them, | 
|  | since this offers benefits in maintenance, efficiency, and portability. | 
|  |  | 
|  | For instance, you could easily compare one string to another in two | 
|  | lines of C code, but if you use the built-in @code{strcmp} function, | 
|  | you're less likely to make a mistake.  And, since these library | 
|  | functions are typically highly optimized, your program may run faster | 
|  | too. | 
|  |  | 
|  | @menu | 
|  | * Representation of Strings::   Introduction to basic concepts. | 
|  | * String/Array Conventions::    Whether to use a string function or an | 
|  | arbitrary array function. | 
|  | * String Length::               Determining the length of a string. | 
|  | * Copying Strings and Arrays::  Functions to copy strings and arrays. | 
|  | * Concatenating Strings::       Functions to concatenate strings while copying. | 
|  | * Truncating Strings::          Functions to truncate strings while copying. | 
|  | * String/Array Comparison::     Functions for byte-wise and character-wise | 
|  | comparison. | 
|  | * Collation Functions::         Functions for collating strings. | 
|  | * Search Functions::            Searching for a specific element or substring. | 
|  | * Finding Tokens in a String::  Splitting a string into tokens by looking | 
|  | for delimiters. | 
|  | * strfry::                      Function for flash-cooking a string. | 
|  | * Trivial Encryption::          Obscuring data. | 
|  | * Encode Binary Data::          Encoding and Decoding of Binary Data. | 
|  | * Argz and Envz Vectors::       Null-separated string vectors. | 
|  | @end menu | 
|  |  | 
|  | @node Representation of Strings | 
|  | @section Representation of Strings | 
|  | @cindex string, representation of | 
|  |  | 
|  | This section is a quick summary of string concepts for beginning C | 
|  | programmers.  It describes how strings are represented in C | 
|  | and some common pitfalls.  If you are already familiar with this | 
|  | material, you can skip this section. | 
|  |  | 
|  | @cindex string | 
|  | A @dfn{string} is a null-terminated array of bytes of type @code{char}, | 
|  | including the terminating null byte.  String-valued | 
|  | variables are usually declared to be pointers of type @code{char *}. | 
|  | Such variables do not include space for the text of a string; that has | 
|  | to be stored somewhere else---in an array variable, a string constant, | 
|  | or dynamically allocated memory (@pxref{Memory Allocation}).  It's up to | 
|  | you to store the address of the chosen memory space into the pointer | 
|  | variable.  Alternatively you can store a @dfn{null pointer} in the | 
|  | pointer variable.  The null pointer does not point anywhere, so | 
|  | attempting to reference the string it points to gets an error. | 
|  |  | 
|  | @cindex multibyte character | 
|  | @cindex multibyte string | 
|  | @cindex wide string | 
|  | A @dfn{multibyte character} is a sequence of one or more bytes that | 
|  | represents a single character using the locale's encoding scheme; a | 
|  | null byte always represents the null character.  A @dfn{multibyte | 
|  | string} is a string that consists entirely of multibyte | 
|  | characters.  In contrast, a @dfn{wide string} is a null-terminated | 
|  | sequence of @code{wchar_t} objects.  A wide-string variable is usually | 
|  | declared to be a pointer of type @code{wchar_t *}, by analogy with | 
|  | string variables and @code{char *}.  @xref{Extended Char Intro}. | 
|  |  | 
|  | @cindex null byte | 
|  | @cindex null wide character | 
|  | By convention, the @dfn{null byte}, @code{'\0'}, | 
|  | marks the end of a string and the @dfn{null wide character}, | 
|  | @code{L'\0'}, marks the end of a wide string.  For example, in | 
|  | testing to see whether the @code{char *} variable @var{p} points to a | 
|  | null byte marking the end of a string, you can write | 
|  | @code{!*@var{p}} or @code{*@var{p} == '\0'}. | 
|  |  | 
|  | A null byte is quite different conceptually from a null pointer, | 
|  | although both are represented by the integer constant @code{0}. | 
|  |  | 
|  | @cindex string literal | 
|  | A @dfn{string literal} appears in C program source as a multibyte | 
|  | string between double-quote characters (@samp{"}).  If the | 
|  | initial double-quote character is immediately preceded by a capital | 
|  | @samp{L} (ell) character (as in @code{L"foo"}), it is a wide string | 
|  | literal.  String literals can also contribute to @dfn{string | 
|  | concatenation}: @code{"a" "b"} is the same as @code{"ab"}. | 
|  | For wide strings one can use either | 
|  | @code{L"a" L"b"} or @code{L"a" "b"}.  Modification of string literals is | 
|  | not allowed by the GNU C compiler, because literals are placed in | 
|  | read-only storage. | 
|  |  | 
|  | Arrays that are declared @code{const} cannot be modified | 
|  | either.  It's generally good style to declare non-modifiable string | 
|  | pointers to be of type @code{const char *}, since this often allows the | 
|  | C compiler to detect accidental modifications as well as providing some | 
|  | amount of documentation about what your program intends to do with the | 
|  | string. | 
|  |  | 
|  | The amount of memory allocated for a byte array may extend past the null byte | 
|  | that marks the end of the string that the array contains.  In this | 
|  | document, the term @dfn{allocated size} is always used to refer to the | 
|  | total amount of memory allocated for an array, while the term | 
|  | @dfn{length} refers to the number of bytes up to (but not including) | 
|  | the terminating null byte.  Wide strings are similar, except their | 
|  | sizes and lengths count wide characters, not bytes. | 
|  | @cindex length of string | 
|  | @cindex allocation size of string | 
|  | @cindex size of string | 
|  | @cindex string length | 
|  | @cindex string allocation | 
|  |  | 
|  | A notorious source of program bugs is trying to put more bytes into a | 
|  | string than fit in its allocated size.  When writing code that extends | 
|  | strings or moves bytes into a pre-allocated array, you should be | 
|  | very careful to keep track of the length of the text and make explicit | 
|  | checks for overflowing the array.  Many of the library functions | 
|  | @emph{do not} do this for you!  Remember also that you need to allocate | 
|  | an extra byte to hold the null byte that marks the end of the | 
|  | string. | 
|  |  | 
|  | @cindex single-byte string | 
|  | @cindex multibyte string | 
|  | Originally strings were sequences of bytes where each byte represented a | 
|  | single character.  This is still true today if the strings are encoded | 
|  | using a single-byte character encoding.  Things are different if the | 
|  | strings are encoded using a multibyte encoding (for more information on | 
|  | encodings see @ref{Extended Char Intro}).  There is no difference in | 
|  | the programming interface for these two kind of strings; the programmer | 
|  | has to be aware of this and interpret the byte sequences accordingly. | 
|  |  | 
|  | But since there is no separate interface taking care of these | 
|  | differences the byte-based string functions are sometimes hard to use. | 
|  | Since the count parameters of these functions specify bytes a call to | 
|  | @code{memcpy} could cut a multibyte character in the middle and put an | 
|  | incomplete (and therefore unusable) byte sequence in the target buffer. | 
|  |  | 
|  | @cindex wide string | 
|  | To avoid these problems later versions of the @w{ISO C} standard | 
|  | introduce a second set of functions which are operating on @dfn{wide | 
|  | characters} (@pxref{Extended Char Intro}).  These functions don't have | 
|  | the problems the single-byte versions have since every wide character is | 
|  | a legal, interpretable value.  This does not mean that cutting wide | 
|  | strings at arbitrary points is without problems.  It normally | 
|  | is for alphabet-based languages (except for non-normalized text) but | 
|  | languages based on syllables still have the problem that more than one | 
|  | wide character is necessary to complete a logical unit.  This is a | 
|  | higher level problem which the @w{C library} functions are not designed | 
|  | to solve.  But it is at least good that no invalid byte sequences can be | 
|  | created.  Also, the higher level functions can also much more easily operate | 
|  | on wide characters than on multibyte characters so that a common strategy | 
|  | is to use wide characters internally whenever text is more than simply | 
|  | copied. | 
|  |  | 
|  | The remaining of this chapter will discuss the functions for handling | 
|  | wide strings in parallel with the discussion of | 
|  | strings since there is almost always an exact equivalent | 
|  | available. | 
|  |  | 
|  | @node String/Array Conventions | 
|  | @section String and Array Conventions | 
|  |  | 
|  | This chapter describes both functions that work on arbitrary arrays or | 
|  | blocks of memory, and functions that are specific to strings and wide | 
|  | strings. | 
|  |  | 
|  | Functions that operate on arbitrary blocks of memory have names | 
|  | beginning with @samp{mem} and @samp{wmem} (such as @code{memcpy} and | 
|  | @code{wmemcpy}) and invariably take an argument which specifies the size | 
|  | (in bytes and wide characters respectively) of the block of memory to | 
|  | operate on.  The array arguments and return values for these functions | 
|  | have type @code{void *} or @code{wchar_t}.  As a matter of style, the | 
|  | elements of the arrays used with the @samp{mem} functions are referred | 
|  | to as ``bytes''.  You can pass any kind of pointer to these functions, | 
|  | and the @code{sizeof} operator is useful in computing the value for the | 
|  | size argument.  Parameters to the @samp{wmem} functions must be of type | 
|  | @code{wchar_t *}.  These functions are not really usable with anything | 
|  | but arrays of this type. | 
|  |  | 
|  | In contrast, functions that operate specifically on strings and wide | 
|  | strings have names beginning with @samp{str} and @samp{wcs} | 
|  | respectively (such as @code{strcpy} and @code{wcscpy}) and look for a | 
|  | terminating null byte or null wide character instead of requiring an explicit | 
|  | size argument to be passed.  (Some of these functions accept a specified | 
|  | maximum length, but they also check for premature termination.) | 
|  | The array arguments and return values for these | 
|  | functions have type @code{char *} and @code{wchar_t *} respectively, and | 
|  | the array elements are referred to as ``bytes'' and ``wide | 
|  | characters''. | 
|  |  | 
|  | In many cases, there are both @samp{mem} and @samp{str}/@samp{wcs} | 
|  | versions of a function.  The one that is more appropriate to use depends | 
|  | on the exact situation.  When your program is manipulating arbitrary | 
|  | arrays or blocks of storage, then you should always use the @samp{mem} | 
|  | functions.  On the other hand, when you are manipulating | 
|  | strings it is usually more convenient to use the @samp{str}/@samp{wcs} | 
|  | functions, unless you already know the length of the string in advance. | 
|  | The @samp{wmem} functions should be used for wide character arrays with | 
|  | known size. | 
|  |  | 
|  | @cindex wint_t | 
|  | @cindex parameter promotion | 
|  | Some of the memory and string functions take single characters as | 
|  | arguments.  Since a value of type @code{char} is automatically promoted | 
|  | into a value of type @code{int} when used as a parameter, the functions | 
|  | are declared with @code{int} as the type of the parameter in question. | 
|  | In case of the wide character functions the situation is similar: the | 
|  | parameter type for a single wide character is @code{wint_t} and not | 
|  | @code{wchar_t}.  This would for many implementations not be necessary | 
|  | since @code{wchar_t} is large enough to not be automatically | 
|  | promoted, but since the @w{ISO C} standard does not require such a | 
|  | choice of types the @code{wint_t} type is used. | 
|  |  | 
|  | @node String Length | 
|  | @section String Length | 
|  |  | 
|  | You can get the length of a string using the @code{strlen} function. | 
|  | This function is declared in the header file @file{string.h}. | 
|  | @pindex string.h | 
|  |  | 
|  | @comment string.h | 
|  | @comment ISO | 
|  | @deftypefun size_t strlen (const char *@var{s}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | The @code{strlen} function returns the length of the | 
|  | string @var{s} in bytes.  (In other words, it returns the offset of the | 
|  | terminating null byte within the array.) | 
|  |  | 
|  | For example, | 
|  | @smallexample | 
|  | strlen ("hello, world") | 
|  | @result{} 12 | 
|  | @end smallexample | 
|  |  | 
|  | When applied to an array, the @code{strlen} function returns | 
|  | the length of the string stored there, not its allocated size.  You can | 
|  | get the allocated size of the array that holds a string using | 
|  | the @code{sizeof} operator: | 
|  |  | 
|  | @smallexample | 
|  | char string[32] = "hello, world"; | 
|  | sizeof (string) | 
|  | @result{} 32 | 
|  | strlen (string) | 
|  | @result{} 12 | 
|  | @end smallexample | 
|  |  | 
|  | But beware, this will not work unless @var{string} is the | 
|  | array itself, not a pointer to it.  For example: | 
|  |  | 
|  | @smallexample | 
|  | char string[32] = "hello, world"; | 
|  | char *ptr = string; | 
|  | sizeof (string) | 
|  | @result{} 32 | 
|  | sizeof (ptr) | 
|  | @result{} 4  /* @r{(on a machine with 4 byte pointers)} */ | 
|  | @end smallexample | 
|  |  | 
|  | This is an easy mistake to make when you are working with functions that | 
|  | take string arguments; those arguments are always pointers, not arrays. | 
|  |  | 
|  | It must also be noted that for multibyte encoded strings the return | 
|  | value does not have to correspond to the number of characters in the | 
|  | string.  To get this value the string can be converted to wide | 
|  | characters and @code{wcslen} can be used or something like the following | 
|  | code can be used: | 
|  |  | 
|  | @smallexample | 
|  | /* @r{The input is in @code{string}.} | 
|  | @r{The length is expected in @code{n}.}  */ | 
|  | @{ | 
|  | mbstate_t t; | 
|  | char *scopy = string; | 
|  | /* In initial state.  */ | 
|  | memset (&t, '\0', sizeof (t)); | 
|  | /* Determine number of characters.  */ | 
|  | n = mbsrtowcs (NULL, &scopy, strlen (scopy), &t); | 
|  | @} | 
|  | @end smallexample | 
|  |  | 
|  | This is cumbersome to do so if the number of characters (as opposed to | 
|  | bytes) is needed often it is better to work with wide characters. | 
|  | @end deftypefun | 
|  |  | 
|  | The wide character equivalent is declared in @file{wchar.h}. | 
|  |  | 
|  | @comment wchar.h | 
|  | @comment ISO | 
|  | @deftypefun size_t wcslen (const wchar_t *@var{ws}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | The @code{wcslen} function is the wide character equivalent to | 
|  | @code{strlen}.  The return value is the number of wide characters in the | 
|  | wide string pointed to by @var{ws} (this is also the offset of | 
|  | the terminating null wide character of @var{ws}). | 
|  |  | 
|  | Since there are no multi wide character sequences making up one wide | 
|  | character the return value is not only the offset in the array, it is | 
|  | also the number of wide characters. | 
|  |  | 
|  | This function was introduced in @w{Amendment 1} to @w{ISO C90}. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment string.h | 
|  | @comment GNU | 
|  | @deftypefun size_t strnlen (const char *@var{s}, size_t @var{maxlen}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | If the array @var{s} of size @var{maxlen} contains a null byte, | 
|  | the @code{strnlen} function returns the length of the string @var{s} in | 
|  | bytes.  Otherwise it | 
|  | returns @var{maxlen}.  Therefore this function is equivalent to | 
|  | @code{(strlen (@var{s}) < @var{maxlen} ? strlen (@var{s}) : @var{maxlen})} | 
|  | but it | 
|  | is more efficient and works even if @var{s} is not null-terminated so | 
|  | long as @var{maxlen} does not exceed the size of @var{s}'s array. | 
|  |  | 
|  | @smallexample | 
|  | char string[32] = "hello, world"; | 
|  | strnlen (string, 32) | 
|  | @result{} 12 | 
|  | strnlen (string, 5) | 
|  | @result{} 5 | 
|  | @end smallexample | 
|  |  | 
|  | This function is a GNU extension and is declared in @file{string.h}. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment wchar.h | 
|  | @comment GNU | 
|  | @deftypefun size_t wcsnlen (const wchar_t *@var{ws}, size_t @var{maxlen}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | @code{wcsnlen} is the wide character equivalent to @code{strnlen}.  The | 
|  | @var{maxlen} parameter specifies the maximum number of wide characters. | 
|  |  | 
|  | This function is a GNU extension and is declared in @file{wchar.h}. | 
|  | @end deftypefun | 
|  |  | 
|  | @node Copying Strings and Arrays | 
|  | @section Copying Strings and Arrays | 
|  |  | 
|  | You can use the functions described in this section to copy the contents | 
|  | of strings, wide strings, and arrays.  The @samp{str} and @samp{mem} | 
|  | functions are declared in @file{string.h} while the @samp{w} functions | 
|  | are declared in @file{wchar.h}. | 
|  | @pindex string.h | 
|  | @pindex wchar.h | 
|  | @cindex copying strings and arrays | 
|  | @cindex string copy functions | 
|  | @cindex array copy functions | 
|  | @cindex concatenating strings | 
|  | @cindex string concatenation functions | 
|  |  | 
|  | A helpful way to remember the ordering of the arguments to the functions | 
|  | in this section is that it corresponds to an assignment expression, with | 
|  | the destination array specified to the left of the source array.  Most | 
|  | of these functions return the address of the destination array; a few | 
|  | return the address of the destination's terminating null, or of just | 
|  | past the destination. | 
|  |  | 
|  | Most of these functions do not work properly if the source and | 
|  | destination arrays overlap.  For example, if the beginning of the | 
|  | destination array overlaps the end of the source array, the original | 
|  | contents of that part of the source array may get overwritten before it | 
|  | is copied.  Even worse, in the case of the string functions, the null | 
|  | byte marking the end of the string may be lost, and the copy | 
|  | function might get stuck in a loop trashing all the memory allocated to | 
|  | your program. | 
|  |  | 
|  | All functions that have problems copying between overlapping arrays are | 
|  | explicitly identified in this manual.  In addition to functions in this | 
|  | section, there are a few others like @code{sprintf} (@pxref{Formatted | 
|  | Output Functions}) and @code{scanf} (@pxref{Formatted Input | 
|  | Functions}). | 
|  |  | 
|  | @comment string.h | 
|  | @comment ISO | 
|  | @deftypefun {void *} memcpy (void *restrict @var{to}, const void *restrict @var{from}, size_t @var{size}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | The @code{memcpy} function copies @var{size} bytes from the object | 
|  | beginning at @var{from} into the object beginning at @var{to}.  The | 
|  | behavior of this function is undefined if the two arrays @var{to} and | 
|  | @var{from} overlap; use @code{memmove} instead if overlapping is possible. | 
|  |  | 
|  | The value returned by @code{memcpy} is the value of @var{to}. | 
|  |  | 
|  | Here is an example of how you might use @code{memcpy} to copy the | 
|  | contents of an array: | 
|  |  | 
|  | @smallexample | 
|  | struct foo *oldarray, *newarray; | 
|  | int arraysize; | 
|  | @dots{} | 
|  | memcpy (new, old, arraysize * sizeof (struct foo)); | 
|  | @end smallexample | 
|  | @end deftypefun | 
|  |  | 
|  | @comment wchar.h | 
|  | @comment ISO | 
|  | @deftypefun {wchar_t *} wmemcpy (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom}, size_t @var{size}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | The @code{wmemcpy} function copies @var{size} wide characters from the object | 
|  | beginning at @var{wfrom} into the object beginning at @var{wto}.  The | 
|  | behavior of this function is undefined if the two arrays @var{wto} and | 
|  | @var{wfrom} overlap; use @code{wmemmove} instead if overlapping is possible. | 
|  |  | 
|  | The following is a possible implementation of @code{wmemcpy} but there | 
|  | are more optimizations possible. | 
|  |  | 
|  | @smallexample | 
|  | wchar_t * | 
|  | wmemcpy (wchar_t *restrict wto, const wchar_t *restrict wfrom, | 
|  | size_t size) | 
|  | @{ | 
|  | return (wchar_t *) memcpy (wto, wfrom, size * sizeof (wchar_t)); | 
|  | @} | 
|  | @end smallexample | 
|  |  | 
|  | The value returned by @code{wmemcpy} is the value of @var{wto}. | 
|  |  | 
|  | This function was introduced in @w{Amendment 1} to @w{ISO C90}. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment string.h | 
|  | @comment GNU | 
|  | @deftypefun {void *} mempcpy (void *restrict @var{to}, const void *restrict @var{from}, size_t @var{size}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | The @code{mempcpy} function is nearly identical to the @code{memcpy} | 
|  | function.  It copies @var{size} bytes from the object beginning at | 
|  | @code{from} into the object pointed to by @var{to}.  But instead of | 
|  | returning the value of @var{to} it returns a pointer to the byte | 
|  | following the last written byte in the object beginning at @var{to}. | 
|  | I.e., the value is @code{((void *) ((char *) @var{to} + @var{size}))}. | 
|  |  | 
|  | This function is useful in situations where a number of objects shall be | 
|  | copied to consecutive memory positions. | 
|  |  | 
|  | @smallexample | 
|  | void * | 
|  | combine (void *o1, size_t s1, void *o2, size_t s2) | 
|  | @{ | 
|  | void *result = malloc (s1 + s2); | 
|  | if (result != NULL) | 
|  | mempcpy (mempcpy (result, o1, s1), o2, s2); | 
|  | return result; | 
|  | @} | 
|  | @end smallexample | 
|  |  | 
|  | This function is a GNU extension. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment wchar.h | 
|  | @comment GNU | 
|  | @deftypefun {wchar_t *} wmempcpy (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom}, size_t @var{size}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | The @code{wmempcpy} function is nearly identical to the @code{wmemcpy} | 
|  | function.  It copies @var{size} wide characters from the object | 
|  | beginning at @code{wfrom} into the object pointed to by @var{wto}.  But | 
|  | instead of returning the value of @var{wto} it returns a pointer to the | 
|  | wide character following the last written wide character in the object | 
|  | beginning at @var{wto}.  I.e., the value is @code{@var{wto} + @var{size}}. | 
|  |  | 
|  | This function is useful in situations where a number of objects shall be | 
|  | copied to consecutive memory positions. | 
|  |  | 
|  | The following is a possible implementation of @code{wmemcpy} but there | 
|  | are more optimizations possible. | 
|  |  | 
|  | @smallexample | 
|  | wchar_t * | 
|  | wmempcpy (wchar_t *restrict wto, const wchar_t *restrict wfrom, | 
|  | size_t size) | 
|  | @{ | 
|  | return (wchar_t *) mempcpy (wto, wfrom, size * sizeof (wchar_t)); | 
|  | @} | 
|  | @end smallexample | 
|  |  | 
|  | This function is a GNU extension. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment string.h | 
|  | @comment ISO | 
|  | @deftypefun {void *} memmove (void *@var{to}, const void *@var{from}, size_t @var{size}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | @code{memmove} copies the @var{size} bytes at @var{from} into the | 
|  | @var{size} bytes at @var{to}, even if those two blocks of space | 
|  | overlap.  In the case of overlap, @code{memmove} is careful to copy the | 
|  | original values of the bytes in the block at @var{from}, including those | 
|  | bytes which also belong to the block at @var{to}. | 
|  |  | 
|  | The value returned by @code{memmove} is the value of @var{to}. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment wchar.h | 
|  | @comment ISO | 
|  | @deftypefun {wchar_t *} wmemmove (wchar_t *@var{wto}, const wchar_t *@var{wfrom}, size_t @var{size}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | @code{wmemmove} copies the @var{size} wide characters at @var{wfrom} | 
|  | into the @var{size} wide characters at @var{wto}, even if those two | 
|  | blocks of space overlap.  In the case of overlap, @code{memmove} is | 
|  | careful to copy the original values of the wide characters in the block | 
|  | at @var{wfrom}, including those wide characters which also belong to the | 
|  | block at @var{wto}. | 
|  |  | 
|  | The following is a possible implementation of @code{wmemcpy} but there | 
|  | are more optimizations possible. | 
|  |  | 
|  | @smallexample | 
|  | wchar_t * | 
|  | wmempcpy (wchar_t *restrict wto, const wchar_t *restrict wfrom, | 
|  | size_t size) | 
|  | @{ | 
|  | return (wchar_t *) mempcpy (wto, wfrom, size * sizeof (wchar_t)); | 
|  | @} | 
|  | @end smallexample | 
|  |  | 
|  | The value returned by @code{wmemmove} is the value of @var{wto}. | 
|  |  | 
|  | This function is a GNU extension. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment string.h | 
|  | @comment SVID | 
|  | @deftypefun {void *} memccpy (void *restrict @var{to}, const void *restrict @var{from}, int @var{c}, size_t @var{size}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | This function copies no more than @var{size} bytes from @var{from} to | 
|  | @var{to}, stopping if a byte matching @var{c} is found.  The return | 
|  | value is a pointer into @var{to} one byte past where @var{c} was copied, | 
|  | or a null pointer if no byte matching @var{c} appeared in the first | 
|  | @var{size} bytes of @var{from}. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment string.h | 
|  | @comment ISO | 
|  | @deftypefun {void *} memset (void *@var{block}, int @var{c}, size_t @var{size}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | This function copies the value of @var{c} (converted to an | 
|  | @code{unsigned char}) into each of the first @var{size} bytes of the | 
|  | object beginning at @var{block}.  It returns the value of @var{block}. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment wchar.h | 
|  | @comment ISO | 
|  | @deftypefun {wchar_t *} wmemset (wchar_t *@var{block}, wchar_t @var{wc}, size_t @var{size}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | This function copies the value of @var{wc} into each of the first | 
|  | @var{size} wide characters of the object beginning at @var{block}.  It | 
|  | returns the value of @var{block}. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment string.h | 
|  | @comment ISO | 
|  | @deftypefun {char *} strcpy (char *restrict @var{to}, const char *restrict @var{from}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | This copies bytes from the string @var{from} (up to and including | 
|  | the terminating null byte) into the string @var{to}.  Like | 
|  | @code{memcpy}, this function has undefined results if the strings | 
|  | overlap.  The return value is the value of @var{to}. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment wchar.h | 
|  | @comment ISO | 
|  | @deftypefun {wchar_t *} wcscpy (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | This copies wide characters from the wide string @var{wfrom} (up to and | 
|  | including the terminating null wide character) into the string | 
|  | @var{wto}.  Like @code{wmemcpy}, this function has undefined results if | 
|  | the strings overlap.  The return value is the value of @var{wto}. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment SVID | 
|  | @deftypefun {char *} strdup (const char *@var{s}) | 
|  | @safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}} | 
|  | This function copies the string @var{s} into a newly | 
|  | allocated string.  The string is allocated using @code{malloc}; see | 
|  | @ref{Unconstrained Allocation}.  If @code{malloc} cannot allocate space | 
|  | for the new string, @code{strdup} returns a null pointer.  Otherwise it | 
|  | returns a pointer to the new string. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment wchar.h | 
|  | @comment GNU | 
|  | @deftypefun {wchar_t *} wcsdup (const wchar_t *@var{ws}) | 
|  | @safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}} | 
|  | This function copies the wide string @var{ws} | 
|  | into a newly allocated string.  The string is allocated using | 
|  | @code{malloc}; see @ref{Unconstrained Allocation}.  If @code{malloc} | 
|  | cannot allocate space for the new string, @code{wcsdup} returns a null | 
|  | pointer.  Otherwise it returns a pointer to the new wide string. | 
|  |  | 
|  | This function is a GNU extension. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment string.h | 
|  | @comment Unknown origin | 
|  | @deftypefun {char *} stpcpy (char *restrict @var{to}, const char *restrict @var{from}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | This function is like @code{strcpy}, except that it returns a pointer to | 
|  | the end of the string @var{to} (that is, the address of the terminating | 
|  | null byte @code{to + strlen (from)}) rather than the beginning. | 
|  |  | 
|  | For example, this program uses @code{stpcpy} to concatenate @samp{foo} | 
|  | and @samp{bar} to produce @samp{foobar}, which it then prints. | 
|  |  | 
|  | @smallexample | 
|  | @include stpcpy.c.texi | 
|  | @end smallexample | 
|  |  | 
|  | This function is not part of the ISO or POSIX standards, and is not | 
|  | customary on Unix systems, but we did not invent it either.  Perhaps it | 
|  | comes from MS-DOG. | 
|  |  | 
|  | Its behavior is undefined if the strings overlap.  The function is | 
|  | declared in @file{string.h}. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment wchar.h | 
|  | @comment GNU | 
|  | @deftypefun {wchar_t *} wcpcpy (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | This function is like @code{wcscpy}, except that it returns a pointer to | 
|  | the end of the string @var{wto} (that is, the address of the terminating | 
|  | null wide character @code{wto + wcslen (wfrom)}) rather than the beginning. | 
|  |  | 
|  | This function is not part of ISO or POSIX but was found useful while | 
|  | developing @theglibc{} itself. | 
|  |  | 
|  | The behavior of @code{wcpcpy} is undefined if the strings overlap. | 
|  |  | 
|  | @code{wcpcpy} is a GNU extension and is declared in @file{wchar.h}. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment string.h | 
|  | @comment GNU | 
|  | @deftypefn {Macro} {char *} strdupa (const char *@var{s}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | This macro is similar to @code{strdup} but allocates the new string | 
|  | using @code{alloca} instead of @code{malloc} (@pxref{Variable Size | 
|  | Automatic}).  This means of course the returned string has the same | 
|  | limitations as any block of memory allocated using @code{alloca}. | 
|  |  | 
|  | For obvious reasons @code{strdupa} is implemented only as a macro; | 
|  | you cannot get the address of this function.  Despite this limitation | 
|  | it is a useful function.  The following code shows a situation where | 
|  | using @code{malloc} would be a lot more expensive. | 
|  |  | 
|  | @smallexample | 
|  | @include strdupa.c.texi | 
|  | @end smallexample | 
|  |  | 
|  | Please note that calling @code{strtok} using @var{path} directly is | 
|  | invalid.  It is also not allowed to call @code{strdupa} in the argument | 
|  | list of @code{strtok} since @code{strdupa} uses @code{alloca} | 
|  | (@pxref{Variable Size Automatic}) can interfere with the parameter | 
|  | passing. | 
|  |  | 
|  | This function is only available if GNU CC is used. | 
|  | @end deftypefn | 
|  |  | 
|  | @comment string.h | 
|  | @comment BSD | 
|  | @deftypefun void bcopy (const void *@var{from}, void *@var{to}, size_t @var{size}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | This is a partially obsolete alternative for @code{memmove}, derived from | 
|  | BSD.  Note that it is not quite equivalent to @code{memmove}, because the | 
|  | arguments are not in the same order and there is no return value. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment string.h | 
|  | @comment BSD | 
|  | @deftypefun void bzero (void *@var{block}, size_t @var{size}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | This is a partially obsolete alternative for @code{memset}, derived from | 
|  | BSD.  Note that it is not as general as @code{memset}, because the only | 
|  | value it can store is zero. | 
|  | @end deftypefun | 
|  |  | 
|  | @node Concatenating Strings | 
|  | @section Concatenating Strings | 
|  | @pindex string.h | 
|  | @pindex wchar.h | 
|  | @cindex concatenating strings | 
|  | @cindex string concatenation functions | 
|  |  | 
|  | The functions described in this section concatenate the contents of a | 
|  | string or wide string to another.  They follow the string-copying | 
|  | functions in their conventions.  @xref{Copying Strings and Arrays}. | 
|  | @samp{strcat} is declared in the header file @file{string.h} while | 
|  | @samp{wcscat} is declared in @file{wchar.h}. | 
|  |  | 
|  | @comment string.h | 
|  | @comment ISO | 
|  | @deftypefun {char *} strcat (char *restrict @var{to}, const char *restrict @var{from}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | The @code{strcat} function is similar to @code{strcpy}, except that the | 
|  | bytes from @var{from} are concatenated or appended to the end of | 
|  | @var{to}, instead of overwriting it.  That is, the first byte from | 
|  | @var{from} overwrites the null byte marking the end of @var{to}. | 
|  |  | 
|  | An equivalent definition for @code{strcat} would be: | 
|  |  | 
|  | @smallexample | 
|  | char * | 
|  | strcat (char *restrict to, const char *restrict from) | 
|  | @{ | 
|  | strcpy (to + strlen (to), from); | 
|  | return to; | 
|  | @} | 
|  | @end smallexample | 
|  |  | 
|  | This function has undefined results if the strings overlap. | 
|  |  | 
|  | As noted below, this function has significant performance issues. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment wchar.h | 
|  | @comment ISO | 
|  | @deftypefun {wchar_t *} wcscat (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | The @code{wcscat} function is similar to @code{wcscpy}, except that the | 
|  | wide characters from @var{wfrom} are concatenated or appended to the end of | 
|  | @var{wto}, instead of overwriting it.  That is, the first wide character from | 
|  | @var{wfrom} overwrites the null wide character marking the end of @var{wto}. | 
|  |  | 
|  | An equivalent definition for @code{wcscat} would be: | 
|  |  | 
|  | @smallexample | 
|  | wchar_t * | 
|  | wcscat (wchar_t *wto, const wchar_t *wfrom) | 
|  | @{ | 
|  | wcscpy (wto + wcslen (wto), wfrom); | 
|  | return wto; | 
|  | @} | 
|  | @end smallexample | 
|  |  | 
|  | This function has undefined results if the strings overlap. | 
|  |  | 
|  | As noted below, this function has significant performance issues. | 
|  | @end deftypefun | 
|  |  | 
|  | Programmers using the @code{strcat} or @code{wcscat} function (or the | 
|  | @code{strncat} or @code{wcsncat} functions defined in | 
|  | a later section, for that matter) | 
|  | can easily be recognized as lazy and reckless.  In almost all situations | 
|  | the lengths of the participating strings are known (it better should be | 
|  | since how can one otherwise ensure the allocated size of the buffer is | 
|  | sufficient?)  Or at least, one could know them if one keeps track of the | 
|  | results of the various function calls.  But then it is very inefficient | 
|  | to use @code{strcat}/@code{wcscat}.  A lot of time is wasted finding the | 
|  | end of the destination string so that the actual copying can start. | 
|  | This is a common example: | 
|  |  | 
|  | @cindex va_copy | 
|  | @smallexample | 
|  | /* @r{This function concatenates arbitrarily many strings.  The last} | 
|  | @r{parameter must be @code{NULL}.}  */ | 
|  | char * | 
|  | concat (const char *str, @dots{}) | 
|  | @{ | 
|  | va_list ap, ap2; | 
|  | size_t total = 1; | 
|  | const char *s; | 
|  | char *result; | 
|  |  | 
|  | va_start (ap, str); | 
|  | va_copy (ap2, ap); | 
|  |  | 
|  | /* @r{Determine how much space we need.}  */ | 
|  | for (s = str; s != NULL; s = va_arg (ap, const char *)) | 
|  | total += strlen (s); | 
|  |  | 
|  | va_end (ap); | 
|  |  | 
|  | result = (char *) malloc (total); | 
|  | if (result != NULL) | 
|  | @{ | 
|  | result[0] = '\0'; | 
|  |  | 
|  | /* @r{Copy the strings.}  */ | 
|  | for (s = str; s != NULL; s = va_arg (ap2, const char *)) | 
|  | strcat (result, s); | 
|  | @} | 
|  |  | 
|  | va_end (ap2); | 
|  |  | 
|  | return result; | 
|  | @} | 
|  | @end smallexample | 
|  |  | 
|  | This looks quite simple, especially the second loop where the strings | 
|  | are actually copied.  But these innocent lines hide a major performance | 
|  | penalty.  Just imagine that ten strings of 100 bytes each have to be | 
|  | concatenated.  For the second string we search the already stored 100 | 
|  | bytes for the end of the string so that we can append the next string. | 
|  | For all strings in total the comparisons necessary to find the end of | 
|  | the intermediate results sums up to 5500!  If we combine the copying | 
|  | with the search for the allocation we can write this function more | 
|  | efficient: | 
|  |  | 
|  | @smallexample | 
|  | char * | 
|  | concat (const char *str, @dots{}) | 
|  | @{ | 
|  | va_list ap; | 
|  | size_t allocated = 100; | 
|  | char *result = (char *) malloc (allocated); | 
|  |  | 
|  | if (result != NULL) | 
|  | @{ | 
|  | char *newp; | 
|  | char *wp; | 
|  | const char *s; | 
|  |  | 
|  | va_start (ap, str); | 
|  |  | 
|  | wp = result; | 
|  | for (s = str; s != NULL; s = va_arg (ap, const char *)) | 
|  | @{ | 
|  | size_t len = strlen (s); | 
|  |  | 
|  | /* @r{Resize the allocated memory if necessary.}  */ | 
|  | if (wp + len + 1 > result + allocated) | 
|  | @{ | 
|  | allocated = (allocated + len) * 2; | 
|  | newp = (char *) realloc (result, allocated); | 
|  | if (newp == NULL) | 
|  | @{ | 
|  | free (result); | 
|  | return NULL; | 
|  | @} | 
|  | wp = newp + (wp - result); | 
|  | result = newp; | 
|  | @} | 
|  |  | 
|  | wp = mempcpy (wp, s, len); | 
|  | @} | 
|  |  | 
|  | /* @r{Terminate the result string.}  */ | 
|  | *wp++ = '\0'; | 
|  |  | 
|  | /* @r{Resize memory to the optimal size.}  */ | 
|  | newp = realloc (result, wp - result); | 
|  | if (newp != NULL) | 
|  | result = newp; | 
|  |  | 
|  | va_end (ap); | 
|  | @} | 
|  |  | 
|  | return result; | 
|  | @} | 
|  | @end smallexample | 
|  |  | 
|  | With a bit more knowledge about the input strings one could fine-tune | 
|  | the memory allocation.  The difference we are pointing to here is that | 
|  | we don't use @code{strcat} anymore.  We always keep track of the length | 
|  | of the current intermediate result so we can safe us the search for the | 
|  | end of the string and use @code{mempcpy}.  Please note that we also | 
|  | don't use @code{stpcpy} which might seem more natural since we handle | 
|  | with strings.  But this is not necessary since we already know the | 
|  | length of the string and therefore can use the faster memory copying | 
|  | function.  The example would work for wide characters the same way. | 
|  |  | 
|  | Whenever a programmer feels the need to use @code{strcat} she or he | 
|  | should think twice and look through the program whether the code cannot | 
|  | be rewritten to take advantage of already calculated results.  Again: it | 
|  | is almost always unnecessary to use @code{strcat}. | 
|  |  | 
|  | @node Truncating Strings | 
|  | @section Truncating Strings while Copying | 
|  | @cindex truncating strings | 
|  | @cindex string truncation | 
|  |  | 
|  | The functions described in this section copy or concatenate the | 
|  | possibly-truncated contents of a string or array to another, and | 
|  | similarly for wide strings.  They follow the string-copying functions | 
|  | in their header conventions.  @xref{Copying Strings and Arrays}.  The | 
|  | @samp{str} functions are declared in the header file @file{string.h} | 
|  | and the @samp{wc} functions are declared in the file @file{wchar.h}. | 
|  |  | 
|  | @comment string.h | 
|  | @deftypefun {char *} strncpy (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | This function is similar to @code{strcpy} but always copies exactly | 
|  | @var{size} bytes into @var{to}. | 
|  |  | 
|  | If @var{from} does not contain a null byte in its first @var{size} | 
|  | bytes, @code{strncpy} copies just the first @var{size} bytes.  In this | 
|  | case no null terminator is written into @var{to}. | 
|  |  | 
|  | Otherwise @var{from} must be a string with length less than | 
|  | @var{size}.  In this case @code{strncpy} copies all of @var{from}, | 
|  | followed by enough null bytes to add up to @var{size} bytes in all. | 
|  |  | 
|  | The behavior of @code{strncpy} is undefined if the strings overlap. | 
|  |  | 
|  | This function was designed for now-rarely-used arrays consisting of | 
|  | non-null bytes followed by zero or more null bytes.  It needs to set | 
|  | all @var{size} bytes of the destination, even when @var{size} is much | 
|  | greater than the length of @var{from}.  As noted below, this function | 
|  | is generally a poor choice for processing text. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment wchar.h | 
|  | @comment ISO | 
|  | @deftypefun {wchar_t *} wcsncpy (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom}, size_t @var{size}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | This function is similar to @code{wcscpy} but always copies exactly | 
|  | @var{size} wide characters into @var{wto}. | 
|  |  | 
|  | If @var{wfrom} does not contain a null wide character in its first | 
|  | @var{size} wide characters, then @code{wcsncpy} copies just the first | 
|  | @var{size} wide characters.  In this case no null terminator is | 
|  | written into @var{wto}. | 
|  |  | 
|  | Otherwise @var{wfrom} must be a wide string with length less than | 
|  | @var{size}.  In this case @code{wcsncpy} copies all of @var{wfrom}, | 
|  | followed by enough null wide characters to add up to @var{size} wide | 
|  | characters in all. | 
|  |  | 
|  | The behavior of @code{wcsncpy} is undefined if the strings overlap. | 
|  |  | 
|  | This function is the wide-character counterpart of @code{strncpy} and | 
|  | suffers from most of the problems that @code{strncpy} does.  For | 
|  | example, as noted below, this function is generally a poor choice for | 
|  | processing text. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment string.h | 
|  | @comment GNU | 
|  | @deftypefun {char *} strndup (const char *@var{s}, size_t @var{size}) | 
|  | @safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}} | 
|  | This function is similar to @code{strdup} but always copies at most | 
|  | @var{size} bytes into the newly allocated string. | 
|  |  | 
|  | If the length of @var{s} is more than @var{size}, then @code{strndup} | 
|  | copies just the first @var{size} bytes and adds a closing null byte. | 
|  | Otherwise all bytes are copied and the string is terminated. | 
|  |  | 
|  | This function differs from @code{strncpy} in that it always terminates | 
|  | the destination string. | 
|  |  | 
|  | As noted below, this function is generally a poor choice for | 
|  | processing text. | 
|  |  | 
|  | @code{strndup} is a GNU extension. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment string.h | 
|  | @comment GNU | 
|  | @deftypefn {Macro} {char *} strndupa (const char *@var{s}, size_t @var{size}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | This function is similar to @code{strndup} but like @code{strdupa} it | 
|  | allocates the new string using @code{alloca} @pxref{Variable Size | 
|  | Automatic}.  The same advantages and limitations of @code{strdupa} are | 
|  | valid for @code{strndupa}, too. | 
|  |  | 
|  | This function is implemented only as a macro, just like @code{strdupa}. | 
|  | Just as @code{strdupa} this macro also must not be used inside the | 
|  | parameter list in a function call. | 
|  |  | 
|  | As noted below, this function is generally a poor choice for | 
|  | processing text. | 
|  |  | 
|  | @code{strndupa} is only available if GNU CC is used. | 
|  | @end deftypefn | 
|  |  | 
|  | @comment string.h | 
|  | @comment GNU | 
|  | @deftypefun {char *} stpncpy (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | This function is similar to @code{stpcpy} but copies always exactly | 
|  | @var{size} bytes into @var{to}. | 
|  |  | 
|  | If the length of @var{from} is more than @var{size}, then @code{stpncpy} | 
|  | copies just the first @var{size} bytes and returns a pointer to the | 
|  | byte directly following the one which was copied last.  Note that in | 
|  | this case there is no null terminator written into @var{to}. | 
|  |  | 
|  | If the length of @var{from} is less than @var{size}, then @code{stpncpy} | 
|  | copies all of @var{from}, followed by enough null bytes to add up | 
|  | to @var{size} bytes in all.  This behavior is rarely useful, but it | 
|  | is implemented to be useful in contexts where this behavior of the | 
|  | @code{strncpy} is used.  @code{stpncpy} returns a pointer to the | 
|  | @emph{first} written null byte. | 
|  |  | 
|  | This function is not part of ISO or POSIX but was found useful while | 
|  | developing @theglibc{} itself. | 
|  |  | 
|  | Its behavior is undefined if the strings overlap.  The function is | 
|  | declared in @file{string.h}. | 
|  |  | 
|  | As noted below, this function is generally a poor choice for | 
|  | processing text. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment wchar.h | 
|  | @comment GNU | 
|  | @deftypefun {wchar_t *} wcpncpy (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom}, size_t @var{size}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | This function is similar to @code{wcpcpy} but copies always exactly | 
|  | @var{wsize} wide characters into @var{wto}. | 
|  |  | 
|  | If the length of @var{wfrom} is more than @var{size}, then | 
|  | @code{wcpncpy} copies just the first @var{size} wide characters and | 
|  | returns a pointer to the wide character directly following the last | 
|  | non-null wide character which was copied last.  Note that in this case | 
|  | there is no null terminator written into @var{wto}. | 
|  |  | 
|  | If the length of @var{wfrom} is less than @var{size}, then @code{wcpncpy} | 
|  | copies all of @var{wfrom}, followed by enough null wide characters to add up | 
|  | to @var{size} wide characters in all.  This behavior is rarely useful, but it | 
|  | is implemented to be useful in contexts where this behavior of the | 
|  | @code{wcsncpy} is used.  @code{wcpncpy} returns a pointer to the | 
|  | @emph{first} written null wide character. | 
|  |  | 
|  | This function is not part of ISO or POSIX but was found useful while | 
|  | developing @theglibc{} itself. | 
|  |  | 
|  | Its behavior is undefined if the strings overlap. | 
|  |  | 
|  | As noted below, this function is generally a poor choice for | 
|  | processing text. | 
|  |  | 
|  | @code{wcpncpy} is a GNU extension. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment string.h | 
|  | @comment ISO | 
|  | @deftypefun {char *} strncat (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | This function is like @code{strcat} except that not more than @var{size} | 
|  | bytes from @var{from} are appended to the end of @var{to}, and | 
|  | @var{from} need not be null-terminated.  A single null byte is also | 
|  | always appended to @var{to}, so the total | 
|  | allocated size of @var{to} must be at least @code{@var{size} + 1} bytes | 
|  | longer than its initial length. | 
|  |  | 
|  | The @code{strncat} function could be implemented like this: | 
|  |  | 
|  | @smallexample | 
|  | @group | 
|  | char * | 
|  | strncat (char *to, const char *from, size_t size) | 
|  | @{ | 
|  | size_t len = strlen (to); | 
|  | memcpy (to + len, from, strnlen (from, size)); | 
|  | to[len + strnlen (from, size)] = '\0'; | 
|  | return to; | 
|  | @} | 
|  | @end group | 
|  | @end smallexample | 
|  |  | 
|  | The behavior of @code{strncat} is undefined if the strings overlap. | 
|  |  | 
|  | As a companion to @code{strncpy}, @code{strncat} was designed for | 
|  | now-rarely-used arrays consisting of non-null bytes followed by zero | 
|  | or more null bytes.  As noted below, this function is generally a poor | 
|  | choice for processing text.  Also, this function has significant | 
|  | performance issues.  @xref{Concatenating Strings}. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment wchar.h | 
|  | @comment ISO | 
|  | @deftypefun {wchar_t *} wcsncat (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom}, size_t @var{size}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | This function is like @code{wcscat} except that not more than @var{size} | 
|  | wide characters from @var{from} are appended to the end of @var{to}, | 
|  | and @var{from} need not be null-terminated.  A single null wide | 
|  | character is also always appended to @var{to}, so the total allocated | 
|  | size of @var{to} must be at least @code{wcsnlen (@var{wfrom}, | 
|  | @var{size}) + 1} wide characters longer than its initial length. | 
|  |  | 
|  | The @code{wcsncat} function could be implemented like this: | 
|  |  | 
|  | @smallexample | 
|  | @group | 
|  | wchar_t * | 
|  | wcsncat (wchar_t *restrict wto, const wchar_t *restrict wfrom, | 
|  | size_t size) | 
|  | @{ | 
|  | size_t len = wcslen (wto); | 
|  | memcpy (wto + len, wfrom, wcsnlen (wfrom, size) * sizeof (wchar_t)); | 
|  | wto[len + wcsnlen (wfrom, size)] = L'\0'; | 
|  | return wto; | 
|  | @} | 
|  | @end group | 
|  | @end smallexample | 
|  |  | 
|  | The behavior of @code{wcsncat} is undefined if the strings overlap. | 
|  |  | 
|  | As noted below, this function is generally a poor choice for | 
|  | processing text.  Also, this function has significant performance | 
|  | issues.  @xref{Concatenating Strings}. | 
|  | @end deftypefun | 
|  |  | 
|  | Because these functions can abruptly truncate strings or wide strings, | 
|  | they are generally poor choices for processing text.  When coping or | 
|  | concatening multibyte strings, they can truncate within a multibyte | 
|  | character so that the result is not a valid multibyte string.  When | 
|  | combining or concatenating multibyte or wide strings, they may | 
|  | truncate the output after a combining character, resulting in a | 
|  | corrupted grapheme.  They can cause bugs even when processing | 
|  | single-byte strings: for example, when calculating an ASCII-only user | 
|  | name, a truncated name can identify the wrong user. | 
|  |  | 
|  | Although some buffer overruns can be prevented by manually replacing | 
|  | calls to copying functions with calls to truncation functions, there | 
|  | are often easier and safer automatic techniques that cause buffer | 
|  | overruns to reliably terminate a program, such as GCC's | 
|  | @option{-fcheck-pointer-bounds} and @option{-fsanitize=address} | 
|  | options.  @xref{Debugging Options,, Options for Debugging Your Program | 
|  | or GCC, gcc.info, Using GCC}.  Because truncation functions can mask | 
|  | application bugs that would otherwise be caught by the automatic | 
|  | techniques, these functions should be used only when the application's | 
|  | underlying logic requires truncation. | 
|  |  | 
|  | @strong{Note:} GNU programs should not truncate strings or wide | 
|  | strings to fit arbitrary size limits.  @xref{Semantics, , Writing | 
|  | Robust Programs, standards, The GNU Coding Standards}.  Instead of | 
|  | string-truncation functions, it is usually better to use dynamic | 
|  | memory allocation (@pxref{Unconstrained Allocation}) and functions | 
|  | such as @code{strdup} or @code{asprintf} to construct strings. | 
|  |  | 
|  | @node String/Array Comparison | 
|  | @section String/Array Comparison | 
|  | @cindex comparing strings and arrays | 
|  | @cindex string comparison functions | 
|  | @cindex array comparison functions | 
|  | @cindex predicates on strings | 
|  | @cindex predicates on arrays | 
|  |  | 
|  | You can use the functions in this section to perform comparisons on the | 
|  | contents of strings and arrays.  As well as checking for equality, these | 
|  | functions can also be used as the ordering functions for sorting | 
|  | operations.  @xref{Searching and Sorting}, for an example of this. | 
|  |  | 
|  | Unlike most comparison operations in C, the string comparison functions | 
|  | return a nonzero value if the strings are @emph{not} equivalent rather | 
|  | than if they are.  The sign of the value indicates the relative ordering | 
|  | of the first part of the strings that are not equivalent:  a | 
|  | negative value indicates that the first string is ``less'' than the | 
|  | second, while a positive value indicates that the first string is | 
|  | ``greater''. | 
|  |  | 
|  | The most common use of these functions is to check only for equality. | 
|  | This is canonically done with an expression like @w{@samp{! strcmp (s1, s2)}}. | 
|  |  | 
|  | All of these functions are declared in the header file @file{string.h}. | 
|  | @pindex string.h | 
|  |  | 
|  | @comment string.h | 
|  | @comment ISO | 
|  | @deftypefun int memcmp (const void *@var{a1}, const void *@var{a2}, size_t @var{size}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | The function @code{memcmp} compares the @var{size} bytes of memory | 
|  | beginning at @var{a1} against the @var{size} bytes of memory beginning | 
|  | at @var{a2}.  The value returned has the same sign as the difference | 
|  | between the first differing pair of bytes (interpreted as @code{unsigned | 
|  | char} objects, then promoted to @code{int}). | 
|  |  | 
|  | If the contents of the two blocks are equal, @code{memcmp} returns | 
|  | @code{0}. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment wchar.h | 
|  | @comment ISO | 
|  | @deftypefun int wmemcmp (const wchar_t *@var{a1}, const wchar_t *@var{a2}, size_t @var{size}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | The function @code{wmemcmp} compares the @var{size} wide characters | 
|  | beginning at @var{a1} against the @var{size} wide characters beginning | 
|  | at @var{a2}.  The value returned is smaller than or larger than zero | 
|  | depending on whether the first differing wide character is @var{a1} is | 
|  | smaller or larger than the corresponding wide character in @var{a2}. | 
|  |  | 
|  | If the contents of the two blocks are equal, @code{wmemcmp} returns | 
|  | @code{0}. | 
|  | @end deftypefun | 
|  |  | 
|  | On arbitrary arrays, the @code{memcmp} function is mostly useful for | 
|  | testing equality.  It usually isn't meaningful to do byte-wise ordering | 
|  | comparisons on arrays of things other than bytes.  For example, a | 
|  | byte-wise comparison on the bytes that make up floating-point numbers | 
|  | isn't likely to tell you anything about the relationship between the | 
|  | values of the floating-point numbers. | 
|  |  | 
|  | @code{wmemcmp} is really only useful to compare arrays of type | 
|  | @code{wchar_t} since the function looks at @code{sizeof (wchar_t)} bytes | 
|  | at a time and this number of bytes is system dependent. | 
|  |  | 
|  | You should also be careful about using @code{memcmp} to compare objects | 
|  | that can contain ``holes'', such as the padding inserted into structure | 
|  | objects to enforce alignment requirements, extra space at the end of | 
|  | unions, and extra bytes at the ends of strings whose length is less | 
|  | than their allocated size.  The contents of these ``holes'' are | 
|  | indeterminate and may cause strange behavior when performing byte-wise | 
|  | comparisons.  For more predictable results, perform an explicit | 
|  | component-wise comparison. | 
|  |  | 
|  | For example, given a structure type definition like: | 
|  |  | 
|  | @smallexample | 
|  | struct foo | 
|  | @{ | 
|  | unsigned char tag; | 
|  | union | 
|  | @{ | 
|  | double f; | 
|  | long i; | 
|  | char *p; | 
|  | @} value; | 
|  | @}; | 
|  | @end smallexample | 
|  |  | 
|  | @noindent | 
|  | you are better off writing a specialized comparison function to compare | 
|  | @code{struct foo} objects instead of comparing them with @code{memcmp}. | 
|  |  | 
|  | @comment string.h | 
|  | @comment ISO | 
|  | @deftypefun int strcmp (const char *@var{s1}, const char *@var{s2}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | The @code{strcmp} function compares the string @var{s1} against | 
|  | @var{s2}, returning a value that has the same sign as the difference | 
|  | between the first differing pair of bytes (interpreted as | 
|  | @code{unsigned char} objects, then promoted to @code{int}). | 
|  |  | 
|  | If the two strings are equal, @code{strcmp} returns @code{0}. | 
|  |  | 
|  | A consequence of the ordering used by @code{strcmp} is that if @var{s1} | 
|  | is an initial substring of @var{s2}, then @var{s1} is considered to be | 
|  | ``less than'' @var{s2}. | 
|  |  | 
|  | @code{strcmp} does not take sorting conventions of the language the | 
|  | strings are written in into account.  To get that one has to use | 
|  | @code{strcoll}. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment wchar.h | 
|  | @comment ISO | 
|  | @deftypefun int wcscmp (const wchar_t *@var{ws1}, const wchar_t *@var{ws2}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  |  | 
|  | The @code{wcscmp} function compares the wide string @var{ws1} | 
|  | against @var{ws2}.  The value returned is smaller than or larger than zero | 
|  | depending on whether the first differing wide character is @var{ws1} is | 
|  | smaller or larger than the corresponding wide character in @var{ws2}. | 
|  |  | 
|  | If the two strings are equal, @code{wcscmp} returns @code{0}. | 
|  |  | 
|  | A consequence of the ordering used by @code{wcscmp} is that if @var{ws1} | 
|  | is an initial substring of @var{ws2}, then @var{ws1} is considered to be | 
|  | ``less than'' @var{ws2}. | 
|  |  | 
|  | @code{wcscmp} does not take sorting conventions of the language the | 
|  | strings are written in into account.  To get that one has to use | 
|  | @code{wcscoll}. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment string.h | 
|  | @comment BSD | 
|  | @deftypefun int strcasecmp (const char *@var{s1}, const char *@var{s2}) | 
|  | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} | 
|  | @c Although this calls tolower multiple times, it's a macro, and | 
|  | @c strcasecmp is optimized so that the locale pointer is read only once. | 
|  | @c There are some asm implementations too, for which the single-read | 
|  | @c from locale TLS pointers also applies. | 
|  | This function is like @code{strcmp}, except that differences in case are | 
|  | ignored, and its arguments must be multibyte strings. | 
|  | How uppercase and lowercase characters are related is | 
|  | determined by the currently selected locale.  In the standard @code{"C"} | 
|  | locale the characters @"A and @"a do not match but in a locale which | 
|  | regards these characters as parts of the alphabet they do match. | 
|  |  | 
|  | @noindent | 
|  | @code{strcasecmp} is derived from BSD. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment wchar.h | 
|  | @comment GNU | 
|  | @deftypefun int wcscasecmp (const wchar_t *@var{ws1}, const wchar_t *@var{ws2}) | 
|  | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} | 
|  | @c Since towlower is not a macro, the locale object may be read multiple | 
|  | @c times. | 
|  | This function is like @code{wcscmp}, except that differences in case are | 
|  | ignored.  How uppercase and lowercase characters are related is | 
|  | determined by the currently selected locale.  In the standard @code{"C"} | 
|  | locale the characters @"A and @"a do not match but in a locale which | 
|  | regards these characters as parts of the alphabet they do match. | 
|  |  | 
|  | @noindent | 
|  | @code{wcscasecmp} is a GNU extension. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment string.h | 
|  | @comment ISO | 
|  | @deftypefun int strncmp (const char *@var{s1}, const char *@var{s2}, size_t @var{size}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | This function is the similar to @code{strcmp}, except that no more than | 
|  | @var{size} bytes are compared.  In other words, if the two | 
|  | strings are the same in their first @var{size} bytes, the | 
|  | return value is zero. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment wchar.h | 
|  | @comment ISO | 
|  | @deftypefun int wcsncmp (const wchar_t *@var{ws1}, const wchar_t *@var{ws2}, size_t @var{size}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | This function is the similar to @code{wcscmp}, except that no more than | 
|  | @var{size} wide characters are compared.  In other words, if the two | 
|  | strings are the same in their first @var{size} wide characters, the | 
|  | return value is zero. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment string.h | 
|  | @comment BSD | 
|  | @deftypefun int strncasecmp (const char *@var{s1}, const char *@var{s2}, size_t @var{n}) | 
|  | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} | 
|  | This function is like @code{strncmp}, except that differences in case | 
|  | are ignored, and the compared parts of the arguments should consist of | 
|  | valid multibyte characters. | 
|  | Like @code{strcasecmp}, it is locale dependent how | 
|  | uppercase and lowercase characters are related. | 
|  |  | 
|  | @noindent | 
|  | @code{strncasecmp} is a GNU extension. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment wchar.h | 
|  | @comment GNU | 
|  | @deftypefun int wcsncasecmp (const wchar_t *@var{ws1}, const wchar_t *@var{s2}, size_t @var{n}) | 
|  | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} | 
|  | This function is like @code{wcsncmp}, except that differences in case | 
|  | are ignored.  Like @code{wcscasecmp}, it is locale dependent how | 
|  | uppercase and lowercase characters are related. | 
|  |  | 
|  | @noindent | 
|  | @code{wcsncasecmp} is a GNU extension. | 
|  | @end deftypefun | 
|  |  | 
|  | Here are some examples showing the use of @code{strcmp} and | 
|  | @code{strncmp} (equivalent examples can be constructed for the wide | 
|  | character functions).  These examples assume the use of the ASCII | 
|  | character set.  (If some other character set---say, EBCDIC---is used | 
|  | instead, then the glyphs are associated with different numeric codes, | 
|  | and the return values and ordering may differ.) | 
|  |  | 
|  | @smallexample | 
|  | strcmp ("hello", "hello") | 
|  | @result{} 0    /* @r{These two strings are the same.} */ | 
|  | strcmp ("hello", "Hello") | 
|  | @result{} 32   /* @r{Comparisons are case-sensitive.} */ | 
|  | strcmp ("hello", "world") | 
|  | @result{} -15  /* @r{The byte @code{'h'} comes before @code{'w'}.} */ | 
|  | strcmp ("hello", "hello, world") | 
|  | @result{} -44  /* @r{Comparing a null byte against a comma.} */ | 
|  | strncmp ("hello", "hello, world", 5) | 
|  | @result{} 0    /* @r{The initial 5 bytes are the same.} */ | 
|  | strncmp ("hello, world", "hello, stupid world!!!", 5) | 
|  | @result{} 0    /* @r{The initial 5 bytes are the same.} */ | 
|  | @end smallexample | 
|  |  | 
|  | @comment string.h | 
|  | @comment GNU | 
|  | @deftypefun int strverscmp (const char *@var{s1}, const char *@var{s2}) | 
|  | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} | 
|  | @c Calls isdigit multiple times, locale may change in between. | 
|  | The @code{strverscmp} function compares the string @var{s1} against | 
|  | @var{s2}, considering them as holding indices/version numbers.  The | 
|  | return value follows the same conventions as found in the | 
|  | @code{strcmp} function.  In fact, if @var{s1} and @var{s2} contain no | 
|  | digits, @code{strverscmp} behaves like @code{strcmp}. | 
|  |  | 
|  | Basically, we compare strings normally (byte by byte), until | 
|  | we find a digit in each string - then we enter a special comparison | 
|  | mode, where each sequence of digits is taken as a whole.  If we reach the | 
|  | end of these two parts without noticing a difference, we return to the | 
|  | standard comparison mode.  There are two types of numeric parts: | 
|  | "integral" and "fractional" (those  begin with a '0').  The types | 
|  | of the numeric parts affect the way we sort them: | 
|  |  | 
|  | @itemize @bullet | 
|  | @item | 
|  | integral/integral: we compare values as you would expect. | 
|  |  | 
|  | @item | 
|  | fractional/integral: the fractional part is less than the integral one. | 
|  | Again, no surprise. | 
|  |  | 
|  | @item | 
|  | fractional/fractional: the things become a bit more complex. | 
|  | If the common prefix contains only leading zeroes, the longest part is less | 
|  | than the other one; else the comparison behaves normally. | 
|  | @end itemize | 
|  |  | 
|  | @smallexample | 
|  | strverscmp ("no digit", "no digit") | 
|  | @result{} 0    /* @r{same behavior as strcmp.} */ | 
|  | strverscmp ("item#99", "item#100") | 
|  | @result{} <0   /* @r{same prefix, but 99 < 100.} */ | 
|  | strverscmp ("alpha1", "alpha001") | 
|  | @result{} >0   /* @r{fractional part inferior to integral one.} */ | 
|  | strverscmp ("part1_f012", "part1_f01") | 
|  | @result{} >0   /* @r{two fractional parts.} */ | 
|  | strverscmp ("foo.009", "foo.0") | 
|  | @result{} <0   /* @r{idem, but with leading zeroes only.} */ | 
|  | @end smallexample | 
|  |  | 
|  | This function is especially useful when dealing with filename sorting, | 
|  | because filenames frequently hold indices/version numbers. | 
|  |  | 
|  | @code{strverscmp} is a GNU extension. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment string.h | 
|  | @comment BSD | 
|  | @deftypefun int bcmp (const void *@var{a1}, const void *@var{a2}, size_t @var{size}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | This is an obsolete alias for @code{memcmp}, derived from BSD. | 
|  | @end deftypefun | 
|  |  | 
|  | @node Collation Functions | 
|  | @section Collation Functions | 
|  |  | 
|  | @cindex collating strings | 
|  | @cindex string collation functions | 
|  |  | 
|  | In some locales, the conventions for lexicographic ordering differ from | 
|  | the strict numeric ordering of character codes.  For example, in Spanish | 
|  | most glyphs with diacritical marks such as accents are not considered | 
|  | distinct letters for the purposes of collation.  On the other hand, the | 
|  | two-character sequence @samp{ll} is treated as a single letter that is | 
|  | collated immediately after @samp{l}. | 
|  |  | 
|  | You can use the functions @code{strcoll} and @code{strxfrm} (declared in | 
|  | the headers file @file{string.h}) and @code{wcscoll} and @code{wcsxfrm} | 
|  | (declared in the headers file @file{wchar}) to compare strings using a | 
|  | collation ordering appropriate for the current locale.  The locale used | 
|  | by these functions in particular can be specified by setting the locale | 
|  | for the @code{LC_COLLATE} category; see @ref{Locales}. | 
|  | @pindex string.h | 
|  | @pindex wchar.h | 
|  |  | 
|  | In the standard C locale, the collation sequence for @code{strcoll} is | 
|  | the same as that for @code{strcmp}.  Similarly, @code{wcscoll} and | 
|  | @code{wcscmp} are the same in this situation. | 
|  |  | 
|  | Effectively, the way these functions work is by applying a mapping to | 
|  | transform the characters in a multibyte string to a byte | 
|  | sequence that represents | 
|  | the string's position in the collating sequence of the current locale. | 
|  | Comparing two such byte sequences in a simple fashion is equivalent to | 
|  | comparing the strings with the locale's collating sequence. | 
|  |  | 
|  | The functions @code{strcoll} and @code{wcscoll} perform this translation | 
|  | implicitly, in order to do one comparison.  By contrast, @code{strxfrm} | 
|  | and @code{wcsxfrm} perform the mapping explicitly.  If you are making | 
|  | multiple comparisons using the same string or set of strings, it is | 
|  | likely to be more efficient to use @code{strxfrm} or @code{wcsxfrm} to | 
|  | transform all the strings just once, and subsequently compare the | 
|  | transformed strings with @code{strcmp} or @code{wcscmp}. | 
|  |  | 
|  | @comment string.h | 
|  | @comment ISO | 
|  | @deftypefun int strcoll (const char *@var{s1}, const char *@var{s2}) | 
|  | @safety{@prelim{}@mtsafe{@mtslocale{}}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}} | 
|  | @c Calls strcoll_l with the current locale, which dereferences only the | 
|  | @c LC_COLLATE data pointer. | 
|  | The @code{strcoll} function is similar to @code{strcmp} but uses the | 
|  | collating sequence of the current locale for collation (the | 
|  | @code{LC_COLLATE} locale).  The arguments are multibyte strings. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment wchar.h | 
|  | @comment ISO | 
|  | @deftypefun int wcscoll (const wchar_t *@var{ws1}, const wchar_t *@var{ws2}) | 
|  | @safety{@prelim{}@mtsafe{@mtslocale{}}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}} | 
|  | @c Same as strcoll, but calling wcscoll_l. | 
|  | The @code{wcscoll} function is similar to @code{wcscmp} but uses the | 
|  | collating sequence of the current locale for collation (the | 
|  | @code{LC_COLLATE} locale). | 
|  | @end deftypefun | 
|  |  | 
|  | Here is an example of sorting an array of strings, using @code{strcoll} | 
|  | to compare them.  The actual sort algorithm is not written here; it | 
|  | comes from @code{qsort} (@pxref{Array Sort Function}).  The job of the | 
|  | code shown here is to say how to compare the strings while sorting them. | 
|  | (Later on in this section, we will show a way to do this more | 
|  | efficiently using @code{strxfrm}.) | 
|  |  | 
|  | @smallexample | 
|  | /* @r{This is the comparison function used with @code{qsort}.} */ | 
|  |  | 
|  | int | 
|  | compare_elements (const void *v1, const void *v2) | 
|  | @{ | 
|  | char * const *p1 = v1; | 
|  | char * const *p2 = v2; | 
|  |  | 
|  | return strcoll (*p1, *p2); | 
|  | @} | 
|  |  | 
|  | /* @r{This is the entry point---the function to sort} | 
|  | @r{strings using the locale's collating sequence.} */ | 
|  |  | 
|  | void | 
|  | sort_strings (char **array, int nstrings) | 
|  | @{ | 
|  | /* @r{Sort @code{temp_array} by comparing the strings.} */ | 
|  | qsort (array, nstrings, | 
|  | sizeof (char *), compare_elements); | 
|  | @} | 
|  | @end smallexample | 
|  |  | 
|  | @cindex converting string to collation order | 
|  | @comment string.h | 
|  | @comment ISO | 
|  | @deftypefun size_t strxfrm (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size}) | 
|  | @safety{@prelim{}@mtsafe{@mtslocale{}}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}} | 
|  | The function @code{strxfrm} transforms the multibyte string | 
|  | @var{from} using the | 
|  | collation transformation determined by the locale currently selected for | 
|  | collation, and stores the transformed string in the array @var{to}.  Up | 
|  | to @var{size} bytes (including a terminating null byte) are | 
|  | stored. | 
|  |  | 
|  | The behavior is undefined if the strings @var{to} and @var{from} | 
|  | overlap; see @ref{Copying Strings and Arrays}. | 
|  |  | 
|  | The return value is the length of the entire transformed string.  This | 
|  | value is not affected by the value of @var{size}, but if it is greater | 
|  | or equal than @var{size}, it means that the transformed string did not | 
|  | entirely fit in the array @var{to}.  In this case, only as much of the | 
|  | string as actually fits was stored.  To get the whole transformed | 
|  | string, call @code{strxfrm} again with a bigger output array. | 
|  |  | 
|  | The transformed string may be longer than the original string, and it | 
|  | may also be shorter. | 
|  |  | 
|  | If @var{size} is zero, no bytes are stored in @var{to}.  In this | 
|  | case, @code{strxfrm} simply returns the number of bytes that would | 
|  | be the length of the transformed string.  This is useful for determining | 
|  | what size the allocated array should be.  It does not matter what | 
|  | @var{to} is if @var{size} is zero; @var{to} may even be a null pointer. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment wchar.h | 
|  | @comment ISO | 
|  | @deftypefun size_t wcsxfrm (wchar_t *restrict @var{wto}, const wchar_t *@var{wfrom}, size_t @var{size}) | 
|  | @safety{@prelim{}@mtsafe{@mtslocale{}}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}} | 
|  | The function @code{wcsxfrm} transforms wide string @var{wfrom} | 
|  | using the collation transformation determined by the locale currently | 
|  | selected for collation, and stores the transformed string in the array | 
|  | @var{wto}.  Up to @var{size} wide characters (including a terminating null | 
|  | wide character) are stored. | 
|  |  | 
|  | The behavior is undefined if the strings @var{wto} and @var{wfrom} | 
|  | overlap; see @ref{Copying Strings and Arrays}. | 
|  |  | 
|  | The return value is the length of the entire transformed wide | 
|  | string.  This value is not affected by the value of @var{size}, but if | 
|  | it is greater or equal than @var{size}, it means that the transformed | 
|  | wide string did not entirely fit in the array @var{wto}.  In | 
|  | this case, only as much of the wide string as actually fits | 
|  | was stored.  To get the whole transformed wide string, call | 
|  | @code{wcsxfrm} again with a bigger output array. | 
|  |  | 
|  | The transformed wide string may be longer than the original | 
|  | wide string, and it may also be shorter. | 
|  |  | 
|  | If @var{size} is zero, no wide characters are stored in @var{to}.  In this | 
|  | case, @code{wcsxfrm} simply returns the number of wide characters that | 
|  | would be the length of the transformed wide string.  This is | 
|  | useful for determining what size the allocated array should be (remember | 
|  | to multiply with @code{sizeof (wchar_t)}).  It does not matter what | 
|  | @var{wto} is if @var{size} is zero; @var{wto} may even be a null pointer. | 
|  | @end deftypefun | 
|  |  | 
|  | Here is an example of how you can use @code{strxfrm} when | 
|  | you plan to do many comparisons.  It does the same thing as the previous | 
|  | example, but much faster, because it has to transform each string only | 
|  | once, no matter how many times it is compared with other strings.  Even | 
|  | the time needed to allocate and free storage is much less than the time | 
|  | we save, when there are many strings. | 
|  |  | 
|  | @smallexample | 
|  | struct sorter @{ char *input; char *transformed; @}; | 
|  |  | 
|  | /* @r{This is the comparison function used with @code{qsort}} | 
|  | @r{to sort an array of @code{struct sorter}.} */ | 
|  |  | 
|  | int | 
|  | compare_elements (const void *v1, const void *v2) | 
|  | @{ | 
|  | const struct sorter *p1 = v1; | 
|  | const struct sorter *p2 = v2; | 
|  |  | 
|  | return strcmp (p1->transformed, p2->transformed); | 
|  | @} | 
|  |  | 
|  | /* @r{This is the entry point---the function to sort} | 
|  | @r{strings using the locale's collating sequence.} */ | 
|  |  | 
|  | void | 
|  | sort_strings_fast (char **array, int nstrings) | 
|  | @{ | 
|  | struct sorter temp_array[nstrings]; | 
|  | int i; | 
|  |  | 
|  | /* @r{Set up @code{temp_array}.  Each element contains} | 
|  | @r{one input string and its transformed string.} */ | 
|  | for (i = 0; i < nstrings; i++) | 
|  | @{ | 
|  | size_t length = strlen (array[i]) * 2; | 
|  | char *transformed; | 
|  | size_t transformed_length; | 
|  |  | 
|  | temp_array[i].input = array[i]; | 
|  |  | 
|  | /* @r{First try a buffer perhaps big enough.}  */ | 
|  | transformed = (char *) xmalloc (length); | 
|  |  | 
|  | /* @r{Transform @code{array[i]}.}  */ | 
|  | transformed_length = strxfrm (transformed, array[i], length); | 
|  |  | 
|  | /* @r{If the buffer was not large enough, resize it} | 
|  | @r{and try again.}  */ | 
|  | if (transformed_length >= length) | 
|  | @{ | 
|  | /* @r{Allocate the needed space. +1 for terminating} | 
|  | @r{@code{'\0'} byte.}  */ | 
|  | transformed = (char *) xrealloc (transformed, | 
|  | transformed_length + 1); | 
|  |  | 
|  | /* @r{The return value is not interesting because we know} | 
|  | @r{how long the transformed string is.}  */ | 
|  | (void) strxfrm (transformed, array[i], | 
|  | transformed_length + 1); | 
|  | @} | 
|  |  | 
|  | temp_array[i].transformed = transformed; | 
|  | @} | 
|  |  | 
|  | /* @r{Sort @code{temp_array} by comparing transformed strings.} */ | 
|  | qsort (temp_array, nstrings, | 
|  | sizeof (struct sorter), compare_elements); | 
|  |  | 
|  | /* @r{Put the elements back in the permanent array} | 
|  | @r{in their sorted order.} */ | 
|  | for (i = 0; i < nstrings; i++) | 
|  | array[i] = temp_array[i].input; | 
|  |  | 
|  | /* @r{Free the strings we allocated.} */ | 
|  | for (i = 0; i < nstrings; i++) | 
|  | free (temp_array[i].transformed); | 
|  | @} | 
|  | @end smallexample | 
|  |  | 
|  | The interesting part of this code for the wide character version would | 
|  | look like this: | 
|  |  | 
|  | @smallexample | 
|  | void | 
|  | sort_strings_fast (wchar_t **array, int nstrings) | 
|  | @{ | 
|  | @dots{} | 
|  | /* @r{Transform @code{array[i]}.}  */ | 
|  | transformed_length = wcsxfrm (transformed, array[i], length); | 
|  |  | 
|  | /* @r{If the buffer was not large enough, resize it} | 
|  | @r{and try again.}  */ | 
|  | if (transformed_length >= length) | 
|  | @{ | 
|  | /* @r{Allocate the needed space. +1 for terminating} | 
|  | @r{@code{L'\0'} wide character.}  */ | 
|  | transformed = (wchar_t *) xrealloc (transformed, | 
|  | (transformed_length + 1) | 
|  | * sizeof (wchar_t)); | 
|  |  | 
|  | /* @r{The return value is not interesting because we know} | 
|  | @r{how long the transformed string is.}  */ | 
|  | (void) wcsxfrm (transformed, array[i], | 
|  | transformed_length + 1); | 
|  | @} | 
|  | @dots{} | 
|  | @end smallexample | 
|  |  | 
|  | @noindent | 
|  | Note the additional multiplication with @code{sizeof (wchar_t)} in the | 
|  | @code{realloc} call. | 
|  |  | 
|  | @strong{Compatibility Note:} The string collation functions are a new | 
|  | feature of @w{ISO C90}.  Older C dialects have no equivalent feature. | 
|  | The wide character versions were introduced in @w{Amendment 1} to @w{ISO | 
|  | C90}. | 
|  |  | 
|  | @node Search Functions | 
|  | @section Search Functions | 
|  |  | 
|  | This section describes library functions which perform various kinds | 
|  | of searching operations on strings and arrays.  These functions are | 
|  | declared in the header file @file{string.h}. | 
|  | @pindex string.h | 
|  | @cindex search functions (for strings) | 
|  | @cindex string search functions | 
|  |  | 
|  | @comment string.h | 
|  | @comment ISO | 
|  | @deftypefun {void *} memchr (const void *@var{block}, int @var{c}, size_t @var{size}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | This function finds the first occurrence of the byte @var{c} (converted | 
|  | to an @code{unsigned char}) in the initial @var{size} bytes of the | 
|  | object beginning at @var{block}.  The return value is a pointer to the | 
|  | located byte, or a null pointer if no match was found. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment wchar.h | 
|  | @comment ISO | 
|  | @deftypefun {wchar_t *} wmemchr (const wchar_t *@var{block}, wchar_t @var{wc}, size_t @var{size}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | This function finds the first occurrence of the wide character @var{wc} | 
|  | in the initial @var{size} wide characters of the object beginning at | 
|  | @var{block}.  The return value is a pointer to the located wide | 
|  | character, or a null pointer if no match was found. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment string.h | 
|  | @comment GNU | 
|  | @deftypefun {void *} rawmemchr (const void *@var{block}, int @var{c}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | Often the @code{memchr} function is used with the knowledge that the | 
|  | byte @var{c} is available in the memory block specified by the | 
|  | parameters.  But this means that the @var{size} parameter is not really | 
|  | needed and that the tests performed with it at runtime (to check whether | 
|  | the end of the block is reached) are not needed. | 
|  |  | 
|  | The @code{rawmemchr} function exists for just this situation which is | 
|  | surprisingly frequent.  The interface is similar to @code{memchr} except | 
|  | that the @var{size} parameter is missing.  The function will look beyond | 
|  | the end of the block pointed to by @var{block} in case the programmer | 
|  | made an error in assuming that the byte @var{c} is present in the block. | 
|  | In this case the result is unspecified.  Otherwise the return value is a | 
|  | pointer to the located byte. | 
|  |  | 
|  | This function is of special interest when looking for the end of a | 
|  | string.  Since all strings are terminated by a null byte a call like | 
|  |  | 
|  | @smallexample | 
|  | rawmemchr (str, '\0') | 
|  | @end smallexample | 
|  |  | 
|  | @noindent | 
|  | will never go beyond the end of the string. | 
|  |  | 
|  | This function is a GNU extension. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment string.h | 
|  | @comment GNU | 
|  | @deftypefun {void *} memrchr (const void *@var{block}, int @var{c}, size_t @var{size}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | The function @code{memrchr} is like @code{memchr}, except that it searches | 
|  | backwards from the end of the block defined by @var{block} and @var{size} | 
|  | (instead of forwards from the front). | 
|  |  | 
|  | This function is a GNU extension. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment string.h | 
|  | @comment ISO | 
|  | @deftypefun {char *} strchr (const char *@var{string}, int @var{c}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | The @code{strchr} function finds the first occurrence of the byte | 
|  | @var{c} (converted to a @code{char}) in the string | 
|  | beginning at @var{string}.  The return value is a pointer to the located | 
|  | byte, or a null pointer if no match was found. | 
|  |  | 
|  | For example, | 
|  | @smallexample | 
|  | strchr ("hello, world", 'l') | 
|  | @result{} "llo, world" | 
|  | strchr ("hello, world", '?') | 
|  | @result{} NULL | 
|  | @end smallexample | 
|  |  | 
|  | The terminating null byte is considered to be part of the string, | 
|  | so you can use this function get a pointer to the end of a string by | 
|  | specifying zero as the value of the @var{c} argument. | 
|  |  | 
|  | When @code{strchr} returns a null pointer, it does not let you know | 
|  | the position of the terminating null byte it has found.  If you | 
|  | need that information, it is better (but less portable) to use | 
|  | @code{strchrnul} than to search for it a second time. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment wchar.h | 
|  | @comment ISO | 
|  | @deftypefun {wchar_t *} wcschr (const wchar_t *@var{wstring}, int @var{wc}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | The @code{wcschr} function finds the first occurrence of the wide | 
|  | character @var{wc} in the wide string | 
|  | beginning at @var{wstring}.  The return value is a pointer to the | 
|  | located wide character, or a null pointer if no match was found. | 
|  |  | 
|  | The terminating null wide character is considered to be part of the wide | 
|  | string, so you can use this function get a pointer to the end | 
|  | of a wide string by specifying a null wide character as the | 
|  | value of the @var{wc} argument.  It would be better (but less portable) | 
|  | to use @code{wcschrnul} in this case, though. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment string.h | 
|  | @comment GNU | 
|  | @deftypefun {char *} strchrnul (const char *@var{string}, int @var{c}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | @code{strchrnul} is the same as @code{strchr} except that if it does | 
|  | not find the byte, it returns a pointer to string's terminating | 
|  | null byte rather than a null pointer. | 
|  |  | 
|  | This function is a GNU extension. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment wchar.h | 
|  | @comment GNU | 
|  | @deftypefun {wchar_t *} wcschrnul (const wchar_t *@var{wstring}, wchar_t @var{wc}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | @code{wcschrnul} is the same as @code{wcschr} except that if it does not | 
|  | find the wide character, it returns a pointer to the wide string's | 
|  | terminating null wide character rather than a null pointer. | 
|  |  | 
|  | This function is a GNU extension. | 
|  | @end deftypefun | 
|  |  | 
|  | One useful, but unusual, use of the @code{strchr} | 
|  | function is when one wants to have a pointer pointing to the null byte | 
|  | terminating a string.  This is often written in this way: | 
|  |  | 
|  | @smallexample | 
|  | s += strlen (s); | 
|  | @end smallexample | 
|  |  | 
|  | @noindent | 
|  | This is almost optimal but the addition operation duplicated a bit of | 
|  | the work already done in the @code{strlen} function.  A better solution | 
|  | is this: | 
|  |  | 
|  | @smallexample | 
|  | s = strchr (s, '\0'); | 
|  | @end smallexample | 
|  |  | 
|  | There is no restriction on the second parameter of @code{strchr} so it | 
|  | could very well also be zero.  Those readers thinking very | 
|  | hard about this might now point out that the @code{strchr} function is | 
|  | more expensive than the @code{strlen} function since we have two abort | 
|  | criteria.  This is right.  But in @theglibc{} the implementation of | 
|  | @code{strchr} is optimized in a special way so that @code{strchr} | 
|  | actually is faster. | 
|  |  | 
|  | @comment string.h | 
|  | @comment ISO | 
|  | @deftypefun {char *} strrchr (const char *@var{string}, int @var{c}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | The function @code{strrchr} is like @code{strchr}, except that it searches | 
|  | backwards from the end of the string @var{string} (instead of forwards | 
|  | from the front). | 
|  |  | 
|  | For example, | 
|  | @smallexample | 
|  | strrchr ("hello, world", 'l') | 
|  | @result{} "ld" | 
|  | @end smallexample | 
|  | @end deftypefun | 
|  |  | 
|  | @comment wchar.h | 
|  | @comment ISO | 
|  | @deftypefun {wchar_t *} wcsrchr (const wchar_t *@var{wstring}, wchar_t @var{c}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | The function @code{wcsrchr} is like @code{wcschr}, except that it searches | 
|  | backwards from the end of the string @var{wstring} (instead of forwards | 
|  | from the front). | 
|  | @end deftypefun | 
|  |  | 
|  | @comment string.h | 
|  | @comment ISO | 
|  | @deftypefun {char *} strstr (const char *@var{haystack}, const char *@var{needle}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | This is like @code{strchr}, except that it searches @var{haystack} for a | 
|  | substring @var{needle} rather than just a single byte.  It | 
|  | returns a pointer into the string @var{haystack} that is the first | 
|  | byte of the substring, or a null pointer if no match was found.  If | 
|  | @var{needle} is an empty string, the function returns @var{haystack}. | 
|  |  | 
|  | For example, | 
|  | @smallexample | 
|  | strstr ("hello, world", "l") | 
|  | @result{} "llo, world" | 
|  | strstr ("hello, world", "wo") | 
|  | @result{} "world" | 
|  | @end smallexample | 
|  | @end deftypefun | 
|  |  | 
|  | @comment wchar.h | 
|  | @comment ISO | 
|  | @deftypefun {wchar_t *} wcsstr (const wchar_t *@var{haystack}, const wchar_t *@var{needle}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | This is like @code{wcschr}, except that it searches @var{haystack} for a | 
|  | substring @var{needle} rather than just a single wide character.  It | 
|  | returns a pointer into the string @var{haystack} that is the first wide | 
|  | character of the substring, or a null pointer if no match was found.  If | 
|  | @var{needle} is an empty string, the function returns @var{haystack}. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment wchar.h | 
|  | @comment XPG | 
|  | @deftypefun {wchar_t *} wcswcs (const wchar_t *@var{haystack}, const wchar_t *@var{needle}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | @code{wcswcs} is a deprecated alias for @code{wcsstr}.  This is the | 
|  | name originally used in the X/Open Portability Guide before the | 
|  | @w{Amendment 1} to @w{ISO C90} was published. | 
|  | @end deftypefun | 
|  |  | 
|  |  | 
|  | @comment string.h | 
|  | @comment GNU | 
|  | @deftypefun {char *} strcasestr (const char *@var{haystack}, const char *@var{needle}) | 
|  | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} | 
|  | @c There may be multiple calls of strncasecmp, each accessing the locale | 
|  | @c object independently. | 
|  | This is like @code{strstr}, except that it ignores case in searching for | 
|  | the substring.   Like @code{strcasecmp}, it is locale dependent how | 
|  | uppercase and lowercase characters are related, and arguments are | 
|  | multibyte strings. | 
|  |  | 
|  |  | 
|  | For example, | 
|  | @smallexample | 
|  | strcasestr ("hello, world", "L") | 
|  | @result{} "llo, world" | 
|  | strcasestr ("hello, World", "wo") | 
|  | @result{} "World" | 
|  | @end smallexample | 
|  | @end deftypefun | 
|  |  | 
|  |  | 
|  | @comment string.h | 
|  | @comment GNU | 
|  | @deftypefun {void *} memmem (const void *@var{haystack}, size_t @var{haystack-len},@*const void *@var{needle}, size_t @var{needle-len}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | This is like @code{strstr}, but @var{needle} and @var{haystack} are byte | 
|  | arrays rather than strings.  @var{needle-len} is the | 
|  | length of @var{needle} and @var{haystack-len} is the length of | 
|  | @var{haystack}.@refill | 
|  |  | 
|  | This function is a GNU extension. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment string.h | 
|  | @comment ISO | 
|  | @deftypefun size_t strspn (const char *@var{string}, const char *@var{skipset}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | The @code{strspn} (``string span'') function returns the length of the | 
|  | initial substring of @var{string} that consists entirely of bytes that | 
|  | are members of the set specified by the string @var{skipset}.  The order | 
|  | of the bytes in @var{skipset} is not important. | 
|  |  | 
|  | For example, | 
|  | @smallexample | 
|  | strspn ("hello, world", "abcdefghijklmnopqrstuvwxyz") | 
|  | @result{} 5 | 
|  | @end smallexample | 
|  |  | 
|  | In a multibyte string, characters consisting of | 
|  | more than one byte are not treated as single entities.  Each byte is treated | 
|  | separately.  The function is not locale-dependent. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment wchar.h | 
|  | @comment ISO | 
|  | @deftypefun size_t wcsspn (const wchar_t *@var{wstring}, const wchar_t *@var{skipset}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | The @code{wcsspn} (``wide character string span'') function returns the | 
|  | length of the initial substring of @var{wstring} that consists entirely | 
|  | of wide characters that are members of the set specified by the string | 
|  | @var{skipset}.  The order of the wide characters in @var{skipset} is not | 
|  | important. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment string.h | 
|  | @comment ISO | 
|  | @deftypefun size_t strcspn (const char *@var{string}, const char *@var{stopset}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | The @code{strcspn} (``string complement span'') function returns the length | 
|  | of the initial substring of @var{string} that consists entirely of bytes | 
|  | that are @emph{not} members of the set specified by the string @var{stopset}. | 
|  | (In other words, it returns the offset of the first byte in @var{string} | 
|  | that is a member of the set @var{stopset}.) | 
|  |  | 
|  | For example, | 
|  | @smallexample | 
|  | strcspn ("hello, world", " \t\n,.;!?") | 
|  | @result{} 5 | 
|  | @end smallexample | 
|  |  | 
|  | In a multibyte string, characters consisting of | 
|  | more than one byte are not treated as a single entities.  Each byte is treated | 
|  | separately.  The function is not locale-dependent. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment wchar.h | 
|  | @comment ISO | 
|  | @deftypefun size_t wcscspn (const wchar_t *@var{wstring}, const wchar_t *@var{stopset}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | The @code{wcscspn} (``wide character string complement span'') function | 
|  | returns the length of the initial substring of @var{wstring} that | 
|  | consists entirely of wide characters that are @emph{not} members of the | 
|  | set specified by the string @var{stopset}.  (In other words, it returns | 
|  | the offset of the first wide character in @var{string} that is a member of | 
|  | the set @var{stopset}.) | 
|  | @end deftypefun | 
|  |  | 
|  | @comment string.h | 
|  | @comment ISO | 
|  | @deftypefun {char *} strpbrk (const char *@var{string}, const char *@var{stopset}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | The @code{strpbrk} (``string pointer break'') function is related to | 
|  | @code{strcspn}, except that it returns a pointer to the first byte | 
|  | in @var{string} that is a member of the set @var{stopset} instead of the | 
|  | length of the initial substring.  It returns a null pointer if no such | 
|  | byte from @var{stopset} is found. | 
|  |  | 
|  | @c @group  Invalid outside the example. | 
|  | For example, | 
|  |  | 
|  | @smallexample | 
|  | strpbrk ("hello, world", " \t\n,.;!?") | 
|  | @result{} ", world" | 
|  | @end smallexample | 
|  | @c @end group | 
|  |  | 
|  | In a multibyte string, characters consisting of | 
|  | more than one byte are not treated as single entities.  Each byte is treated | 
|  | separately.  The function is not locale-dependent. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment wchar.h | 
|  | @comment ISO | 
|  | @deftypefun {wchar_t *} wcspbrk (const wchar_t *@var{wstring}, const wchar_t *@var{stopset}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | The @code{wcspbrk} (``wide character string pointer break'') function is | 
|  | related to @code{wcscspn}, except that it returns a pointer to the first | 
|  | wide character in @var{wstring} that is a member of the set | 
|  | @var{stopset} instead of the length of the initial substring.  It | 
|  | returns a null pointer if no such wide character from @var{stopset} is found. | 
|  | @end deftypefun | 
|  |  | 
|  |  | 
|  | @subsection Compatibility String Search Functions | 
|  |  | 
|  | @comment string.h | 
|  | @comment BSD | 
|  | @deftypefun {char *} index (const char *@var{string}, int @var{c}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | @code{index} is another name for @code{strchr}; they are exactly the same. | 
|  | New code should always use @code{strchr} since this name is defined in | 
|  | @w{ISO C} while @code{index} is a BSD invention which never was available | 
|  | on @w{System V} derived systems. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment string.h | 
|  | @comment BSD | 
|  | @deftypefun {char *} rindex (const char *@var{string}, int @var{c}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | @code{rindex} is another name for @code{strrchr}; they are exactly the same. | 
|  | New code should always use @code{strrchr} since this name is defined in | 
|  | @w{ISO C} while @code{rindex} is a BSD invention which never was available | 
|  | on @w{System V} derived systems. | 
|  | @end deftypefun | 
|  |  | 
|  | @node Finding Tokens in a String | 
|  | @section Finding Tokens in a String | 
|  |  | 
|  | @cindex tokenizing strings | 
|  | @cindex breaking a string into tokens | 
|  | @cindex parsing tokens from a string | 
|  | It's fairly common for programs to have a need to do some simple kinds | 
|  | of lexical analysis and parsing, such as splitting a command string up | 
|  | into tokens.  You can do this with the @code{strtok} function, declared | 
|  | in the header file @file{string.h}. | 
|  | @pindex string.h | 
|  |  | 
|  | @comment string.h | 
|  | @comment ISO | 
|  | @deftypefun {char *} strtok (char *restrict @var{newstring}, const char *restrict @var{delimiters}) | 
|  | @safety{@prelim{}@mtunsafe{@mtasurace{:strtok}}@asunsafe{}@acsafe{}} | 
|  | A string can be split into tokens by making a series of calls to the | 
|  | function @code{strtok}. | 
|  |  | 
|  | The string to be split up is passed as the @var{newstring} argument on | 
|  | the first call only.  The @code{strtok} function uses this to set up | 
|  | some internal state information.  Subsequent calls to get additional | 
|  | tokens from the same string are indicated by passing a null pointer as | 
|  | the @var{newstring} argument.  Calling @code{strtok} with another | 
|  | non-null @var{newstring} argument reinitializes the state information. | 
|  | It is guaranteed that no other library function ever calls @code{strtok} | 
|  | behind your back (which would mess up this internal state information). | 
|  |  | 
|  | The @var{delimiters} argument is a string that specifies a set of delimiters | 
|  | that may surround the token being extracted.  All the initial bytes | 
|  | that are members of this set are discarded.  The first byte that is | 
|  | @emph{not} a member of this set of delimiters marks the beginning of the | 
|  | next token.  The end of the token is found by looking for the next | 
|  | byte that is a member of the delimiter set.  This byte in the | 
|  | original string @var{newstring} is overwritten by a null byte, and the | 
|  | pointer to the beginning of the token in @var{newstring} is returned. | 
|  |  | 
|  | On the next call to @code{strtok}, the searching begins at the next | 
|  | byte beyond the one that marked the end of the previous token. | 
|  | Note that the set of delimiters @var{delimiters} do not have to be the | 
|  | same on every call in a series of calls to @code{strtok}. | 
|  |  | 
|  | If the end of the string @var{newstring} is reached, or if the remainder of | 
|  | string consists only of delimiter bytes, @code{strtok} returns | 
|  | a null pointer. | 
|  |  | 
|  | In a multibyte string, characters consisting of | 
|  | more than one byte are not treated as single entities.  Each byte is treated | 
|  | separately.  The function is not locale-dependent. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment wchar.h | 
|  | @comment ISO | 
|  | @deftypefun {wchar_t *} wcstok (wchar_t *@var{newstring}, const wchar_t *@var{delimiters}, wchar_t **@var{save_ptr}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | A string can be split into tokens by making a series of calls to the | 
|  | function @code{wcstok}. | 
|  |  | 
|  | The string to be split up is passed as the @var{newstring} argument on | 
|  | the first call only.  The @code{wcstok} function uses this to set up | 
|  | some internal state information.  Subsequent calls to get additional | 
|  | tokens from the same wide string are indicated by passing a | 
|  | null pointer as the @var{newstring} argument, which causes the pointer | 
|  | previously stored in @var{save_ptr} to be used instead. | 
|  |  | 
|  | The @var{delimiters} argument is a wide string that specifies | 
|  | a set of delimiters that may surround the token being extracted.  All | 
|  | the initial wide characters that are members of this set are discarded. | 
|  | The first wide character that is @emph{not} a member of this set of | 
|  | delimiters marks the beginning of the next token.  The end of the token | 
|  | is found by looking for the next wide character that is a member of the | 
|  | delimiter set.  This wide character in the original wide | 
|  | string @var{newstring} is overwritten by a null wide character, the | 
|  | pointer past the overwritten wide character is saved in @var{save_ptr}, | 
|  | and the pointer to the beginning of the token in @var{newstring} is | 
|  | returned. | 
|  |  | 
|  | On the next call to @code{wcstok}, the searching begins at the next | 
|  | wide character beyond the one that marked the end of the previous token. | 
|  | Note that the set of delimiters @var{delimiters} do not have to be the | 
|  | same on every call in a series of calls to @code{wcstok}. | 
|  |  | 
|  | If the end of the wide string @var{newstring} is reached, or | 
|  | if the remainder of string consists only of delimiter wide characters, | 
|  | @code{wcstok} returns a null pointer. | 
|  | @end deftypefun | 
|  |  | 
|  | @strong{Warning:} Since @code{strtok} and @code{wcstok} alter the string | 
|  | they is parsing, you should always copy the string to a temporary buffer | 
|  | before parsing it with @code{strtok}/@code{wcstok} (@pxref{Copying Strings | 
|  | and Arrays}).  If you allow @code{strtok} or @code{wcstok} to modify | 
|  | a string that came from another part of your program, you are asking for | 
|  | trouble; that string might be used for other purposes after | 
|  | @code{strtok} or @code{wcstok} has modified it, and it would not have | 
|  | the expected value. | 
|  |  | 
|  | The string that you are operating on might even be a constant.  Then | 
|  | when @code{strtok} or @code{wcstok} tries to modify it, your program | 
|  | will get a fatal signal for writing in read-only memory.  @xref{Program | 
|  | Error Signals}.  Even if the operation of @code{strtok} or @code{wcstok} | 
|  | would not require a modification of the string (e.g., if there is | 
|  | exactly one token) the string can (and in the @glibcadj{} case will) be | 
|  | modified. | 
|  |  | 
|  | This is a special case of a general principle: if a part of a program | 
|  | does not have as its purpose the modification of a certain data | 
|  | structure, then it is error-prone to modify the data structure | 
|  | temporarily. | 
|  |  | 
|  | The function @code{strtok} is not reentrant, whereas @code{wcstok} is. | 
|  | @xref{Nonreentrancy}, for a discussion of where and why reentrancy is | 
|  | important. | 
|  |  | 
|  | Here is a simple example showing the use of @code{strtok}. | 
|  |  | 
|  | @comment Yes, this example has been tested. | 
|  | @smallexample | 
|  | #include <string.h> | 
|  | #include <stddef.h> | 
|  |  | 
|  | @dots{} | 
|  |  | 
|  | const char string[] = "words separated by spaces -- and, punctuation!"; | 
|  | const char delimiters[] = " .,;:!-"; | 
|  | char *token, *cp; | 
|  |  | 
|  | @dots{} | 
|  |  | 
|  | cp = strdupa (string);                /* Make writable copy.  */ | 
|  | token = strtok (cp, delimiters);      /* token => "words" */ | 
|  | token = strtok (NULL, delimiters);    /* token => "separated" */ | 
|  | token = strtok (NULL, delimiters);    /* token => "by" */ | 
|  | token = strtok (NULL, delimiters);    /* token => "spaces" */ | 
|  | token = strtok (NULL, delimiters);    /* token => "and" */ | 
|  | token = strtok (NULL, delimiters);    /* token => "punctuation" */ | 
|  | token = strtok (NULL, delimiters);    /* token => NULL */ | 
|  | @end smallexample | 
|  |  | 
|  | @Theglibc{} contains two more functions for tokenizing a string | 
|  | which overcome the limitation of non-reentrancy.  They are not | 
|  | available available for wide strings. | 
|  |  | 
|  | @comment string.h | 
|  | @comment POSIX | 
|  | @deftypefun {char *} strtok_r (char *@var{newstring}, const char *@var{delimiters}, char **@var{save_ptr}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | Just like @code{strtok}, this function splits the string into several | 
|  | tokens which can be accessed by successive calls to @code{strtok_r}. | 
|  | The difference is that, as in @code{wcstok}, the information about the | 
|  | next token is stored in the space pointed to by the third argument, | 
|  | @var{save_ptr}, which is a pointer to a string pointer.  Calling | 
|  | @code{strtok_r} with a null pointer for @var{newstring} and leaving | 
|  | @var{save_ptr} between the calls unchanged does the job without | 
|  | hindering reentrancy. | 
|  |  | 
|  | This function is defined in POSIX.1 and can be found on many systems | 
|  | which support multi-threading. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment string.h | 
|  | @comment BSD | 
|  | @deftypefun {char *} strsep (char **@var{string_ptr}, const char *@var{delimiter}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | This function has a similar functionality as @code{strtok_r} with the | 
|  | @var{newstring} argument replaced by the @var{save_ptr} argument.  The | 
|  | initialization of the moving pointer has to be done by the user. | 
|  | Successive calls to @code{strsep} move the pointer along the tokens | 
|  | separated by @var{delimiter}, returning the address of the next token | 
|  | and updating @var{string_ptr} to point to the beginning of the next | 
|  | token. | 
|  |  | 
|  | One difference between @code{strsep} and @code{strtok_r} is that if the | 
|  | input string contains more than one byte from @var{delimiter} in a | 
|  | row @code{strsep} returns an empty string for each pair of bytes | 
|  | from @var{delimiter}.  This means that a program normally should test | 
|  | for @code{strsep} returning an empty string before processing it. | 
|  |  | 
|  | This function was introduced in 4.3BSD and therefore is widely available. | 
|  | @end deftypefun | 
|  |  | 
|  | Here is how the above example looks like when @code{strsep} is used. | 
|  |  | 
|  | @comment Yes, this example has been tested. | 
|  | @smallexample | 
|  | #include <string.h> | 
|  | #include <stddef.h> | 
|  |  | 
|  | @dots{} | 
|  |  | 
|  | const char string[] = "words separated by spaces -- and, punctuation!"; | 
|  | const char delimiters[] = " .,;:!-"; | 
|  | char *running; | 
|  | char *token; | 
|  |  | 
|  | @dots{} | 
|  |  | 
|  | running = strdupa (string); | 
|  | token = strsep (&running, delimiters);    /* token => "words" */ | 
|  | token = strsep (&running, delimiters);    /* token => "separated" */ | 
|  | token = strsep (&running, delimiters);    /* token => "by" */ | 
|  | token = strsep (&running, delimiters);    /* token => "spaces" */ | 
|  | token = strsep (&running, delimiters);    /* token => "" */ | 
|  | token = strsep (&running, delimiters);    /* token => "" */ | 
|  | token = strsep (&running, delimiters);    /* token => "" */ | 
|  | token = strsep (&running, delimiters);    /* token => "and" */ | 
|  | token = strsep (&running, delimiters);    /* token => "" */ | 
|  | token = strsep (&running, delimiters);    /* token => "punctuation" */ | 
|  | token = strsep (&running, delimiters);    /* token => "" */ | 
|  | token = strsep (&running, delimiters);    /* token => NULL */ | 
|  | @end smallexample | 
|  |  | 
|  | @comment string.h | 
|  | @comment GNU | 
|  | @deftypefun {char *} basename (const char *@var{filename}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | The GNU version of the @code{basename} function returns the last | 
|  | component of the path in @var{filename}.  This function is the preferred | 
|  | usage, since it does not modify the argument, @var{filename}, and | 
|  | respects trailing slashes.  The prototype for @code{basename} can be | 
|  | found in @file{string.h}.  Note, this function is overriden by the XPG | 
|  | version, if @file{libgen.h} is included. | 
|  |  | 
|  | Example of using GNU @code{basename}: | 
|  |  | 
|  | @smallexample | 
|  | #include <string.h> | 
|  |  | 
|  | int | 
|  | main (int argc, char *argv[]) | 
|  | @{ | 
|  | char *prog = basename (argv[0]); | 
|  |  | 
|  | if (argc < 2) | 
|  | @{ | 
|  | fprintf (stderr, "Usage %s <arg>\n", prog); | 
|  | exit (1); | 
|  | @} | 
|  |  | 
|  | @dots{} | 
|  | @} | 
|  | @end smallexample | 
|  |  | 
|  | @strong{Portability Note:} This function may produce different results | 
|  | on different systems. | 
|  |  | 
|  | @end deftypefun | 
|  |  | 
|  | @comment libgen.h | 
|  | @comment XPG | 
|  | @deftypefun {char *} basename (char *@var{path}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | This is the standard XPG defined @code{basename}.  It is similar in | 
|  | spirit to the GNU version, but may modify the @var{path} by removing | 
|  | trailing '/' bytes.  If the @var{path} is made up entirely of '/' | 
|  | bytes, then "/" will be returned.  Also, if @var{path} is | 
|  | @code{NULL} or an empty string, then "." is returned.  The prototype for | 
|  | the XPG version can be found in @file{libgen.h}. | 
|  |  | 
|  | Example of using XPG @code{basename}: | 
|  |  | 
|  | @smallexample | 
|  | #include <libgen.h> | 
|  |  | 
|  | int | 
|  | main (int argc, char *argv[]) | 
|  | @{ | 
|  | char *prog; | 
|  | char *path = strdupa (argv[0]); | 
|  |  | 
|  | prog = basename (path); | 
|  |  | 
|  | if (argc < 2) | 
|  | @{ | 
|  | fprintf (stderr, "Usage %s <arg>\n", prog); | 
|  | exit (1); | 
|  | @} | 
|  |  | 
|  | @dots{} | 
|  |  | 
|  | @} | 
|  | @end smallexample | 
|  | @end deftypefun | 
|  |  | 
|  | @comment libgen.h | 
|  | @comment XPG | 
|  | @deftypefun {char *} dirname (char *@var{path}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | The @code{dirname} function is the compliment to the XPG version of | 
|  | @code{basename}.  It returns the parent directory of the file specified | 
|  | by @var{path}.  If @var{path} is @code{NULL}, an empty string, or | 
|  | contains no '/' bytes, then "." is returned.  The prototype for this | 
|  | function can be found in @file{libgen.h}. | 
|  | @end deftypefun | 
|  |  | 
|  | @node strfry | 
|  | @section strfry | 
|  |  | 
|  | The function below addresses the perennial programming quandary: ``How do | 
|  | I take good data in string form and painlessly turn it into garbage?'' | 
|  | This is actually a fairly simple task for C programmers who do not use | 
|  | @theglibc{} string functions, but for programs based on @theglibc{}, | 
|  | the @code{strfry} function is the preferred method for | 
|  | destroying string data. | 
|  |  | 
|  | The prototype for this function is in @file{string.h}. | 
|  |  | 
|  | @comment string.h | 
|  | @comment GNU | 
|  | @deftypefun {char *} strfry (char *@var{string}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | @c Calls initstate_r, time, getpid, strlen, and random_r. | 
|  |  | 
|  | @code{strfry} creates a pseudorandom anagram of a string, replacing the | 
|  | input with the anagram in place.  For each position in the string, | 
|  | @code{strfry} swaps it with a position in the string selected at random | 
|  | (from a uniform distribution).  The two positions may be the same. | 
|  |  | 
|  | The return value of @code{strfry} is always @var{string}. | 
|  |  | 
|  | @strong{Portability Note:}  This function is unique to @theglibc{}. | 
|  |  | 
|  | @end deftypefun | 
|  |  | 
|  |  | 
|  | @node Trivial Encryption | 
|  | @section Trivial Encryption | 
|  | @cindex encryption | 
|  |  | 
|  |  | 
|  | The @code{memfrob} function converts an array of data to something | 
|  | unrecognizable and back again.  It is not encryption in its usual sense | 
|  | since it is easy for someone to convert the encrypted data back to clear | 
|  | text.  The transformation is analogous to Usenet's ``Rot13'' encryption | 
|  | method for obscuring offensive jokes from sensitive eyes and such. | 
|  | Unlike Rot13, @code{memfrob} works on arbitrary binary data, not just | 
|  | text. | 
|  | @cindex Rot13 | 
|  |  | 
|  | For true encryption, @xref{Cryptographic Functions}. | 
|  |  | 
|  | This function is declared in @file{string.h}. | 
|  | @pindex string.h | 
|  |  | 
|  | @comment string.h | 
|  | @comment GNU | 
|  | @deftypefun {void *} memfrob (void *@var{mem}, size_t @var{length}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  |  | 
|  | @code{memfrob} transforms (frobnicates) each byte of the data structure | 
|  | at @var{mem}, which is @var{length} bytes long, by bitwise exclusive | 
|  | oring it with binary 00101010.  It does the transformation in place and | 
|  | its return value is always @var{mem}. | 
|  |  | 
|  | Note that @code{memfrob} a second time on the same data structure | 
|  | returns it to its original state. | 
|  |  | 
|  | This is a good function for hiding information from someone who doesn't | 
|  | want to see it or doesn't want to see it very much.  To really prevent | 
|  | people from retrieving the information, use stronger encryption such as | 
|  | that described in @xref{Cryptographic Functions}. | 
|  |  | 
|  | @strong{Portability Note:}  This function is unique to @theglibc{}. | 
|  |  | 
|  | @end deftypefun | 
|  |  | 
|  | @node Encode Binary Data | 
|  | @section Encode Binary Data | 
|  |  | 
|  | To store or transfer binary data in environments which only support text | 
|  | one has to encode the binary data by mapping the input bytes to | 
|  | bytes in the range allowed for storing or transferring.  SVID | 
|  | systems (and nowadays XPG compliant systems) provide minimal support for | 
|  | this task. | 
|  |  | 
|  | @comment stdlib.h | 
|  | @comment XPG | 
|  | @deftypefun {char *} l64a (long int @var{n}) | 
|  | @safety{@prelim{}@mtunsafe{@mtasurace{:l64a}}@asunsafe{}@acsafe{}} | 
|  | This function encodes a 32-bit input value using bytes from the | 
|  | basic character set.  It returns a pointer to a 7 byte buffer which | 
|  | contains an encoded version of @var{n}.  To encode a series of bytes the | 
|  | user must copy the returned string to a destination buffer.  It returns | 
|  | the empty string if @var{n} is zero, which is somewhat bizarre but | 
|  | mandated by the standard.@* | 
|  | @strong{Warning:} Since a static buffer is used this function should not | 
|  | be used in multi-threaded programs.  There is no thread-safe alternative | 
|  | to this function in the C library.@* | 
|  | @strong{Compatibility Note:} The XPG standard states that the return | 
|  | value of @code{l64a} is undefined if @var{n} is negative.  In the GNU | 
|  | implementation, @code{l64a} treats its argument as unsigned, so it will | 
|  | return a sensible encoding for any nonzero @var{n}; however, portable | 
|  | programs should not rely on this. | 
|  |  | 
|  | To encode a large buffer @code{l64a} must be called in a loop, once for | 
|  | each 32-bit word of the buffer.  For example, one could do something | 
|  | like this: | 
|  |  | 
|  | @smallexample | 
|  | char * | 
|  | encode (const void *buf, size_t len) | 
|  | @{ | 
|  | /* @r{We know in advance how long the buffer has to be.} */ | 
|  | unsigned char *in = (unsigned char *) buf; | 
|  | char *out = malloc (6 + ((len + 3) / 4) * 6 + 1); | 
|  | char *cp = out, *p; | 
|  |  | 
|  | /* @r{Encode the length.} */ | 
|  | /* @r{Using `htonl' is necessary so that the data can be} | 
|  | @r{decoded even on machines with different byte order.} | 
|  | @r{`l64a' can return a string shorter than 6 bytes, so } | 
|  | @r{we pad it with encoding of 0 (}'.'@r{) at the end by } | 
|  | @r{hand.} */ | 
|  |  | 
|  | p = stpcpy (cp, l64a (htonl (len))); | 
|  | cp = mempcpy (p, "......", 6 - (p - cp)); | 
|  |  | 
|  | while (len > 3) | 
|  | @{ | 
|  | unsigned long int n = *in++; | 
|  | n = (n << 8) | *in++; | 
|  | n = (n << 8) | *in++; | 
|  | n = (n << 8) | *in++; | 
|  | len -= 4; | 
|  | p = stpcpy (cp, l64a (htonl (n))); | 
|  | cp = mempcpy (p, "......", 6 - (p - cp)); | 
|  | @} | 
|  | if (len > 0) | 
|  | @{ | 
|  | unsigned long int n = *in++; | 
|  | if (--len > 0) | 
|  | @{ | 
|  | n = (n << 8) | *in++; | 
|  | if (--len > 0) | 
|  | n = (n << 8) | *in; | 
|  | @} | 
|  | cp = stpcpy (cp, l64a (htonl (n))); | 
|  | @} | 
|  | *cp = '\0'; | 
|  | return out; | 
|  | @} | 
|  | @end smallexample | 
|  |  | 
|  | It is strange that the library does not provide the complete | 
|  | functionality needed but so be it. | 
|  |  | 
|  | @end deftypefun | 
|  |  | 
|  | To decode data produced with @code{l64a} the following function should be | 
|  | used. | 
|  |  | 
|  | @comment stdlib.h | 
|  | @comment XPG | 
|  | @deftypefun {long int} a64l (const char *@var{string}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | The parameter @var{string} should contain a string which was produced by | 
|  | a call to @code{l64a}.  The function processes at least 6 bytes of | 
|  | this string, and decodes the bytes it finds according to the table | 
|  | below.  It stops decoding when it finds a byte not in the table, | 
|  | rather like @code{atoi}; if you have a buffer which has been broken into | 
|  | lines, you must be careful to skip over the end-of-line bytes. | 
|  |  | 
|  | The decoded number is returned as a @code{long int} value. | 
|  | @end deftypefun | 
|  |  | 
|  | The @code{l64a} and @code{a64l} functions use a base 64 encoding, in | 
|  | which each byte of an encoded string represents six bits of an | 
|  | input word.  These symbols are used for the base 64 digits: | 
|  |  | 
|  | @multitable {xxxxx} {xxx} {xxx} {xxx} {xxx} {xxx} {xxx} {xxx} {xxx} | 
|  | @item              @tab 0 @tab 1 @tab 2 @tab 3 @tab 4 @tab 5 @tab 6 @tab 7 | 
|  | @item       0      @tab @code{.} @tab @code{/} @tab @code{0} @tab @code{1} | 
|  | @tab @code{2} @tab @code{3} @tab @code{4} @tab @code{5} | 
|  | @item       8      @tab @code{6} @tab @code{7} @tab @code{8} @tab @code{9} | 
|  | @tab @code{A} @tab @code{B} @tab @code{C} @tab @code{D} | 
|  | @item       16     @tab @code{E} @tab @code{F} @tab @code{G} @tab @code{H} | 
|  | @tab @code{I} @tab @code{J} @tab @code{K} @tab @code{L} | 
|  | @item       24     @tab @code{M} @tab @code{N} @tab @code{O} @tab @code{P} | 
|  | @tab @code{Q} @tab @code{R} @tab @code{S} @tab @code{T} | 
|  | @item       32     @tab @code{U} @tab @code{V} @tab @code{W} @tab @code{X} | 
|  | @tab @code{Y} @tab @code{Z} @tab @code{a} @tab @code{b} | 
|  | @item       40     @tab @code{c} @tab @code{d} @tab @code{e} @tab @code{f} | 
|  | @tab @code{g} @tab @code{h} @tab @code{i} @tab @code{j} | 
|  | @item       48     @tab @code{k} @tab @code{l} @tab @code{m} @tab @code{n} | 
|  | @tab @code{o} @tab @code{p} @tab @code{q} @tab @code{r} | 
|  | @item       56     @tab @code{s} @tab @code{t} @tab @code{u} @tab @code{v} | 
|  | @tab @code{w} @tab @code{x} @tab @code{y} @tab @code{z} | 
|  | @end multitable | 
|  |  | 
|  | This encoding scheme is not standard.  There are some other encoding | 
|  | methods which are much more widely used (UU encoding, MIME encoding). | 
|  | Generally, it is better to use one of these encodings. | 
|  |  | 
|  | @node Argz and Envz Vectors | 
|  | @section Argz and Envz Vectors | 
|  |  | 
|  | @cindex argz vectors (string vectors) | 
|  | @cindex string vectors, null-byte separated | 
|  | @cindex argument vectors, null-byte separated | 
|  | @dfn{argz vectors} are vectors of strings in a contiguous block of | 
|  | memory, each element separated from its neighbors by null bytes | 
|  | (@code{'\0'}). | 
|  |  | 
|  | @cindex envz vectors (environment vectors) | 
|  | @cindex environment vectors, null-byte separated | 
|  | @dfn{Envz vectors} are an extension of argz vectors where each element is a | 
|  | name-value pair, separated by a @code{'='} byte (as in a Unix | 
|  | environment). | 
|  |  | 
|  | @menu | 
|  | * Argz Functions::              Operations on argz vectors. | 
|  | * Envz Functions::              Additional operations on environment vectors. | 
|  | @end menu | 
|  |  | 
|  | @node Argz Functions, Envz Functions, , Argz and Envz Vectors | 
|  | @subsection Argz Functions | 
|  |  | 
|  | Each argz vector is represented by a pointer to the first element, of | 
|  | type @code{char *}, and a size, of type @code{size_t}, both of which can | 
|  | be initialized to @code{0} to represent an empty argz vector.  All argz | 
|  | functions accept either a pointer and a size argument, or pointers to | 
|  | them, if they will be modified. | 
|  |  | 
|  | The argz functions use @code{malloc}/@code{realloc} to allocate/grow | 
|  | argz vectors, and so any argz vector creating using these functions may | 
|  | be freed by using @code{free}; conversely, any argz function that may | 
|  | grow a string expects that string to have been allocated using | 
|  | @code{malloc} (those argz functions that only examine their arguments or | 
|  | modify them in place will work on any sort of memory). | 
|  | @xref{Unconstrained Allocation}. | 
|  |  | 
|  | All argz functions that do memory allocation have a return type of | 
|  | @code{error_t}, and return @code{0} for success, and @code{ENOMEM} if an | 
|  | allocation error occurs. | 
|  |  | 
|  | @pindex argz.h | 
|  | These functions are declared in the standard include file @file{argz.h}. | 
|  |  | 
|  | @comment argz.h | 
|  | @comment GNU | 
|  | @deftypefun {error_t} argz_create (char *const @var{argv}[], char **@var{argz}, size_t *@var{argz_len}) | 
|  | @safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}} | 
|  | The @code{argz_create} function converts the Unix-style argument vector | 
|  | @var{argv} (a vector of pointers to normal C strings, terminated by | 
|  | @code{(char *)0}; @pxref{Program Arguments}) into an argz vector with | 
|  | the same elements, which is returned in @var{argz} and @var{argz_len}. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment argz.h | 
|  | @comment GNU | 
|  | @deftypefun {error_t} argz_create_sep (const char *@var{string}, int @var{sep}, char **@var{argz}, size_t *@var{argz_len}) | 
|  | @safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}} | 
|  | The @code{argz_create_sep} function converts the string | 
|  | @var{string} into an argz vector (returned in @var{argz} and | 
|  | @var{argz_len}) by splitting it into elements at every occurrence of the | 
|  | byte @var{sep}. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment argz.h | 
|  | @comment GNU | 
|  | @deftypefun {size_t} argz_count (const char *@var{argz}, size_t @var{arg_len}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | Returns the number of elements in the argz vector @var{argz} and | 
|  | @var{argz_len}. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment argz.h | 
|  | @comment GNU | 
|  | @deftypefun {void} argz_extract (const char *@var{argz}, size_t @var{argz_len}, char **@var{argv}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | The @code{argz_extract} function converts the argz vector @var{argz} and | 
|  | @var{argz_len} into a Unix-style argument vector stored in @var{argv}, | 
|  | by putting pointers to every element in @var{argz} into successive | 
|  | positions in @var{argv}, followed by a terminator of @code{0}. | 
|  | @var{Argv} must be pre-allocated with enough space to hold all the | 
|  | elements in @var{argz} plus the terminating @code{(char *)0} | 
|  | (@code{(argz_count (@var{argz}, @var{argz_len}) + 1) * sizeof (char *)} | 
|  | bytes should be enough).  Note that the string pointers stored into | 
|  | @var{argv} point into @var{argz}---they are not copies---and so | 
|  | @var{argz} must be copied if it will be changed while @var{argv} is | 
|  | still active.  This function is useful for passing the elements in | 
|  | @var{argz} to an exec function (@pxref{Executing a File}). | 
|  | @end deftypefun | 
|  |  | 
|  | @comment argz.h | 
|  | @comment GNU | 
|  | @deftypefun {void} argz_stringify (char *@var{argz}, size_t @var{len}, int @var{sep}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | The @code{argz_stringify} converts @var{argz} into a normal string with | 
|  | the elements separated by the byte @var{sep}, by replacing each | 
|  | @code{'\0'} inside @var{argz} (except the last one, which terminates the | 
|  | string) with @var{sep}.  This is handy for printing @var{argz} in a | 
|  | readable manner. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment argz.h | 
|  | @comment GNU | 
|  | @deftypefun {error_t} argz_add (char **@var{argz}, size_t *@var{argz_len}, const char *@var{str}) | 
|  | @safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}} | 
|  | @c Calls strlen and argz_append. | 
|  | The @code{argz_add} function adds the string @var{str} to the end of the | 
|  | argz vector @code{*@var{argz}}, and updates @code{*@var{argz}} and | 
|  | @code{*@var{argz_len}} accordingly. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment argz.h | 
|  | @comment GNU | 
|  | @deftypefun {error_t} argz_add_sep (char **@var{argz}, size_t *@var{argz_len}, const char *@var{str}, int @var{delim}) | 
|  | @safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}} | 
|  | The @code{argz_add_sep} function is similar to @code{argz_add}, but | 
|  | @var{str} is split into separate elements in the result at occurrences of | 
|  | the byte @var{delim}.  This is useful, for instance, for | 
|  | adding the components of a Unix search path to an argz vector, by using | 
|  | a value of @code{':'} for @var{delim}. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment argz.h | 
|  | @comment GNU | 
|  | @deftypefun {error_t} argz_append (char **@var{argz}, size_t *@var{argz_len}, const char *@var{buf}, size_t @var{buf_len}) | 
|  | @safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}} | 
|  | The @code{argz_append} function appends @var{buf_len} bytes starting at | 
|  | @var{buf} to the argz vector @code{*@var{argz}}, reallocating | 
|  | @code{*@var{argz}} to accommodate it, and adding @var{buf_len} to | 
|  | @code{*@var{argz_len}}. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment argz.h | 
|  | @comment GNU | 
|  | @deftypefun {void} argz_delete (char **@var{argz}, size_t *@var{argz_len}, char *@var{entry}) | 
|  | @safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}} | 
|  | @c Calls free if no argument is left. | 
|  | If @var{entry} points to the beginning of one of the elements in the | 
|  | argz vector @code{*@var{argz}}, the @code{argz_delete} function will | 
|  | remove this entry and reallocate @code{*@var{argz}}, modifying | 
|  | @code{*@var{argz}} and @code{*@var{argz_len}} accordingly.  Note that as | 
|  | destructive argz functions usually reallocate their argz argument, | 
|  | pointers into argz vectors such as @var{entry} will then become invalid. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment argz.h | 
|  | @comment GNU | 
|  | @deftypefun {error_t} argz_insert (char **@var{argz}, size_t *@var{argz_len}, char *@var{before}, const char *@var{entry}) | 
|  | @safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}} | 
|  | @c Calls argz_add or realloc and memmove. | 
|  | The @code{argz_insert} function inserts the string @var{entry} into the | 
|  | argz vector @code{*@var{argz}} at a point just before the existing | 
|  | element pointed to by @var{before}, reallocating @code{*@var{argz}} and | 
|  | updating @code{*@var{argz}} and @code{*@var{argz_len}}.  If @var{before} | 
|  | is @code{0}, @var{entry} is added to the end instead (as if by | 
|  | @code{argz_add}).  Since the first element is in fact the same as | 
|  | @code{*@var{argz}}, passing in @code{*@var{argz}} as the value of | 
|  | @var{before} will result in @var{entry} being inserted at the beginning. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment argz.h | 
|  | @comment GNU | 
|  | @deftypefun {char *} argz_next (const char *@var{argz}, size_t @var{argz_len}, const char *@var{entry}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | The @code{argz_next} function provides a convenient way of iterating | 
|  | over the elements in the argz vector @var{argz}.  It returns a pointer | 
|  | to the next element in @var{argz} after the element @var{entry}, or | 
|  | @code{0} if there are no elements following @var{entry}.  If @var{entry} | 
|  | is @code{0}, the first element of @var{argz} is returned. | 
|  |  | 
|  | This behavior suggests two styles of iteration: | 
|  |  | 
|  | @smallexample | 
|  | char *entry = 0; | 
|  | while ((entry = argz_next (@var{argz}, @var{argz_len}, entry))) | 
|  | @var{action}; | 
|  | @end smallexample | 
|  |  | 
|  | (the double parentheses are necessary to make some C compilers shut up | 
|  | about what they consider a questionable @code{while}-test) and: | 
|  |  | 
|  | @smallexample | 
|  | char *entry; | 
|  | for (entry = @var{argz}; | 
|  | entry; | 
|  | entry = argz_next (@var{argz}, @var{argz_len}, entry)) | 
|  | @var{action}; | 
|  | @end smallexample | 
|  |  | 
|  | Note that the latter depends on @var{argz} having a value of @code{0} if | 
|  | it is empty (rather than a pointer to an empty block of memory); this | 
|  | invariant is maintained for argz vectors created by the functions here. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment argz.h | 
|  | @comment GNU | 
|  | @deftypefun error_t argz_replace (@w{char **@var{argz}, size_t *@var{argz_len}}, @w{const char *@var{str}, const char *@var{with}}, @w{unsigned *@var{replace_count}}) | 
|  | @safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}} | 
|  | Replace any occurrences of the string @var{str} in @var{argz} with | 
|  | @var{with}, reallocating @var{argz} as necessary.  If | 
|  | @var{replace_count} is non-zero, @code{*@var{replace_count}} will be | 
|  | incremented by number of replacements performed. | 
|  | @end deftypefun | 
|  |  | 
|  | @node Envz Functions, , Argz Functions, Argz and Envz Vectors | 
|  | @subsection Envz Functions | 
|  |  | 
|  | Envz vectors are just argz vectors with additional constraints on the form | 
|  | of each element; as such, argz functions can also be used on them, where it | 
|  | makes sense. | 
|  |  | 
|  | Each element in an envz vector is a name-value pair, separated by a @code{'='} | 
|  | byte; if multiple @code{'='} bytes are present in an element, those | 
|  | after the first are considered part of the value, and treated like all other | 
|  | non-@code{'\0'} bytes. | 
|  |  | 
|  | If @emph{no} @code{'='} bytes are present in an element, that element is | 
|  | considered the name of a ``null'' entry, as distinct from an entry with an | 
|  | empty value: @code{envz_get} will return @code{0} if given the name of null | 
|  | entry, whereas an entry with an empty value would result in a value of | 
|  | @code{""}; @code{envz_entry} will still find such entries, however.  Null | 
|  | entries can be removed with @code{envz_strip} function. | 
|  |  | 
|  | As with argz functions, envz functions that may allocate memory (and thus | 
|  | fail) have a return type of @code{error_t}, and return either @code{0} or | 
|  | @code{ENOMEM}. | 
|  |  | 
|  | @pindex envz.h | 
|  | These functions are declared in the standard include file @file{envz.h}. | 
|  |  | 
|  | @comment envz.h | 
|  | @comment GNU | 
|  | @deftypefun {char *} envz_entry (const char *@var{envz}, size_t @var{envz_len}, const char *@var{name}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | The @code{envz_entry} function finds the entry in @var{envz} with the name | 
|  | @var{name}, and returns a pointer to the whole entry---that is, the argz | 
|  | element which begins with @var{name} followed by a @code{'='} byte.  If | 
|  | there is no entry with that name, @code{0} is returned. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment envz.h | 
|  | @comment GNU | 
|  | @deftypefun {char *} envz_get (const char *@var{envz}, size_t @var{envz_len}, const char *@var{name}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | The @code{envz_get} function finds the entry in @var{envz} with the name | 
|  | @var{name} (like @code{envz_entry}), and returns a pointer to the value | 
|  | portion of that entry (following the @code{'='}).  If there is no entry with | 
|  | that name (or only a null entry), @code{0} is returned. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment envz.h | 
|  | @comment GNU | 
|  | @deftypefun {error_t} envz_add (char **@var{envz}, size_t *@var{envz_len}, const char *@var{name}, const char *@var{value}) | 
|  | @safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}} | 
|  | @c Calls envz_remove, which calls enz_entry and argz_delete, and then | 
|  | @c argz_add or equivalent code that reallocs and appends name=value. | 
|  | The @code{envz_add} function adds an entry to @code{*@var{envz}} | 
|  | (updating @code{*@var{envz}} and @code{*@var{envz_len}}) with the name | 
|  | @var{name}, and value @var{value}.  If an entry with the same name | 
|  | already exists in @var{envz}, it is removed first.  If @var{value} is | 
|  | @code{0}, then the new entry will the special null type of entry | 
|  | (mentioned above). | 
|  | @end deftypefun | 
|  |  | 
|  | @comment envz.h | 
|  | @comment GNU | 
|  | @deftypefun {error_t} envz_merge (char **@var{envz}, size_t *@var{envz_len}, const char *@var{envz2}, size_t @var{envz2_len}, int @var{override}) | 
|  | @safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}} | 
|  | The @code{envz_merge} function adds each entry in @var{envz2} to @var{envz}, | 
|  | as if with @code{envz_add}, updating @code{*@var{envz}} and | 
|  | @code{*@var{envz_len}}.  If @var{override} is true, then values in @var{envz2} | 
|  | will supersede those with the same name in @var{envz}, otherwise not. | 
|  |  | 
|  | Null entries are treated just like other entries in this respect, so a null | 
|  | entry in @var{envz} can prevent an entry of the same name in @var{envz2} from | 
|  | being added to @var{envz}, if @var{override} is false. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment envz.h | 
|  | @comment GNU | 
|  | @deftypefun {void} envz_strip (char **@var{envz}, size_t *@var{envz_len}) | 
|  | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} | 
|  | The @code{envz_strip} function removes any null entries from @var{envz}, | 
|  | updating @code{*@var{envz}} and @code{*@var{envz_len}}. | 
|  | @end deftypefun | 
|  |  | 
|  | @comment envz.h | 
|  | @comment GNU | 
|  | @deftypefun {void} envz_remove (char **@var{envz}, size_t *@var{envz_len}, const char *@var{name}) | 
|  | @safety{@prelim{}@mtsafe{}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}} | 
|  | The @code{envz_remove} function removes an entry named @var{name} from | 
|  | @var{envz}, updating @code{*@var{envz}} and @code{*@var{envz_len}}. | 
|  | @end deftypefun | 
|  |  | 
|  | @c FIXME this are undocumented: | 
|  | @c strcasecmp_l @safety{@mtsafe{}@assafe{}@acsafe{}} see strcasecmp |