lh | 9ed821d | 2023-04-07 01:36:19 -0700 | [diff] [blame] | 1 | @node Arithmetic, Date and Time, Mathematics, Top |
| 2 | @c %MENU% Low level arithmetic functions |
| 3 | @chapter Arithmetic Functions |
| 4 | |
| 5 | This chapter contains information about functions for doing basic |
| 6 | arithmetic operations, such as splitting a float into its integer and |
| 7 | fractional parts or retrieving the imaginary part of a complex value. |
| 8 | These functions are declared in the header files @file{math.h} and |
| 9 | @file{complex.h}. |
| 10 | |
| 11 | @menu |
| 12 | * Integers:: Basic integer types and concepts |
| 13 | * Integer Division:: Integer division with guaranteed rounding. |
| 14 | * Floating Point Numbers:: Basic concepts. IEEE 754. |
| 15 | * Floating Point Classes:: The five kinds of floating-point number. |
| 16 | * Floating Point Errors:: When something goes wrong in a calculation. |
| 17 | * Rounding:: Controlling how results are rounded. |
| 18 | * Control Functions:: Saving and restoring the FPU's state. |
| 19 | * Arithmetic Functions:: Fundamental operations provided by the library. |
| 20 | * Complex Numbers:: The types. Writing complex constants. |
| 21 | * Operations on Complex:: Projection, conjugation, decomposition. |
| 22 | * Parsing of Numbers:: Converting strings to numbers. |
| 23 | * System V Number Conversion:: An archaic way to convert numbers to strings. |
| 24 | @end menu |
| 25 | |
| 26 | @node Integers |
| 27 | @section Integers |
| 28 | @cindex integer |
| 29 | |
| 30 | The C language defines several integer data types: integer, short integer, |
| 31 | long integer, and character, all in both signed and unsigned varieties. |
| 32 | The GNU C compiler extends the language to contain long long integers |
| 33 | as well. |
| 34 | @cindex signedness |
| 35 | |
| 36 | The C integer types were intended to allow code to be portable among |
| 37 | machines with different inherent data sizes (word sizes), so each type |
| 38 | may have different ranges on different machines. The problem with |
| 39 | this is that a program often needs to be written for a particular range |
| 40 | of integers, and sometimes must be written for a particular size of |
| 41 | storage, regardless of what machine the program runs on. |
| 42 | |
| 43 | To address this problem, @theglibc{} contains C type definitions |
| 44 | you can use to declare integers that meet your exact needs. Because the |
| 45 | @glibcadj{} header files are customized to a specific machine, your |
| 46 | program source code doesn't have to be. |
| 47 | |
| 48 | These @code{typedef}s are in @file{stdint.h}. |
| 49 | @pindex stdint.h |
| 50 | |
| 51 | If you require that an integer be represented in exactly N bits, use one |
| 52 | of the following types, with the obvious mapping to bit size and signedness: |
| 53 | |
| 54 | @itemize @bullet |
| 55 | @item int8_t |
| 56 | @item int16_t |
| 57 | @item int32_t |
| 58 | @item int64_t |
| 59 | @item uint8_t |
| 60 | @item uint16_t |
| 61 | @item uint32_t |
| 62 | @item uint64_t |
| 63 | @end itemize |
| 64 | |
| 65 | If your C compiler and target machine do not allow integers of a certain |
| 66 | size, the corresponding above type does not exist. |
| 67 | |
| 68 | If you don't need a specific storage size, but want the smallest data |
| 69 | structure with @emph{at least} N bits, use one of these: |
| 70 | |
| 71 | @itemize @bullet |
| 72 | @item int_least8_t |
| 73 | @item int_least16_t |
| 74 | @item int_least32_t |
| 75 | @item int_least64_t |
| 76 | @item uint_least8_t |
| 77 | @item uint_least16_t |
| 78 | @item uint_least32_t |
| 79 | @item uint_least64_t |
| 80 | @end itemize |
| 81 | |
| 82 | If you don't need a specific storage size, but want the data structure |
| 83 | that allows the fastest access while having at least N bits (and |
| 84 | among data structures with the same access speed, the smallest one), use |
| 85 | one of these: |
| 86 | |
| 87 | @itemize @bullet |
| 88 | @item int_fast8_t |
| 89 | @item int_fast16_t |
| 90 | @item int_fast32_t |
| 91 | @item int_fast64_t |
| 92 | @item uint_fast8_t |
| 93 | @item uint_fast16_t |
| 94 | @item uint_fast32_t |
| 95 | @item uint_fast64_t |
| 96 | @end itemize |
| 97 | |
| 98 | If you want an integer with the widest range possible on the platform on |
| 99 | which it is being used, use one of the following. If you use these, |
| 100 | you should write code that takes into account the variable size and range |
| 101 | of the integer. |
| 102 | |
| 103 | @itemize @bullet |
| 104 | @item intmax_t |
| 105 | @item uintmax_t |
| 106 | @end itemize |
| 107 | |
| 108 | @Theglibc{} also provides macros that tell you the maximum and |
| 109 | minimum possible values for each integer data type. The macro names |
| 110 | follow these examples: @code{INT32_MAX}, @code{UINT8_MAX}, |
| 111 | @code{INT_FAST32_MIN}, @code{INT_LEAST64_MIN}, @code{UINTMAX_MAX}, |
| 112 | @code{INTMAX_MAX}, @code{INTMAX_MIN}. Note that there are no macros for |
| 113 | unsigned integer minima. These are always zero. |
| 114 | @cindex maximum possible integer |
| 115 | @cindex minimum possible integer |
| 116 | |
| 117 | There are similar macros for use with C's built in integer types which |
| 118 | should come with your C compiler. These are described in @ref{Data Type |
| 119 | Measurements}. |
| 120 | |
| 121 | Don't forget you can use the C @code{sizeof} function with any of these |
| 122 | data types to get the number of bytes of storage each uses. |
| 123 | |
| 124 | |
| 125 | @node Integer Division |
| 126 | @section Integer Division |
| 127 | @cindex integer division functions |
| 128 | |
| 129 | This section describes functions for performing integer division. These |
| 130 | functions are redundant when GNU CC is used, because in GNU C the |
| 131 | @samp{/} operator always rounds towards zero. But in other C |
| 132 | implementations, @samp{/} may round differently with negative arguments. |
| 133 | @code{div} and @code{ldiv} are useful because they specify how to round |
| 134 | the quotient: towards zero. The remainder has the same sign as the |
| 135 | numerator. |
| 136 | |
| 137 | These functions are specified to return a result @var{r} such that the value |
| 138 | @code{@var{r}.quot*@var{denominator} + @var{r}.rem} equals |
| 139 | @var{numerator}. |
| 140 | |
| 141 | @pindex stdlib.h |
| 142 | To use these facilities, you should include the header file |
| 143 | @file{stdlib.h} in your program. |
| 144 | |
| 145 | @comment stdlib.h |
| 146 | @comment ISO |
| 147 | @deftp {Data Type} div_t |
| 148 | This is a structure type used to hold the result returned by the @code{div} |
| 149 | function. It has the following members: |
| 150 | |
| 151 | @table @code |
| 152 | @item int quot |
| 153 | The quotient from the division. |
| 154 | |
| 155 | @item int rem |
| 156 | The remainder from the division. |
| 157 | @end table |
| 158 | @end deftp |
| 159 | |
| 160 | @comment stdlib.h |
| 161 | @comment ISO |
| 162 | @deftypefun div_t div (int @var{numerator}, int @var{denominator}) |
| 163 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 164 | @c Functions in this section are pure, and thus safe. |
| 165 | This function @code{div} computes the quotient and remainder from |
| 166 | the division of @var{numerator} by @var{denominator}, returning the |
| 167 | result in a structure of type @code{div_t}. |
| 168 | |
| 169 | If the result cannot be represented (as in a division by zero), the |
| 170 | behavior is undefined. |
| 171 | |
| 172 | Here is an example, albeit not a very useful one. |
| 173 | |
| 174 | @smallexample |
| 175 | div_t result; |
| 176 | result = div (20, -6); |
| 177 | @end smallexample |
| 178 | |
| 179 | @noindent |
| 180 | Now @code{result.quot} is @code{-3} and @code{result.rem} is @code{2}. |
| 181 | @end deftypefun |
| 182 | |
| 183 | @comment stdlib.h |
| 184 | @comment ISO |
| 185 | @deftp {Data Type} ldiv_t |
| 186 | This is a structure type used to hold the result returned by the @code{ldiv} |
| 187 | function. It has the following members: |
| 188 | |
| 189 | @table @code |
| 190 | @item long int quot |
| 191 | The quotient from the division. |
| 192 | |
| 193 | @item long int rem |
| 194 | The remainder from the division. |
| 195 | @end table |
| 196 | |
| 197 | (This is identical to @code{div_t} except that the components are of |
| 198 | type @code{long int} rather than @code{int}.) |
| 199 | @end deftp |
| 200 | |
| 201 | @comment stdlib.h |
| 202 | @comment ISO |
| 203 | @deftypefun ldiv_t ldiv (long int @var{numerator}, long int @var{denominator}) |
| 204 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 205 | The @code{ldiv} function is similar to @code{div}, except that the |
| 206 | arguments are of type @code{long int} and the result is returned as a |
| 207 | structure of type @code{ldiv_t}. |
| 208 | @end deftypefun |
| 209 | |
| 210 | @comment stdlib.h |
| 211 | @comment ISO |
| 212 | @deftp {Data Type} lldiv_t |
| 213 | This is a structure type used to hold the result returned by the @code{lldiv} |
| 214 | function. It has the following members: |
| 215 | |
| 216 | @table @code |
| 217 | @item long long int quot |
| 218 | The quotient from the division. |
| 219 | |
| 220 | @item long long int rem |
| 221 | The remainder from the division. |
| 222 | @end table |
| 223 | |
| 224 | (This is identical to @code{div_t} except that the components are of |
| 225 | type @code{long long int} rather than @code{int}.) |
| 226 | @end deftp |
| 227 | |
| 228 | @comment stdlib.h |
| 229 | @comment ISO |
| 230 | @deftypefun lldiv_t lldiv (long long int @var{numerator}, long long int @var{denominator}) |
| 231 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 232 | The @code{lldiv} function is like the @code{div} function, but the |
| 233 | arguments are of type @code{long long int} and the result is returned as |
| 234 | a structure of type @code{lldiv_t}. |
| 235 | |
| 236 | The @code{lldiv} function was added in @w{ISO C99}. |
| 237 | @end deftypefun |
| 238 | |
| 239 | @comment inttypes.h |
| 240 | @comment ISO |
| 241 | @deftp {Data Type} imaxdiv_t |
| 242 | This is a structure type used to hold the result returned by the @code{imaxdiv} |
| 243 | function. It has the following members: |
| 244 | |
| 245 | @table @code |
| 246 | @item intmax_t quot |
| 247 | The quotient from the division. |
| 248 | |
| 249 | @item intmax_t rem |
| 250 | The remainder from the division. |
| 251 | @end table |
| 252 | |
| 253 | (This is identical to @code{div_t} except that the components are of |
| 254 | type @code{intmax_t} rather than @code{int}.) |
| 255 | |
| 256 | See @ref{Integers} for a description of the @code{intmax_t} type. |
| 257 | |
| 258 | @end deftp |
| 259 | |
| 260 | @comment inttypes.h |
| 261 | @comment ISO |
| 262 | @deftypefun imaxdiv_t imaxdiv (intmax_t @var{numerator}, intmax_t @var{denominator}) |
| 263 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 264 | The @code{imaxdiv} function is like the @code{div} function, but the |
| 265 | arguments are of type @code{intmax_t} and the result is returned as |
| 266 | a structure of type @code{imaxdiv_t}. |
| 267 | |
| 268 | See @ref{Integers} for a description of the @code{intmax_t} type. |
| 269 | |
| 270 | The @code{imaxdiv} function was added in @w{ISO C99}. |
| 271 | @end deftypefun |
| 272 | |
| 273 | |
| 274 | @node Floating Point Numbers |
| 275 | @section Floating Point Numbers |
| 276 | @cindex floating point |
| 277 | @cindex IEEE 754 |
| 278 | @cindex IEEE floating point |
| 279 | |
| 280 | Most computer hardware has support for two different kinds of numbers: |
| 281 | integers (@math{@dots{}-3, -2, -1, 0, 1, 2, 3@dots{}}) and |
| 282 | floating-point numbers. Floating-point numbers have three parts: the |
| 283 | @dfn{mantissa}, the @dfn{exponent}, and the @dfn{sign bit}. The real |
| 284 | number represented by a floating-point value is given by |
| 285 | @tex |
| 286 | $(s \mathrel? -1 \mathrel: 1) \cdot 2^e \cdot M$ |
| 287 | @end tex |
| 288 | @ifnottex |
| 289 | @math{(s ? -1 : 1) @mul{} 2^e @mul{} M} |
| 290 | @end ifnottex |
| 291 | where @math{s} is the sign bit, @math{e} the exponent, and @math{M} |
| 292 | the mantissa. @xref{Floating Point Concepts}, for details. (It is |
| 293 | possible to have a different @dfn{base} for the exponent, but all modern |
| 294 | hardware uses @math{2}.) |
| 295 | |
| 296 | Floating-point numbers can represent a finite subset of the real |
| 297 | numbers. While this subset is large enough for most purposes, it is |
| 298 | important to remember that the only reals that can be represented |
| 299 | exactly are rational numbers that have a terminating binary expansion |
| 300 | shorter than the width of the mantissa. Even simple fractions such as |
| 301 | @math{1/5} can only be approximated by floating point. |
| 302 | |
| 303 | Mathematical operations and functions frequently need to produce values |
| 304 | that are not representable. Often these values can be approximated |
| 305 | closely enough for practical purposes, but sometimes they can't. |
| 306 | Historically there was no way to tell when the results of a calculation |
| 307 | were inaccurate. Modern computers implement the @w{IEEE 754} standard |
| 308 | for numerical computations, which defines a framework for indicating to |
| 309 | the program when the results of calculation are not trustworthy. This |
| 310 | framework consists of a set of @dfn{exceptions} that indicate why a |
| 311 | result could not be represented, and the special values @dfn{infinity} |
| 312 | and @dfn{not a number} (NaN). |
| 313 | |
| 314 | @node Floating Point Classes |
| 315 | @section Floating-Point Number Classification Functions |
| 316 | @cindex floating-point classes |
| 317 | @cindex classes, floating-point |
| 318 | @pindex math.h |
| 319 | |
| 320 | @w{ISO C99} defines macros that let you determine what sort of |
| 321 | floating-point number a variable holds. |
| 322 | |
| 323 | @comment math.h |
| 324 | @comment ISO |
| 325 | @deftypefn {Macro} int fpclassify (@emph{float-type} @var{x}) |
| 326 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 327 | This is a generic macro which works on all floating-point types and |
| 328 | which returns a value of type @code{int}. The possible values are: |
| 329 | |
| 330 | @vtable @code |
| 331 | @item FP_NAN |
| 332 | The floating-point number @var{x} is ``Not a Number'' (@pxref{Infinity |
| 333 | and NaN}) |
| 334 | @item FP_INFINITE |
| 335 | The value of @var{x} is either plus or minus infinity (@pxref{Infinity |
| 336 | and NaN}) |
| 337 | @item FP_ZERO |
| 338 | The value of @var{x} is zero. In floating-point formats like @w{IEEE |
| 339 | 754}, where zero can be signed, this value is also returned if |
| 340 | @var{x} is negative zero. |
| 341 | @item FP_SUBNORMAL |
| 342 | Numbers whose absolute value is too small to be represented in the |
| 343 | normal format are represented in an alternate, @dfn{denormalized} format |
| 344 | (@pxref{Floating Point Concepts}). This format is less precise but can |
| 345 | represent values closer to zero. @code{fpclassify} returns this value |
| 346 | for values of @var{x} in this alternate format. |
| 347 | @item FP_NORMAL |
| 348 | This value is returned for all other values of @var{x}. It indicates |
| 349 | that there is nothing special about the number. |
| 350 | @end vtable |
| 351 | |
| 352 | @end deftypefn |
| 353 | |
| 354 | @code{fpclassify} is most useful if more than one property of a number |
| 355 | must be tested. There are more specific macros which only test one |
| 356 | property at a time. Generally these macros execute faster than |
| 357 | @code{fpclassify}, since there is special hardware support for them. |
| 358 | You should therefore use the specific macros whenever possible. |
| 359 | |
| 360 | @comment math.h |
| 361 | @comment ISO |
| 362 | @deftypefn {Macro} int isfinite (@emph{float-type} @var{x}) |
| 363 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 364 | This macro returns a nonzero value if @var{x} is finite: not plus or |
| 365 | minus infinity, and not NaN. It is equivalent to |
| 366 | |
| 367 | @smallexample |
| 368 | (fpclassify (x) != FP_NAN && fpclassify (x) != FP_INFINITE) |
| 369 | @end smallexample |
| 370 | |
| 371 | @code{isfinite} is implemented as a macro which accepts any |
| 372 | floating-point type. |
| 373 | @end deftypefn |
| 374 | |
| 375 | @comment math.h |
| 376 | @comment ISO |
| 377 | @deftypefn {Macro} int isnormal (@emph{float-type} @var{x}) |
| 378 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 379 | This macro returns a nonzero value if @var{x} is finite and normalized. |
| 380 | It is equivalent to |
| 381 | |
| 382 | @smallexample |
| 383 | (fpclassify (x) == FP_NORMAL) |
| 384 | @end smallexample |
| 385 | @end deftypefn |
| 386 | |
| 387 | @comment math.h |
| 388 | @comment ISO |
| 389 | @deftypefn {Macro} int isnan (@emph{float-type} @var{x}) |
| 390 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 391 | This macro returns a nonzero value if @var{x} is NaN. It is equivalent |
| 392 | to |
| 393 | |
| 394 | @smallexample |
| 395 | (fpclassify (x) == FP_NAN) |
| 396 | @end smallexample |
| 397 | @end deftypefn |
| 398 | |
| 399 | @comment math.h |
| 400 | @comment GNU |
| 401 | @deftypefn {Macro} int issignaling (@emph{float-type} @var{x}) |
| 402 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 403 | This macro returns a nonzero value if @var{x} is a signaling NaN |
| 404 | (sNaN). It is based on draft TS 18661 and currently enabled as a GNU |
| 405 | extension. |
| 406 | @end deftypefn |
| 407 | |
| 408 | Another set of floating-point classification functions was provided by |
| 409 | BSD. @Theglibc{} also supports these functions; however, we |
| 410 | recommend that you use the ISO C99 macros in new code. Those are standard |
| 411 | and will be available more widely. Also, since they are macros, you do |
| 412 | not have to worry about the type of their argument. |
| 413 | |
| 414 | @comment math.h |
| 415 | @comment BSD |
| 416 | @deftypefun int isinf (double @var{x}) |
| 417 | @comment math.h |
| 418 | @comment BSD |
| 419 | @deftypefunx int isinff (float @var{x}) |
| 420 | @comment math.h |
| 421 | @comment BSD |
| 422 | @deftypefunx int isinfl (long double @var{x}) |
| 423 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 424 | This function returns @code{-1} if @var{x} represents negative infinity, |
| 425 | @code{1} if @var{x} represents positive infinity, and @code{0} otherwise. |
| 426 | @end deftypefun |
| 427 | |
| 428 | @comment math.h |
| 429 | @comment BSD |
| 430 | @deftypefun int isnan (double @var{x}) |
| 431 | @comment math.h |
| 432 | @comment BSD |
| 433 | @deftypefunx int isnanf (float @var{x}) |
| 434 | @comment math.h |
| 435 | @comment BSD |
| 436 | @deftypefunx int isnanl (long double @var{x}) |
| 437 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 438 | This function returns a nonzero value if @var{x} is a ``not a number'' |
| 439 | value, and zero otherwise. |
| 440 | |
| 441 | @strong{NB:} The @code{isnan} macro defined by @w{ISO C99} overrides |
| 442 | the BSD function. This is normally not a problem, because the two |
| 443 | routines behave identically. However, if you really need to get the BSD |
| 444 | function for some reason, you can write |
| 445 | |
| 446 | @smallexample |
| 447 | (isnan) (x) |
| 448 | @end smallexample |
| 449 | @end deftypefun |
| 450 | |
| 451 | @comment math.h |
| 452 | @comment BSD |
| 453 | @deftypefun int finite (double @var{x}) |
| 454 | @comment math.h |
| 455 | @comment BSD |
| 456 | @deftypefunx int finitef (float @var{x}) |
| 457 | @comment math.h |
| 458 | @comment BSD |
| 459 | @deftypefunx int finitel (long double @var{x}) |
| 460 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 461 | This function returns a nonzero value if @var{x} is finite or a ``not a |
| 462 | number'' value, and zero otherwise. |
| 463 | @end deftypefun |
| 464 | |
| 465 | @strong{Portability Note:} The functions listed in this section are BSD |
| 466 | extensions. |
| 467 | |
| 468 | |
| 469 | @node Floating Point Errors |
| 470 | @section Errors in Floating-Point Calculations |
| 471 | |
| 472 | @menu |
| 473 | * FP Exceptions:: IEEE 754 math exceptions and how to detect them. |
| 474 | * Infinity and NaN:: Special values returned by calculations. |
| 475 | * Status bit operations:: Checking for exceptions after the fact. |
| 476 | * Math Error Reporting:: How the math functions report errors. |
| 477 | @end menu |
| 478 | |
| 479 | @node FP Exceptions |
| 480 | @subsection FP Exceptions |
| 481 | @cindex exception |
| 482 | @cindex signal |
| 483 | @cindex zero divide |
| 484 | @cindex division by zero |
| 485 | @cindex inexact exception |
| 486 | @cindex invalid exception |
| 487 | @cindex overflow exception |
| 488 | @cindex underflow exception |
| 489 | |
| 490 | The @w{IEEE 754} standard defines five @dfn{exceptions} that can occur |
| 491 | during a calculation. Each corresponds to a particular sort of error, |
| 492 | such as overflow. |
| 493 | |
| 494 | When exceptions occur (when exceptions are @dfn{raised}, in the language |
| 495 | of the standard), one of two things can happen. By default the |
| 496 | exception is simply noted in the floating-point @dfn{status word}, and |
| 497 | the program continues as if nothing had happened. The operation |
| 498 | produces a default value, which depends on the exception (see the table |
| 499 | below). Your program can check the status word to find out which |
| 500 | exceptions happened. |
| 501 | |
| 502 | Alternatively, you can enable @dfn{traps} for exceptions. In that case, |
| 503 | when an exception is raised, your program will receive the @code{SIGFPE} |
| 504 | signal. The default action for this signal is to terminate the |
| 505 | program. @xref{Signal Handling}, for how you can change the effect of |
| 506 | the signal. |
| 507 | |
| 508 | @findex matherr |
| 509 | In the System V math library, the user-defined function @code{matherr} |
| 510 | is called when certain exceptions occur inside math library functions. |
| 511 | However, the Unix98 standard deprecates this interface. We support it |
| 512 | for historical compatibility, but recommend that you do not use it in |
| 513 | new programs. When this interface is used, exceptions may not be |
| 514 | raised. |
| 515 | |
| 516 | @noindent |
| 517 | The exceptions defined in @w{IEEE 754} are: |
| 518 | |
| 519 | @table @samp |
| 520 | @item Invalid Operation |
| 521 | This exception is raised if the given operands are invalid for the |
| 522 | operation to be performed. Examples are |
| 523 | (see @w{IEEE 754}, @w{section 7}): |
| 524 | @enumerate |
| 525 | @item |
| 526 | Addition or subtraction: @math{@infinity{} - @infinity{}}. (But |
| 527 | @math{@infinity{} + @infinity{} = @infinity{}}). |
| 528 | @item |
| 529 | Multiplication: @math{0 @mul{} @infinity{}}. |
| 530 | @item |
| 531 | Division: @math{0/0} or @math{@infinity{}/@infinity{}}. |
| 532 | @item |
| 533 | Remainder: @math{x} REM @math{y}, where @math{y} is zero or @math{x} is |
| 534 | infinite. |
| 535 | @item |
| 536 | Square root if the operand is less then zero. More generally, any |
| 537 | mathematical function evaluated outside its domain produces this |
| 538 | exception. |
| 539 | @item |
| 540 | Conversion of a floating-point number to an integer or decimal |
| 541 | string, when the number cannot be represented in the target format (due |
| 542 | to overflow, infinity, or NaN). |
| 543 | @item |
| 544 | Conversion of an unrecognizable input string. |
| 545 | @item |
| 546 | Comparison via predicates involving @math{<} or @math{>}, when one or |
| 547 | other of the operands is NaN. You can prevent this exception by using |
| 548 | the unordered comparison functions instead; see @ref{FP Comparison Functions}. |
| 549 | @end enumerate |
| 550 | |
| 551 | If the exception does not trap, the result of the operation is NaN. |
| 552 | |
| 553 | @item Division by Zero |
| 554 | This exception is raised when a finite nonzero number is divided |
| 555 | by zero. If no trap occurs the result is either @math{+@infinity{}} or |
| 556 | @math{-@infinity{}}, depending on the signs of the operands. |
| 557 | |
| 558 | @item Overflow |
| 559 | This exception is raised whenever the result cannot be represented |
| 560 | as a finite value in the precision format of the destination. If no trap |
| 561 | occurs the result depends on the sign of the intermediate result and the |
| 562 | current rounding mode (@w{IEEE 754}, @w{section 7.3}): |
| 563 | @enumerate |
| 564 | @item |
| 565 | Round to nearest carries all overflows to @math{@infinity{}} |
| 566 | with the sign of the intermediate result. |
| 567 | @item |
| 568 | Round toward @math{0} carries all overflows to the largest representable |
| 569 | finite number with the sign of the intermediate result. |
| 570 | @item |
| 571 | Round toward @math{-@infinity{}} carries positive overflows to the |
| 572 | largest representable finite number and negative overflows to |
| 573 | @math{-@infinity{}}. |
| 574 | |
| 575 | @item |
| 576 | Round toward @math{@infinity{}} carries negative overflows to the |
| 577 | most negative representable finite number and positive overflows |
| 578 | to @math{@infinity{}}. |
| 579 | @end enumerate |
| 580 | |
| 581 | Whenever the overflow exception is raised, the inexact exception is also |
| 582 | raised. |
| 583 | |
| 584 | @item Underflow |
| 585 | The underflow exception is raised when an intermediate result is too |
| 586 | small to be calculated accurately, or if the operation's result rounded |
| 587 | to the destination precision is too small to be normalized. |
| 588 | |
| 589 | When no trap is installed for the underflow exception, underflow is |
| 590 | signaled (via the underflow flag) only when both tininess and loss of |
| 591 | accuracy have been detected. If no trap handler is installed the |
| 592 | operation continues with an imprecise small value, or zero if the |
| 593 | destination precision cannot hold the small exact result. |
| 594 | |
| 595 | @item Inexact |
| 596 | This exception is signalled if a rounded result is not exact (such as |
| 597 | when calculating the square root of two) or a result overflows without |
| 598 | an overflow trap. |
| 599 | @end table |
| 600 | |
| 601 | @node Infinity and NaN |
| 602 | @subsection Infinity and NaN |
| 603 | @cindex infinity |
| 604 | @cindex not a number |
| 605 | @cindex NaN |
| 606 | |
| 607 | @w{IEEE 754} floating point numbers can represent positive or negative |
| 608 | infinity, and @dfn{NaN} (not a number). These three values arise from |
| 609 | calculations whose result is undefined or cannot be represented |
| 610 | accurately. You can also deliberately set a floating-point variable to |
| 611 | any of them, which is sometimes useful. Some examples of calculations |
| 612 | that produce infinity or NaN: |
| 613 | |
| 614 | @ifnottex |
| 615 | @smallexample |
| 616 | @math{1/0 = @infinity{}} |
| 617 | @math{log (0) = -@infinity{}} |
| 618 | @math{sqrt (-1) = NaN} |
| 619 | @end smallexample |
| 620 | @end ifnottex |
| 621 | @tex |
| 622 | $${1\over0} = \infty$$ |
| 623 | $$\log 0 = -\infty$$ |
| 624 | $$\sqrt{-1} = \hbox{NaN}$$ |
| 625 | @end tex |
| 626 | |
| 627 | When a calculation produces any of these values, an exception also |
| 628 | occurs; see @ref{FP Exceptions}. |
| 629 | |
| 630 | The basic operations and math functions all accept infinity and NaN and |
| 631 | produce sensible output. Infinities propagate through calculations as |
| 632 | one would expect: for example, @math{2 + @infinity{} = @infinity{}}, |
| 633 | @math{4/@infinity{} = 0}, atan @math{(@infinity{}) = @pi{}/2}. NaN, on |
| 634 | the other hand, infects any calculation that involves it. Unless the |
| 635 | calculation would produce the same result no matter what real value |
| 636 | replaced NaN, the result is NaN. |
| 637 | |
| 638 | In comparison operations, positive infinity is larger than all values |
| 639 | except itself and NaN, and negative infinity is smaller than all values |
| 640 | except itself and NaN. NaN is @dfn{unordered}: it is not equal to, |
| 641 | greater than, or less than anything, @emph{including itself}. @code{x == |
| 642 | x} is false if the value of @code{x} is NaN. You can use this to test |
| 643 | whether a value is NaN or not, but the recommended way to test for NaN |
| 644 | is with the @code{isnan} function (@pxref{Floating Point Classes}). In |
| 645 | addition, @code{<}, @code{>}, @code{<=}, and @code{>=} will raise an |
| 646 | exception when applied to NaNs. |
| 647 | |
| 648 | @file{math.h} defines macros that allow you to explicitly set a variable |
| 649 | to infinity or NaN. |
| 650 | |
| 651 | @comment math.h |
| 652 | @comment ISO |
| 653 | @deftypevr Macro float INFINITY |
| 654 | An expression representing positive infinity. It is equal to the value |
| 655 | produced by mathematical operations like @code{1.0 / 0.0}. |
| 656 | @code{-INFINITY} represents negative infinity. |
| 657 | |
| 658 | You can test whether a floating-point value is infinite by comparing it |
| 659 | to this macro. However, this is not recommended; you should use the |
| 660 | @code{isfinite} macro instead. @xref{Floating Point Classes}. |
| 661 | |
| 662 | This macro was introduced in the @w{ISO C99} standard. |
| 663 | @end deftypevr |
| 664 | |
| 665 | @comment math.h |
| 666 | @comment GNU |
| 667 | @deftypevr Macro float NAN |
| 668 | An expression representing a value which is ``not a number''. This |
| 669 | macro is a GNU extension, available only on machines that support the |
| 670 | ``not a number'' value---that is to say, on all machines that support |
| 671 | IEEE floating point. |
| 672 | |
| 673 | You can use @samp{#ifdef NAN} to test whether the machine supports |
| 674 | NaN. (Of course, you must arrange for GNU extensions to be visible, |
| 675 | such as by defining @code{_GNU_SOURCE}, and then you must include |
| 676 | @file{math.h}.) |
| 677 | @end deftypevr |
| 678 | |
| 679 | @w{IEEE 754} also allows for another unusual value: negative zero. This |
| 680 | value is produced when you divide a positive number by negative |
| 681 | infinity, or when a negative result is smaller than the limits of |
| 682 | representation. |
| 683 | |
| 684 | @node Status bit operations |
| 685 | @subsection Examining the FPU status word |
| 686 | |
| 687 | @w{ISO C99} defines functions to query and manipulate the |
| 688 | floating-point status word. You can use these functions to check for |
| 689 | untrapped exceptions when it's convenient, rather than worrying about |
| 690 | them in the middle of a calculation. |
| 691 | |
| 692 | These constants represent the various @w{IEEE 754} exceptions. Not all |
| 693 | FPUs report all the different exceptions. Each constant is defined if |
| 694 | and only if the FPU you are compiling for supports that exception, so |
| 695 | you can test for FPU support with @samp{#ifdef}. They are defined in |
| 696 | @file{fenv.h}. |
| 697 | |
| 698 | @vtable @code |
| 699 | @comment fenv.h |
| 700 | @comment ISO |
| 701 | @item FE_INEXACT |
| 702 | The inexact exception. |
| 703 | @comment fenv.h |
| 704 | @comment ISO |
| 705 | @item FE_DIVBYZERO |
| 706 | The divide by zero exception. |
| 707 | @comment fenv.h |
| 708 | @comment ISO |
| 709 | @item FE_UNDERFLOW |
| 710 | The underflow exception. |
| 711 | @comment fenv.h |
| 712 | @comment ISO |
| 713 | @item FE_OVERFLOW |
| 714 | The overflow exception. |
| 715 | @comment fenv.h |
| 716 | @comment ISO |
| 717 | @item FE_INVALID |
| 718 | The invalid exception. |
| 719 | @end vtable |
| 720 | |
| 721 | The macro @code{FE_ALL_EXCEPT} is the bitwise OR of all exception macros |
| 722 | which are supported by the FP implementation. |
| 723 | |
| 724 | These functions allow you to clear exception flags, test for exceptions, |
| 725 | and save and restore the set of exceptions flagged. |
| 726 | |
| 727 | @comment fenv.h |
| 728 | @comment ISO |
| 729 | @deftypefun int feclearexcept (int @var{excepts}) |
| 730 | @safety{@prelim{}@mtsafe{}@assafe{@assposix{}}@acsafe{@acsposix{}}} |
| 731 | @c The other functions in this section that modify FP status register |
| 732 | @c mostly do so with non-atomic load-modify-store sequences, but since |
| 733 | @c the register is thread-specific, this should be fine, and safe for |
| 734 | @c cancellation. As long as the FP environment is restored before the |
| 735 | @c signal handler returns control to the interrupted thread (like any |
| 736 | @c kernel should do), the functions are also safe for use in signal |
| 737 | @c handlers. |
| 738 | This function clears all of the supported exception flags indicated by |
| 739 | @var{excepts}. |
| 740 | |
| 741 | The function returns zero in case the operation was successful, a |
| 742 | non-zero value otherwise. |
| 743 | @end deftypefun |
| 744 | |
| 745 | @comment fenv.h |
| 746 | @comment ISO |
| 747 | @deftypefun int feraiseexcept (int @var{excepts}) |
| 748 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 749 | This function raises the supported exceptions indicated by |
| 750 | @var{excepts}. If more than one exception bit in @var{excepts} is set |
| 751 | the order in which the exceptions are raised is undefined except that |
| 752 | overflow (@code{FE_OVERFLOW}) or underflow (@code{FE_UNDERFLOW}) are |
| 753 | raised before inexact (@code{FE_INEXACT}). Whether for overflow or |
| 754 | underflow the inexact exception is also raised is also implementation |
| 755 | dependent. |
| 756 | |
| 757 | The function returns zero in case the operation was successful, a |
| 758 | non-zero value otherwise. |
| 759 | @end deftypefun |
| 760 | |
| 761 | @comment fenv.h |
| 762 | @comment ISO |
| 763 | @deftypefun int fetestexcept (int @var{excepts}) |
| 764 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 765 | Test whether the exception flags indicated by the parameter @var{except} |
| 766 | are currently set. If any of them are, a nonzero value is returned |
| 767 | which specifies which exceptions are set. Otherwise the result is zero. |
| 768 | @end deftypefun |
| 769 | |
| 770 | To understand these functions, imagine that the status word is an |
| 771 | integer variable named @var{status}. @code{feclearexcept} is then |
| 772 | equivalent to @samp{status &= ~excepts} and @code{fetestexcept} is |
| 773 | equivalent to @samp{(status & excepts)}. The actual implementation may |
| 774 | be very different, of course. |
| 775 | |
| 776 | Exception flags are only cleared when the program explicitly requests it, |
| 777 | by calling @code{feclearexcept}. If you want to check for exceptions |
| 778 | from a set of calculations, you should clear all the flags first. Here |
| 779 | is a simple example of the way to use @code{fetestexcept}: |
| 780 | |
| 781 | @smallexample |
| 782 | @{ |
| 783 | double f; |
| 784 | int raised; |
| 785 | feclearexcept (FE_ALL_EXCEPT); |
| 786 | f = compute (); |
| 787 | raised = fetestexcept (FE_OVERFLOW | FE_INVALID); |
| 788 | if (raised & FE_OVERFLOW) @{ /* @dots{} */ @} |
| 789 | if (raised & FE_INVALID) @{ /* @dots{} */ @} |
| 790 | /* @dots{} */ |
| 791 | @} |
| 792 | @end smallexample |
| 793 | |
| 794 | You cannot explicitly set bits in the status word. You can, however, |
| 795 | save the entire status word and restore it later. This is done with the |
| 796 | following functions: |
| 797 | |
| 798 | @comment fenv.h |
| 799 | @comment ISO |
| 800 | @deftypefun int fegetexceptflag (fexcept_t *@var{flagp}, int @var{excepts}) |
| 801 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 802 | This function stores in the variable pointed to by @var{flagp} an |
| 803 | implementation-defined value representing the current setting of the |
| 804 | exception flags indicated by @var{excepts}. |
| 805 | |
| 806 | The function returns zero in case the operation was successful, a |
| 807 | non-zero value otherwise. |
| 808 | @end deftypefun |
| 809 | |
| 810 | @comment fenv.h |
| 811 | @comment ISO |
| 812 | @deftypefun int fesetexceptflag (const fexcept_t *@var{flagp}, int @var{excepts}) |
| 813 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 814 | This function restores the flags for the exceptions indicated by |
| 815 | @var{excepts} to the values stored in the variable pointed to by |
| 816 | @var{flagp}. |
| 817 | |
| 818 | The function returns zero in case the operation was successful, a |
| 819 | non-zero value otherwise. |
| 820 | @end deftypefun |
| 821 | |
| 822 | Note that the value stored in @code{fexcept_t} bears no resemblance to |
| 823 | the bit mask returned by @code{fetestexcept}. The type may not even be |
| 824 | an integer. Do not attempt to modify an @code{fexcept_t} variable. |
| 825 | |
| 826 | @node Math Error Reporting |
| 827 | @subsection Error Reporting by Mathematical Functions |
| 828 | @cindex errors, mathematical |
| 829 | @cindex domain error |
| 830 | @cindex range error |
| 831 | |
| 832 | Many of the math functions are defined only over a subset of the real or |
| 833 | complex numbers. Even if they are mathematically defined, their result |
| 834 | may be larger or smaller than the range representable by their return |
| 835 | type without loss of accuracy. These are known as @dfn{domain errors}, |
| 836 | @dfn{overflows}, and |
| 837 | @dfn{underflows}, respectively. Math functions do several things when |
| 838 | one of these errors occurs. In this manual we will refer to the |
| 839 | complete response as @dfn{signalling} a domain error, overflow, or |
| 840 | underflow. |
| 841 | |
| 842 | When a math function suffers a domain error, it raises the invalid |
| 843 | exception and returns NaN. It also sets @var{errno} to @code{EDOM}; |
| 844 | this is for compatibility with old systems that do not support @w{IEEE |
| 845 | 754} exception handling. Likewise, when overflow occurs, math |
| 846 | functions raise the overflow exception and, in the default rounding |
| 847 | mode, return @math{@infinity{}} or @math{-@infinity{}} as appropriate |
| 848 | (in other rounding modes, the largest finite value of the appropriate |
| 849 | sign is returned when appropriate for that rounding mode). They also |
| 850 | set @var{errno} to @code{ERANGE} if returning @math{@infinity{}} or |
| 851 | @math{-@infinity{}}; @var{errno} may or may not be set to |
| 852 | @code{ERANGE} when a finite value is returned on overflow. When |
| 853 | underflow occurs, the underflow exception is raised, and zero |
| 854 | (appropriately signed) or a subnormal value, as appropriate for the |
| 855 | mathematical result of the function and the rounding mode, is |
| 856 | returned. @var{errno} may be set to @code{ERANGE}, but this is not |
| 857 | guaranteed; it is intended that @theglibc{} should set it when the |
| 858 | underflow is to an appropriately signed zero, but not necessarily for |
| 859 | other underflows. |
| 860 | |
| 861 | Some of the math functions are defined mathematically to result in a |
| 862 | complex value over parts of their domains. The most familiar example of |
| 863 | this is taking the square root of a negative number. The complex math |
| 864 | functions, such as @code{csqrt}, will return the appropriate complex value |
| 865 | in this case. The real-valued functions, such as @code{sqrt}, will |
| 866 | signal a domain error. |
| 867 | |
| 868 | Some older hardware does not support infinities. On that hardware, |
| 869 | overflows instead return a particular very large number (usually the |
| 870 | largest representable number). @file{math.h} defines macros you can use |
| 871 | to test for overflow on both old and new hardware. |
| 872 | |
| 873 | @comment math.h |
| 874 | @comment ISO |
| 875 | @deftypevr Macro double HUGE_VAL |
| 876 | @comment math.h |
| 877 | @comment ISO |
| 878 | @deftypevrx Macro float HUGE_VALF |
| 879 | @comment math.h |
| 880 | @comment ISO |
| 881 | @deftypevrx Macro {long double} HUGE_VALL |
| 882 | An expression representing a particular very large number. On machines |
| 883 | that use @w{IEEE 754} floating point format, @code{HUGE_VAL} is infinity. |
| 884 | On other machines, it's typically the largest positive number that can |
| 885 | be represented. |
| 886 | |
| 887 | Mathematical functions return the appropriately typed version of |
| 888 | @code{HUGE_VAL} or @code{@minus{}HUGE_VAL} when the result is too large |
| 889 | to be represented. |
| 890 | @end deftypevr |
| 891 | |
| 892 | @node Rounding |
| 893 | @section Rounding Modes |
| 894 | |
| 895 | Floating-point calculations are carried out internally with extra |
| 896 | precision, and then rounded to fit into the destination type. This |
| 897 | ensures that results are as precise as the input data. @w{IEEE 754} |
| 898 | defines four possible rounding modes: |
| 899 | |
| 900 | @table @asis |
| 901 | @item Round to nearest. |
| 902 | This is the default mode. It should be used unless there is a specific |
| 903 | need for one of the others. In this mode results are rounded to the |
| 904 | nearest representable value. If the result is midway between two |
| 905 | representable values, the even representable is chosen. @dfn{Even} here |
| 906 | means the lowest-order bit is zero. This rounding mode prevents |
| 907 | statistical bias and guarantees numeric stability: round-off errors in a |
| 908 | lengthy calculation will remain smaller than half of @code{FLT_EPSILON}. |
| 909 | |
| 910 | @c @item Round toward @math{+@infinity{}} |
| 911 | @item Round toward plus Infinity. |
| 912 | All results are rounded to the smallest representable value |
| 913 | which is greater than the result. |
| 914 | |
| 915 | @c @item Round toward @math{-@infinity{}} |
| 916 | @item Round toward minus Infinity. |
| 917 | All results are rounded to the largest representable value which is less |
| 918 | than the result. |
| 919 | |
| 920 | @item Round toward zero. |
| 921 | All results are rounded to the largest representable value whose |
| 922 | magnitude is less than that of the result. In other words, if the |
| 923 | result is negative it is rounded up; if it is positive, it is rounded |
| 924 | down. |
| 925 | @end table |
| 926 | |
| 927 | @noindent |
| 928 | @file{fenv.h} defines constants which you can use to refer to the |
| 929 | various rounding modes. Each one will be defined if and only if the FPU |
| 930 | supports the corresponding rounding mode. |
| 931 | |
| 932 | @table @code |
| 933 | @comment fenv.h |
| 934 | @comment ISO |
| 935 | @vindex FE_TONEAREST |
| 936 | @item FE_TONEAREST |
| 937 | Round to nearest. |
| 938 | |
| 939 | @comment fenv.h |
| 940 | @comment ISO |
| 941 | @vindex FE_UPWARD |
| 942 | @item FE_UPWARD |
| 943 | Round toward @math{+@infinity{}}. |
| 944 | |
| 945 | @comment fenv.h |
| 946 | @comment ISO |
| 947 | @vindex FE_DOWNWARD |
| 948 | @item FE_DOWNWARD |
| 949 | Round toward @math{-@infinity{}}. |
| 950 | |
| 951 | @comment fenv.h |
| 952 | @comment ISO |
| 953 | @vindex FE_TOWARDZERO |
| 954 | @item FE_TOWARDZERO |
| 955 | Round toward zero. |
| 956 | @end table |
| 957 | |
| 958 | Underflow is an unusual case. Normally, @w{IEEE 754} floating point |
| 959 | numbers are always normalized (@pxref{Floating Point Concepts}). |
| 960 | Numbers smaller than @math{2^r} (where @math{r} is the minimum exponent, |
| 961 | @code{FLT_MIN_RADIX-1} for @var{float}) cannot be represented as |
| 962 | normalized numbers. Rounding all such numbers to zero or @math{2^r} |
| 963 | would cause some algorithms to fail at 0. Therefore, they are left in |
| 964 | denormalized form. That produces loss of precision, since some bits of |
| 965 | the mantissa are stolen to indicate the decimal point. |
| 966 | |
| 967 | If a result is too small to be represented as a denormalized number, it |
| 968 | is rounded to zero. However, the sign of the result is preserved; if |
| 969 | the calculation was negative, the result is @dfn{negative zero}. |
| 970 | Negative zero can also result from some operations on infinity, such as |
| 971 | @math{4/-@infinity{}}. |
| 972 | |
| 973 | At any time one of the above four rounding modes is selected. You can |
| 974 | find out which one with this function: |
| 975 | |
| 976 | @comment fenv.h |
| 977 | @comment ISO |
| 978 | @deftypefun int fegetround (void) |
| 979 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 980 | Returns the currently selected rounding mode, represented by one of the |
| 981 | values of the defined rounding mode macros. |
| 982 | @end deftypefun |
| 983 | |
| 984 | @noindent |
| 985 | To change the rounding mode, use this function: |
| 986 | |
| 987 | @comment fenv.h |
| 988 | @comment ISO |
| 989 | @deftypefun int fesetround (int @var{round}) |
| 990 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 991 | Changes the currently selected rounding mode to @var{round}. If |
| 992 | @var{round} does not correspond to one of the supported rounding modes |
| 993 | nothing is changed. @code{fesetround} returns zero if it changed the |
| 994 | rounding mode, a nonzero value if the mode is not supported. |
| 995 | @end deftypefun |
| 996 | |
| 997 | You should avoid changing the rounding mode if possible. It can be an |
| 998 | expensive operation; also, some hardware requires you to compile your |
| 999 | program differently for it to work. The resulting code may run slower. |
| 1000 | See your compiler documentation for details. |
| 1001 | @c This section used to claim that functions existed to round one number |
| 1002 | @c in a specific fashion. I can't find any functions in the library |
| 1003 | @c that do that. -zw |
| 1004 | |
| 1005 | @node Control Functions |
| 1006 | @section Floating-Point Control Functions |
| 1007 | |
| 1008 | @w{IEEE 754} floating-point implementations allow the programmer to |
| 1009 | decide whether traps will occur for each of the exceptions, by setting |
| 1010 | bits in the @dfn{control word}. In C, traps result in the program |
| 1011 | receiving the @code{SIGFPE} signal; see @ref{Signal Handling}. |
| 1012 | |
| 1013 | @strong{NB:} @w{IEEE 754} says that trap handlers are given details of |
| 1014 | the exceptional situation, and can set the result value. C signals do |
| 1015 | not provide any mechanism to pass this information back and forth. |
| 1016 | Trapping exceptions in C is therefore not very useful. |
| 1017 | |
| 1018 | It is sometimes necessary to save the state of the floating-point unit |
| 1019 | while you perform some calculation. The library provides functions |
| 1020 | which save and restore the exception flags, the set of exceptions that |
| 1021 | generate traps, and the rounding mode. This information is known as the |
| 1022 | @dfn{floating-point environment}. |
| 1023 | |
| 1024 | The functions to save and restore the floating-point environment all use |
| 1025 | a variable of type @code{fenv_t} to store information. This type is |
| 1026 | defined in @file{fenv.h}. Its size and contents are |
| 1027 | implementation-defined. You should not attempt to manipulate a variable |
| 1028 | of this type directly. |
| 1029 | |
| 1030 | To save the state of the FPU, use one of these functions: |
| 1031 | |
| 1032 | @comment fenv.h |
| 1033 | @comment ISO |
| 1034 | @deftypefun int fegetenv (fenv_t *@var{envp}) |
| 1035 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1036 | Store the floating-point environment in the variable pointed to by |
| 1037 | @var{envp}. |
| 1038 | |
| 1039 | The function returns zero in case the operation was successful, a |
| 1040 | non-zero value otherwise. |
| 1041 | @end deftypefun |
| 1042 | |
| 1043 | @comment fenv.h |
| 1044 | @comment ISO |
| 1045 | @deftypefun int feholdexcept (fenv_t *@var{envp}) |
| 1046 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1047 | Store the current floating-point environment in the object pointed to by |
| 1048 | @var{envp}. Then clear all exception flags, and set the FPU to trap no |
| 1049 | exceptions. Not all FPUs support trapping no exceptions; if |
| 1050 | @code{feholdexcept} cannot set this mode, it returns nonzero value. If it |
| 1051 | succeeds, it returns zero. |
| 1052 | @end deftypefun |
| 1053 | |
| 1054 | The functions which restore the floating-point environment can take these |
| 1055 | kinds of arguments: |
| 1056 | |
| 1057 | @itemize @bullet |
| 1058 | @item |
| 1059 | Pointers to @code{fenv_t} objects, which were initialized previously by a |
| 1060 | call to @code{fegetenv} or @code{feholdexcept}. |
| 1061 | @item |
| 1062 | @vindex FE_DFL_ENV |
| 1063 | The special macro @code{FE_DFL_ENV} which represents the floating-point |
| 1064 | environment as it was available at program start. |
| 1065 | @item |
| 1066 | Implementation defined macros with names starting with @code{FE_} and |
| 1067 | having type @code{fenv_t *}. |
| 1068 | |
| 1069 | @vindex FE_NOMASK_ENV |
| 1070 | If possible, @theglibc{} defines a macro @code{FE_NOMASK_ENV} |
| 1071 | which represents an environment where every exception raised causes a |
| 1072 | trap to occur. You can test for this macro using @code{#ifdef}. It is |
| 1073 | only defined if @code{_GNU_SOURCE} is defined. |
| 1074 | |
| 1075 | Some platforms might define other predefined environments. |
| 1076 | @end itemize |
| 1077 | |
| 1078 | @noindent |
| 1079 | To set the floating-point environment, you can use either of these |
| 1080 | functions: |
| 1081 | |
| 1082 | @comment fenv.h |
| 1083 | @comment ISO |
| 1084 | @deftypefun int fesetenv (const fenv_t *@var{envp}) |
| 1085 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1086 | Set the floating-point environment to that described by @var{envp}. |
| 1087 | |
| 1088 | The function returns zero in case the operation was successful, a |
| 1089 | non-zero value otherwise. |
| 1090 | @end deftypefun |
| 1091 | |
| 1092 | @comment fenv.h |
| 1093 | @comment ISO |
| 1094 | @deftypefun int feupdateenv (const fenv_t *@var{envp}) |
| 1095 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1096 | Like @code{fesetenv}, this function sets the floating-point environment |
| 1097 | to that described by @var{envp}. However, if any exceptions were |
| 1098 | flagged in the status word before @code{feupdateenv} was called, they |
| 1099 | remain flagged after the call. In other words, after @code{feupdateenv} |
| 1100 | is called, the status word is the bitwise OR of the previous status word |
| 1101 | and the one saved in @var{envp}. |
| 1102 | |
| 1103 | The function returns zero in case the operation was successful, a |
| 1104 | non-zero value otherwise. |
| 1105 | @end deftypefun |
| 1106 | |
| 1107 | @noindent |
| 1108 | To control for individual exceptions if raising them causes a trap to |
| 1109 | occur, you can use the following two functions. |
| 1110 | |
| 1111 | @strong{Portability Note:} These functions are all GNU extensions. |
| 1112 | |
| 1113 | @comment fenv.h |
| 1114 | @comment GNU |
| 1115 | @deftypefun int feenableexcept (int @var{excepts}) |
| 1116 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1117 | This functions enables traps for each of the exceptions as indicated by |
| 1118 | the parameter @var{except}. The individual exceptions are described in |
| 1119 | @ref{Status bit operations}. Only the specified exceptions are |
| 1120 | enabled, the status of the other exceptions is not changed. |
| 1121 | |
| 1122 | The function returns the previous enabled exceptions in case the |
| 1123 | operation was successful, @code{-1} otherwise. |
| 1124 | @end deftypefun |
| 1125 | |
| 1126 | @comment fenv.h |
| 1127 | @comment GNU |
| 1128 | @deftypefun int fedisableexcept (int @var{excepts}) |
| 1129 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1130 | This functions disables traps for each of the exceptions as indicated by |
| 1131 | the parameter @var{except}. The individual exceptions are described in |
| 1132 | @ref{Status bit operations}. Only the specified exceptions are |
| 1133 | disabled, the status of the other exceptions is not changed. |
| 1134 | |
| 1135 | The function returns the previous enabled exceptions in case the |
| 1136 | operation was successful, @code{-1} otherwise. |
| 1137 | @end deftypefun |
| 1138 | |
| 1139 | @comment fenv.h |
| 1140 | @comment GNU |
| 1141 | @deftypefun int fegetexcept (void) |
| 1142 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1143 | The function returns a bitmask of all currently enabled exceptions. It |
| 1144 | returns @code{-1} in case of failure. |
| 1145 | @end deftypefun |
| 1146 | |
| 1147 | @node Arithmetic Functions |
| 1148 | @section Arithmetic Functions |
| 1149 | |
| 1150 | The C library provides functions to do basic operations on |
| 1151 | floating-point numbers. These include absolute value, maximum and minimum, |
| 1152 | normalization, bit twiddling, rounding, and a few others. |
| 1153 | |
| 1154 | @menu |
| 1155 | * Absolute Value:: Absolute values of integers and floats. |
| 1156 | * Normalization Functions:: Extracting exponents and putting them back. |
| 1157 | * Rounding Functions:: Rounding floats to integers. |
| 1158 | * Remainder Functions:: Remainders on division, precisely defined. |
| 1159 | * FP Bit Twiddling:: Sign bit adjustment. Adding epsilon. |
| 1160 | * FP Comparison Functions:: Comparisons without risk of exceptions. |
| 1161 | * Misc FP Arithmetic:: Max, min, positive difference, multiply-add. |
| 1162 | @end menu |
| 1163 | |
| 1164 | @node Absolute Value |
| 1165 | @subsection Absolute Value |
| 1166 | @cindex absolute value functions |
| 1167 | |
| 1168 | These functions are provided for obtaining the @dfn{absolute value} (or |
| 1169 | @dfn{magnitude}) of a number. The absolute value of a real number |
| 1170 | @var{x} is @var{x} if @var{x} is positive, @minus{}@var{x} if @var{x} is |
| 1171 | negative. For a complex number @var{z}, whose real part is @var{x} and |
| 1172 | whose imaginary part is @var{y}, the absolute value is @w{@code{sqrt |
| 1173 | (@var{x}*@var{x} + @var{y}*@var{y})}}. |
| 1174 | |
| 1175 | @pindex math.h |
| 1176 | @pindex stdlib.h |
| 1177 | Prototypes for @code{abs}, @code{labs} and @code{llabs} are in @file{stdlib.h}; |
| 1178 | @code{imaxabs} is declared in @file{inttypes.h}; |
| 1179 | @code{fabs}, @code{fabsf} and @code{fabsl} are declared in @file{math.h}. |
| 1180 | @code{cabs}, @code{cabsf} and @code{cabsl} are declared in @file{complex.h}. |
| 1181 | |
| 1182 | @comment stdlib.h |
| 1183 | @comment ISO |
| 1184 | @deftypefun int abs (int @var{number}) |
| 1185 | @comment stdlib.h |
| 1186 | @comment ISO |
| 1187 | @deftypefunx {long int} labs (long int @var{number}) |
| 1188 | @comment stdlib.h |
| 1189 | @comment ISO |
| 1190 | @deftypefunx {long long int} llabs (long long int @var{number}) |
| 1191 | @comment inttypes.h |
| 1192 | @comment ISO |
| 1193 | @deftypefunx intmax_t imaxabs (intmax_t @var{number}) |
| 1194 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1195 | These functions return the absolute value of @var{number}. |
| 1196 | |
| 1197 | Most computers use a two's complement integer representation, in which |
| 1198 | the absolute value of @code{INT_MIN} (the smallest possible @code{int}) |
| 1199 | cannot be represented; thus, @w{@code{abs (INT_MIN)}} is not defined. |
| 1200 | |
| 1201 | @code{llabs} and @code{imaxdiv} are new to @w{ISO C99}. |
| 1202 | |
| 1203 | See @ref{Integers} for a description of the @code{intmax_t} type. |
| 1204 | |
| 1205 | @end deftypefun |
| 1206 | |
| 1207 | @comment math.h |
| 1208 | @comment ISO |
| 1209 | @deftypefun double fabs (double @var{number}) |
| 1210 | @comment math.h |
| 1211 | @comment ISO |
| 1212 | @deftypefunx float fabsf (float @var{number}) |
| 1213 | @comment math.h |
| 1214 | @comment ISO |
| 1215 | @deftypefunx {long double} fabsl (long double @var{number}) |
| 1216 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1217 | This function returns the absolute value of the floating-point number |
| 1218 | @var{number}. |
| 1219 | @end deftypefun |
| 1220 | |
| 1221 | @comment complex.h |
| 1222 | @comment ISO |
| 1223 | @deftypefun double cabs (complex double @var{z}) |
| 1224 | @comment complex.h |
| 1225 | @comment ISO |
| 1226 | @deftypefunx float cabsf (complex float @var{z}) |
| 1227 | @comment complex.h |
| 1228 | @comment ISO |
| 1229 | @deftypefunx {long double} cabsl (complex long double @var{z}) |
| 1230 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1231 | These functions return the absolute value of the complex number @var{z} |
| 1232 | (@pxref{Complex Numbers}). The absolute value of a complex number is: |
| 1233 | |
| 1234 | @smallexample |
| 1235 | sqrt (creal (@var{z}) * creal (@var{z}) + cimag (@var{z}) * cimag (@var{z})) |
| 1236 | @end smallexample |
| 1237 | |
| 1238 | This function should always be used instead of the direct formula |
| 1239 | because it takes special care to avoid losing precision. It may also |
| 1240 | take advantage of hardware support for this operation. See @code{hypot} |
| 1241 | in @ref{Exponents and Logarithms}. |
| 1242 | @end deftypefun |
| 1243 | |
| 1244 | @node Normalization Functions |
| 1245 | @subsection Normalization Functions |
| 1246 | @cindex normalization functions (floating-point) |
| 1247 | |
| 1248 | The functions described in this section are primarily provided as a way |
| 1249 | to efficiently perform certain low-level manipulations on floating point |
| 1250 | numbers that are represented internally using a binary radix; |
| 1251 | see @ref{Floating Point Concepts}. These functions are required to |
| 1252 | have equivalent behavior even if the representation does not use a radix |
| 1253 | of 2, but of course they are unlikely to be particularly efficient in |
| 1254 | those cases. |
| 1255 | |
| 1256 | @pindex math.h |
| 1257 | All these functions are declared in @file{math.h}. |
| 1258 | |
| 1259 | @comment math.h |
| 1260 | @comment ISO |
| 1261 | @deftypefun double frexp (double @var{value}, int *@var{exponent}) |
| 1262 | @comment math.h |
| 1263 | @comment ISO |
| 1264 | @deftypefunx float frexpf (float @var{value}, int *@var{exponent}) |
| 1265 | @comment math.h |
| 1266 | @comment ISO |
| 1267 | @deftypefunx {long double} frexpl (long double @var{value}, int *@var{exponent}) |
| 1268 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1269 | These functions are used to split the number @var{value} |
| 1270 | into a normalized fraction and an exponent. |
| 1271 | |
| 1272 | If the argument @var{value} is not zero, the return value is @var{value} |
| 1273 | times a power of two, and its magnitude is always in the range 1/2 |
| 1274 | (inclusive) to 1 (exclusive). The corresponding exponent is stored in |
| 1275 | @code{*@var{exponent}}; the return value multiplied by 2 raised to this |
| 1276 | exponent equals the original number @var{value}. |
| 1277 | |
| 1278 | For example, @code{frexp (12.8, &exponent)} returns @code{0.8} and |
| 1279 | stores @code{4} in @code{exponent}. |
| 1280 | |
| 1281 | If @var{value} is zero, then the return value is zero and |
| 1282 | zero is stored in @code{*@var{exponent}}. |
| 1283 | @end deftypefun |
| 1284 | |
| 1285 | @comment math.h |
| 1286 | @comment ISO |
| 1287 | @deftypefun double ldexp (double @var{value}, int @var{exponent}) |
| 1288 | @comment math.h |
| 1289 | @comment ISO |
| 1290 | @deftypefunx float ldexpf (float @var{value}, int @var{exponent}) |
| 1291 | @comment math.h |
| 1292 | @comment ISO |
| 1293 | @deftypefunx {long double} ldexpl (long double @var{value}, int @var{exponent}) |
| 1294 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1295 | These functions return the result of multiplying the floating-point |
| 1296 | number @var{value} by 2 raised to the power @var{exponent}. (It can |
| 1297 | be used to reassemble floating-point numbers that were taken apart |
| 1298 | by @code{frexp}.) |
| 1299 | |
| 1300 | For example, @code{ldexp (0.8, 4)} returns @code{12.8}. |
| 1301 | @end deftypefun |
| 1302 | |
| 1303 | The following functions, which come from BSD, provide facilities |
| 1304 | equivalent to those of @code{ldexp} and @code{frexp}. See also the |
| 1305 | @w{ISO C} function @code{logb} which originally also appeared in BSD. |
| 1306 | |
| 1307 | @comment math.h |
| 1308 | @comment BSD |
| 1309 | @deftypefun double scalb (double @var{value}, double @var{exponent}) |
| 1310 | @comment math.h |
| 1311 | @comment BSD |
| 1312 | @deftypefunx float scalbf (float @var{value}, float @var{exponent}) |
| 1313 | @comment math.h |
| 1314 | @comment BSD |
| 1315 | @deftypefunx {long double} scalbl (long double @var{value}, long double @var{exponent}) |
| 1316 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1317 | The @code{scalb} function is the BSD name for @code{ldexp}. |
| 1318 | @end deftypefun |
| 1319 | |
| 1320 | @comment math.h |
| 1321 | @comment BSD |
| 1322 | @deftypefun double scalbn (double @var{x}, int @var{n}) |
| 1323 | @comment math.h |
| 1324 | @comment BSD |
| 1325 | @deftypefunx float scalbnf (float @var{x}, int @var{n}) |
| 1326 | @comment math.h |
| 1327 | @comment BSD |
| 1328 | @deftypefunx {long double} scalbnl (long double @var{x}, int @var{n}) |
| 1329 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1330 | @code{scalbn} is identical to @code{scalb}, except that the exponent |
| 1331 | @var{n} is an @code{int} instead of a floating-point number. |
| 1332 | @end deftypefun |
| 1333 | |
| 1334 | @comment math.h |
| 1335 | @comment BSD |
| 1336 | @deftypefun double scalbln (double @var{x}, long int @var{n}) |
| 1337 | @comment math.h |
| 1338 | @comment BSD |
| 1339 | @deftypefunx float scalblnf (float @var{x}, long int @var{n}) |
| 1340 | @comment math.h |
| 1341 | @comment BSD |
| 1342 | @deftypefunx {long double} scalblnl (long double @var{x}, long int @var{n}) |
| 1343 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1344 | @code{scalbln} is identical to @code{scalb}, except that the exponent |
| 1345 | @var{n} is a @code{long int} instead of a floating-point number. |
| 1346 | @end deftypefun |
| 1347 | |
| 1348 | @comment math.h |
| 1349 | @comment BSD |
| 1350 | @deftypefun double significand (double @var{x}) |
| 1351 | @comment math.h |
| 1352 | @comment BSD |
| 1353 | @deftypefunx float significandf (float @var{x}) |
| 1354 | @comment math.h |
| 1355 | @comment BSD |
| 1356 | @deftypefunx {long double} significandl (long double @var{x}) |
| 1357 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1358 | @code{significand} returns the mantissa of @var{x} scaled to the range |
| 1359 | @math{[1, 2)}. |
| 1360 | It is equivalent to @w{@code{scalb (@var{x}, (double) -ilogb (@var{x}))}}. |
| 1361 | |
| 1362 | This function exists mainly for use in certain standardized tests |
| 1363 | of @w{IEEE 754} conformance. |
| 1364 | @end deftypefun |
| 1365 | |
| 1366 | @node Rounding Functions |
| 1367 | @subsection Rounding Functions |
| 1368 | @cindex converting floats to integers |
| 1369 | |
| 1370 | @pindex math.h |
| 1371 | The functions listed here perform operations such as rounding and |
| 1372 | truncation of floating-point values. Some of these functions convert |
| 1373 | floating point numbers to integer values. They are all declared in |
| 1374 | @file{math.h}. |
| 1375 | |
| 1376 | You can also convert floating-point numbers to integers simply by |
| 1377 | casting them to @code{int}. This discards the fractional part, |
| 1378 | effectively rounding towards zero. However, this only works if the |
| 1379 | result can actually be represented as an @code{int}---for very large |
| 1380 | numbers, this is impossible. The functions listed here return the |
| 1381 | result as a @code{double} instead to get around this problem. |
| 1382 | |
| 1383 | @comment math.h |
| 1384 | @comment ISO |
| 1385 | @deftypefun double ceil (double @var{x}) |
| 1386 | @comment math.h |
| 1387 | @comment ISO |
| 1388 | @deftypefunx float ceilf (float @var{x}) |
| 1389 | @comment math.h |
| 1390 | @comment ISO |
| 1391 | @deftypefunx {long double} ceill (long double @var{x}) |
| 1392 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1393 | These functions round @var{x} upwards to the nearest integer, |
| 1394 | returning that value as a @code{double}. Thus, @code{ceil (1.5)} |
| 1395 | is @code{2.0}. |
| 1396 | @end deftypefun |
| 1397 | |
| 1398 | @comment math.h |
| 1399 | @comment ISO |
| 1400 | @deftypefun double floor (double @var{x}) |
| 1401 | @comment math.h |
| 1402 | @comment ISO |
| 1403 | @deftypefunx float floorf (float @var{x}) |
| 1404 | @comment math.h |
| 1405 | @comment ISO |
| 1406 | @deftypefunx {long double} floorl (long double @var{x}) |
| 1407 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1408 | These functions round @var{x} downwards to the nearest |
| 1409 | integer, returning that value as a @code{double}. Thus, @code{floor |
| 1410 | (1.5)} is @code{1.0} and @code{floor (-1.5)} is @code{-2.0}. |
| 1411 | @end deftypefun |
| 1412 | |
| 1413 | @comment math.h |
| 1414 | @comment ISO |
| 1415 | @deftypefun double trunc (double @var{x}) |
| 1416 | @comment math.h |
| 1417 | @comment ISO |
| 1418 | @deftypefunx float truncf (float @var{x}) |
| 1419 | @comment math.h |
| 1420 | @comment ISO |
| 1421 | @deftypefunx {long double} truncl (long double @var{x}) |
| 1422 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1423 | The @code{trunc} functions round @var{x} towards zero to the nearest |
| 1424 | integer (returned in floating-point format). Thus, @code{trunc (1.5)} |
| 1425 | is @code{1.0} and @code{trunc (-1.5)} is @code{-1.0}. |
| 1426 | @end deftypefun |
| 1427 | |
| 1428 | @comment math.h |
| 1429 | @comment ISO |
| 1430 | @deftypefun double rint (double @var{x}) |
| 1431 | @comment math.h |
| 1432 | @comment ISO |
| 1433 | @deftypefunx float rintf (float @var{x}) |
| 1434 | @comment math.h |
| 1435 | @comment ISO |
| 1436 | @deftypefunx {long double} rintl (long double @var{x}) |
| 1437 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1438 | These functions round @var{x} to an integer value according to the |
| 1439 | current rounding mode. @xref{Floating Point Parameters}, for |
| 1440 | information about the various rounding modes. The default |
| 1441 | rounding mode is to round to the nearest integer; some machines |
| 1442 | support other modes, but round-to-nearest is always used unless |
| 1443 | you explicitly select another. |
| 1444 | |
| 1445 | If @var{x} was not initially an integer, these functions raise the |
| 1446 | inexact exception. |
| 1447 | @end deftypefun |
| 1448 | |
| 1449 | @comment math.h |
| 1450 | @comment ISO |
| 1451 | @deftypefun double nearbyint (double @var{x}) |
| 1452 | @comment math.h |
| 1453 | @comment ISO |
| 1454 | @deftypefunx float nearbyintf (float @var{x}) |
| 1455 | @comment math.h |
| 1456 | @comment ISO |
| 1457 | @deftypefunx {long double} nearbyintl (long double @var{x}) |
| 1458 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1459 | These functions return the same value as the @code{rint} functions, but |
| 1460 | do not raise the inexact exception if @var{x} is not an integer. |
| 1461 | @end deftypefun |
| 1462 | |
| 1463 | @comment math.h |
| 1464 | @comment ISO |
| 1465 | @deftypefun double round (double @var{x}) |
| 1466 | @comment math.h |
| 1467 | @comment ISO |
| 1468 | @deftypefunx float roundf (float @var{x}) |
| 1469 | @comment math.h |
| 1470 | @comment ISO |
| 1471 | @deftypefunx {long double} roundl (long double @var{x}) |
| 1472 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1473 | These functions are similar to @code{rint}, but they round halfway |
| 1474 | cases away from zero instead of to the nearest integer (or other |
| 1475 | current rounding mode). |
| 1476 | @end deftypefun |
| 1477 | |
| 1478 | @comment math.h |
| 1479 | @comment ISO |
| 1480 | @deftypefun {long int} lrint (double @var{x}) |
| 1481 | @comment math.h |
| 1482 | @comment ISO |
| 1483 | @deftypefunx {long int} lrintf (float @var{x}) |
| 1484 | @comment math.h |
| 1485 | @comment ISO |
| 1486 | @deftypefunx {long int} lrintl (long double @var{x}) |
| 1487 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1488 | These functions are just like @code{rint}, but they return a |
| 1489 | @code{long int} instead of a floating-point number. |
| 1490 | @end deftypefun |
| 1491 | |
| 1492 | @comment math.h |
| 1493 | @comment ISO |
| 1494 | @deftypefun {long long int} llrint (double @var{x}) |
| 1495 | @comment math.h |
| 1496 | @comment ISO |
| 1497 | @deftypefunx {long long int} llrintf (float @var{x}) |
| 1498 | @comment math.h |
| 1499 | @comment ISO |
| 1500 | @deftypefunx {long long int} llrintl (long double @var{x}) |
| 1501 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1502 | These functions are just like @code{rint}, but they return a |
| 1503 | @code{long long int} instead of a floating-point number. |
| 1504 | @end deftypefun |
| 1505 | |
| 1506 | @comment math.h |
| 1507 | @comment ISO |
| 1508 | @deftypefun {long int} lround (double @var{x}) |
| 1509 | @comment math.h |
| 1510 | @comment ISO |
| 1511 | @deftypefunx {long int} lroundf (float @var{x}) |
| 1512 | @comment math.h |
| 1513 | @comment ISO |
| 1514 | @deftypefunx {long int} lroundl (long double @var{x}) |
| 1515 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1516 | These functions are just like @code{round}, but they return a |
| 1517 | @code{long int} instead of a floating-point number. |
| 1518 | @end deftypefun |
| 1519 | |
| 1520 | @comment math.h |
| 1521 | @comment ISO |
| 1522 | @deftypefun {long long int} llround (double @var{x}) |
| 1523 | @comment math.h |
| 1524 | @comment ISO |
| 1525 | @deftypefunx {long long int} llroundf (float @var{x}) |
| 1526 | @comment math.h |
| 1527 | @comment ISO |
| 1528 | @deftypefunx {long long int} llroundl (long double @var{x}) |
| 1529 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1530 | These functions are just like @code{round}, but they return a |
| 1531 | @code{long long int} instead of a floating-point number. |
| 1532 | @end deftypefun |
| 1533 | |
| 1534 | |
| 1535 | @comment math.h |
| 1536 | @comment ISO |
| 1537 | @deftypefun double modf (double @var{value}, double *@var{integer-part}) |
| 1538 | @comment math.h |
| 1539 | @comment ISO |
| 1540 | @deftypefunx float modff (float @var{value}, float *@var{integer-part}) |
| 1541 | @comment math.h |
| 1542 | @comment ISO |
| 1543 | @deftypefunx {long double} modfl (long double @var{value}, long double *@var{integer-part}) |
| 1544 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1545 | These functions break the argument @var{value} into an integer part and a |
| 1546 | fractional part (between @code{-1} and @code{1}, exclusive). Their sum |
| 1547 | equals @var{value}. Each of the parts has the same sign as @var{value}, |
| 1548 | and the integer part is always rounded toward zero. |
| 1549 | |
| 1550 | @code{modf} stores the integer part in @code{*@var{integer-part}}, and |
| 1551 | returns the fractional part. For example, @code{modf (2.5, &intpart)} |
| 1552 | returns @code{0.5} and stores @code{2.0} into @code{intpart}. |
| 1553 | @end deftypefun |
| 1554 | |
| 1555 | @node Remainder Functions |
| 1556 | @subsection Remainder Functions |
| 1557 | |
| 1558 | The functions in this section compute the remainder on division of two |
| 1559 | floating-point numbers. Each is a little different; pick the one that |
| 1560 | suits your problem. |
| 1561 | |
| 1562 | @comment math.h |
| 1563 | @comment ISO |
| 1564 | @deftypefun double fmod (double @var{numerator}, double @var{denominator}) |
| 1565 | @comment math.h |
| 1566 | @comment ISO |
| 1567 | @deftypefunx float fmodf (float @var{numerator}, float @var{denominator}) |
| 1568 | @comment math.h |
| 1569 | @comment ISO |
| 1570 | @deftypefunx {long double} fmodl (long double @var{numerator}, long double @var{denominator}) |
| 1571 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1572 | These functions compute the remainder from the division of |
| 1573 | @var{numerator} by @var{denominator}. Specifically, the return value is |
| 1574 | @code{@var{numerator} - @w{@var{n} * @var{denominator}}}, where @var{n} |
| 1575 | is the quotient of @var{numerator} divided by @var{denominator}, rounded |
| 1576 | towards zero to an integer. Thus, @w{@code{fmod (6.5, 2.3)}} returns |
| 1577 | @code{1.9}, which is @code{6.5} minus @code{4.6}. |
| 1578 | |
| 1579 | The result has the same sign as the @var{numerator} and has magnitude |
| 1580 | less than the magnitude of the @var{denominator}. |
| 1581 | |
| 1582 | If @var{denominator} is zero, @code{fmod} signals a domain error. |
| 1583 | @end deftypefun |
| 1584 | |
| 1585 | @comment math.h |
| 1586 | @comment BSD |
| 1587 | @deftypefun double drem (double @var{numerator}, double @var{denominator}) |
| 1588 | @comment math.h |
| 1589 | @comment BSD |
| 1590 | @deftypefunx float dremf (float @var{numerator}, float @var{denominator}) |
| 1591 | @comment math.h |
| 1592 | @comment BSD |
| 1593 | @deftypefunx {long double} dreml (long double @var{numerator}, long double @var{denominator}) |
| 1594 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1595 | These functions are like @code{fmod} except that they round the |
| 1596 | internal quotient @var{n} to the nearest integer instead of towards zero |
| 1597 | to an integer. For example, @code{drem (6.5, 2.3)} returns @code{-0.4}, |
| 1598 | which is @code{6.5} minus @code{6.9}. |
| 1599 | |
| 1600 | The absolute value of the result is less than or equal to half the |
| 1601 | absolute value of the @var{denominator}. The difference between |
| 1602 | @code{fmod (@var{numerator}, @var{denominator})} and @code{drem |
| 1603 | (@var{numerator}, @var{denominator})} is always either |
| 1604 | @var{denominator}, minus @var{denominator}, or zero. |
| 1605 | |
| 1606 | If @var{denominator} is zero, @code{drem} signals a domain error. |
| 1607 | @end deftypefun |
| 1608 | |
| 1609 | @comment math.h |
| 1610 | @comment BSD |
| 1611 | @deftypefun double remainder (double @var{numerator}, double @var{denominator}) |
| 1612 | @comment math.h |
| 1613 | @comment BSD |
| 1614 | @deftypefunx float remainderf (float @var{numerator}, float @var{denominator}) |
| 1615 | @comment math.h |
| 1616 | @comment BSD |
| 1617 | @deftypefunx {long double} remainderl (long double @var{numerator}, long double @var{denominator}) |
| 1618 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1619 | This function is another name for @code{drem}. |
| 1620 | @end deftypefun |
| 1621 | |
| 1622 | @node FP Bit Twiddling |
| 1623 | @subsection Setting and modifying single bits of FP values |
| 1624 | @cindex FP arithmetic |
| 1625 | |
| 1626 | There are some operations that are too complicated or expensive to |
| 1627 | perform by hand on floating-point numbers. @w{ISO C99} defines |
| 1628 | functions to do these operations, which mostly involve changing single |
| 1629 | bits. |
| 1630 | |
| 1631 | @comment math.h |
| 1632 | @comment ISO |
| 1633 | @deftypefun double copysign (double @var{x}, double @var{y}) |
| 1634 | @comment math.h |
| 1635 | @comment ISO |
| 1636 | @deftypefunx float copysignf (float @var{x}, float @var{y}) |
| 1637 | @comment math.h |
| 1638 | @comment ISO |
| 1639 | @deftypefunx {long double} copysignl (long double @var{x}, long double @var{y}) |
| 1640 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1641 | These functions return @var{x} but with the sign of @var{y}. They work |
| 1642 | even if @var{x} or @var{y} are NaN or zero. Both of these can carry a |
| 1643 | sign (although not all implementations support it) and this is one of |
| 1644 | the few operations that can tell the difference. |
| 1645 | |
| 1646 | @code{copysign} never raises an exception. |
| 1647 | @c except signalling NaNs |
| 1648 | |
| 1649 | This function is defined in @w{IEC 559} (and the appendix with |
| 1650 | recommended functions in @w{IEEE 754}/@w{IEEE 854}). |
| 1651 | @end deftypefun |
| 1652 | |
| 1653 | @comment math.h |
| 1654 | @comment ISO |
| 1655 | @deftypefun int signbit (@emph{float-type} @var{x}) |
| 1656 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1657 | @code{signbit} is a generic macro which can work on all floating-point |
| 1658 | types. It returns a nonzero value if the value of @var{x} has its sign |
| 1659 | bit set. |
| 1660 | |
| 1661 | This is not the same as @code{x < 0.0}, because @w{IEEE 754} floating |
| 1662 | point allows zero to be signed. The comparison @code{-0.0 < 0.0} is |
| 1663 | false, but @code{signbit (-0.0)} will return a nonzero value. |
| 1664 | @end deftypefun |
| 1665 | |
| 1666 | @comment math.h |
| 1667 | @comment ISO |
| 1668 | @deftypefun double nextafter (double @var{x}, double @var{y}) |
| 1669 | @comment math.h |
| 1670 | @comment ISO |
| 1671 | @deftypefunx float nextafterf (float @var{x}, float @var{y}) |
| 1672 | @comment math.h |
| 1673 | @comment ISO |
| 1674 | @deftypefunx {long double} nextafterl (long double @var{x}, long double @var{y}) |
| 1675 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1676 | The @code{nextafter} function returns the next representable neighbor of |
| 1677 | @var{x} in the direction towards @var{y}. The size of the step between |
| 1678 | @var{x} and the result depends on the type of the result. If |
| 1679 | @math{@var{x} = @var{y}} the function simply returns @var{y}. If either |
| 1680 | value is @code{NaN}, @code{NaN} is returned. Otherwise |
| 1681 | a value corresponding to the value of the least significant bit in the |
| 1682 | mantissa is added or subtracted, depending on the direction. |
| 1683 | @code{nextafter} will signal overflow or underflow if the result goes |
| 1684 | outside of the range of normalized numbers. |
| 1685 | |
| 1686 | This function is defined in @w{IEC 559} (and the appendix with |
| 1687 | recommended functions in @w{IEEE 754}/@w{IEEE 854}). |
| 1688 | @end deftypefun |
| 1689 | |
| 1690 | @comment math.h |
| 1691 | @comment ISO |
| 1692 | @deftypefun double nexttoward (double @var{x}, long double @var{y}) |
| 1693 | @comment math.h |
| 1694 | @comment ISO |
| 1695 | @deftypefunx float nexttowardf (float @var{x}, long double @var{y}) |
| 1696 | @comment math.h |
| 1697 | @comment ISO |
| 1698 | @deftypefunx {long double} nexttowardl (long double @var{x}, long double @var{y}) |
| 1699 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1700 | These functions are identical to the corresponding versions of |
| 1701 | @code{nextafter} except that their second argument is a @code{long |
| 1702 | double}. |
| 1703 | @end deftypefun |
| 1704 | |
| 1705 | @cindex NaN |
| 1706 | @comment math.h |
| 1707 | @comment ISO |
| 1708 | @deftypefun double nan (const char *@var{tagp}) |
| 1709 | @comment math.h |
| 1710 | @comment ISO |
| 1711 | @deftypefunx float nanf (const char *@var{tagp}) |
| 1712 | @comment math.h |
| 1713 | @comment ISO |
| 1714 | @deftypefunx {long double} nanl (const char *@var{tagp}) |
| 1715 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
| 1716 | @c The unsafe-but-ruled-safe locale use comes from strtod. |
| 1717 | The @code{nan} function returns a representation of NaN, provided that |
| 1718 | NaN is supported by the target platform. |
| 1719 | @code{nan ("@var{n-char-sequence}")} is equivalent to |
| 1720 | @code{strtod ("NAN(@var{n-char-sequence})")}. |
| 1721 | |
| 1722 | The argument @var{tagp} is used in an unspecified manner. On @w{IEEE |
| 1723 | 754} systems, there are many representations of NaN, and @var{tagp} |
| 1724 | selects one. On other systems it may do nothing. |
| 1725 | @end deftypefun |
| 1726 | |
| 1727 | @node FP Comparison Functions |
| 1728 | @subsection Floating-Point Comparison Functions |
| 1729 | @cindex unordered comparison |
| 1730 | |
| 1731 | The standard C comparison operators provoke exceptions when one or other |
| 1732 | of the operands is NaN. For example, |
| 1733 | |
| 1734 | @smallexample |
| 1735 | int v = a < 1.0; |
| 1736 | @end smallexample |
| 1737 | |
| 1738 | @noindent |
| 1739 | will raise an exception if @var{a} is NaN. (This does @emph{not} |
| 1740 | happen with @code{==} and @code{!=}; those merely return false and true, |
| 1741 | respectively, when NaN is examined.) Frequently this exception is |
| 1742 | undesirable. @w{ISO C99} therefore defines comparison functions that |
| 1743 | do not raise exceptions when NaN is examined. All of the functions are |
| 1744 | implemented as macros which allow their arguments to be of any |
| 1745 | floating-point type. The macros are guaranteed to evaluate their |
| 1746 | arguments only once. |
| 1747 | |
| 1748 | @comment math.h |
| 1749 | @comment ISO |
| 1750 | @deftypefn Macro int isgreater (@emph{real-floating} @var{x}, @emph{real-floating} @var{y}) |
| 1751 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1752 | This macro determines whether the argument @var{x} is greater than |
| 1753 | @var{y}. It is equivalent to @code{(@var{x}) > (@var{y})}, but no |
| 1754 | exception is raised if @var{x} or @var{y} are NaN. |
| 1755 | @end deftypefn |
| 1756 | |
| 1757 | @comment math.h |
| 1758 | @comment ISO |
| 1759 | @deftypefn Macro int isgreaterequal (@emph{real-floating} @var{x}, @emph{real-floating} @var{y}) |
| 1760 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1761 | This macro determines whether the argument @var{x} is greater than or |
| 1762 | equal to @var{y}. It is equivalent to @code{(@var{x}) >= (@var{y})}, but no |
| 1763 | exception is raised if @var{x} or @var{y} are NaN. |
| 1764 | @end deftypefn |
| 1765 | |
| 1766 | @comment math.h |
| 1767 | @comment ISO |
| 1768 | @deftypefn Macro int isless (@emph{real-floating} @var{x}, @emph{real-floating} @var{y}) |
| 1769 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1770 | This macro determines whether the argument @var{x} is less than @var{y}. |
| 1771 | It is equivalent to @code{(@var{x}) < (@var{y})}, but no exception is |
| 1772 | raised if @var{x} or @var{y} are NaN. |
| 1773 | @end deftypefn |
| 1774 | |
| 1775 | @comment math.h |
| 1776 | @comment ISO |
| 1777 | @deftypefn Macro int islessequal (@emph{real-floating} @var{x}, @emph{real-floating} @var{y}) |
| 1778 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1779 | This macro determines whether the argument @var{x} is less than or equal |
| 1780 | to @var{y}. It is equivalent to @code{(@var{x}) <= (@var{y})}, but no |
| 1781 | exception is raised if @var{x} or @var{y} are NaN. |
| 1782 | @end deftypefn |
| 1783 | |
| 1784 | @comment math.h |
| 1785 | @comment ISO |
| 1786 | @deftypefn Macro int islessgreater (@emph{real-floating} @var{x}, @emph{real-floating} @var{y}) |
| 1787 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1788 | This macro determines whether the argument @var{x} is less or greater |
| 1789 | than @var{y}. It is equivalent to @code{(@var{x}) < (@var{y}) || |
| 1790 | (@var{x}) > (@var{y})} (although it only evaluates @var{x} and @var{y} |
| 1791 | once), but no exception is raised if @var{x} or @var{y} are NaN. |
| 1792 | |
| 1793 | This macro is not equivalent to @code{@var{x} != @var{y}}, because that |
| 1794 | expression is true if @var{x} or @var{y} are NaN. |
| 1795 | @end deftypefn |
| 1796 | |
| 1797 | @comment math.h |
| 1798 | @comment ISO |
| 1799 | @deftypefn Macro int isunordered (@emph{real-floating} @var{x}, @emph{real-floating} @var{y}) |
| 1800 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1801 | This macro determines whether its arguments are unordered. In other |
| 1802 | words, it is true if @var{x} or @var{y} are NaN, and false otherwise. |
| 1803 | @end deftypefn |
| 1804 | |
| 1805 | Not all machines provide hardware support for these operations. On |
| 1806 | machines that don't, the macros can be very slow. Therefore, you should |
| 1807 | not use these functions when NaN is not a concern. |
| 1808 | |
| 1809 | @strong{NB:} There are no macros @code{isequal} or @code{isunequal}. |
| 1810 | They are unnecessary, because the @code{==} and @code{!=} operators do |
| 1811 | @emph{not} throw an exception if one or both of the operands are NaN. |
| 1812 | |
| 1813 | @node Misc FP Arithmetic |
| 1814 | @subsection Miscellaneous FP arithmetic functions |
| 1815 | @cindex minimum |
| 1816 | @cindex maximum |
| 1817 | @cindex positive difference |
| 1818 | @cindex multiply-add |
| 1819 | |
| 1820 | The functions in this section perform miscellaneous but common |
| 1821 | operations that are awkward to express with C operators. On some |
| 1822 | processors these functions can use special machine instructions to |
| 1823 | perform these operations faster than the equivalent C code. |
| 1824 | |
| 1825 | @comment math.h |
| 1826 | @comment ISO |
| 1827 | @deftypefun double fmin (double @var{x}, double @var{y}) |
| 1828 | @comment math.h |
| 1829 | @comment ISO |
| 1830 | @deftypefunx float fminf (float @var{x}, float @var{y}) |
| 1831 | @comment math.h |
| 1832 | @comment ISO |
| 1833 | @deftypefunx {long double} fminl (long double @var{x}, long double @var{y}) |
| 1834 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1835 | The @code{fmin} function returns the lesser of the two values @var{x} |
| 1836 | and @var{y}. It is similar to the expression |
| 1837 | @smallexample |
| 1838 | ((x) < (y) ? (x) : (y)) |
| 1839 | @end smallexample |
| 1840 | except that @var{x} and @var{y} are only evaluated once. |
| 1841 | |
| 1842 | If an argument is NaN, the other argument is returned. If both arguments |
| 1843 | are NaN, NaN is returned. |
| 1844 | @end deftypefun |
| 1845 | |
| 1846 | @comment math.h |
| 1847 | @comment ISO |
| 1848 | @deftypefun double fmax (double @var{x}, double @var{y}) |
| 1849 | @comment math.h |
| 1850 | @comment ISO |
| 1851 | @deftypefunx float fmaxf (float @var{x}, float @var{y}) |
| 1852 | @comment math.h |
| 1853 | @comment ISO |
| 1854 | @deftypefunx {long double} fmaxl (long double @var{x}, long double @var{y}) |
| 1855 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1856 | The @code{fmax} function returns the greater of the two values @var{x} |
| 1857 | and @var{y}. |
| 1858 | |
| 1859 | If an argument is NaN, the other argument is returned. If both arguments |
| 1860 | are NaN, NaN is returned. |
| 1861 | @end deftypefun |
| 1862 | |
| 1863 | @comment math.h |
| 1864 | @comment ISO |
| 1865 | @deftypefun double fdim (double @var{x}, double @var{y}) |
| 1866 | @comment math.h |
| 1867 | @comment ISO |
| 1868 | @deftypefunx float fdimf (float @var{x}, float @var{y}) |
| 1869 | @comment math.h |
| 1870 | @comment ISO |
| 1871 | @deftypefunx {long double} fdiml (long double @var{x}, long double @var{y}) |
| 1872 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1873 | The @code{fdim} function returns the positive difference between |
| 1874 | @var{x} and @var{y}. The positive difference is @math{@var{x} - |
| 1875 | @var{y}} if @var{x} is greater than @var{y}, and @math{0} otherwise. |
| 1876 | |
| 1877 | If @var{x}, @var{y}, or both are NaN, NaN is returned. |
| 1878 | @end deftypefun |
| 1879 | |
| 1880 | @comment math.h |
| 1881 | @comment ISO |
| 1882 | @deftypefun double fma (double @var{x}, double @var{y}, double @var{z}) |
| 1883 | @comment math.h |
| 1884 | @comment ISO |
| 1885 | @deftypefunx float fmaf (float @var{x}, float @var{y}, float @var{z}) |
| 1886 | @comment math.h |
| 1887 | @comment ISO |
| 1888 | @deftypefunx {long double} fmal (long double @var{x}, long double @var{y}, long double @var{z}) |
| 1889 | @cindex butterfly |
| 1890 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 1891 | The @code{fma} function performs floating-point multiply-add. This is |
| 1892 | the operation @math{(@var{x} @mul{} @var{y}) + @var{z}}, but the |
| 1893 | intermediate result is not rounded to the destination type. This can |
| 1894 | sometimes improve the precision of a calculation. |
| 1895 | |
| 1896 | This function was introduced because some processors have a special |
| 1897 | instruction to perform multiply-add. The C compiler cannot use it |
| 1898 | directly, because the expression @samp{x*y + z} is defined to round the |
| 1899 | intermediate result. @code{fma} lets you choose when you want to round |
| 1900 | only once. |
| 1901 | |
| 1902 | @vindex FP_FAST_FMA |
| 1903 | On processors which do not implement multiply-add in hardware, |
| 1904 | @code{fma} can be very slow since it must avoid intermediate rounding. |
| 1905 | @file{math.h} defines the symbols @code{FP_FAST_FMA}, |
| 1906 | @code{FP_FAST_FMAF}, and @code{FP_FAST_FMAL} when the corresponding |
| 1907 | version of @code{fma} is no slower than the expression @samp{x*y + z}. |
| 1908 | In @theglibc{}, this always means the operation is implemented in |
| 1909 | hardware. |
| 1910 | @end deftypefun |
| 1911 | |
| 1912 | @node Complex Numbers |
| 1913 | @section Complex Numbers |
| 1914 | @pindex complex.h |
| 1915 | @cindex complex numbers |
| 1916 | |
| 1917 | @w{ISO C99} introduces support for complex numbers in C. This is done |
| 1918 | with a new type qualifier, @code{complex}. It is a keyword if and only |
| 1919 | if @file{complex.h} has been included. There are three complex types, |
| 1920 | corresponding to the three real types: @code{float complex}, |
| 1921 | @code{double complex}, and @code{long double complex}. |
| 1922 | |
| 1923 | To construct complex numbers you need a way to indicate the imaginary |
| 1924 | part of a number. There is no standard notation for an imaginary |
| 1925 | floating point constant. Instead, @file{complex.h} defines two macros |
| 1926 | that can be used to create complex numbers. |
| 1927 | |
| 1928 | @deftypevr Macro {const float complex} _Complex_I |
| 1929 | This macro is a representation of the complex number ``@math{0+1i}''. |
| 1930 | Multiplying a real floating-point value by @code{_Complex_I} gives a |
| 1931 | complex number whose value is purely imaginary. You can use this to |
| 1932 | construct complex constants: |
| 1933 | |
| 1934 | @smallexample |
| 1935 | @math{3.0 + 4.0i} = @code{3.0 + 4.0 * _Complex_I} |
| 1936 | @end smallexample |
| 1937 | |
| 1938 | Note that @code{_Complex_I * _Complex_I} has the value @code{-1}, but |
| 1939 | the type of that value is @code{complex}. |
| 1940 | @end deftypevr |
| 1941 | |
| 1942 | @c Put this back in when gcc supports _Imaginary_I. It's too confusing. |
| 1943 | @ignore |
| 1944 | @noindent |
| 1945 | Without an optimizing compiler this is more expensive than the use of |
| 1946 | @code{_Imaginary_I} but with is better than nothing. You can avoid all |
| 1947 | the hassles if you use the @code{I} macro below if the name is not |
| 1948 | problem. |
| 1949 | |
| 1950 | @deftypevr Macro {const float imaginary} _Imaginary_I |
| 1951 | This macro is a representation of the value ``@math{1i}''. I.e., it is |
| 1952 | the value for which |
| 1953 | |
| 1954 | @smallexample |
| 1955 | _Imaginary_I * _Imaginary_I = -1 |
| 1956 | @end smallexample |
| 1957 | |
| 1958 | @noindent |
| 1959 | The result is not of type @code{float imaginary} but instead @code{float}. |
| 1960 | One can use it to easily construct complex number like in |
| 1961 | |
| 1962 | @smallexample |
| 1963 | 3.0 - _Imaginary_I * 4.0 |
| 1964 | @end smallexample |
| 1965 | |
| 1966 | @noindent |
| 1967 | which results in the complex number with a real part of 3.0 and a |
| 1968 | imaginary part -4.0. |
| 1969 | @end deftypevr |
| 1970 | @end ignore |
| 1971 | |
| 1972 | @noindent |
| 1973 | @code{_Complex_I} is a bit of a mouthful. @file{complex.h} also defines |
| 1974 | a shorter name for the same constant. |
| 1975 | |
| 1976 | @deftypevr Macro {const float complex} I |
| 1977 | This macro has exactly the same value as @code{_Complex_I}. Most of the |
| 1978 | time it is preferable. However, it causes problems if you want to use |
| 1979 | the identifier @code{I} for something else. You can safely write |
| 1980 | |
| 1981 | @smallexample |
| 1982 | #include <complex.h> |
| 1983 | #undef I |
| 1984 | @end smallexample |
| 1985 | |
| 1986 | @noindent |
| 1987 | if you need @code{I} for your own purposes. (In that case we recommend |
| 1988 | you also define some other short name for @code{_Complex_I}, such as |
| 1989 | @code{J}.) |
| 1990 | |
| 1991 | @ignore |
| 1992 | If the implementation does not support the @code{imaginary} types |
| 1993 | @code{I} is defined as @code{_Complex_I} which is the second best |
| 1994 | solution. It still can be used in the same way but requires a most |
| 1995 | clever compiler to get the same results. |
| 1996 | @end ignore |
| 1997 | @end deftypevr |
| 1998 | |
| 1999 | @node Operations on Complex |
| 2000 | @section Projections, Conjugates, and Decomposing of Complex Numbers |
| 2001 | @cindex project complex numbers |
| 2002 | @cindex conjugate complex numbers |
| 2003 | @cindex decompose complex numbers |
| 2004 | @pindex complex.h |
| 2005 | |
| 2006 | @w{ISO C99} also defines functions that perform basic operations on |
| 2007 | complex numbers, such as decomposition and conjugation. The prototypes |
| 2008 | for all these functions are in @file{complex.h}. All functions are |
| 2009 | available in three variants, one for each of the three complex types. |
| 2010 | |
| 2011 | @comment complex.h |
| 2012 | @comment ISO |
| 2013 | @deftypefun double creal (complex double @var{z}) |
| 2014 | @comment complex.h |
| 2015 | @comment ISO |
| 2016 | @deftypefunx float crealf (complex float @var{z}) |
| 2017 | @comment complex.h |
| 2018 | @comment ISO |
| 2019 | @deftypefunx {long double} creall (complex long double @var{z}) |
| 2020 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 2021 | These functions return the real part of the complex number @var{z}. |
| 2022 | @end deftypefun |
| 2023 | |
| 2024 | @comment complex.h |
| 2025 | @comment ISO |
| 2026 | @deftypefun double cimag (complex double @var{z}) |
| 2027 | @comment complex.h |
| 2028 | @comment ISO |
| 2029 | @deftypefunx float cimagf (complex float @var{z}) |
| 2030 | @comment complex.h |
| 2031 | @comment ISO |
| 2032 | @deftypefunx {long double} cimagl (complex long double @var{z}) |
| 2033 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 2034 | These functions return the imaginary part of the complex number @var{z}. |
| 2035 | @end deftypefun |
| 2036 | |
| 2037 | @comment complex.h |
| 2038 | @comment ISO |
| 2039 | @deftypefun {complex double} conj (complex double @var{z}) |
| 2040 | @comment complex.h |
| 2041 | @comment ISO |
| 2042 | @deftypefunx {complex float} conjf (complex float @var{z}) |
| 2043 | @comment complex.h |
| 2044 | @comment ISO |
| 2045 | @deftypefunx {complex long double} conjl (complex long double @var{z}) |
| 2046 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 2047 | These functions return the conjugate value of the complex number |
| 2048 | @var{z}. The conjugate of a complex number has the same real part and a |
| 2049 | negated imaginary part. In other words, @samp{conj(a + bi) = a + -bi}. |
| 2050 | @end deftypefun |
| 2051 | |
| 2052 | @comment complex.h |
| 2053 | @comment ISO |
| 2054 | @deftypefun double carg (complex double @var{z}) |
| 2055 | @comment complex.h |
| 2056 | @comment ISO |
| 2057 | @deftypefunx float cargf (complex float @var{z}) |
| 2058 | @comment complex.h |
| 2059 | @comment ISO |
| 2060 | @deftypefunx {long double} cargl (complex long double @var{z}) |
| 2061 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 2062 | These functions return the argument of the complex number @var{z}. |
| 2063 | The argument of a complex number is the angle in the complex plane |
| 2064 | between the positive real axis and a line passing through zero and the |
| 2065 | number. This angle is measured in the usual fashion and ranges from |
| 2066 | @math{-@pi{}} to @math{@pi{}}. |
| 2067 | |
| 2068 | @code{carg} has a branch cut along the negative real axis. |
| 2069 | @end deftypefun |
| 2070 | |
| 2071 | @comment complex.h |
| 2072 | @comment ISO |
| 2073 | @deftypefun {complex double} cproj (complex double @var{z}) |
| 2074 | @comment complex.h |
| 2075 | @comment ISO |
| 2076 | @deftypefunx {complex float} cprojf (complex float @var{z}) |
| 2077 | @comment complex.h |
| 2078 | @comment ISO |
| 2079 | @deftypefunx {complex long double} cprojl (complex long double @var{z}) |
| 2080 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 2081 | These functions return the projection of the complex value @var{z} onto |
| 2082 | the Riemann sphere. Values with an infinite imaginary part are projected |
| 2083 | to positive infinity on the real axis, even if the real part is NaN. If |
| 2084 | the real part is infinite, the result is equivalent to |
| 2085 | |
| 2086 | @smallexample |
| 2087 | INFINITY + I * copysign (0.0, cimag (z)) |
| 2088 | @end smallexample |
| 2089 | @end deftypefun |
| 2090 | |
| 2091 | @node Parsing of Numbers |
| 2092 | @section Parsing of Numbers |
| 2093 | @cindex parsing numbers (in formatted input) |
| 2094 | @cindex converting strings to numbers |
| 2095 | @cindex number syntax, parsing |
| 2096 | @cindex syntax, for reading numbers |
| 2097 | |
| 2098 | This section describes functions for ``reading'' integer and |
| 2099 | floating-point numbers from a string. It may be more convenient in some |
| 2100 | cases to use @code{sscanf} or one of the related functions; see |
| 2101 | @ref{Formatted Input}. But often you can make a program more robust by |
| 2102 | finding the tokens in the string by hand, then converting the numbers |
| 2103 | one by one. |
| 2104 | |
| 2105 | @menu |
| 2106 | * Parsing of Integers:: Functions for conversion of integer values. |
| 2107 | * Parsing of Floats:: Functions for conversion of floating-point |
| 2108 | values. |
| 2109 | @end menu |
| 2110 | |
| 2111 | @node Parsing of Integers |
| 2112 | @subsection Parsing of Integers |
| 2113 | |
| 2114 | @pindex stdlib.h |
| 2115 | @pindex wchar.h |
| 2116 | The @samp{str} functions are declared in @file{stdlib.h} and those |
| 2117 | beginning with @samp{wcs} are declared in @file{wchar.h}. One might |
| 2118 | wonder about the use of @code{restrict} in the prototypes of the |
| 2119 | functions in this section. It is seemingly useless but the @w{ISO C} |
| 2120 | standard uses it (for the functions defined there) so we have to do it |
| 2121 | as well. |
| 2122 | |
| 2123 | @comment stdlib.h |
| 2124 | @comment ISO |
| 2125 | @deftypefun {long int} strtol (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base}) |
| 2126 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
| 2127 | @c strtol uses the thread-local pointer to the locale in effect, and |
| 2128 | @c strtol_l loads the LC_NUMERIC locale data from it early on and once, |
| 2129 | @c but if the locale is the global locale, and another thread calls |
| 2130 | @c setlocale in a way that modifies the pointer to the LC_CTYPE locale |
| 2131 | @c category, the behavior of e.g. IS*, TOUPPER will vary throughout the |
| 2132 | @c execution of the function, because they re-read the locale data from |
| 2133 | @c the given locale pointer. We solved this by documenting setlocale as |
| 2134 | @c MT-Unsafe. |
| 2135 | The @code{strtol} (``string-to-long'') function converts the initial |
| 2136 | part of @var{string} to a signed integer, which is returned as a value |
| 2137 | of type @code{long int}. |
| 2138 | |
| 2139 | This function attempts to decompose @var{string} as follows: |
| 2140 | |
| 2141 | @itemize @bullet |
| 2142 | @item |
| 2143 | A (possibly empty) sequence of whitespace characters. Which characters |
| 2144 | are whitespace is determined by the @code{isspace} function |
| 2145 | (@pxref{Classification of Characters}). These are discarded. |
| 2146 | |
| 2147 | @item |
| 2148 | An optional plus or minus sign (@samp{+} or @samp{-}). |
| 2149 | |
| 2150 | @item |
| 2151 | A nonempty sequence of digits in the radix specified by @var{base}. |
| 2152 | |
| 2153 | If @var{base} is zero, decimal radix is assumed unless the series of |
| 2154 | digits begins with @samp{0} (specifying octal radix), or @samp{0x} or |
| 2155 | @samp{0X} (specifying hexadecimal radix); in other words, the same |
| 2156 | syntax used for integer constants in C. |
| 2157 | |
| 2158 | Otherwise @var{base} must have a value between @code{2} and @code{36}. |
| 2159 | If @var{base} is @code{16}, the digits may optionally be preceded by |
| 2160 | @samp{0x} or @samp{0X}. If base has no legal value the value returned |
| 2161 | is @code{0l} and the global variable @code{errno} is set to @code{EINVAL}. |
| 2162 | |
| 2163 | @item |
| 2164 | Any remaining characters in the string. If @var{tailptr} is not a null |
| 2165 | pointer, @code{strtol} stores a pointer to this tail in |
| 2166 | @code{*@var{tailptr}}. |
| 2167 | @end itemize |
| 2168 | |
| 2169 | If the string is empty, contains only whitespace, or does not contain an |
| 2170 | initial substring that has the expected syntax for an integer in the |
| 2171 | specified @var{base}, no conversion is performed. In this case, |
| 2172 | @code{strtol} returns a value of zero and the value stored in |
| 2173 | @code{*@var{tailptr}} is the value of @var{string}. |
| 2174 | |
| 2175 | In a locale other than the standard @code{"C"} locale, this function |
| 2176 | may recognize additional implementation-dependent syntax. |
| 2177 | |
| 2178 | If the string has valid syntax for an integer but the value is not |
| 2179 | representable because of overflow, @code{strtol} returns either |
| 2180 | @code{LONG_MAX} or @code{LONG_MIN} (@pxref{Range of Type}), as |
| 2181 | appropriate for the sign of the value. It also sets @code{errno} |
| 2182 | to @code{ERANGE} to indicate there was overflow. |
| 2183 | |
| 2184 | You should not check for errors by examining the return value of |
| 2185 | @code{strtol}, because the string might be a valid representation of |
| 2186 | @code{0l}, @code{LONG_MAX}, or @code{LONG_MIN}. Instead, check whether |
| 2187 | @var{tailptr} points to what you expect after the number |
| 2188 | (e.g. @code{'\0'} if the string should end after the number). You also |
| 2189 | need to clear @var{errno} before the call and check it afterward, in |
| 2190 | case there was overflow. |
| 2191 | |
| 2192 | There is an example at the end of this section. |
| 2193 | @end deftypefun |
| 2194 | |
| 2195 | @comment wchar.h |
| 2196 | @comment ISO |
| 2197 | @deftypefun {long int} wcstol (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) |
| 2198 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
| 2199 | The @code{wcstol} function is equivalent to the @code{strtol} function |
| 2200 | in nearly all aspects but handles wide character strings. |
| 2201 | |
| 2202 | The @code{wcstol} function was introduced in @w{Amendment 1} of @w{ISO C90}. |
| 2203 | @end deftypefun |
| 2204 | |
| 2205 | @comment stdlib.h |
| 2206 | @comment ISO |
| 2207 | @deftypefun {unsigned long int} strtoul (const char *retrict @var{string}, char **restrict @var{tailptr}, int @var{base}) |
| 2208 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
| 2209 | The @code{strtoul} (``string-to-unsigned-long'') function is like |
| 2210 | @code{strtol} except it converts to an @code{unsigned long int} value. |
| 2211 | The syntax is the same as described above for @code{strtol}. The value |
| 2212 | returned on overflow is @code{ULONG_MAX} (@pxref{Range of Type}). |
| 2213 | |
| 2214 | If @var{string} depicts a negative number, @code{strtoul} acts the same |
| 2215 | as @var{strtol} but casts the result to an unsigned integer. That means |
| 2216 | for example that @code{strtoul} on @code{"-1"} returns @code{ULONG_MAX} |
| 2217 | and an input more negative than @code{LONG_MIN} returns |
| 2218 | (@code{ULONG_MAX} + 1) / 2. |
| 2219 | |
| 2220 | @code{strtoul} sets @var{errno} to @code{EINVAL} if @var{base} is out of |
| 2221 | range, or @code{ERANGE} on overflow. |
| 2222 | @end deftypefun |
| 2223 | |
| 2224 | @comment wchar.h |
| 2225 | @comment ISO |
| 2226 | @deftypefun {unsigned long int} wcstoul (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) |
| 2227 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
| 2228 | The @code{wcstoul} function is equivalent to the @code{strtoul} function |
| 2229 | in nearly all aspects but handles wide character strings. |
| 2230 | |
| 2231 | The @code{wcstoul} function was introduced in @w{Amendment 1} of @w{ISO C90}. |
| 2232 | @end deftypefun |
| 2233 | |
| 2234 | @comment stdlib.h |
| 2235 | @comment ISO |
| 2236 | @deftypefun {long long int} strtoll (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base}) |
| 2237 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
| 2238 | The @code{strtoll} function is like @code{strtol} except that it returns |
| 2239 | a @code{long long int} value, and accepts numbers with a correspondingly |
| 2240 | larger range. |
| 2241 | |
| 2242 | If the string has valid syntax for an integer but the value is not |
| 2243 | representable because of overflow, @code{strtoll} returns either |
| 2244 | @code{LLONG_MAX} or @code{LLONG_MIN} (@pxref{Range of Type}), as |
| 2245 | appropriate for the sign of the value. It also sets @code{errno} to |
| 2246 | @code{ERANGE} to indicate there was overflow. |
| 2247 | |
| 2248 | The @code{strtoll} function was introduced in @w{ISO C99}. |
| 2249 | @end deftypefun |
| 2250 | |
| 2251 | @comment wchar.h |
| 2252 | @comment ISO |
| 2253 | @deftypefun {long long int} wcstoll (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) |
| 2254 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
| 2255 | The @code{wcstoll} function is equivalent to the @code{strtoll} function |
| 2256 | in nearly all aspects but handles wide character strings. |
| 2257 | |
| 2258 | The @code{wcstoll} function was introduced in @w{Amendment 1} of @w{ISO C90}. |
| 2259 | @end deftypefun |
| 2260 | |
| 2261 | @comment stdlib.h |
| 2262 | @comment BSD |
| 2263 | @deftypefun {long long int} strtoq (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base}) |
| 2264 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
| 2265 | @code{strtoq} (``string-to-quad-word'') is the BSD name for @code{strtoll}. |
| 2266 | @end deftypefun |
| 2267 | |
| 2268 | @comment wchar.h |
| 2269 | @comment GNU |
| 2270 | @deftypefun {long long int} wcstoq (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) |
| 2271 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
| 2272 | The @code{wcstoq} function is equivalent to the @code{strtoq} function |
| 2273 | in nearly all aspects but handles wide character strings. |
| 2274 | |
| 2275 | The @code{wcstoq} function is a GNU extension. |
| 2276 | @end deftypefun |
| 2277 | |
| 2278 | @comment stdlib.h |
| 2279 | @comment ISO |
| 2280 | @deftypefun {unsigned long long int} strtoull (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base}) |
| 2281 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
| 2282 | The @code{strtoull} function is related to @code{strtoll} the same way |
| 2283 | @code{strtoul} is related to @code{strtol}. |
| 2284 | |
| 2285 | The @code{strtoull} function was introduced in @w{ISO C99}. |
| 2286 | @end deftypefun |
| 2287 | |
| 2288 | @comment wchar.h |
| 2289 | @comment ISO |
| 2290 | @deftypefun {unsigned long long int} wcstoull (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) |
| 2291 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
| 2292 | The @code{wcstoull} function is equivalent to the @code{strtoull} function |
| 2293 | in nearly all aspects but handles wide character strings. |
| 2294 | |
| 2295 | The @code{wcstoull} function was introduced in @w{Amendment 1} of @w{ISO C90}. |
| 2296 | @end deftypefun |
| 2297 | |
| 2298 | @comment stdlib.h |
| 2299 | @comment BSD |
| 2300 | @deftypefun {unsigned long long int} strtouq (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base}) |
| 2301 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
| 2302 | @code{strtouq} is the BSD name for @code{strtoull}. |
| 2303 | @end deftypefun |
| 2304 | |
| 2305 | @comment wchar.h |
| 2306 | @comment GNU |
| 2307 | @deftypefun {unsigned long long int} wcstouq (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) |
| 2308 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
| 2309 | The @code{wcstouq} function is equivalent to the @code{strtouq} function |
| 2310 | in nearly all aspects but handles wide character strings. |
| 2311 | |
| 2312 | The @code{wcstouq} function is a GNU extension. |
| 2313 | @end deftypefun |
| 2314 | |
| 2315 | @comment inttypes.h |
| 2316 | @comment ISO |
| 2317 | @deftypefun intmax_t strtoimax (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base}) |
| 2318 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
| 2319 | The @code{strtoimax} function is like @code{strtol} except that it returns |
| 2320 | a @code{intmax_t} value, and accepts numbers of a corresponding range. |
| 2321 | |
| 2322 | If the string has valid syntax for an integer but the value is not |
| 2323 | representable because of overflow, @code{strtoimax} returns either |
| 2324 | @code{INTMAX_MAX} or @code{INTMAX_MIN} (@pxref{Integers}), as |
| 2325 | appropriate for the sign of the value. It also sets @code{errno} to |
| 2326 | @code{ERANGE} to indicate there was overflow. |
| 2327 | |
| 2328 | See @ref{Integers} for a description of the @code{intmax_t} type. The |
| 2329 | @code{strtoimax} function was introduced in @w{ISO C99}. |
| 2330 | @end deftypefun |
| 2331 | |
| 2332 | @comment wchar.h |
| 2333 | @comment ISO |
| 2334 | @deftypefun intmax_t wcstoimax (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) |
| 2335 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
| 2336 | The @code{wcstoimax} function is equivalent to the @code{strtoimax} function |
| 2337 | in nearly all aspects but handles wide character strings. |
| 2338 | |
| 2339 | The @code{wcstoimax} function was introduced in @w{ISO C99}. |
| 2340 | @end deftypefun |
| 2341 | |
| 2342 | @comment inttypes.h |
| 2343 | @comment ISO |
| 2344 | @deftypefun uintmax_t strtoumax (const char *restrict @var{string}, char **restrict @var{tailptr}, int @var{base}) |
| 2345 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
| 2346 | The @code{strtoumax} function is related to @code{strtoimax} |
| 2347 | the same way that @code{strtoul} is related to @code{strtol}. |
| 2348 | |
| 2349 | See @ref{Integers} for a description of the @code{intmax_t} type. The |
| 2350 | @code{strtoumax} function was introduced in @w{ISO C99}. |
| 2351 | @end deftypefun |
| 2352 | |
| 2353 | @comment wchar.h |
| 2354 | @comment ISO |
| 2355 | @deftypefun uintmax_t wcstoumax (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}, int @var{base}) |
| 2356 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
| 2357 | The @code{wcstoumax} function is equivalent to the @code{strtoumax} function |
| 2358 | in nearly all aspects but handles wide character strings. |
| 2359 | |
| 2360 | The @code{wcstoumax} function was introduced in @w{ISO C99}. |
| 2361 | @end deftypefun |
| 2362 | |
| 2363 | @comment stdlib.h |
| 2364 | @comment ISO |
| 2365 | @deftypefun {long int} atol (const char *@var{string}) |
| 2366 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
| 2367 | This function is similar to the @code{strtol} function with a @var{base} |
| 2368 | argument of @code{10}, except that it need not detect overflow errors. |
| 2369 | The @code{atol} function is provided mostly for compatibility with |
| 2370 | existing code; using @code{strtol} is more robust. |
| 2371 | @end deftypefun |
| 2372 | |
| 2373 | @comment stdlib.h |
| 2374 | @comment ISO |
| 2375 | @deftypefun int atoi (const char *@var{string}) |
| 2376 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
| 2377 | This function is like @code{atol}, except that it returns an @code{int}. |
| 2378 | The @code{atoi} function is also considered obsolete; use @code{strtol} |
| 2379 | instead. |
| 2380 | @end deftypefun |
| 2381 | |
| 2382 | @comment stdlib.h |
| 2383 | @comment ISO |
| 2384 | @deftypefun {long long int} atoll (const char *@var{string}) |
| 2385 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
| 2386 | This function is similar to @code{atol}, except it returns a @code{long |
| 2387 | long int}. |
| 2388 | |
| 2389 | The @code{atoll} function was introduced in @w{ISO C99}. It too is |
| 2390 | obsolete (despite having just been added); use @code{strtoll} instead. |
| 2391 | @end deftypefun |
| 2392 | |
| 2393 | All the functions mentioned in this section so far do not handle |
| 2394 | alternative representations of characters as described in the locale |
| 2395 | data. Some locales specify thousands separator and the way they have to |
| 2396 | be used which can help to make large numbers more readable. To read |
| 2397 | such numbers one has to use the @code{scanf} functions with the @samp{'} |
| 2398 | flag. |
| 2399 | |
| 2400 | Here is a function which parses a string as a sequence of integers and |
| 2401 | returns the sum of them: |
| 2402 | |
| 2403 | @smallexample |
| 2404 | int |
| 2405 | sum_ints_from_string (char *string) |
| 2406 | @{ |
| 2407 | int sum = 0; |
| 2408 | |
| 2409 | while (1) @{ |
| 2410 | char *tail; |
| 2411 | int next; |
| 2412 | |
| 2413 | /* @r{Skip whitespace by hand, to detect the end.} */ |
| 2414 | while (isspace (*string)) string++; |
| 2415 | if (*string == 0) |
| 2416 | break; |
| 2417 | |
| 2418 | /* @r{There is more nonwhitespace,} */ |
| 2419 | /* @r{so it ought to be another number.} */ |
| 2420 | errno = 0; |
| 2421 | /* @r{Parse it.} */ |
| 2422 | next = strtol (string, &tail, 0); |
| 2423 | /* @r{Add it in, if not overflow.} */ |
| 2424 | if (errno) |
| 2425 | printf ("Overflow\n"); |
| 2426 | else |
| 2427 | sum += next; |
| 2428 | /* @r{Advance past it.} */ |
| 2429 | string = tail; |
| 2430 | @} |
| 2431 | |
| 2432 | return sum; |
| 2433 | @} |
| 2434 | @end smallexample |
| 2435 | |
| 2436 | @node Parsing of Floats |
| 2437 | @subsection Parsing of Floats |
| 2438 | |
| 2439 | @pindex stdlib.h |
| 2440 | The @samp{str} functions are declared in @file{stdlib.h} and those |
| 2441 | beginning with @samp{wcs} are declared in @file{wchar.h}. One might |
| 2442 | wonder about the use of @code{restrict} in the prototypes of the |
| 2443 | functions in this section. It is seemingly useless but the @w{ISO C} |
| 2444 | standard uses it (for the functions defined there) so we have to do it |
| 2445 | as well. |
| 2446 | |
| 2447 | @comment stdlib.h |
| 2448 | @comment ISO |
| 2449 | @deftypefun double strtod (const char *restrict @var{string}, char **restrict @var{tailptr}) |
| 2450 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
| 2451 | @c Besides the unsafe-but-ruled-safe locale uses, this uses a lot of |
| 2452 | @c mpn, but it's all safe. |
| 2453 | @c |
| 2454 | @c round_and_return |
| 2455 | @c get_rounding_mode ok |
| 2456 | @c mpn_add_1 ok |
| 2457 | @c mpn_rshift ok |
| 2458 | @c MPN_ZERO ok |
| 2459 | @c MPN2FLOAT -> mpn_construct_(float|double|long_double) ok |
| 2460 | @c str_to_mpn |
| 2461 | @c mpn_mul_1 -> umul_ppmm ok |
| 2462 | @c mpn_add_1 ok |
| 2463 | @c mpn_lshift_1 -> mpn_lshift ok |
| 2464 | @c STRTOF_INTERNAL |
| 2465 | @c MPN_VAR ok |
| 2466 | @c SET_MANTISSA ok |
| 2467 | @c STRNCASECMP ok, wide and narrow |
| 2468 | @c round_and_return ok |
| 2469 | @c mpn_mul ok |
| 2470 | @c mpn_addmul_1 ok |
| 2471 | @c ... mpn_sub |
| 2472 | @c mpn_lshift ok |
| 2473 | @c udiv_qrnnd ok |
| 2474 | @c count_leading_zeros ok |
| 2475 | @c add_ssaaaa ok |
| 2476 | @c sub_ddmmss ok |
| 2477 | @c umul_ppmm ok |
| 2478 | @c mpn_submul_1 ok |
| 2479 | The @code{strtod} (``string-to-double'') function converts the initial |
| 2480 | part of @var{string} to a floating-point number, which is returned as a |
| 2481 | value of type @code{double}. |
| 2482 | |
| 2483 | This function attempts to decompose @var{string} as follows: |
| 2484 | |
| 2485 | @itemize @bullet |
| 2486 | @item |
| 2487 | A (possibly empty) sequence of whitespace characters. Which characters |
| 2488 | are whitespace is determined by the @code{isspace} function |
| 2489 | (@pxref{Classification of Characters}). These are discarded. |
| 2490 | |
| 2491 | @item |
| 2492 | An optional plus or minus sign (@samp{+} or @samp{-}). |
| 2493 | |
| 2494 | @item A floating point number in decimal or hexadecimal format. The |
| 2495 | decimal format is: |
| 2496 | @itemize @minus |
| 2497 | |
| 2498 | @item |
| 2499 | A nonempty sequence of digits optionally containing a decimal-point |
| 2500 | character---normally @samp{.}, but it depends on the locale |
| 2501 | (@pxref{General Numeric}). |
| 2502 | |
| 2503 | @item |
| 2504 | An optional exponent part, consisting of a character @samp{e} or |
| 2505 | @samp{E}, an optional sign, and a sequence of digits. |
| 2506 | |
| 2507 | @end itemize |
| 2508 | |
| 2509 | The hexadecimal format is as follows: |
| 2510 | @itemize @minus |
| 2511 | |
| 2512 | @item |
| 2513 | A 0x or 0X followed by a nonempty sequence of hexadecimal digits |
| 2514 | optionally containing a decimal-point character---normally @samp{.}, but |
| 2515 | it depends on the locale (@pxref{General Numeric}). |
| 2516 | |
| 2517 | @item |
| 2518 | An optional binary-exponent part, consisting of a character @samp{p} or |
| 2519 | @samp{P}, an optional sign, and a sequence of digits. |
| 2520 | |
| 2521 | @end itemize |
| 2522 | |
| 2523 | @item |
| 2524 | Any remaining characters in the string. If @var{tailptr} is not a null |
| 2525 | pointer, a pointer to this tail of the string is stored in |
| 2526 | @code{*@var{tailptr}}. |
| 2527 | @end itemize |
| 2528 | |
| 2529 | If the string is empty, contains only whitespace, or does not contain an |
| 2530 | initial substring that has the expected syntax for a floating-point |
| 2531 | number, no conversion is performed. In this case, @code{strtod} returns |
| 2532 | a value of zero and the value returned in @code{*@var{tailptr}} is the |
| 2533 | value of @var{string}. |
| 2534 | |
| 2535 | In a locale other than the standard @code{"C"} or @code{"POSIX"} locales, |
| 2536 | this function may recognize additional locale-dependent syntax. |
| 2537 | |
| 2538 | If the string has valid syntax for a floating-point number but the value |
| 2539 | is outside the range of a @code{double}, @code{strtod} will signal |
| 2540 | overflow or underflow as described in @ref{Math Error Reporting}. |
| 2541 | |
| 2542 | @code{strtod} recognizes four special input strings. The strings |
| 2543 | @code{"inf"} and @code{"infinity"} are converted to @math{@infinity{}}, |
| 2544 | or to the largest representable value if the floating-point format |
| 2545 | doesn't support infinities. You can prepend a @code{"+"} or @code{"-"} |
| 2546 | to specify the sign. Case is ignored when scanning these strings. |
| 2547 | |
| 2548 | The strings @code{"nan"} and @code{"nan(@var{chars@dots{}})"} are converted |
| 2549 | to NaN. Again, case is ignored. If @var{chars@dots{}} are provided, they |
| 2550 | are used in some unspecified fashion to select a particular |
| 2551 | representation of NaN (there can be several). |
| 2552 | |
| 2553 | Since zero is a valid result as well as the value returned on error, you |
| 2554 | should check for errors in the same way as for @code{strtol}, by |
| 2555 | examining @var{errno} and @var{tailptr}. |
| 2556 | @end deftypefun |
| 2557 | |
| 2558 | @comment stdlib.h |
| 2559 | @comment ISO |
| 2560 | @deftypefun float strtof (const char *@var{string}, char **@var{tailptr}) |
| 2561 | @comment stdlib.h |
| 2562 | @comment ISO |
| 2563 | @deftypefunx {long double} strtold (const char *@var{string}, char **@var{tailptr}) |
| 2564 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
| 2565 | These functions are analogous to @code{strtod}, but return @code{float} |
| 2566 | and @code{long double} values respectively. They report errors in the |
| 2567 | same way as @code{strtod}. @code{strtof} can be substantially faster |
| 2568 | than @code{strtod}, but has less precision; conversely, @code{strtold} |
| 2569 | can be much slower but has more precision (on systems where @code{long |
| 2570 | double} is a separate type). |
| 2571 | |
| 2572 | These functions have been GNU extensions and are new to @w{ISO C99}. |
| 2573 | @end deftypefun |
| 2574 | |
| 2575 | @comment wchar.h |
| 2576 | @comment ISO |
| 2577 | @deftypefun double wcstod (const wchar_t *restrict @var{string}, wchar_t **restrict @var{tailptr}) |
| 2578 | @comment stdlib.h |
| 2579 | @comment ISO |
| 2580 | @deftypefunx float wcstof (const wchar_t *@var{string}, wchar_t **@var{tailptr}) |
| 2581 | @comment stdlib.h |
| 2582 | @comment ISO |
| 2583 | @deftypefunx {long double} wcstold (const wchar_t *@var{string}, wchar_t **@var{tailptr}) |
| 2584 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
| 2585 | The @code{wcstod}, @code{wcstof}, and @code{wcstol} functions are |
| 2586 | equivalent in nearly all aspect to the @code{strtod}, @code{strtof}, and |
| 2587 | @code{strtold} functions but it handles wide character string. |
| 2588 | |
| 2589 | The @code{wcstod} function was introduced in @w{Amendment 1} of @w{ISO |
| 2590 | C90}. The @code{wcstof} and @code{wcstold} functions were introduced in |
| 2591 | @w{ISO C99}. |
| 2592 | @end deftypefun |
| 2593 | |
| 2594 | @comment stdlib.h |
| 2595 | @comment ISO |
| 2596 | @deftypefun double atof (const char *@var{string}) |
| 2597 | @safety{@prelim{}@mtsafe{@mtslocale{}}@assafe{}@acsafe{}} |
| 2598 | This function is similar to the @code{strtod} function, except that it |
| 2599 | need not detect overflow and underflow errors. The @code{atof} function |
| 2600 | is provided mostly for compatibility with existing code; using |
| 2601 | @code{strtod} is more robust. |
| 2602 | @end deftypefun |
| 2603 | |
| 2604 | @Theglibc{} also provides @samp{_l} versions of these functions, |
| 2605 | which take an additional argument, the locale to use in conversion. |
| 2606 | |
| 2607 | See also @ref{Parsing of Integers}. |
| 2608 | |
| 2609 | @node System V Number Conversion |
| 2610 | @section Old-fashioned System V number-to-string functions |
| 2611 | |
| 2612 | The old @w{System V} C library provided three functions to convert |
| 2613 | numbers to strings, with unusual and hard-to-use semantics. @Theglibc{} |
| 2614 | also provides these functions and some natural extensions. |
| 2615 | |
| 2616 | These functions are only available in @theglibc{} and on systems descended |
| 2617 | from AT&T Unix. Therefore, unless these functions do precisely what you |
| 2618 | need, it is better to use @code{sprintf}, which is standard. |
| 2619 | |
| 2620 | All these functions are defined in @file{stdlib.h}. |
| 2621 | |
| 2622 | @comment stdlib.h |
| 2623 | @comment SVID, Unix98 |
| 2624 | @deftypefun {char *} ecvt (double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg}) |
| 2625 | @safety{@prelim{}@mtunsafe{@mtasurace{:ecvt}}@asunsafe{}@acsafe{}} |
| 2626 | The function @code{ecvt} converts the floating-point number @var{value} |
| 2627 | to a string with at most @var{ndigit} decimal digits. The |
| 2628 | returned string contains no decimal point or sign. The first digit of |
| 2629 | the string is non-zero (unless @var{value} is actually zero) and the |
| 2630 | last digit is rounded to nearest. @code{*@var{decpt}} is set to the |
| 2631 | index in the string of the first digit after the decimal point. |
| 2632 | @code{*@var{neg}} is set to a nonzero value if @var{value} is negative, |
| 2633 | zero otherwise. |
| 2634 | |
| 2635 | If @var{ndigit} decimal digits would exceed the precision of a |
| 2636 | @code{double} it is reduced to a system-specific value. |
| 2637 | |
| 2638 | The returned string is statically allocated and overwritten by each call |
| 2639 | to @code{ecvt}. |
| 2640 | |
| 2641 | If @var{value} is zero, it is implementation defined whether |
| 2642 | @code{*@var{decpt}} is @code{0} or @code{1}. |
| 2643 | |
| 2644 | For example: @code{ecvt (12.3, 5, &d, &n)} returns @code{"12300"} |
| 2645 | and sets @var{d} to @code{2} and @var{n} to @code{0}. |
| 2646 | @end deftypefun |
| 2647 | |
| 2648 | @comment stdlib.h |
| 2649 | @comment SVID, Unix98 |
| 2650 | @deftypefun {char *} fcvt (double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg}) |
| 2651 | @safety{@prelim{}@mtunsafe{@mtasurace{:fcvt}}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}} |
| 2652 | The function @code{fcvt} is like @code{ecvt}, but @var{ndigit} specifies |
| 2653 | the number of digits after the decimal point. If @var{ndigit} is less |
| 2654 | than zero, @var{value} is rounded to the @math{@var{ndigit}+1}'th place to the |
| 2655 | left of the decimal point. For example, if @var{ndigit} is @code{-1}, |
| 2656 | @var{value} will be rounded to the nearest 10. If @var{ndigit} is |
| 2657 | negative and larger than the number of digits to the left of the decimal |
| 2658 | point in @var{value}, @var{value} will be rounded to one significant digit. |
| 2659 | |
| 2660 | If @var{ndigit} decimal digits would exceed the precision of a |
| 2661 | @code{double} it is reduced to a system-specific value. |
| 2662 | |
| 2663 | The returned string is statically allocated and overwritten by each call |
| 2664 | to @code{fcvt}. |
| 2665 | @end deftypefun |
| 2666 | |
| 2667 | @comment stdlib.h |
| 2668 | @comment SVID, Unix98 |
| 2669 | @deftypefun {char *} gcvt (double @var{value}, int @var{ndigit}, char *@var{buf}) |
| 2670 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 2671 | @c gcvt calls sprintf, that ultimately calls vfprintf, which malloc()s |
| 2672 | @c args_value if it's too large, but gcvt never exercises this path. |
| 2673 | @code{gcvt} is functionally equivalent to @samp{sprintf(buf, "%*g", |
| 2674 | ndigit, value}. It is provided only for compatibility's sake. It |
| 2675 | returns @var{buf}. |
| 2676 | |
| 2677 | If @var{ndigit} decimal digits would exceed the precision of a |
| 2678 | @code{double} it is reduced to a system-specific value. |
| 2679 | @end deftypefun |
| 2680 | |
| 2681 | As extensions, @theglibc{} provides versions of these three |
| 2682 | functions that take @code{long double} arguments. |
| 2683 | |
| 2684 | @comment stdlib.h |
| 2685 | @comment GNU |
| 2686 | @deftypefun {char *} qecvt (long double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg}) |
| 2687 | @safety{@prelim{}@mtunsafe{@mtasurace{:qecvt}}@asunsafe{}@acsafe{}} |
| 2688 | This function is equivalent to @code{ecvt} except that it takes a |
| 2689 | @code{long double} for the first parameter and that @var{ndigit} is |
| 2690 | restricted by the precision of a @code{long double}. |
| 2691 | @end deftypefun |
| 2692 | |
| 2693 | @comment stdlib.h |
| 2694 | @comment GNU |
| 2695 | @deftypefun {char *} qfcvt (long double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg}) |
| 2696 | @safety{@prelim{}@mtunsafe{@mtasurace{:qfcvt}}@asunsafe{@ascuheap{}}@acunsafe{@acsmem{}}} |
| 2697 | This function is equivalent to @code{fcvt} except that it |
| 2698 | takes a @code{long double} for the first parameter and that @var{ndigit} is |
| 2699 | restricted by the precision of a @code{long double}. |
| 2700 | @end deftypefun |
| 2701 | |
| 2702 | @comment stdlib.h |
| 2703 | @comment GNU |
| 2704 | @deftypefun {char *} qgcvt (long double @var{value}, int @var{ndigit}, char *@var{buf}) |
| 2705 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 2706 | This function is equivalent to @code{gcvt} except that it takes a |
| 2707 | @code{long double} for the first parameter and that @var{ndigit} is |
| 2708 | restricted by the precision of a @code{long double}. |
| 2709 | @end deftypefun |
| 2710 | |
| 2711 | |
| 2712 | @cindex gcvt_r |
| 2713 | The @code{ecvt} and @code{fcvt} functions, and their @code{long double} |
| 2714 | equivalents, all return a string located in a static buffer which is |
| 2715 | overwritten by the next call to the function. @Theglibc{} |
| 2716 | provides another set of extended functions which write the converted |
| 2717 | string into a user-supplied buffer. These have the conventional |
| 2718 | @code{_r} suffix. |
| 2719 | |
| 2720 | @code{gcvt_r} is not necessary, because @code{gcvt} already uses a |
| 2721 | user-supplied buffer. |
| 2722 | |
| 2723 | @comment stdlib.h |
| 2724 | @comment GNU |
| 2725 | @deftypefun int ecvt_r (double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg}, char *@var{buf}, size_t @var{len}) |
| 2726 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 2727 | The @code{ecvt_r} function is the same as @code{ecvt}, except |
| 2728 | that it places its result into the user-specified buffer pointed to by |
| 2729 | @var{buf}, with length @var{len}. The return value is @code{-1} in |
| 2730 | case of an error and zero otherwise. |
| 2731 | |
| 2732 | This function is a GNU extension. |
| 2733 | @end deftypefun |
| 2734 | |
| 2735 | @comment stdlib.h |
| 2736 | @comment SVID, Unix98 |
| 2737 | @deftypefun int fcvt_r (double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg}, char *@var{buf}, size_t @var{len}) |
| 2738 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 2739 | The @code{fcvt_r} function is the same as @code{fcvt}, except that it |
| 2740 | places its result into the user-specified buffer pointed to by |
| 2741 | @var{buf}, with length @var{len}. The return value is @code{-1} in |
| 2742 | case of an error and zero otherwise. |
| 2743 | |
| 2744 | This function is a GNU extension. |
| 2745 | @end deftypefun |
| 2746 | |
| 2747 | @comment stdlib.h |
| 2748 | @comment GNU |
| 2749 | @deftypefun int qecvt_r (long double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg}, char *@var{buf}, size_t @var{len}) |
| 2750 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 2751 | The @code{qecvt_r} function is the same as @code{qecvt}, except |
| 2752 | that it places its result into the user-specified buffer pointed to by |
| 2753 | @var{buf}, with length @var{len}. The return value is @code{-1} in |
| 2754 | case of an error and zero otherwise. |
| 2755 | |
| 2756 | This function is a GNU extension. |
| 2757 | @end deftypefun |
| 2758 | |
| 2759 | @comment stdlib.h |
| 2760 | @comment GNU |
| 2761 | @deftypefun int qfcvt_r (long double @var{value}, int @var{ndigit}, int *@var{decpt}, int *@var{neg}, char *@var{buf}, size_t @var{len}) |
| 2762 | @safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} |
| 2763 | The @code{qfcvt_r} function is the same as @code{qfcvt}, except |
| 2764 | that it places its result into the user-specified buffer pointed to by |
| 2765 | @var{buf}, with length @var{len}. The return value is @code{-1} in |
| 2766 | case of an error and zero otherwise. |
| 2767 | |
| 2768 | This function is a GNU extension. |
| 2769 | @end deftypefun |