doc/ansi_C.doc

   1 .de NS
   2 .sp
   3 .in 0
   4 \\fBANS \\$1:\\fP
   5 ..
   6 .TL
   7 Amsterdam Compiler Kit-ANSI C compiler compliance statements
   8 .AU
   9 Hans van Eck
  10 .AI
  11 Dept. of Mathematics and Computer Science
  12 Vrije Universiteit
  13 Amsterdam, The Netherlands
  14 .PP
  15 This document specifies the implementation-defined behaviour of the ANSI-C
  16 front end of the Amsterdam Compiler Kit as required by ANS X3.159-1989.  Since
  17 the implementation-defined behaviour sometimes depends on the machine
  18 compiling on or for, some items will be left unspecified in this
  19 document\(dg.
  20 .FS
  21 \(dg when cross-compiling, run-time behaviour may be different from
  22 compile-time behaviour
  23 .FE
  24 The compiler assumes that it runs on a UNIX system.
  25 .NS A.6.3.1
  26 .IP -
  27 Diagnostics are placed on the standard error output.  They have the
  28 following specification:
  29 .br
  30 "<file>", line <nr>: [(<class>)] <diagnostic>
  31 .br
  32 There are three classes of diagnostics: "error", "strict" and "warning".
  33 When the class is "error", the <class> is absent.
  34 .br
  35 The class "strict" is used for violations of the standard which are
  36 not severe enough to stop compilation.  An example is the the occurrence
  37 of non white-space after an '#else' or '#endif' pre-processing
  38 directive.  The class "warning" is used for legal but dubious
  39 constructions.  An example is overflow of constant expressions.
  40 .NS A.6.3.2
  41 .IP -
  42 The function 'main' can have two arguments.  The first argument is an
  43 integer specifying the number of arguments on the command line.  The second
  44 argument is a pointer to an array of pointers to the arguments (as
  45 strings).
  46 .IP -
  47 Interactive devices are terminals.
  48 .NS A.6.3.3
  49 .IP -
  50 The number of significant characters is an option.  By default it is 64.
  51 There is a distinction between upper and lower case.
  52 .NS A.6.3.4
  53 .IP -
  54 The compiler assumes ASCII-characters in both the source and execution
  55 character set.
  56 .IP -
  57 There are no multi-byte characters.
  58 .IP -
  59 There 8 bits in a character.
  60 .IP -
  61 Character constants with values that can not be represented in 8 bits
  62 are truncated.
  63 .IP -
  64 Character constants that are more than 1 character wide will have the
  65 first character specified in the least significant byte.
  66 .IP -
  67 The only supported locale is "C".
  68 .IP -
  69 A plain 'char' has the same range of values as 'signed char'.
  70 .NS A.6.3.5
  71 .IP -
  72 The compiler assumes that it works on and compiles for a
  73 2-complement binary-number system.  Shorts will use 2 bytes and longs
  74 will use 4 bytes.  The size of integers are machine dependent.
  75 .IP -
  76 Converting an integer to a shorter signed integer is implemented by
  77 ignoring the high-order byte(s) of the former.
  78 Converting a unsigned integer to a signed integer of the same type is
  79 only done in administration.  This means that the bit-pattern remains
  80 unchanged.
  81 .IP -
  82 The result of bitwise operations on signed integers are what can be
  83 expected on a 2-complement machine.
  84 .IP -
  85 If either operand is negative, whether the result of the / operator is the
  86 largest integer less than or equal to the algebraic quotient or the
  87 smallest integer greater than or equal to the algebraic quotient is machine
  88 dependent, as is the sign of the result of the % operator.
  89 .IP -
  90 The right-shift of a negative value is negative.
  91 .NS A.6.3.6
  92 .IP -
  93 The representation of floating-point values is machine-dependent.
  94 When native floating-point is not present an IEEE-emulation is used.
  95 The compiler uses high-precision floating-point for constant folding.
  96 .IP -
  97 Truncation is always to the nearest floating-point number that can
  98 be represented.
  99 .NS A.6.3.7
 100 .IP -
 101 The type returned by the sizeof-operator (also known as size_t)
 102 is 'unsigned int'.  This is done for backward compatibility reasons.
 103 .IP -
 104 Casting an integer to a pointer or vice versa has no effect in
 105 bit-pattern when the sizes are equal.  Otherwise the value will be
 106 truncated or zero-extended (depending on the direction of the
 107 conversion and the relative sizes).
 108 .IP -
 109 When a pointer is as large as an integer, the type of a 'ptrdiff_t' will
 110 be 'int'.  Otherwise the type will be 'long'.
 111 .NS A.6.3.8
 112 .IP -
 113 Since the front end has only limited control over the registers, it can
 114 only make it more likely that variables that are declared as
 115 registers also end up in registers.  The only things that can possibly be
 116 put into registers are : 'int', 'long', 'float', 'double', 'long double'
 117 and pointers.
 118 .NS A.6.3.9
 119 .IP -
 120 When a member of a union object is accessed using a member of a
 121 different type, the resulting value will usually be garbage.  The
 122 compiler makes no effort to catch these errors.
 123 .IP -
 124 The alignment of types is a compile-time option.  The alignment of
 125 a structure-member is the alignment of its type.  Usually, the
 126 alignment is passed on to the compiler by the 'ack' program.  When a
 127 user wants to do this manually, he/she should be prepared for trouble.
 128 .IP -
 129 A "plain" 'int' bit-field is taken as a 'signed int'.  This means that
 130 a field with a size of 1 bit can only store the values 0 and -1.
 131 .IP -
 132 The order of allocation of bit-fields is a compile-time option.  By
 133 default, high-order bits are allocated first.
 134 .IP -
 135 An enum has the same size as a "plain" 'int'.
 136 .NS A.6.3.10
 137 .IP -
 138 An access to a volatile declared variable is done by just mentioning
 139 the variable.  E.g. the statement "x;" where x is declared volatile,
 140 constitutes an access.
 141 .S A.6.3.11
 142 .IP -
 143 There is no fixed limit on the number of declarators that may modify an
 144 arithmetic, structure or union type, although specifying too many may
 145 cause the compiler to run out of memory.
 146 .NS A.6.3.12
 147 .IP -
 148 The maximum number of cases in a switch-statement is in the order of
 149 1e9, although the compiler may run out of memory somewhat earlier.
 150 .NS A.6.3.13
 151 .IP -
 152 Since both the pre-processor and the compiler assume ASCII-characters,
 153 a single character constant in a conditional-inclusion directive
 154 matches the same value in the execution character set.
 155 .IP -
 156 The pre-processor recognizes -I... command-line options.  The
 157 directories thus specified are searched first.  After that, depending on the
 158 command that the preprocessor is called with, machine/system-dependant
 159 directories are searched.  After that, ~em/include/_tail_ac and
 160 /usr/include are visited.
 161 .IP -
 162 Quoted names are first looked for in the directory in which the file
 163 which does the include resides.
 164 .IP -
 165 The characters in a h- or q- char-sequence are taken to be UNIX
 166 paths.
 167 .IP -
 168 Neither the compiler nor the preprocessor know any pragmas.
 169 .IP -
 170 Since the compiler runs on UNIX, __DATE__ and __TIME__ will always be
 171 defined.
 172 .NS A.6.3.14
 173 .IP -
 174 NULL is defined as ((void *)0).  This in order to flag dubious
 175 constructions like "int x = NULL;".
 176 .IP -
 177 The diagnostic printed by 'assert' is as follows:
 178 .ti +4n
 179 "Assertion "<expr>" failed, file "<file>", line <line>",
 180 .br
 181 where <expr> is the argument to the assert macro, printed as string.
 182 (the <file> and <line> should be clear)
 183 .KS
 184 .IP -
 185 The sets for character test macros.
 186 .TS
 187 l l.
 188 name:   set:
 189 isalnum()       0-9A-Za-z
 190 isalpha()       A-Za-z
 191 iscntrl()       \e000-\e037\e177
 192 islower()       a-z
 193 isupper()       A-Z
 194 isprint()       <space>-~ (== \e040-\e176)
 195 .TE
 196 .KE
 197 As an addition, there is an isascii() macro, which tests whether a character
 198 is an ascii character.  Characters in the range from \e000 to \e177 are ascii
 199 characters.
 200 .KS
 201 .IP -
 202 The behaviour of mathematic functions on domain error:
 203 .TS
 204 l c
 205 l n.
 206 name:   returns:
 207 asin()  0.0
 208 acos()  0.0
 209 atan2() 0.0
 210 fmod()  0.0
 211 log()   -HUGE_VAL
 212 log10() -HUGE_VAL
 213 pow()   0.0
 214 sqrt()  0.0
 215 .TE
 216 .KE
 217 .IP -
 218 Underflow range errors do not cause errno to be set.
 219 .IP -
 220 The function fmod() returns 0.0 and sets errno to EDOM when the second
 221 argument is 0.0.
 222 .IP -
 223 The set of signals for the signal() function depends on the UNIX-system
 224 which the compiler is compiling for.  The default handling, semantics
 225 and behaviour of these signals are those specified by the operating
 226 system vendor.  The default handling is not reset when SIGILL is
 227 received.
 228 .IP -
 229 A text-stream need not end in a new-line character.
 230 .IP -
 231 White space characters before a new-line appear when read in.
 232 .IP -
 233 There may be any number of null characters appended to a binary
 234 stream.
 235 .IP -
 236 The file position indicator of an append mode stream is initially
 237 positioned at the beginning of the file.
 238 .IP -
 239 A write on a text stream does not cause the associated file to be
 240 truncated beyond that point.
 241 .IP -
 242 The buffering intended by the standard is fully supported.
 243 .IP -
 244 A zero-length file actually exists.
 245 .IP -
 246 A file name can consist of any character, except for the '\e0' and
 247 the '/'.
 248 .IP -
 249 A file can be open multiple times.
 250 .IP -
 251 When a remove() is done on an open file, reading and writing behave
 252 just as can be expected from a non-removed file.  When the associated
 253 stream is closed, all written data will be lost.
 254 .IP -
 255 When a file exists prior to a call to rename(), the behaviour is that
 256 of the underlying UNIX system.  Normally, the call would fail.
 257 .IP -
 258 The %p conversion in fprintf() has the same effect as %#x or %#lx,
 259 depending on the sizes of pointer and integer.
 260 .IP -
 261 The %p conversion in fscanf() has the same effect as %x or %lx,
 262 depending on the sizes of pointer and integer.
 263 .IP -
 264 A - character that is neither the first nor the last character in the
 265 scanlist for %[ conversion is taken to be a range indicator.  When the
 266 first character has a higher ASCII-value than the second, the - will
 267 just be put into the scanlist.
 268 .IP -
 269 The value of errno when fgetpos() or ftell() failed is that of lseek().
 270 This means:
 271 .RS
 272 .IP "EBADF \-" 10
 273 when the stream is not valid
 274 .IP "ESPIPE \-"
 275 when fildes is associated with a pipe (and on some systems: sockets)
 276 .IP "EINVAL \-"
 277 the resulting file pointer would be negative
 278 .RE
 279 .LP
 280 .IP -
 281 The messages generated by perror() depend on the value of errno.
 282 The mapping of errors to strings is done by strerror().
 283 .IP -
 284 When the requested size is zero, malloc(), calloc() and realloc()
 285 return a null-pointer.
 286 .IP -
 287 When abort() is called, output buffers will be flushed.  Temporary files
 288 (made with the tmpfile() function) will have disappeared when SIGABRT
 289 is not caught or ignored.
 290 .IP -
 291 The exit() function returns the low-order eight bits of its argument
 292 to the environment.
 293 .IP -
 294 The predefined environment names are controlled by the user.
 295 Setting environment variables is done through the putenv() function.
 296 This function accepts a pointer to char as its argument.
 297 To set f.i. the environment variable TERM to a230 one writes
 298 .ti +4n
 299 putenv("TERM=a230");
 300 .br
 301 The argument to putenv() is stored in an internal table, so malloc'ed
 302 strings can not be freed until another call to putenv() (which sets the
 303 same environment variable) is made.  The function returns 1 if it fails,
 304 0 otherwise.
 305 .LP
 306 .IP -
 307 The argument to system is passed as argument to /bin/sh -c.
 308 .IP -
 309 The strings returned by strerror() depend on errno in the following
 310 way:
 311 .TS
 312 l l.
 313 errno   string
 314 0       "Error 0",
 315 EPERM   "Not owner",
 316 ENOENT  "No such file or directory",
 317 ESRCH   "No such process",
 318 EINTR   "Interrupted system call",
 319 EIO     "I/O error",
 320 ENXIO   "No such device or address",
 321 E2BIG   "Arg list too long",
 322 ENOEXEC "Exec format error",
 323 EBADF   "Bad file number",
 324 ECHILD  "No children",
 325 EAGAIN  "No more processes",
 326 ENOMEM  "Not enough core",
 327 EACCES  "Permission denied",
 328 EFAULT  "Bad address",
 329 ENOTBLK "Block device required",
 330 EBUSY   "Mount device busy",
 331 EEXIST  "File exists",
 332 EXDEV   "Cross-device link",
 333 ENODEV  "No such device",
 334 ENOTDIR "Not a directory",
 335 EISDIR  "Is a directory",
 336 EINVAL  "Invalid argument",
 337 ENFILE  "File table overflow",
 338 EMFILE  "Too many open files",
 339 ENOTTY  "Not a typewriter",
 340 ETXTBSY "Text file busy",
 341 EFBUG   "File too large",
 342 ENOSPC  "No space left on device",
 343 ESPIPE  "Illegal seek",
 344 EROFS   "Read-only file system",
 345 EMLINK  "Too many links",
 346 EPIPE   "Broken pipe",
 347 EDOM    "Math argument",
 348 ERANGE  "Result too large"
 349 .TE
 350 everything else causes strerror() to return "unknown error"
 351 .IP -
 352 The local time zone is per default MET (GMT + 1:00:00).  This can be
 353 changed through the TZ environment variable, or by some changes in the
 354 sources.
 355 .IP -
 356 The clock() function returns the number of ticks since process
 357 startup.
 358 .SH
 359 References
 360 .IP [1]
 361 ANS X3.159-1989
 362 .I
 363 American National Standard for Information Systems -
 364 Programming Language C
 365 .R