Textbooks are full of good advices:
Use other aids as well. Explaining your code to someone else (even a teddy bear) is wonderfully effective. Use a debugger to get a stak trace. Use some of the commercial tools that check for memory leaks, array bounds violations, suspect code and the like. Step through your program when it has become clear that you have the wrong picture of how the code works.
— Brian W. Kernighan, Rob Pike, The practice of programming, 1999 (Chapter 5: Debugging)
Enable every optional warning; view the warnings as a risk-free, high-return investment in your program. Don't ask, "Should I enable this warning?" Instead ask, "Why shouldn't I enable it?" Turn on every warning unless you have an excellent reason not to.
— Steve Macguire, Writing solid code, 1993
Sounds familiar? But with which option? This page tries to answer that kind of question.
Constructive feedback is welcome.
[From Thomas Wolf]
There are two different IEEE standards for floating-point arithmetic. They have numbers 754 and 854. Usually, people talk about the 754 standard, which is document ANSI/IEEE Std 754-1985, also an IEC standard: IEC 559:1989 and has also been published as ACM SIGPLAN Notices 22(2), pp. 9-25, Feb. 1987.
The ANSI/IEEE Std 854-1987 standard allows both binary and decimal bases for floating-point values, and it doesn't specify how floating-point numbers are encoded (i.e. the bit layout).
Most implementations default to the "non stop arithmetic" behaviour where arithmetic exceptions are masked at start up and consequently do not get delivered synchronously to the program. This can be usually overwritten by using a compiler flag (C/C++, Fortran) or by calling a system API in order to modify the exception mask. Unfortunately, these APIs are not part of any standard.
Here are the main ones:
Several packages contain some relevant code:
[From J. Giles]
D. W. Matula, A Formalization of Floating-Point Numeric Base
Conversion, IEEE Transactions on Computers, vol. C-19, no. 8,
pp. 681-692, August 1970
Basically, the condition is that the number of decimal digits D, and the number of binary bits B should be related as follows:
10^(D-1) > 2^B
If that's the case, then translating the binary to decimal and back to binary again is an identity operation. So, for IEEE single precision, the number of bits is 24 (counting the hidden normalization) so you should have D=9 (or more). For IEEE double, B is 53 and you want D=17 (or more). For Intel's version of double extended, you have B=64, so D>=21.
Similarly, if you want to translate from decimal to binary and back to decimal and get the same answer, the required relation between the precisions is:
2^(B-1) > 10^D
That is, some people like to enter 0.1 and not get 0.09999997 back. For this, the maximum decimal digits you should use for the IEEE and Intel binary representations is:
single: D<=6
double: D<=15
double-extended: D<=18
The evaluation of a function reference shall neither affect nor be affected by the evaluation of any other entity within the statement. If a function reference causes definition or undefinition of an actual argument of the function, that argument or any associated entities shall not appear elsewhere in the same statement.
Jon Bentley, author of Programming Pearls, published Writing Efficient Programs, in which he provides a unified, pragmatic treatment of program efficiency, independent of language and host platform. For ease of presentation, he codified his methods as a set of terse rules in the Appendix C of Writing Efficient Programs.
Win32: fixed at link time
You can increase this by specifying the linker option
/stack:n where n is the number of bytes
(in decimal) you want for the stack. In Developer Studio, select
Project..Settings..Link and add the option switch to the
list of Project Options.
You can also change the stack size of an already linked executable with the command editbin /stack:n program.exe.
Finally, a command like "limit stacksize unlimited" increases the limit to a maximum value set by some kernel parameter. The kernel can be configured to change this limit.
If the maxima are too small, a quick workaround for Fortran programmer is to "SAVE" (or "ALLOCATE") the offending arrays or to compile the subroutines (or whole program) with a flag that does an automatic "SAVE".
set -u
to trap on uninitialized variables, set
-x
to get a traceif [ "x${VAR-}" = x ]; then
if [ "`echo -e xxx`" = xxx ]; then
echo='echo -e'
else
echo='echo'
fi
Additionally, nearly all compilers can be asked to give "more" warnings
as detailed within this document. And most of them offers a "verify syntax only" option.
Note that sometimes the flow analysis is performed only when the relevant
optimisation flag is set.
Fast, powerful, usable through telnet but raw. Quickest way to get a stack trace ("where" in dbx, "backtrace" or "bt" in gdb).
awk
'/^[0-9]/{print $2}' prog.exe.stackdump | addr2line -f -e
prog.exe
(possibly in cygwin.bat). Then, upon a crash, a core file will be created. You can then use gdb to analyse the result.
set CYGWIN=error_start=x:\path\to\cygwin\bin\dumper.exe
On UNIX systems that have no development environment (and therefore lack such debuggers as xdb, dbx, ...) you can still get some information from the coredump by using "adb" or "sdb". These are low-level, general-purpose debuggers with a rather terse interface. One simple recipe for "adb" is this:
> adb programfile core
$c
:q
$c will give you the stack trace, so that you at least know in what routine the program crashed, :q will end the debug session.
Each Unix vendor offers its own graphical debugger (workshop
on Solaris, cvd on Irix, dde on HP, ladebug
-gui (was dxladebug) on Compaq, xldb on AIX,
...).
My preference goes to the Data
Display Debugger (ddd) which provides an uniform interface across unix
platform, encapsulates the system debugger (gdb, dbx,
ladebug, ...) and works well (Dr Dobb's article).
setenv TVROOT xxx
setenv PATH "${PATH}:${TVROOT}/bin"
setenv MANPATH "${MANPATH}:${TVROOT}/man"
[f77, f90] -64 -g <prog-name>.f -L${TVROOT}/lib
-ldbfork_n64 -lmpi -o <application-name>
[f77, f90] -n32 -g <prog-name>.f -L"${TVROOT}"/lib -ldbfork_n32
-lmpi -o <application-name>
mpirun -np <nprocs> totalview
<application-name>
GNU compilers - Linux (and any supported platform by gcc) |
Sun compilers - Sun Solaris | HP compilers - HP HP-UX | IBM compilers - IBM AIX, Linux, … | SGI compilers - SGI IRIX | Compaq compilers - Tru64 Unix | Microsoft C++ compiler - Windows | ||
---|---|---|---|---|---|---|---|---|
version of targeted tools | GNU compilers (gcc, g++ [3.4.x]) |
Sun compilers (cc, CC [Workshop 6.0 update 1]) |
HP compilers (cc [11.x], aCC [3.x]) |
IBM compilers (xlc [8.0]) |
SGI compilers (cc, CC) |
Compaq compilers (discontinued: cc [5.3], cxx [6.2]) |
Microsoft Visual C++ (CL.EXE) [7.0] | |
verify syntax only | -fsyntax-only | [cc] -xe | N/A | -qsyntaxonly | [all compilers] -Hf (was -fe) | -Hf | /Zs | |
floating point
trapping (note) |
principle | system dependent API calls (references) |
compiler support: -ftrap=xxx | linker support: +FP xxx | compiler support: -qflttrap=xxx | link with -lfpe + setenv TRAP_FPE ... | compiler support (-fptm<x>) |
API calls (_control87) + debugger support |
trap DIV, INV, OV | API calls (Linux/glibc and x86 examples) | -ftrap=common or -fnonstd | +FP VZO | -qflttrap=inv:ov:zero:en | see example | default (-fptm n) | API calls | |
integer trapping | overflow: -ftrapv divide by zero: default |
overflow: N/A divide by zero: default |
overflow: N/A divide by zero: default |
overflow: N/A divide by zero:-qcheck=divzero (implied by -qcheck=all) |
overflow: -DEBUG:div_check=3 (seems not to work) divide by zero: default |
overflow: N/A divide by zero: default |
debugger support | |
GNU compilers - Linux (and any supported platform by gcc) |
Sun compilers - Sun Solaris | HP compilers - HP HP-UX | IBM compilers - IBM AIX, Linux, … | SGI compilers - SGI IRIX | Compaq compilers - Tru64 Unix | Microsoft C++ compiler - Windows | ||
standard conformance | [gcc, g++] -ansi -pedantic | [cc] -Xa, -Xc | [cc] -Aa, +Mlevel [aCC]-Aa, +p |
-qlanglvl=<xx> | [cc, CC] -ansi [CC] -LANG:std |
[cc] -std<n> [cxx] -std strict_ansi |
/Za | |
run-time detection of
uninitialized variable (note) |
[tools] On linux x86, valgrind |
[tools] various available | N/A | [compiler support] for stack storage:
-qinitiauto=FF In practice -qinitauto=FF -qflttrap=inv:ov:zero:en -qfloat=nans (??) for heap storage -qheapdebug (AIX specific) |
[compiler support] -DEBUG:trap_uninitialized (was
-trapuv) static memory: -Wl,'-f 0xFFFFFFFF' |
[compiler support] -trapuv (static memory:
-Wl,'-f 0xFFFFFFFF' doesn't operate as on SGI) [tools] atom -tool third |
[compiler support] /GZ (MS C++ 6.0), /RTC1
(MS C++ 7.0) [tools] commercial tools |
|
compile time flow
analysis (note) |
-Wuninitialized -O | [C] lint -Nlevel=n (n>=2) | [hp compilers] +Onoinitcheck | -qinfo=uni |
N/A | N/A | /Z3, /Z4 | |
put literal strings in read-only memory | default | [cc] -xstrconst [CC] -features=conststrings |
[cc] +ESlit | -qro, -qroconst | [all compilers] -use_readonly_const -G0 -rdata_shared | [all compilers] -readonly_strings | /GF | |
abort on deferencing null pointer | default | default | -z | -qcheck=nullptr (implied by -qcheck=all) | default | N/A | default | |
GNU compilers - Linux (and any supported platform by gcc) |
Sun compilers - Sun Solaris | HP compilers - HP HP-UX | IBM compilers - IBM AIX, Linux, … | SGI compilers - SGI IRIX | Compaq compilers - Tru64 Unix | Microsoft C++ compiler - Windows | ||
take advantage of
aliasing rules (note) |
-fstrict-aliasing (implied by -O2 (gcc>=3.x)) | [cc] -xalias_level=std | [cc] +Optrs_ansi, +Optrs_strongly_typed, +Otype_safety=ansi | -qalias=ansi -O (was -qansialias) | -OPT:alias=typed, (seems not to work: -LANG:alias_const) | -ansi_alias | N/A | |
check varargs | N/A (?) | N/A | N/A | N/A | -DEBUG:varargs_interface_check, -DEBUG:varargs_prototypes | [cc] -vararg | N/A | |
check calls | compile time | [gcc] K&R decl.: -Wstrict-prototypes, -Wold-style-definition, missing decl.: -Wmissing-prototypes | [cc] K&R decl.: -fd | [cc] missing decl.: +w1 | decl. consistency: -qinfo=dcl, missing decl.: -qinfo=pro | [cc] missing decl.: -fullwarn | [cc] missing decl.: -warnprotos | ?? |
link or run time | N/A | N/A | N/A | link time:-qextchk (AIX specific) | N/A | N/A | N/A | |
GNU compilers - Linux (and any supported platform by gcc) |
Sun compilers - Sun Solaris | HP compilers - HP HP-UX | IBM compilers - IBM AIX, Linux, … | SGI compilers - SGI IRIX | Compaq compilers - Tru64 Unix | Microsoft C++ compiler - Windows | ||
link or run time memory
debugging (note) |
[tools] various (Valgrind,
Electric Fence, ...) [library] glibc |
[compilers] stack overflow check: -xcheck=stkovf
(>= 7.0) [tools] various available [library] man watchmalloc |
[tools] gdb (aka wdb) | -qheapdebug (AIX specific), -qcheck=bound (implied by -qcheck=all) | [all compilers] -DEBUG:subscript_check [library] man malloc_ss |
[cc] -check_bounds [tools] atom -tool third |
build in debug mode, buffer security check: /GS (MS
C++ 7.0), /RTC1 (MS C++ 7.0) [tools] commercial tools, MS pageheap, built-in facilities defined in crtdbg.h |
|
flags for debugger | -g, -ggdb, -g3, -ggdb3 | -g, -xs, -g0 | -g, [aCC] -g0, +objdebug, +d | -g, -qfullpath, -qlinedebug | -g, -g3, [CC] -gslim | -g0, -g1, -g2, -g3, [cxx] -gall | build in debug mode | |
reentrant code | (glibc based systems, e.g. Linux) -D_REENTRANT | -mt | -D_POSIX_C_SOURCE=199506L (c.f. man
pthread) [aCC >=3.30] -mt |
use the ..._r commands (xlc_r, ...) | -D_POSIX_C_SOURCE=199506L (c.f. man 3 intro) | [all compilers] -pthread | /MD, /ML, /MT | |
GNU compilers - Linux (and any supported platform by gcc) |
Sun compilers - Sun Solaris | HP compilers - HP HP-UX | IBM compilers - IBM AIX, Linux, … | SGI compilers - SGI IRIX | Compaq compilers - Tru64 Unix | Microsoft C++ compiler - Windows |
GNU compiler - Linux (and any supported platform by gcc) |
Intel compiler - Linux | Sun compiler - Sun Solaris | HP compiler - HP HP-UX | IBM compiler - IBM AIX, Linux, … | SGI compilers - SGI IRIX | Compaq compiler - Tru64 Unix | Compaq Fortran compiler - Windows | Salford Fortran compilers - Windows | NAGWare Fortran compiler (any supported platform) |
Lahey Fortran compiler | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|
version of targeted tools | GNU compiler (gfortran [4.2.x], g77 [discontinued after 3.4.x]) |
Intel Fortran compiler (ifort [9.1]) |
Sun compiler (f90, f95 [Workshop 7.0]) |
HP compiler (f90 [2.4]) |
IBM compiler (xlf [10.1]) |
SGI compilers (f90 [7.3]) |
Compaq compiler (discontinued: f90, f95 [5.5]) |
Compaq Visual Fortran (discontinued: DF [6.6]) (Windows Fortran compilers) |
Salford Fortran compilers FTN77 [4.0], FTN95 [3.0] (Windows Fortran compilers) |
NAGWare compiler (f95 [5.0]) |
Lahey/Fujitsu compiler lf95 [6.0] (Windows Fortran compilers) |
|
verify syntax only | -fsyntax-only | -syntax, -y | N/A | N/A | N/A | -Hf (was -fe) | -syntax_only | /syntax_only | N/A | -M -M -nomod (no module files produced) |
N/A | |
floating point
trapping (note) |
principle | [gfortran] -ffpe-trap=xxx [g77] system dependent API calls (references) |
compiler support (-fpe<x>) |
compiler support: -ftrap=xxx | linker support: +FP xxx | compiler support: -qflttrap=xxx | link with -lfpe + setenv TRAP_FPE ... | compiler support (-fpe<x>) |
/fpe:<level> | API calls | compiler support: -ieee=xxx | --trap <args> |
trap DIV, INV, OV | [gfortran] -ffpe-trap=invalid,zero,overflow [g77] API calls (Linux/glibc and x86 examples) |
-fpe 0 | -ftrap=common or -fnonstd | +FP VZO | -qflttrap=inv:ov:zero:en | see example | default (-fpe) | /fpe:0 (non default on x86) | default | default (-ieee=stop) | -trap dio | |
integer trapping | overflow: -ftrapv divide by zero: default |
overflow: N/A divide by zero: default |
overflow: N/A divide by zero: default |
overflow: possible with directive divide by zero: default |
overflow: N/A divide by zero: default |
overflow: -DEBUG:div_check=3 (seems not to work) divide by zero: default |
overflow: -check overflow divide by zero: default |
overflow: /check:overflow divide by zero: default |
overflow: N/A divide by zero: default |
overflow: N/A divide by zero: default |
overflow: N/A divide by zero: default |
|
GNU compiler - Linux (and any supported platform by gcc) |
Intel compiler - Linux | Sun compiler - Sun Solaris | HP compiler - HP HP-UX | IBM compiler - IBM AIX, Linux, … | SGI compilers - SGI IRIX | Compaq compiler - Tru64 Unix | Compaq Fortran compiler - Windows | Salford Fortran compilers - Windows | NAGWare Fortran compiler (any supported platform) |
Lahey Fortran compiler | ||
standard conformance | -pedantic | -stand | -ansi | +langlvl=xx | -qlanglvl=<xx> | -ansi | -std<xx> | /stand | /ANSI, [FTN95] /ISO, /RESTRICT_SYNTAX | default | --f95 | |
run-time detection of
uninitialized variable (note) |
[tools] On linux x86, valgrind [gfortran >= 4.3] -finit-real=nan, -finit-init=xxx, -finit-logical=xxx (f2c has -trapuv since June 2001) |
(compile with -auto) [compiler support]
-ftrapuv (ifort 9.0 does not use NaN which makes it less
useful), -check uninit [tools] valgrind on x86 |
[tools] various available (compile with -stackvar) | N/A | [compiler support] for stack storage:
-qinitiauto=FFF00000 In practice [xlf] -qnosave -qinitauto=FFF00000 -qflttrap=inv:ov:zero:en |
[compiler support] -DEBUG:trap_uninitialized (was
-trapuv) static memory: -Wl,'-f 0xFFFFFFFF' |
[compiler support] -trapuv (compile with
-automatic) (static memory: -Wl,'-f 0xFFFFFFFF' doesn't operate as on SGI) [tools] atom -tool third |
[compiler support] N/A (/automatic may help somewhat,
see as well
1) [tools] commercial tools |
/UNDEF | [compiler support] -nan, -C=undefined [tools] some may help |
--check | |
compile time flow
analysis (note) |
-Wuninitialized -O | (>=10.x) -diag-enable sv (disable object file generation) | -XlistE | +Onoinitcheck | N/A | ftnlint (limited analysis) | -automatic (optimisation must be on) (-warn uninitialized on by default) |
default with /automatic | default | limited | ?? | |
put literal strings in read-only memory | default | default (/assume:protect_constants) | N/A | N/A | N/A | N/A | -readonly_strings, -assume protect_constants (default) | default (/assume:protect_constants) | /CHECK, [FTN95] /FULL_UNDEF | default or N/A | --npca/--pca | |
GNU compiler - Linux (and any supported platform by gcc) |
Intel compiler - Linux | Sun compiler - Sun Solaris | HP compiler - HP HP-UX | IBM compiler - IBM AIX, Linux, … | SGI compilers - SGI IRIX | Compaq compiler - Tru64 Unix | Compaq Fortran compiler - Windows | Salford Fortran compilers - Windows | NAGWare Fortran compiler (any supported platform) |
Lahey Fortran compiler | ||
abort on deferencing null pointer | N/A | -check pointer | ?? | ?? | ?? | ?? | ?? | ?? | /FULL_UNDEF | -C=pointer | ?? | |
stack oriented/static allocation | [gfortran >= 4.3] -frecursive, [g77] default / -fno-automatic | -auto / -save (default is -auto_scalar) | -stackvar / default | default (+nosave) / +save | -qnosave / -qsave | default / -static | -automatic / default (-static) | /automatic / default (/static) | default / /SAV | default / -save | default (--nsav) / --sav | |
disallow implicit declaration | [gfortran] -fimplicit-none [g77] -Wimplicit |
-u, -implicitnone | -u | +implicit_none | -u (or -qundef) | -u | -u (or -warn declarations) | /warn:declarations | /IMPLICIT_NONE | -u | --in | |
check calls | compile time | [g77] (per file) default | (across files) -gen-interfaces and -warn interfaces | (per file) default | ?? | N/A | (per file) default | (per file) -warn argument_checking | (per file) /warn:argument_checking | N/A | (per file) default | (per file) default |
link or run time | N/A | mismatch in number of arguments (Windows only) /iface:cvf | N/A | N/A | link time:-qextchk (AIX specific) | N/A | N/A | mismatch in number of arguments detected due to stdcall convention | /CHECK, /FULLCHECK, [FTN95] /FULL_UNDEF | run time: -C=calls | --check, --checkglobal | |
GNU compiler - Linux (and any supported platform by gcc) |
Intel compiler - Linux | Sun compiler - Sun Solaris | HP compiler - HP HP-UX | IBM compiler - IBM AIX, Linux, … | SGI compilers - SGI IRIX | Compaq compiler - Tru64 Unix | Compaq Fortran compiler - Windows | Salford Fortran compilers - Windows | NAGWare Fortran compiler (any supported platform) |
Lahey Fortran compiler | ||
link or run time memory
debugging (note 1, note 2) |
[gfortran, g77] -fbounds-check [tools] various (Valgrind, ...) [library] Linux/glibc |
-check bounds | -C, -xcheck=stkovf (f95 >= 7.0) [tools] various available |
+check=all [tools] gdb (aka wdb) |
-C | -DEBUG:subscript_check (set the environment variable F90_BOUNDS_CHECK_ABORT to "YES") | -C [tools] atom -tool third |
[DF] /check:bounds | /CHECK, /FULLCHECK, [FTN95] /FULL_UNDEF | [f95] -C memory tracing: -mtrace [tools] various |
--check | |
stack trace on crash | gfortran >= 4.3 -fbacktrace | -traceback | +fp_exception | -qsigtrap=xl__trcedump | /traceback | -gline | default (--trace) | |||||
flags for debugger | -g, -ggdb, -g3, -ggdb3 | -g, -inline_debug_info | -g, -xs, -g0 | -g | -g, -qfullpath | -g, -g3 | -g0, -g1, -g2, -g3, [f95] -assume gfullpath, [f95] -ladebug | build in debug mode | /DEBUG, [FTN95] /FULL_DEBUG | -g | -g | |
reentrant code | ?? | -recursive, -threads | ?? | ?? | use the ..._r commands (xlf_r, ...) | ?? | -reentrancy threaded | /recursive, /threads | /MULTI_THREADED (FTN95>=3.0) | -thread_safe | ?? | |
GNU compiler - Linux (and any supported platform by gcc) |
Intel compiler - Linux | Sun compiler - Sun Solaris | HP compiler - HP HP-UX | IBM compiler - IBM AIX, Linux, … | SGI compilers - SGI IRIX | Compaq compiler - Tru64 Unix | Compaq Fortran compiler - Windows | Salford Fortran compilers - Windows | NAGWare Fortran compiler (any supported platform) |
Lahey Fortran compiler |
This may implemented:
Note: Fortran applications benefit from being built with a "no save" flag. This can lead to a stack overflow (see Increasing the stack size). Fortran static data should be initialized. Some linkers offer some detection capability.
Selective Fortran bound checking
Sometimes this kind of instrumentation cannot be applied everywhere
because some routines rely on dirty tricks. One workaround is to
disable locally the feature by using a directive if the compiler offers
it. It's possible at least with:
!DIR$
[NO]BOUNDS
")OPTIONS /CHECK=NOBOUNDS
" before the program unit)@PROCESS
NOCHECK
")The eventual portability problem can be solved either by using
INCLUDE (or #include) with a small file
containing the relevant code or by tagging it with some
#ifdef.
Of course, an alternative is to apply selectively the compile-line
option in the build procedure.
2and, if available, MALLOC_PERTURB_ to a pertubation byte such as
B.
Build with gcc -c trapfpe.c and link with trapfpe.o or build with gcc -shared -o trapfpe.so trapfpe.c and set LD_PRELOAD to this library. Adapted from: info g77 'Trouble' 'Missing Features' 'Floating-point Exception Handling'.
Starting with glibc 2.2, the following C99-style (but glibc specific) code is preferred.
#define _GNU_SOURCE 1
#include <fenv.h>
static void __attribute__ ((constructor)) trapfpe(void)
{
/* Enable some exceptions. At startup all exceptions are masked. */
feenableexcept(FE_INVALID|FE_DIVBYZERO|FE_OVERFLOW);
}
Previous versions of glibc require some platform dependent code (x86 specific).
Look for the "info" pages. Under emacs on a Linux box: "C-h i".
Interesting "debug" nodes:
"lint" with gcc (by James Hu) - somewhat out-of-date:
glint() {
gcc -ansi -pedantic -pedantic-errors -O \
-Wall -W -Wtraditional -Wpointer-arith -Wbad-function-cast \
-Wcast-qual -Wcast-align -Wwrite-strings -Wconversion \
-Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations \
-S -o - "$@" > /dev/null; }
Joseph Myers' favourite warnings as of gcc 3.4.
Some useful flags to find out what's going on. Many others are available.
According the documentation, to use the libstdc++ debug mode, compile your application with the compiler flag -D_GLIBCXX_DEBUG.
GCC (at least up to the 3.4 release), does not offer "referential transparency" on Intel x86 that uses floating point extended registers. It can lead to puzzling results as explained by Brad Lucier who showed as well how the compiler could be fixed. The x87 architecture which has only 8 80-bit FPU registers is particularly sensitive to uncontrollable register spills to 64 bits in memory that cause double roundings. Furthermore, variables allocated on the stack may be optimized away in register, changing the precision and producing different results depending on the level of optimization of the program. Some possible actions are detailed below. Note that the x86-64 architecture in 64 bit mode uses the SSE floating point device and not the x87 stack and is therefore immune to this problem.
Note that even if gcc was producing code that spills double extended register on stack without rounding, the results will still occasionally be different than on a architecture without extended arithmetic and that due to double rounding.
long double
"
explicitly, for instance on some reduction operations (sum, dot product,
...) may help.#pragma STDC FENV_ACCESS ON
" when it is
supported. Meanwhile, careful testing is required.wdb 2.0 and later (HP-supported version of gdb) offers some memory debugging capability.
Note: xxlf allows one to put together the options through a GUI.
;; GNU Emacs 20.x (but not XEmacs) has broken the compilation regexps.
;; So add one for NAGWare f95
(require 'compile)
(setq compilation-error-regexp-alist
(append compilation-error-regexp-alist
'(("^[A-Za-z ]+[:] \\([^.]+[.][a-zA-Z0-9]+\\), line \\([0-9]+\\)[:]"
1 2))))
NAGWare f95 offers several flags that must be used unreservedly during development time.
IMPLICIT NONE
is in effect by default,
unless overridden by explicit IMPLICIT
statements.f95 offers run-time checking through the option -C (even when
an array's last dimension is declared as * - look at the man page), can
trap uninitialised floating point variables with -nan and
perform uninitialised variables detection (-C=undefined).
Additionnaly, depending on the platform, you can:
Type "man DEBUG_group" for more information.
First of all, any program compiled will -trapuv will trap on invalid.
If your program is linked with -lfpe, then this flag forces floating-point errors to trap following the value of TRAP_FPE. Look at "man handle_sigfpes" for more information.
You can use these aliases (for your .cshrc) (adapted from Peter Shenkin):
# floating point trap on SGI. Must be linked with -lfpe.
# c.f. man handle_sigfpes.
alias fpdebug setenv TRAP_FPE \
"UNDERFL=DEFAULT\;OVERFL=TRACE\(1\)\,ABORT\;DIVZERO=TRACE\(1\)\,ABORT\;INVALID=TRACE\(1\)\,ABORT"
alias fpundebug 'unsetenv TRAP_FPE'
# trap by default
fpdebug
If I say "fpdebug" at the shell level, then run a program linked with -lfpe, IEEE floating-point exceptions will be trapped ; if I haven't said "fpdebug", or have later said "fpundebug", the program will execute normally.
Some tools can help to detect uninitialized variables, memory corruptions and leaks (with Sun Fortran, think of -stackvar).
Diagnostics capabilitiessections.
For debugging, we recommend the following switch settings:
-chk (a,e,s,u,x) -chkglobal -g -pca -stchk -trace -w -info
(Note: Specifying -chkglobal or -chk (x) must be used for compilation of all files of the program, or incorrect results may occur.)
The examples below show how to trap the exceptions Invalid, Divide by zero and Overflow on the x87 and SSE floating point units.
Systems | Example for x87 | Example for SSE |
---|---|---|
Windows APIs |
Use _controlfp or _control87.
|
With Visual Studio 2005, _controlfp and _control87 affect the control words for both the x87 and the SSE FPU. |
Linux/glibc 2.2 and later |
Use feenableexcept.
|
Use feenableexcept, which sets both the x87 and the SSE control words from glibc 2.3.3 onwards for x86_32. |
Linux/glibc 2.1 and older |
|
These APIs do not affect the SSE FPU. |
FreeBSD |
FreeBSD post March 2005 implements feenableexcept. For
older versions, follow the example below.
|
Unknown |
Compilers supporting xmmintrin.h | These APIs do not affect the x87 FPU. |
|
gcc compatible assembler (e.g. Cygwin) |
|
|
Under Linux/glibc, the x87 floating point processing unit operates by default under double extended precision. Under Win32 and FreeBSD, the default is set to double precision mode. This can lead to observed differences for some algorithms. Setting the precision mode and restoring it around a specific section is the most reliable way to fix the problem as illustrated below.
References:
#if defined(_WIN32)
# include <float.h>
# ifdef SINGLE
# define _CW_PREC PC_24
# else
# define _CW_PREC PC_53
# endif
# define x86_SetPrecision \
unsigned int _oldcw_pc; \
_oldcw_pc = _control87(0,0) & MCW_PC; \
_control87(_CW_PREC,MCW_PC)
# define x86_RestorePrecision \
_control87(_oldcw_pc,MCW_PC)
#elif defined(i386) && defined(__FreeBSD__)
# include <floatingpoint.h>
# ifdef SINGLE
# define _CW_PREC PC_PS
# else
# define _CW_PREC PC_PD
# endif
# define x86_SetPrecision \
fp_prec_t _oldcw_pc; \
_oldcw_pc = fpgetprec(); \
fpsetprec(_CW_PREC) \
# define x86_RestorePrecision \
fpsetprec(_oldcw_pc)
#elif defined(i386) && defined(__GNUC__)
# ifdef SINGLE
# define _CW_PREC _FPU_SINGLE
# else
# define _CW_PREC _FPU_DOUBLE
# endif
# if defined(linux)
# include <fpu_control.h>
# else
# define _FPU_EXTENDED 0x300
# define _FPU_DOUBLE 0x200
# define _FPU_SINGLE 0x0
# define _FPU_GETCW(cw) __asm__ __volatile__("fnstcw %0" : "=m" (*&cw))
# define _FPU_SETCW(cw) __asm__ __volatile__("fldcw %0" : : "m" (*&cw))
# define fpu_control_t unsigned int
# endif
# define x86_SetPrecision \
fpu_control_t _oldcw_pc; \
{ fpu_control_t _cw; \
_FPU_GETCW(_cw); \
_oldcw_pc = _cw & _FPU_EXTENDED; \
_cw = (_cw & ~_FPU_EXTENDED) | _CW_PREC; \
_FPU_SETCW(_cw); \
}
# define x86_RestorePrecision \
{ fpu_control_t _cw; \
_FPU_GETCW(_cw); \
_cw = (_cw & ~_FPU_EXTENDED) | _oldcw_pc; \
_FPU_SETCW(_cw); \
}
#else
# define x86_SetPrecision
# define x86_RestorePrecision
#endif
/* UNTESTED, see "ieee_flags" for an alternative */
#if defined(i386) && defined(__sun) && \
(defined(__SUNPRO_C) || defined(__SUNPRO_CC))
# include <fenv.h>
# ifdef SINGLE
# define _CW_PREC FE_FLTPREC
# else
# define _CW_PREC FE_DBLPREC
# endif
# define x86_SetPrecision \
int _oldcw_pc; \
_oldcw_pc = fegetprec(); \
fesetprec(_CW_PREC) \
# define x86_RestorePrecision \
fesetprec(_oldcw_pc)
#endif
Refer to the CNRS/IDRIS documentation (in French).
This page does not deal with optimization. However here are some relevant links:
Malcolm Cohen, Mario Deilmann, James Giles, Herman D. Knoble, Jean-Yves L'Excellent, Arjen Markus, Michel Olagnon, Gareth Shaw
$Id: CompilerTricks.html,v 1.260 2008/01/28 13:51:50 adesitter Exp $ | Valid XHTML and CSS
by Arnaud Desitter.