Compiler and tools tricks

Textbooks are full of good advices:

Use other aids as well. Explaining your code to someone else (even a teddy bear) is wonderfully effective. Use a debugger to get a stak trace. Use some of the commercial tools that check for memory leaks, array bounds violations, suspect code and the like. Step through your program when it has become clear that you have the wrong picture of how the code works.
— Brian W. Kernighan, Rob Pike, The practice of programming, 1999 (Chapter 5: Debugging)

Enable every optional warning; view the warnings as a risk-free, high-return investment in your program. Don't ask, "Should I enable this warning?" Instead ask, "Why shouldn't I enable it?" Turn on every warning unless you have an excellent reason not to.
— Steve Macguire, Writing solid code, 1993

Sounds familiar? But with which option? This page tries to answer that kind of question.

Constructive feedback is welcome.


Table of Contents


Theoretical background

Floating point arithmetic

IEEE Standards for Floating-Point Arithmetic

Language bindings

Miscellaneous

Aliasing

Bentley's Rules

Jon Bentley, author of Programming Pearls, published Writing Efficient Programs, in which he provides a unified, pragmatic treatment of program efficiency, independent of language and host platform. For ease of presentation, he codified his methods as a set of terse rules in the Appendix C of Writing Efficient Programs.

War stories related to development

[Contents]


Numerical topics

Exchange Fortran unformatted data between heterogeneous machines

Approximate "diff"

Increasing the stack size

[Contents]


Code coverage tools

[Contents]


Code profiling

[Contents]


make

[Contents]


Bourne shell

[Contents]


Static analyzers

lint

Additionally, nearly all compilers can be asked to give "more" warnings as detailed within this document. And most of them offers a "verify syntax only" option.
Note that sometimes the flow analysis is performed only when the relevant optimisation flag is set.

Source browser

Source beautifier

Metrics

[Contents]


Debuggers

"Command line" debuggers

Fast, powerful, usable through telnet but raw. Quickest way to get a stack trace ("where" in dbx, "backtrace" or "bt" in gdb).

Graphical debuggers

Each Unix vendor offers its own graphical debugger (workshop on Solaris, cvd on Irix, dde on HP, ladebug -gui (was dxladebug) on Compaq, xldb on AIX, ...).
My preference goes to the Data Display Debugger (ddd) which provides an uniform interface across unix platform, encapsulates the system debugger (gdb, dbx, ladebug, ...) and works well (Dr Dobb's article).

Availability and setup of DDD

Debuggers for parallel applications

Trace debugger

Various debugger tricks

[Contents]


Comparison tables

C/C++

Debugging capabilities with some development environments for C and C++
GNU compilers - Linux
(and any supported platform by gcc)
Sun compilers - Sun Solaris HP compilers - HP HP-UX IBM compilers - IBM AIX, Linux, … SGI compilers - SGI IRIX Compaq compilers - Tru64 Unix Microsoft C++ compiler - Windows
version of targeted tools GNU compilers
(gcc, g++ [3.4.x])
Sun compilers
(cc, CC [Workshop 6.0 update 1])
HP compilers
(cc [11.x], aCC [3.x])
IBM compilers
(xlc [8.0])
SGI compilers
(cc, CC)
Compaq compilers
(discontinued: cc [5.3], cxx [6.2])
Microsoft Visual C++ (CL.EXE) [7.0]
verify syntax only -fsyntax-only [cc] -xe N/A -qsyntaxonly [all compilers] -Hf (was -fe) -Hf /Zs
floating point trapping
(note)
principle system dependent API calls
(references)
compiler support: -ftrap=xxx linker support: +FP xxx compiler support: -qflttrap=xxx link with -lfpe + setenv TRAP_FPE ... compiler support
(-fptm<x>)
API calls (_control87) + debugger support
trap DIV, INV, OV API calls (Linux/glibc and x86 examples) -ftrap=common or -fnonstd +FP VZO -qflttrap=inv:ov:zero:en see example default (-fptm n) API calls
integer trapping overflow: -ftrapv
divide by zero: default
overflow: N/A
divide by zero: default
overflow: N/A
divide by zero: default
overflow: N/A
divide by zero:-qcheck=divzero (implied by -qcheck=all)
overflow: -DEBUG:div_check=3 (seems not to work)
divide by zero: default
overflow: N/A
divide by zero: default
debugger support
GNU compilers - Linux
(and any supported platform by gcc)
Sun compilers - Sun Solaris HP compilers - HP HP-UX IBM compilers - IBM AIX, Linux, … SGI compilers - SGI IRIX Compaq compilers - Tru64 Unix Microsoft C++ compiler - Windows
standard conformance [gcc, g++] -ansi -pedantic [cc] -Xa, -Xc [cc] -Aa, +Mlevel
[aCC]-Aa, +p
-qlanglvl=<xx> [cc, CC] -ansi
[CC] -LANG:std
[cc] -std<n>
[cxx] -std strict_ansi
/Za
run-time detection of uninitialized variable
(note)
[tools] On linux x86, valgrind
[tools] various available N/A [compiler support] for stack storage: -qinitiauto=FF
In practice -qinitauto=FF -qflttrap=inv:ov:zero:en -qfloat=nans (??)
for heap storage -qheapdebug (AIX specific)
[compiler support] -DEBUG:trap_uninitialized (was -trapuv)
static memory: -Wl,'-f 0xFFFFFFFF'
[compiler support] -trapuv (static memory: -Wl,'-f 0xFFFFFFFF' doesn't operate as on SGI)
[tools] atom -tool third
[compiler support] /GZ (MS C++ 6.0), /RTC1 (MS C++ 7.0)
[tools] commercial tools
compile time flow analysis
(note)
-Wuninitialized -O [C] lint -Nlevel=n (n>=2) [hp compilers] +Onoinitcheck -qinfo=uni
N/A N/A /Z3, /Z4
put literal strings in read-only memory default [cc] -xstrconst
[CC] -features=conststrings
[cc] +ESlit -qro, -qroconst [all compilers] -use_readonly_const -G0 -rdata_shared [all compilers] -readonly_strings /GF
abort on deferencing null pointer default default -z -qcheck=nullptr (implied by -qcheck=all) default N/A default
GNU compilers - Linux
(and any supported platform by gcc)
Sun compilers - Sun Solaris HP compilers - HP HP-UX IBM compilers - IBM AIX, Linux, … SGI compilers - SGI IRIX Compaq compilers - Tru64 Unix Microsoft C++ compiler - Windows
take advantage of aliasing rules
(note)
-fstrict-aliasing (implied by -O2 (gcc>=3.x)) [cc] -xalias_level=std [cc] +Optrs_ansi, +Optrs_strongly_typed, +Otype_safety=ansi -qalias=ansi -O (was -qansialias) -OPT:alias=typed, (seems not to work: -LANG:alias_const) -ansi_alias N/A
check varargs N/A (?) N/A N/A N/A -DEBUG:varargs_interface_check, -DEBUG:varargs_prototypes [cc] -vararg N/A
check calls compile time [gcc] K&R decl.: -Wstrict-prototypes, -Wold-style-definition, missing decl.: -Wmissing-prototypes [cc] K&R decl.: -fd [cc] missing decl.: +w1 decl. consistency: -qinfo=dcl, missing decl.: -qinfo=pro [cc] missing decl.: -fullwarn [cc] missing decl.: -warnprotos ??
link or run time N/A N/A N/A link time:-qextchk (AIX specific) N/A N/A N/A
GNU compilers - Linux
(and any supported platform by gcc)
Sun compilers - Sun Solaris HP compilers - HP HP-UX IBM compilers - IBM AIX, Linux, … SGI compilers - SGI IRIX Compaq compilers - Tru64 Unix Microsoft C++ compiler - Windows
link or run time memory debugging
(note)
[tools] various (Valgrind, Electric Fence, ...)
[library] glibc
[compilers] stack overflow check: -xcheck=stkovf (>= 7.0)
[tools] various available
[library] man watchmalloc
[tools] gdb (aka wdb) -qheapdebug (AIX specific), -qcheck=bound (implied by -qcheck=all) [all compilers] -DEBUG:subscript_check
[library] man malloc_ss
[cc] -check_bounds
[tools] atom -tool third
build in debug mode, buffer security check: /GS (MS C++ 7.0), /RTC1 (MS C++ 7.0)
[tools] commercial tools, MS pageheap, built-in facilities defined in crtdbg.h
flags for debugger -g, -ggdb, -g3, -ggdb3 -g, -xs, -g0 -g, [aCC] -g0, +objdebug, +d -g, -qfullpath, -qlinedebug -g, -g3, [CC] -gslim -g0, -g1, -g2, -g3, [cxx] -gall build in debug mode
reentrant code (glibc based systems, e.g. Linux) -D_REENTRANT -mt -D_POSIX_C_SOURCE=199506L (c.f. man pthread)
[aCC >=3.30] -mt
use the ..._r commands (xlc_r, ...) -D_POSIX_C_SOURCE=199506L (c.f. man 3 intro) [all compilers] -pthread /MD, /ML, /MT
GNU compilers - Linux
(and any supported platform by gcc)
Sun compilers - Sun Solaris HP compilers - HP HP-UX IBM compilers - IBM AIX, Linux, … SGI compilers - SGI IRIX Compaq compilers - Tru64 Unix Microsoft C++ compiler - Windows

Fortran

Debugging capabilities with some development environments for Fortran
GNU compiler - Linux
(and any supported platform by gcc)
Intel compiler - Linux Sun compiler - Sun Solaris HP compiler - HP HP-UX IBM compiler - IBM AIX, Linux, … SGI compilers - SGI IRIX Compaq compiler - Tru64 Unix Compaq Fortran compiler - Windows Salford Fortran compilers - Windows NAGWare Fortran compiler
(any supported platform)
Lahey Fortran compiler
version of targeted tools GNU compiler
(gfortran [4.2.x], g77 [discontinued after 3.4.x])
Intel Fortran compiler
(ifort [9.1])
Sun compiler
(f90, f95 [Workshop 7.0])
HP compiler
(f90 [2.4])
IBM compiler
(xlf [10.1])
SGI compilers
(f90 [7.3])
Compaq compiler
(discontinued: f90, f95 [5.5])
Compaq Visual Fortran (discontinued: DF [6.6])
(Windows Fortran compilers)
Salford Fortran compilers
FTN77 [4.0], FTN95 [3.0]
(Windows Fortran compilers)
NAGWare compiler
(f95 [5.0])
Lahey/Fujitsu compiler
lf95 [6.0]
(Windows Fortran compilers)
verify syntax only -fsyntax-only -syntax, -y N/A N/A N/A -Hf (was -fe) -syntax_only /syntax_only N/A -M
-M -nomod (no module files produced)
N/A
floating point trapping
(note)
principle [gfortran] -ffpe-trap=xxx
[g77] system dependent API calls
(references)
compiler support
(-fpe<x>)
compiler support: -ftrap=xxx linker support: +FP xxx compiler support: -qflttrap=xxx link with -lfpe + setenv TRAP_FPE ... compiler support
(-fpe<x>)
/fpe:<level> API calls compiler support: -ieee=xxx --trap <args>
trap DIV, INV, OV [gfortran] -ffpe-trap=invalid,zero,overflow
[g77] API calls (Linux/glibc and x86 examples)
-fpe 0 -ftrap=common or -fnonstd +FP VZO -qflttrap=inv:ov:zero:en see example default (-fpe) /fpe:0 (non default on x86) default default (-ieee=stop) -trap dio
integer trapping overflow: -ftrapv
divide by zero: default
overflow: N/A
divide by zero: default
overflow: N/A
divide by zero: default
overflow: possible with directive
divide by zero: default
overflow: N/A
divide by zero: default
overflow: -DEBUG:div_check=3 (seems not to work)
divide by zero: default
overflow: -check overflow
divide by zero: default
overflow: /check:overflow
divide by zero: default
overflow: N/A
divide by zero: default
overflow: N/A
divide by zero: default
overflow: N/A
divide by zero: default
GNU compiler - Linux
(and any supported platform by gcc)
Intel compiler - Linux Sun compiler - Sun Solaris HP compiler - HP HP-UX IBM compiler - IBM AIX, Linux, … SGI compilers - SGI IRIX Compaq compiler - Tru64 Unix Compaq Fortran compiler - Windows Salford Fortran compilers - Windows NAGWare Fortran compiler
(any supported platform)
Lahey Fortran compiler
standard conformance -pedantic -stand -ansi +langlvl=xx -qlanglvl=<xx> -ansi -std<xx> /stand /ANSI, [FTN95] /ISO, /RESTRICT_SYNTAX default --f95
run-time detection of uninitialized variable
(note)
[tools] On linux x86, valgrind
[gfortran >= 4.3] -finit-real=nan, -finit-init=xxx, -finit-logical=xxx
(f2c has -trapuv since June 2001)
(compile with -auto) [compiler support] -ftrapuv (ifort 9.0 does not use NaN which makes it less useful), -check uninit
[tools] valgrind on x86
[tools] various available (compile with -stackvar) N/A [compiler support] for stack storage: -qinitiauto=FFF00000
In practice [xlf] -qnosave -qinitauto=FFF00000 -qflttrap=inv:ov:zero:en
[compiler support] -DEBUG:trap_uninitialized (was -trapuv)
static memory: -Wl,'-f 0xFFFFFFFF'
[compiler support] -trapuv (compile with -automatic)
(static memory: -Wl,'-f 0xFFFFFFFF' doesn't operate as on SGI)
[tools] atom -tool third
[compiler support] N/A (/automatic may help somewhat, see as well 1)
[tools] commercial tools
/UNDEF [compiler support] -nan, -C=undefined
[tools] some may help
--check
compile time flow analysis
(note)
-Wuninitialized -O (>=10.x) -diag-enable sv (disable object file generation) -XlistE +Onoinitcheck N/A ftnlint (limited analysis) -automatic (optimisation must be on)
(-warn uninitialized on by default)
default with /automatic default limited ??
put literal strings in read-only memory default default (/assume:protect_constants) N/A N/A N/A N/A -readonly_strings, -assume protect_constants (default) default (/assume:protect_constants) /CHECK, [FTN95] /FULL_UNDEF default or N/A --npca/--pca
GNU compiler - Linux
(and any supported platform by gcc)
Intel compiler - Linux Sun compiler - Sun Solaris HP compiler - HP HP-UX IBM compiler - IBM AIX, Linux, … SGI compilers - SGI IRIX Compaq compiler - Tru64 Unix Compaq Fortran compiler - Windows Salford Fortran compilers - Windows NAGWare Fortran compiler
(any supported platform)
Lahey Fortran compiler
abort on deferencing null pointer N/A -check pointer ?? ?? ?? ?? ?? ?? /FULL_UNDEF -C=pointer ??
stack oriented/static allocation [gfortran >= 4.3] -frecursive, [g77] default / -fno-automatic -auto / -save (default is -auto_scalar) -stackvar / default default (+nosave) / +save -qnosave / -qsave default / -static -automatic / default (-static) /automatic / default (/static) default / /SAV default / -save default (--nsav) / --sav
disallow implicit declaration [gfortran] -fimplicit-none
[g77] -Wimplicit
-u, -implicitnone -u +implicit_none -u (or -qundef) -u -u (or -warn declarations) /warn:declarations /IMPLICIT_NONE -u --in
check calls compile time [g77] (per file) default (across files) -gen-interfaces and -warn interfaces (per file) default ?? N/A (per file) default (per file) -warn argument_checking (per file) /warn:argument_checking N/A (per file) default (per file) default
link or run time N/A mismatch in number of arguments (Windows only) /iface:cvf N/A N/A link time:-qextchk (AIX specific) N/A N/A mismatch in number of arguments detected due to stdcall convention /CHECK, /FULLCHECK, [FTN95] /FULL_UNDEF run time: -C=calls --check, --checkglobal
GNU compiler - Linux
(and any supported platform by gcc)
Intel compiler - Linux Sun compiler - Sun Solaris HP compiler - HP HP-UX IBM compiler - IBM AIX, Linux, … SGI compilers - SGI IRIX Compaq compiler - Tru64 Unix Compaq Fortran compiler - Windows Salford Fortran compilers - Windows NAGWare Fortran compiler
(any supported platform)
Lahey Fortran compiler
link or run time memory debugging
(note 1, note 2)
[gfortran, g77] -fbounds-check
[tools] various (Valgrind, ...)
[library] Linux/glibc
-check bounds -C, -xcheck=stkovf (f95 >= 7.0)
[tools] various available
+check=all
[tools] gdb (aka wdb)
-C -DEBUG:subscript_check (set the environment variable F90_BOUNDS_CHECK_ABORT to "YES") -C
[tools] atom -tool third
[DF] /check:bounds /CHECK, /FULLCHECK, [FTN95] /FULL_UNDEF [f95] -C
memory tracing: -mtrace
[tools] various
--check
stack trace on crash gfortran >= 4.3 -fbacktrace -traceback +fp_exception -qsigtrap=xl__trcedump /traceback -gline default (--trace)
flags for debugger -g, -ggdb, -g3, -ggdb3 -g, -inline_debug_info -g, -xs, -g0 -g -g, -qfullpath -g, -g3 -g0, -g1, -g2, -g3, [f95] -assume gfullpath, [f95] -ladebug build in debug mode /DEBUG, [FTN95] /FULL_DEBUG -g -g
reentrant code ?? -recursive, -threads ?? ?? use the ..._r commands (xlf_r, ...) ?? -reentrancy threaded /recursive, /threads /MULTI_THREADED (FTN95>=3.0) -thread_safe ??
GNU compiler - Linux
(and any supported platform by gcc)
Intel compiler - Linux Sun compiler - Sun Solaris HP compiler - HP HP-UX IBM compiler - IBM AIX, Linux, … SGI compilers - SGI IRIX Compaq compiler - Tru64 Unix Compaq Fortran compiler - Windows Salford Fortran compilers - Windows NAGWare Fortran compiler
(any supported platform)
Lahey Fortran compiler

Notes

  1. For details about this table, see the relevant section in this page.
  2. floating point trapping
    • IEEE 754: most people choose to trap on Invalid, Divide by zero and Overflow, and to ignore Underflow and Inexact.
    • For more information, refer to the floating point section.
  3. Run-tine detection of uninitialized variable can be done:
    • by filling each variable with some "special" value (depending on the type: SNaN, 0x80 or 0x81 on each byte, an out-of-bound address, ...) that may or may not create a harware trap upon usage (caveat: assignments may not trigger a trap). Without a trap, the detection is not perfect on its own but the overhead is kept low. Often thread-safe and compatible with full optimisation. This technique was pioneered with WATFOR on IBM 7040 in 1965 and exploited some hardware peculiarity (parity check).
      As well, this technique can be coupled with some additional run-time checks. See for instance, Salford FNT77 User's guide, chapter 8.
      Example: Salford compilers, SGI compilers, Cray compilers, nag_lvi of the NAGWare f77 tools, ...
    • by using some kind a memory colouring algorithm. The detection is thorough. May or may not be thread-safe depending on the implementation. The overhead can be large.
      Example: Lahey compilers, Rational purify, Julian Seward's valgrind on Linux x86, Sun dbx, atom on Compaq

    This may implemented:

    1. by a source-to-source preprocessor or the compiler system
      • +: complete semantic visibility
      • -: need recompilation
      • -: can't deal with binary library
    2. by a postprocessor at machine code level
      • +: no recompilation. possible relink.
      • +: can deal with binary library
      • -: semantic lost

    Note: Fortran applications benefit from being built with a "no save" flag. This can lead to a stack overflow (see Increasing the stack size). Fortran static data should be initialized. Some linkers offer some detection capability.

  4. compile time flow analysis
    For Fortran 77, ftnchek does some good flow analysis.
  5. memory debugging
    • covers different topics: out-of-bound access (on the stack, on the heap, in static memory), memory leaks, ...
    • in practice, purify and valgrind do the most thorough job.
    • Various free tools can be very valuable on many platforms.
    • For completeness, some hardware supports debugging, for instance the Unisys A-Series (as well Risks Digest 23.24). See as well John R. Mashey's point of view.
  6. Aliasing: refer to the aliasing section.
  7. Selective Fortran bound checking
    Sometimes this kind of instrumentation cannot be applied everywhere because some routines rely on dirty tricks. One workaround is to disable locally the feature by using a directive if the compiler offers it. It's possible at least with:

    • SGI and Cray f90 (portable comment "!DIR$ [NO]BOUNDS")
    • SGI f77, Sun f77 and Compaq compilers (non portable statement "OPTIONS /CHECK=NOBOUNDS" before the program unit)
    • xlf on AIX (non portable directive "@PROCESS NOCHECK")

    The eventual portability problem can be solved either by using INCLUDE (or #include) with a small file containing the relevant code or by tagging it with some #ifdef.
    Of course, an alternative is to apply selectively the compile-line option in the build procedure.

[Contents]


Compaq (Digital)

Debug flags

[all compilers] -trapuv
Forces all uninitialized stack variables to be initialized with 0xfff58005fff58005. When this value is used as a floating-point variable, it is treated as a floating-point NaN and causes a floating-point trap. When it is used as a pointer, an address or segmentation violation usually occurs.
With Fortran, think of using -automatic.
[Fortran] -automatic
Places local variables on the run-time stack.
[Fortran] -C
check for subscripts out of range at runtime
[Fortran] -check <xxx>
Add some checks. Check (!) it out in the manual pages (man f77, man f90).
advice: -check format -check output_conversion -check overflow [-check underflow]
[Fortran] -syntax_only
Specifies that the source file will be checked only for correct syntax. No code is generated, no object file is produced, and some error checking done by the optimizer is bypassed (for example, checking for uninitialized variables). This option lets you do a quick syntax check of your source file. The default is -nosyntax_only.
[Fortran] -u or -warn declarations
Makes the default type of a variable undefined (IMPLICIT NONE), which causes the compiler to issue a warning for any undeclared symbols.
[Fortran] -warn <xxx>
Enable some warnings. Check it out in the manual pages. (man f77, man f90).
advice: -warn argument_checking
[cc] -w
Activate warnings.
[cc] -warnprotos
Causes the compiler to produce warning messages when a function is called that is not declared with a full prototype. This checking is more strict than required by ANSI C.
[cc] -check
Performs compile-time code checking. With this flag, the compiler checks for code that exhibits non portable behavior, represents a possible unintended code sequence, or possibly affects operation of the program because of a quiet change in the ANSI C Standard. Some of these checks have traditionally been associated with the lint utility. This flag is available for -newc and -migrate only.
[cc] -portable
Enables the issuance of diagnostics that warn about any non portable usages encountered. This flag is not available when you use the -oldc flag.
[cc] -std
Enforces the ANSI C standard, but allows some common programming practices disallowed by the standard.
[cc] -varargs
Prints warnings for all lines that may require the varargs.h macros.
[cc, cxx] -readonly_strings
Causes all string literals to be read-only.
[cc, cxx] -Hf
Halt processing after compiling and template instantiating (if applies)

Tracing flags

-Wl,'-ySYMBOL'
Print the name of each linked file in which SYMBOL appears.

Runtime checking

[Contents]


Linux/glibc

Documentation

Runtime checking

Floating point trapping

Build with gcc -c trapfpe.c and link with trapfpe.o or build with gcc -shared -o trapfpe.so trapfpe.c and set LD_PRELOAD to this library. Adapted from: info g77 'Trouble' 'Missing Features' 'Floating-point Exception Handling'.

Starting with glibc 2.2, the following C99-style (but glibc specific) code is preferred.

#define _GNU_SOURCE 1
#include <fenv.h>
static void __attribute__ ((constructor)) trapfpe(void)
{
  /* Enable some exceptions. At startup all exceptions are masked. */
  feenableexcept(FE_INVALID|FE_DIVBYZERO|FE_OVERFLOW);
}

Previous versions of glibc require some platform dependent code (x86 specific).

[Contents]


GNU compilers

Documentation

Look for the "info" pages. Under emacs on a Linux box: "C-h i".

Interesting "debug" nodes:

Debug flags

-fsyntax-only
Check the code for syntax errors, but do not do anything beyond that.
-ggdb
Produce debugging information for use by GDB. This means to use the most expressive format available, including GDB extensions if at all possible.
-g3, -ggdb3
Level 3 includes extra information, such as all the macro definitions present in the program. Some debuggers support macro expansion when you use -g3.
-fmessage-length=0
Messages on one line, i.e. not split. Useful with some editors' "jump on error" mode.
[gcc] -Wall -W -Wstrict-prototypes -Wwrite-strings -pedantic -O (add -Wold-style-definition with gcc 3.4 and later)
[g++] -Wall -W -Wwrite-strings -pedantic -O
activate all the standard warnings.
[gcc, g++] -Wunused-macros
warn about macros defined in the main file that are unused.
[g++] -Weffc++
warn about guidelines Scott Meyers' "Effective C++" books. Noisy but useful. Use with -fmessage-length=0 and grep -v as a filter.
[g++] -Wold-style-cast
warn about C-style casts.
[g77] -Wall -W -Wsurprising -pedantic -O
activate all the standard warnings.
[g77] -Wimplicit
Warn whenever a variable, array, or function is implicitly declared.
[g77 (>= 2.95)] -fbounds-check
run-time checks for array subscripts and substring.
[all compilers (>= 2.95)] -fstrict-aliasing
Allows the compiler to assume the strictest aliasing rules applicable to the language being compiled.
It may help to find buggy code. In principle, code produced by f2c should be safe for such optimization. Implied by -O2 since gcc 3.x.

"lint" with gcc (by James Hu) - somewhat out-of-date:

glint() {
gcc -ansi -pedantic -pedantic-errors -O \
-Wall -W -Wtraditional -Wpointer-arith -Wbad-function-cast \
-Wcast-qual -Wcast-align -Wwrite-strings -Wconversion \
-Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations \
-S -o - "$@" > /dev/null; }

Joseph Myers' favourite warnings as of gcc 3.4.

Tracing flags

Some useful flags to find out what's going on. Many others are available.

-v
Print (on standard error output) the commands executed to run the stages of compilation.
-H
Print the name of each header file used, in addition to other normal activities.
[gcc] -E -dM -xc /dev/null
shows which symbols are defined. Use -xc++ for C++.
[GNU ld] -Wl,-M
Print a link map to the standard output.
[GNU ld] -Wl,--cref
Output a cross reference table.
[GNU ld] -Wl,'-y SYMBOL'
Print the name of each linked file in which SYMBOL appears.

libstd++ debug mode

According the documentation, to use the libstdc++ debug mode, compile your application with the compiler flag -D_GLIBCXX_DEBUG.

Floating point implementation

GCC (at least up to the 3.4 release), does not offer "referential transparency" on Intel x86 that uses floating point extended registers. It can lead to puzzling results as explained by Brad Lucier who showed as well how the compiler could be fixed. The x87 architecture which has only 8 80-bit FPU registers is particularly sensitive to uncontrollable register spills to 64 bits in memory that cause double roundings. Furthermore, variables allocated on the stack may be optimized away in register, changing the precision and producing different results depending on the level of optimization of the program. Some possible actions are detailed below. Note that the x86-64 architecture in 64 bit mode uses the SSE floating point device and not the x87 stack and is therefore immune to this problem.

Note that even if gcc was producing code that spills double extended register on stack without rounding, the results will still occasionally be different than on a architecture without extended arithmetic and that due to double rounding.

[Contents]


HP

Debug flags

[all compilers] +FP flags
Specify how the runtime environment for trapping floating-point operations should be initialized at program startup. The default is that all traps are disabled. See ld(1) for specific values for flags.
You can try: +FP VZOuiD
[f90] +fp_exceptions
[f77] +T
Enable floating-point exceptions and cause the running program to issue a procedure traceback for runtime errors. Can be mixed with +FP
[f90] +check=all
Enable compile-time range checking of array subscripts.
[f90] +langlvl={90|default}
Issue warnings for all extensions to the Fortran 90 standard (+langlvl=90). The default, +langlvl=default, allows extensions.
[f90] +implicit_none
Cause the types of identifiers to be implicitly undefined.
[cc] -A<mode>
Specify the compilation standard to be used by the compiler.
[cc] +w<n>
Specify the level of the warning messages.
[aCC] -Aa
ISO C++ conformance
[aCC] +w
Warn about all questionable constructs.
[aCC] +p
Disallows all anachronistic constructs.
[all compilers] +Onoinitcheck
The optimizer does not initialize uninitialized variables, but issues warning messages when it discovers them.
[cc] +ESlit
Place string literals and const-qualified data into read-only memory.
[cc, aCC] -z
Do not bind anything to address zero. This option allows run-time detection of null pointers.

Tracing flags

-Wc,-list,progress
get the complete list of include files.
-Wl,-y SYMBOL
Print the name of each linked file in which SYMBOL appears.

Runtime checking

wdb 2.0 and later (HP-supported version of gdb) offers some memory debugging capability.

[Contents]


IBM

xlc and xlf debug flags

-qflttrap
Generate instructions to trap floating-point exceptions
(try: -qflttrap=inv:ov:zero:en)
-qfloat=nans
Detects (at run time) operations that involve signaling NaN values (SNaN)
-qflag=<sev1>:<sev2>
Specifies severity level of diagnostics to be reported in listing, <sev1>, and on screen, <sev2>. Severity levels include:
I
Informational messages.
L
Language-level messages.
W
Warning messages.
E
Error messages.
S
Severe error messages.
U
Unrecoverable error messages.
-qinitauto=<hh>
Initialize automatic storage to <hh>. <hh> is a hexadecimal value. This generates extra code and should only be used for error determination.
Try: [xlc] -qinitauto=FF, [xlf] -qinitauto=FFF00000
-qlanglvl=<langlvl>
Specify language level to be used during compilation. ([xlc] <langlvl> can be ansi, saal2, saa, extended, or classic)

xlc debug flags

-qextchk (AIX specific)
Perform external name type-checking and function call checking.
-qheapdebug (AIX specific)
Enables debug versions of memory management functions
-qlinedebug
Generates abbreviated line number and source file name information for the debugger.
-qro
Put string literals in read only area.
-qroconst
Put constant values in read only area.
-qsyntaxonly
Causes the compiler to perform syntax checking without generating an object file.
-qcheck=<option>
Generate code to check for run-time checks.
nullptr
Runtime checking of addresses contained in pointer variables used to reference storage.
bounds
Runtime checking of addresses when subscripting within an object of known size.
divzero
Runtime checking of integer division. A trap will occur if an attempt is made to divide by zero.
all
Switches on all the above suboptions.
-qinfo=all
Produce additional lint-like messages. Turns on all diagnostic messages for all groups.

xlf debug flags

Note: xxlf allows one to put together the options through a GUI.

-C or -qcheck
Performs run-time checking of array bounds and character substring expressions.
-qextchk (AIX specific)
Performs procedure interface checking as well as detection of mismatched common blocks.
Done at link time.
-qnosave
Sets the storage class of local variables to AUTOMATIC.
-u or -qundef
Specifies undefined (no) implicit data typing.

[Contents]


NAGWare

Availability and documentation

Debug flags

"Mandatory" options

NAGWare f95 offers several flags that must be used unreservedly during development time.

-C (and -C=all)
Compile code with most runtime checks enabled (including check of array bounds and check of procedure references). Version 5.0 of the compiler introduces "-C=undefined" that detects uninitialised variables. See documentation for limitation.
Do not think twice: hard code -C in your Makefile.
-nan
Initialise floating point variables to IEEE Signalling NaN. This includes local variables and INTENT(OUT) dummy arguments. This may be useful for finding uninitialised floating point variables while keeping the overhead low. Works only when -ieee=<mode> is set to "stop" (the default).

Other useful flags

-ieee=<mode>
Set the mode of IEEE arithmetic operation according to mode, which must be one of full, nonstd or stop.
The default mode, -ieee=stop, is recommended for most situations.
-info
Request output of information messages.
-M
Produce module information files (.mod files) only.
In effect, it can be used to check syntax with minimal compilation.
-M -nomod allows a syntax check without any file produced.
-u
Specify that IMPLICIT NONE is in effect by default, unless overridden by explicit IMPLICIT statements.

Runtime checking

f95 offers run-time checking through the option -C (even when an array's last dimension is declared as * - look at the man page), can trap uninitialised floating point variables with -nan and perform uninitialised variables detection (-C=undefined).
Additionnaly, depending on the platform, you can:

[Contents]


Portland Group compilers

Availability and documentation

Debug flags

-g
Includes debugging information in the object module.
-Ktrap=option[,option]...
Controls the behavior of the IA-32 processor when IA-32 floating-point exceptions occurs. -Ktrap=fp is equivalent to -Ktrap=inv,divz,ovf.
-Mpgflag
Selects options for code generation
[no]bounds
specifies whether array bounds checking is enabled or disabled.
chkfpstk
check for internal consistency of the x86 FP stack in the prologue of a function and after returning from a function or subroutine call.
chkptr
check for NULL pointers (pgf90 and pghpf only).
chkstk
check the stack for available space upon entry to and before the start of a parallel region. Useful when many private variables are declared.
[no]dclchk
determines whether all program variables must be declared (pgf77, pgf90, and pghpf only).
[no]depchk
checks for potential data dependences.
info
print informational messages regarding optimization and code generation to standard output as compilation proceeds.
[no]iomutex
determines whether critical sections are generated around Fortran I/O calls (pgf77, pgf90, and pghpf only).
[no]save
determines whether the compiler assumes that all local variables are subject to the SAVE statement (pgf77, pgf90, and pghpf only).
standard
causes the compiler to flag source code that does not conform to the ANSI standard (pgf77, pgf90, and pghpf only).
[no]unixlogical
determines whether logical .TRUE. and .FALSE. are determined by non-zero (TRUE) and zero (FALSE) values for unixlogical. With nounixlogical, the default, -1 values are TRUE and 0 values are FALSE (pgf77, pgf90, and pghpf only).

SGI

Debug flags

[all compilers] -DEBUG:subscript_check (or -C, Fortran only)
check for subscripts out of range at runtime
[all compilers] -DEBUG:trap_uninitialized (or -trapuv)
Force all un-initialized stack, automatic and dynamically allocated variables to be initialized with 0xFFFA5A5A. When this value is used as a floating point variable, it is treated as a floating point NaN and it will cause a floating point trap. When it is used as a pointer, an address or segmentation violation will most likely to occur.
Note: be aware that the program will trap on IEEE invalid exception. This exception can be produced by something else than an uninitialized variable, for instance sqrt(-1.).
[all compilers] -DEBUG:div_check=3
check all integer divides for zero divisors or overflow at run time.
[all compilers except f90] -use_readonly_const -G0 -rdata_shared
Puts string literals and file-level (static, common, or external) const qualified initialized variables into a .rodata section to separate these objects from data likely to be modified. This is the default. However, if you want constants to not be writable, then in addition to specifying -use_readonly_const, you must also specify -G0 -rdata_shared, because by default, the linker makes .rodata and gp-relative sections writable.
[all compilers] -Wl,'-f <fill>'
Sets the fill pattern for holes between sections within an output segment. The argument fill must be a four-byte hexadecimal constant.
In practice 0xFFFFFFFF will allow the detection of uninitialized static variables, essentially useful with Fortran.
[all compilers] -ansi
strict ansi
[all compilers] -fullwarn
full warnings
[all compilers] -Hf (was -fe)
check syntax only (stop afer "front end")

Type "man DEBUG_group" for more information.

Floating point trapping

First of all, any program compiled will -trapuv will trap on invalid.

If your program is linked with -lfpe, then this flag forces floating-point errors to trap following the value of TRAP_FPE. Look at "man handle_sigfpes" for more information.

You can use these aliases (for your .cshrc) (adapted from Peter Shenkin):

# floating point trap on SGI. Must be linked with -lfpe.
# c.f. man handle_sigfpes.
alias fpdebug setenv TRAP_FPE \
"UNDERFL=DEFAULT\;OVERFL=TRACE\(1\)\,ABORT\;DIVZERO=TRACE\(1\)\,ABORT\;INVALID=TRACE\(1\)\,ABORT"
alias fpundebug 'unsetenv TRAP_FPE'
# trap by default
fpdebug

If I say "fpdebug" at the shell level, then run a program linked with -lfpe, IEEE floating-point exceptions will be trapped ; if I haven't said "fpdebug", or have later said "fpundebug", the program will execute normally.

Tracing flags

-Wl,'-ySYMBOL'
Print the name of each linked file in which SYMBOL appears.

[Contents]


Sun

Debug flags

[all compilers] -xs
Allow debugging by dbx without .o files.
[all compilers] -ftrap=common
Trap on invalid, division by zero, and overflow.
To be effective this option must be used when compiling the main program.
[all compilers] -fnonstd
-ftrap=common + disable gradual underflow.
[all compilers] -xcheck=stkovf (>= 7.0)
Checks for overflow at start of a function.
[Fortran] -stackvar
Allocate local variables on the stack. Useful to help catching uninitialized variables.
[Fortran] -ansi
Identify many non ANSI extensions.
[Fortran] -C
Check array references for out of range subscripts.
[Fortran] -XlistE
Do global program checking.
Try: -XlistE -Xlisto /dev/tty
[Fortran] -fpover=yes
Detect floating-point overflow in formatted input.
[C] -fd
Reports K&R function declarations and definitions.
[C] -X[a|c|s|t]
Specifies the degree of conformance to the ANSI/ISO C standard.
[C] -xe
Performs only syntax and semantic checking on the source file, but does not produce any object or executable file.
[C] -xstrconst
Inserts string literals into the read-only data section of the text segment instead of the default data segment.
[C++] +w
Prints extra warnings where necessary.
[C++] +w2
Prints even more warnings.
[C++] -features=conststrings
Inserts string literals into the read-only memory.

Tracing flags

-H
Print the name of each header file used.

Runtime checking

Some tools can help to detect uninitialized variables, memory corruptions and leaks (with Sun Fortran, think of -stackvar).

[Contents]


Windows and Linux Fortran compilers

Debug flags

[Contents]


x86 specifics

Floating point trapping

The examples below show how to trap the exceptions Invalid, Divide by zero and Overflow on the x87 and SSE floating point units.

Floating point trapping on x87 and SSE FPU
Systems Example for x87 Example for SSE
Windows APIs Use _controlfp or _control87.
#include <float.h>
unsigned int cw;
/* could use _controlfp */
cw = _control87(0,0) & MCW_EM;
cw &= ~(_EM_INVALID|_EM_ZERODIVIDE|_EM_OVERFLOW);
_control87(cw,MCW_EM);
With Visual Studio 2005, _controlfp and _control87 affect the control words for both the x87 and the SSE FPU.
Linux/glibc 2.2 and later Use feenableexcept.
#define _GNU_SOURCE 1
#include <fenv.h>
feenableexcept(FE_INVALID|FE_DIVBYZERO|FE_OVERFLOW);
Use feenableexcept, which sets both the x87 and the SSE control words from glibc 2.3.3 onwards for x86_32.
Linux/glibc 2.1 and older
#include <fpu_control.h>
fpu_control_t cw;
_FPU_GETCW(cw);
cw &= ~(_FPU_MASK_IM | _FPU_MASK_ZM | _FPU_MASK_OM);
_FPU_SETCW(cw);
These APIs do not affect the SSE FPU.
FreeBSD FreeBSD post March 2005 implements feenableexcept. For older versions, follow the example below.
#include <floatingpoint.h>
fp_except_t cw;
cw = fpgetmask();
fpsetmask(cw & ~(FP_X_INV | FP_X_DZ | FP_X_OFL));
Unknown
Compilers supporting xmmintrin.h These APIs do not affect the x87 FPU.
#include <xmmintrin.h>
_MM_SET_EXCEPTION_MASK(_MM_GET_EXCEPTION_MASK() &
                       ~(_MM_MASK_INVALID|
                         _MM_MASK_DIV_ZERO|
                         _MM_MASK_OVERFLOW)
                       );
gcc compatible assembler
(e.g. Cygwin)
unsigned int cw;
__asm__ __volatile__ ("fnstcw %0" : "=m" (cw));
cw &= ~(0x01 | 0x04 | 0x08);
__asm__ __volatile__ ("fldcw %0" : : "m" (cw));
unsigned int cw;
__asm__  __volatile__ ("stmxcsr %0" : "=m" (cw));
cw &= ~((0x01|0x04|0x08) << 7);
__asm__  __volatile__ ("ldmxcsr %0" : : "m" (cw));

Floating point precision mode

Under Linux/glibc, the x87 floating point processing unit operates by default under double extended precision. Under Win32 and FreeBSD, the default is set to double precision mode. This can lead to observed differences for some algorithms. Setting the precision mode and restoring it around a specific section is the most reliable way to fix the problem as illustrated below.

References:

#if defined(_WIN32)
# include <float.h>
# ifdef SINGLE
#  define _CW_PREC PC_24
# else
#  define _CW_PREC PC_53
# endif
# define x86_SetPrecision \
  unsigned int _oldcw_pc; \
  _oldcw_pc = _control87(0,0) & MCW_PC; \
  _control87(_CW_PREC,MCW_PC)
# define x86_RestorePrecision \
  _control87(_oldcw_pc,MCW_PC)

#elif defined(i386) && defined(__FreeBSD__)
# include <floatingpoint.h>
# ifdef SINGLE
#  define _CW_PREC PC_PS
# else
#  define _CW_PREC PC_PD
# endif
# define x86_SetPrecision \
  fp_prec_t _oldcw_pc; \
  _oldcw_pc = fpgetprec(); \
  fpsetprec(_CW_PREC) \
# define x86_RestorePrecision \
  fpsetprec(_oldcw_pc)

#elif defined(i386) && defined(__GNUC__)
# ifdef SINGLE
#  define _CW_PREC _FPU_SINGLE
# else
#  define _CW_PREC _FPU_DOUBLE
# endif
# if defined(linux)
#  include <fpu_control.h>
# else
#  define _FPU_EXTENDED 0x300
#  define _FPU_DOUBLE   0x200
#  define _FPU_SINGLE   0x0
#  define _FPU_GETCW(cw) __asm__ __volatile__("fnstcw %0" : "=m" (*&cw))
#  define _FPU_SETCW(cw) __asm__ __volatile__("fldcw %0" : : "m" (*&cw))
#  define fpu_control_t unsigned int
# endif
# define x86_SetPrecision \
  fpu_control_t _oldcw_pc; \
  { fpu_control_t _cw; \
    _FPU_GETCW(_cw); \
    _oldcw_pc = _cw & _FPU_EXTENDED; \
    _cw = (_cw & ~_FPU_EXTENDED) | _CW_PREC; \
    _FPU_SETCW(_cw); \
  }
# define x86_RestorePrecision \
  { fpu_control_t _cw; \
    _FPU_GETCW(_cw); \
    _cw = (_cw & ~_FPU_EXTENDED) | _oldcw_pc; \
    _FPU_SETCW(_cw); \
  }

#else
# define x86_SetPrecision
# define x86_RestorePrecision
#endif
/* UNTESTED, see "ieee_flags" for an alternative */
#if defined(i386) && defined(__sun) && \
    (defined(__SUNPRO_C) || defined(__SUNPRO_CC)) 
# include <fenv.h>
# ifdef SINGLE
#  define _CW_PREC FE_FLTPREC
# else
#  define _CW_PREC FE_DBLPREC
# endif
# define x86_SetPrecision \
  int _oldcw_pc; \
  _oldcw_pc = fegetprec(); \
  fesetprec(_CW_PREC) \
# define x86_RestorePrecision \
  fesetprec(_oldcw_pc)
#endif

Compilers on super-computers

NEC SX

Refer to the CNRS/IDRIS documentation (in French).

Cray debug flags

[f90] -e ni
i
Generates a run-time error when an uninitialized local real or integer variable is used in a floating-point operation or array subscript
Also see the -f option on segldr(1) [f90 -Wl"-f indef"] and the -D preset= option on cld(1).
n
Generates messages to note all nonstandard Fortran usage, based on the Fortran 95 standard
[f90] -Wl"-f indef"
set undefined value in static variables (i.e. "COMMON")
[f90] -m 0
Message types enabled: Error, warning, caution, note, and comment
[f90] -R runchk
run-time checks. Look at the documentation.
Ex: -Rab: arguments (type and number) and bound checking
[f90] -t num (UNICOS Systems Only)
The -t num option specifies the number of bits to be truncated on floating-point operations. For num, enter an integer in the range 0 num 47. The default is 0.
Useful for stability test.

[Contents]


Optimization

This page does not deal with optimization. However here are some relevant links:

[Contents]


Acknowledgments

Malcolm Cohen, Mario Deilmann, James Giles, Herman D. Knoble, Jean-Yves L'Excellent, Arjen Markus, Michel Olagnon, Gareth Shaw

[Contents]


$Id: CompilerTricks.html,v 1.260 2008/01/28 13:51:50 adesitter Exp $ | Valid XHTML and CSS

by Arnaud Desitter.
© Arnaud Desitter, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008
This page can be redistributed as long as the copyright mention is preserved.