[Home]
[Search]
[Contents]
Optimizing Code
Optimizers modify compiled code to make it more efficient.
This chapter describes how the Digital Mars global optimization
feature works and how to use it to produce faster and smaller programs.
What's in This Chapter:
- What global optimization is, and how it differs from local
optimization.
- When and how to use the global optimization feature.
- Global optimizer options in alphabetical order.
- Hints and tips for maximizing the efficiency of
optimization.
About Global Optimization
A global optimizer (as opposed to a local optimizer) analyzes and
optimizes all the code for a function as a whole. Local optimizers
look at only one statement at a time or at short sequences of
statements. A global optimizer uses a powerful technique called
"data flow analysis" to gather information about each function. The
more an optimizer knows about a function, the more optimizations
that are possible. This technique is why global optimizers improve
code more than local optimizers do.
Advantages of using the global optimization feature
You can tune most C and C++ programs to a particular machine, and
the global optimization feature does few transformations that you
cannot duplicate at the source level. However, this kind of fine-tuning
is usually not appropriate because:
- Tuning code takes time. Compilers are meant to allow
programmers to be more productive, and global
optimization largely eliminates the need to tune programs
by hand.
- Cryptically tuned code is harder to maintain.
- Code that is tuned for one machine may turn out to be
poorly tuned for another. A widely applicable example of
this problem is the fact that different compilers support
different register values.
- Digital Mars C++ uses simple methods to expand inline
functions, relying on the global optimization feature to
"fix" them into more acceptable code.
- C++ is one of the few languages that permits implicit
promotion of user-defined types; global optimization
helps clean up the resulting temporary variables.
When to use the global optimization feature
Global optimization slows compilation but is a highly effective
timesaver on frequently executed programs. Optimization also
moves code to the extent that some of the correlation between
source and object code is lost. Therefore, do not use global
optimization in conjunction with the debugger.
Effects of optimization on performance
Typical computation-bound programs can speed up as much as 30
percent when optimized. I/O-bound programs may not speed up at
all. The most dramatic speed improvements are in small, frequently
executed loops. Speed improves least in code that consists primarily
of function calls.
Optimization versus assembly language code
Globally optimized code still does not run nearly as efficiently as
carefully crafted assembly language code. One reason is that the as-sembly
language programmer knows more about the algorithm than
the compiler knows. The compiler always optimizes conservatively,
to ensure safe optimizations. However, you may not have time to
carefully craft an entire program or program component in assembly
language; an optimizing compiler then offers the most cost-effective
solution.
What global optimization does not do
Global optimization does not replace inefficient algorithms with
efficient ones. It cannot recognize a bubble sort and replace it with a
quick sort. Regardless of how good a compiler might be, improving
the algorithm often produces the most dramatic increases in
execution speed.
Optimizing for Windows
All global optimization methods are safe for use with code that runs
on any version of Microsoft Windows.
Using the Global Optimization Feature To use the global optimization feature, specify the -o option to SC.
For example,
sc demo.c -o
means compile and optimize demo.c.
To optimize from the IDDE, use the Code Optimization subpage
on the Build page of the Project Settings dialog box.
Using Optimization Flags
If you use the -o option without specifying any flags (the default),
the compiler uses all optimization methods and optimizes the speed
of the program. The tradeoff is this option takes up a lot of memory.
Optimizing for speed vs. space
When space is at a premium, you can change the default of "speed
over space" to "space over speed," like this:
sc demo.c -o+space
This command compiles demo.c with all optimizations applied; the
speed optimization is turned off, and the space optimization is
turned on. For information about what speed and space
optimizations do, see "Space/speed tradeoffs" later in this chapter.
Turning off all optimization methods
To turn all optimizations off, use the flags -o-all or -o+none. The
-o+all and -o-none flags are equivalent to -o.
For example:
sc demo.c -o+none
compiles demo.c without performing global optimization.
Turning off specific optimization methods
If you must explicitly disable specific methods, do so by first turning
on all methods with -o and then disabling individual methods one
at a time with -flag. For example:
sc demo.c -o-cnp -o-cp
performs all optimizations except "constant propagation" and "copy
propagation."
Note:
You cannot turn on individual optimization
methods.
List of global optimization flags
The table below lists all the optimization flags. Complete
descriptions of each flag follow the table.
- all
- Perform all optimizations
- cnp
- Constant propagation
- cp
- Copy propagation
- cse
- Global common subexpressions elimination
- da
- Dead assignment elimination
- dc
- Dead code elimination
- dv
- Dead variable elimination; compute live ranges
- li
- Loop invariant removal
- liv
- Loop induction variable replacement
- loop
- Iterate until optimizations have no further effect
- none
- Do no optimizations
- reg
- Assign variables to registers where possible
- space
- Favor space optimizations over execution speed optimizations
- speed, time
- Favor execution speed (time) optimizations over space
optimizations
- vbe
- Very busy expressions
- w
- Turn on warnings
all Turns on every optimization except space. The flags none and
all are mutually exclusive; +all is equivalent to -none (shutting
off none). Use +all to optimize your code to the fullest extent.
cnp replaces certain variables with constants. Consider the code:
A = 5;
for (i = 0; i < A; i++)
abc[i] = A;
A always has the value 5 within the loop body. To optimize, the
compiler replaces A with its value:
A = 5;
for (i = 0; i < 5; i++)
abc[i] = 5;
Constant propagation opportunities frequently occur when loop
rotation is done. For example, constant propagation converts:
while (e)
expression;
to:
if (e)
do
expression;
while (e)
cp Copy propagation is similar to constant propagation, except it copies
variables instead of constants. For example, it replaces:
A = b;
for (i = 0; i < A; i++)
abc[i] = A;
with:
A = b;
for (i = 0; i < b; i++)
abc[i] = b;
Copy propagation frequently uncovers unnecessary assignments,
such as the assignment to A, that can be removed.
cse
Common subexpression elimination removes redundant
computations. For example:
plus10 = A + B + 10;
minus10 = A + B - 10;
tang = (A + B) / cos(B); // A + B is computed three times
becomes:
T = A + B;
plus10 = T + 10;
minus10 = T - 10;
tang = T / cos(B); // A + B is computed once
da
Eliminates assignments to variables that have no further use. For
example:
int abc()
{ int f;
static int g = 0;
if (g != 3)
{ f = 3; // A dead assignment to f
g = 3; // Not a dead assignment
}
}
dc
Dead code is code that can never execute. Sometimes dead code is
subtle:
#define VALUE 0
if (VALUE > 10)
a = dead + code; // This statement is dead
dv
Dead variables are automatic variables that are declared but never
used. They often result from other optimizations. The live range of a
variable is that portion of a function in which the variable's value
must be preserved. If the live range is zero, the variable is dead. If
two or more variables have nonintersecting live ranges, they can
occupy the same memory location.
When allocating a large number of variables into a fixed number of
registers, the compiler figures the live range of each register and
compares it with the live range of the corresponding variable. If they
do not intersect, the compiler assigns the variable to that register.
li
If an expression appears in a loop body but its result never changes,
then it is a loop invariant, and the compiler can remove its
computation from the loop. For example:
while (f()) // b * c is loop invariant
g(b * c);
becomes:
T = b * c;
while (f())
g(T);
liv
Loop invariant removal makes loops faster, especially those that cycle
through an array. It can make code slightly larger, however. For
example:
int a[ARRAY_SIZE], i;
for (i = 0; i < ARRAY_SIZE; i++)
a[i] = GetNextElement();
Without this optimization, your code has to perform a multiplication
each time it figures the address for the next array element
(i * sizeof( int)). With the optimization, it remembers the
address of the last element and adds the size of an element to that
address.
loop
Performing many optimizations makes further optimizations
possible. This switch tells the compiler to continue until it discovers
no further optimizations.
none
The none switch shuts off all optimizations, which is the same as
not using the optimization feature. The none and all
optimizations are mutually exclusive; +none is equal to -all
(shutting off all).
reg
reg puts as many variables in registers as possible. Variables are
assigned to registers based on how often they are used and how
deeply nested in loops they are.
Using the register storage class is unnecessary when you use the
global optimization feature. However, the compiler honors all such
declarations in your code.
space
speed/time
Sometimes you must choose whether you want your code to run
faster or to take up a minimal amount of memory. One specific
example is:
while (f())
g();
When optimized for space, the above example becomes:
goto L1;
do
{ g();
L1:;
} while (f());
When optimized for speed, the example becomes:
if (f())
{
do
g();
while(f());
}
Note:
If you don't specify +space, the compiler defaults
to optimizing for speed.
The speed and space flags are mutually exclusive:
-speed is the same as +space
-space is the same as +speed
vbe
Very busy expressions occur along more than one path in the flow
graph:
if (condition)
a = b * c;
else
d = b * c;
Here b * c is a very busy expression. The optimization is:
T = b * c;
if (condition)
a = T;
else
d = T;
This optimization only saves space; it does not increase execution
speed.
Optimization Tips
Here are some ideas to keep in mind when you use the global
optimization feature.
Use of const and volatile keywords
The const keyword defines a variable that is never modified, so the
compiler does not have to make worst-case assumptions about that
variable's changing.
The volatile keyword tells the compiler that the variable can
change asynchronously and never "optimizes out" loads and stores
to volatile variables.
You can use the const keyword for static tables and the volatile
keyword for hardware locations (like low-memory globals) and data
that could change when an interrupt occurs.
Example:
const static int a = 5, b[3] = { 1,2,7 };
extern volatile bool flag;
If a working program fails when optimized
The compiler uses a strict interpretation of the definition of the C++
language and strives to conform to it. However, some aspects of C++
are undefined, and the compiler may interpret the same code
differently, depending on whether optimization is performed. Some
reasons that code may behave differently when optimized include:
Recompiling code with the global optimization feature can uncover
problems similar to those encountered when porting to another
compiler. Adherence to standard "portable" coding practices
minimizes any problems encountered when using the optimization
feature.