|
|
Subscribe / Log in / New account

GCC gets a new Optimizer Framework

This article brought to you by LWN subscribers

Subscribers to LWN.net made this article — and everything that surrounds it — possible. If you appreciate our content, please buy a subscription and make the next set of articles possible.

May 12, 2004

This article was contributed by Steven Bosscher and Diego Novillo.

Earlier this week, the first bits a major compiler internals overhaul have been merged into the development mainline of the GNU Compiler Collection (GCC) for inclusion in the next release.

GCC is used as the system compiler for GNU/Linux and many other operating systems. The system compiler is one of the components of an operating system that has a massive impact on the performance of the system as a whole. From the kernel to productivity applications, from the C library to even the compiler itself, almost all executable binaries are compiled with the system compiler, so it has to be stable and produce good code. It is therefore not surprising that major changes to the internals of a stable compiler almost never happen. But computer architectures change, so at some point an aging compiler will have to undergo big surgery or risk becoming irrelevant. And GCC is aging.

While GCC produces reasonably good code for a large number of architectures, even its most recent version essentially builds on the compiler framework started by Richard Stallman in the early 1980's. In this framework, code improving transformations are performed on an intermediate representation called Register Transfer Language (RTL), an architecture independent, lisp-like assembly language. Older versions of GCC used this framework mostly for local optimizations, but such limited optimizations are insufficient for modern architectures with RISC-like properties and a significant difference between the speed of the chip and of memory access.

So, with the release of GCC 3.0, a number of global optimizations acting on RTL were introduced. Unfortunately, for many code transformations, RTL is not a suitable and effective representation because it is too close to the actual machine language. This hinders several of the high-level analyses performed by modern compilers. It has become more and more obvious that a new, high-level intermediate representation needs to be added to GCC. The Tree SSA project has been started to address this need.

The goal of the Tree SSA project is to build a completely machine-independent optimization framework based on the Static Single Assignment (SSA) form. SSA is an intermediate representation (IR) that is becoming increasingly popular because it allows efficient implementations of data flow analysis and optimizing transformations.

In SSA form, every temporary variable is only assigned a value once. Actual programs are seldom in SSA form initially, because variables tend to be assigned multiple times, not just once. An SSA-based compiler modifies the program representation so that every time a variable is assigned in the original program, a new version of the variable is created. Different versions of the same variable are distinguished by subscripting the variable name with its version number.

Variables used in the right-hand side of expressions are renamed so that their version number matches that of the most recent assignment. It is not always possible to statically determine what is the most recent assignment for a given use. These ambiguities are the result of branches and loops in the program's flow of control. To solve them, the SSA form introduces a new type of operation called PHI functions, these merge multiple incoming assignments to generate a new definition; they are placed at points in the program where the flow of control causes more than one assignment to be available.

Figure 1Figure 2

For example, consider the code fragment in Figure 1, where it may not be known at compile time which of the branches will execute. The USE-DEF chains for 'x' are drawn in the figure. In the second 'switch', the compiler has to assume that any of the assignments to 'x' in the first switch may have been executed. In this case, the SSA conversion process will introduce a PHI function for 'x' to create the needed unique definition, as shown in figure 2.

Notice that PHI functions are an artifact used internally by the SSA form and are never emitted in the final code. The PHI function that defines 'x_4' in the previous example simply means that 'x_4' can take the value of 'x_1', 'x_2', or 'x_3' at run time.

Once the program is in SSA form, flow of control and USE-DEF chains are explicitly represented in the intermediate representation, giving almost instantaneous information to passes like constant propagation and folding. The properties of the SSA form greatly simplify data flow analysis, and indeed many traditional compiler optimizations, such as constant and copy propagation and also some forms of common subexpression elimination, are relatively straightforward and fast on functions and even whole programs represented in SSA form.

Before work on these optimizations could start, a whole new optimization framework had to be implemented:

  • A new intermediate representation.
  • GCC already constructed each function as an abstract syntax tree (AST), but there was no single AST representation in GCC. Instead, each language defined its own trees which were translated piecewise to RTL and then optimized in the old framework. With Tree SSA, two new language independent representations have been added to resolve this issue.

    All the language front-ends now emit a very high-level IR called GENERIC. Each function is handed over to the language independent parts of the compiler as a tree in GENERIC form. Next, this tree is lowered to GIMPLE form, another new IR derived from the SIMPLE representation proposed by the McCAT project out of McGill University.

    The GIMPLE representation looks like three-address code. All side effects are explicit so that a function in GIMPLE form is ready for analysis. Most of the existing front-ends have been modified to emit GENERIC so that they can be optimized using Tree SSA. The next release will also include a Fortran 95 front-end, which is the first front-end built directly to emit GENERIC.

  • Analyses for rewriting the GIMPLE representation in SSA form.
  • In the old framework, no optimizations were performed on the AST. This meant that there was no need for a control flow graph, or for data flow analysis to be performed. All of this is now necessary before a representation can be rewritten into SSA form.

    To avoid unnecessary code duplication, a lot of effort was spent on rewriting the old framework so that it was possible to share many of the basic control flow graph manipulations between the old and the new framework. Data flow analyses had to be implemented from scratch.

    One particularly interesting analysis is alias analysis. GCC now implements several types of alias analysis: type-based flow-insensitive analysis, flow-insensitive points-to analysis, and flow-sensitive points-to analysis. Most analyses are currently intra-procedural, although some inter-procedural analyses are partially implemented or planned.

  • Passes for performing the actual code optimizations.
  • Passes that have already been implemented include sparse conditional constant propagation, partial redundancy elimination, dead code and dead store elimination, and scalar replacement of aggregates. Also, a lot of dominator tree based optimizations and some conditional execution conversions have been implemented. Many of these passes replace equivalent passes that work on RTL.

All the new parts together account for about 100,000 lines of new code, not including the many changes to existing parts of the compiler. The framework implemented as part of the Tree SSA project adds a whole new path to the compilation process, while no RTL passes have been disabled yet.

Still, a compiler with the Tree SSA passes enabled is not significantly slower than the recently released GCC 3.4.0, and a number of very expensive passes in the RTL framework have already been subsumed by Tree SSA passes. Once these RTL passes have been disabled and removed, the resulting compiler will be a lot faster than GCC 3.4.0, while the generated code is at least as good, and often better.

Index entries for this article
GuestArticlesBosscher, Steven


(Log in to post comments)

gfortran

Posted May 13, 2004 3:16 UTC (Thu) by vblum (guest, #1151) [Link]

One of the interesting side aspects of the tree-ssa branch is not optimization related - it
was supposed to include gfortran, finally an implementation of Fortran 90 (95?), which for
some of us would be a huge optimisation in itself ...

What is the status of gfortran in tree-ssa? Is it still in? Will it be ready / usable by the time
gcc 3.5 appears? How does it compare to the steadily improving, but rather privately
developed g95? Are those branches sundered for good?

questions, questions, but they are mildly interesting to someone using the language ...

gfortran

Posted May 13, 2004 6:55 UTC (Thu) by jamesh (guest, #1159) [Link]

Are you sure these two projects aren't one and the same?

Looking at README.backend file in the G95 CVS, it says they require the tree-ssa branch.

gfortran

Posted May 13, 2004 15:26 UTC (Thu) by vblum (guest, #1151) [Link]

Early 2003, there was a fork in the project. My understanding was that Andy Vaught (who, I think, founded g95?) did not agree with a roadmap for gcc inclusion that some of his co-developers wanted. The others splut and started on direct gcc inclusion - Andy went on, allegedly focussing on a usable compiler first and worrying about inclusion politics later. The others intergrated their branch of g95 into gcc tree-ssa officially. This explains why the gcc-g95 has all its mailing lists & infrastructure at gcc.gnu.org, while I cannot even find a mailing list on g95 itself any more - you have to email Andy Vaught directly, it seems ... on the other hand, he mainatins an active blog and the compiler seems to produce workable code for some highly complex pieces of code now ...

Anyone out there with better information is welcome to correct me.

gfortran

Posted May 13, 2004 17:20 UTC (Thu) by JoeBuck (subscriber, #2330) [Link]

The only "live" part of the g95 (or gfortran) project is now fully integrated with GCC development.

gfortran

Posted May 13, 2004 17:39 UTC (Thu) by vblum (guest, #1151) [Link]

So are gfortran (in gcc) and www.g95.org one branch? The appearance is very different ... (and g95.org is certainly rather live ...)

gfortran

Posted May 20, 2004 9:53 UTC (Thu) by joib (subscriber, #8541) [Link]

No. g95 is a separate project, although there is some cross-pollination. GNU gfortran lives at http://gcc.gnu.org/fortran/ , and as has been said, is now part of mainline gcc, expected to release in 2005. Unfortunately the web page is somewhat out of date, check the fortran@gcc.gnu.org mailing list instead, archive is at http://gcc.gnu.org/ml/fortran/ .

If you want to compile F95 code right away, g95 is probably in a more usable state at the moment. However, considering that g95 is a one-man show I don't think it's long-term viability is as good as that of gfortran.

PL/I Frontend for GCC

Posted May 13, 2004 6:30 UTC (Thu) by henriksorensen (guest, #6313) [Link]

On a side note, there is also being developed a PL/I frontend for GCC. The aim of the project is also to use the GENERIC/GIMPLE representations.

GCC gets a new Optimizer Framework

Posted May 13, 2004 10:10 UTC (Thu) by rjw (guest, #10415) [Link]

I wonder what this will mean for gcj?

Hopefully it will be closer integrated into the mainline, and we could see things like compiling a subset of C/ C++ / fortran to JVM bytecode. AFAIK, at the moment, gcj doesn't use all that much of the same code as the other frontends.

Another interesting area ( for mono folks) would be if this would make an IL backend easier... and maybe even enable Managed C++, ugly as it is..

GCC gets a new Optimizer Framework

Posted May 13, 2004 17:22 UTC (Thu) by JoeBuck (subscriber, #2330) [Link]

gcj has been "fully integrated into the mainline" both before and after the switch to tree-ssa.

GCC gets a new Optimizer Framework

Posted May 13, 2004 17:35 UTC (Thu) by Per_Bothner (subscriber, #7375) [Link]

Hopefully [gcj] will be closer integrated into the mainline, and we could see things like compiling a subset of C/ C++ / fortran to JVM bytecode. AFAIK, at the moment, gcj doesn't use all that much of the same code as the other frontends.

The tree-ssa work doesn't change the "integration" of gcj at all - it's as integrated after the tree-ssa merge as it was in 3.4. And it isn't less "integrated" than other languages. It doesn't share code with the C-specific frontends but neither does Fortran.

One way ssa can help gcj is that can make it easier to write high-level optimizers with better knowledge of Java semantics.

Compiling a subset of C/ C++ / Fortran to JVM bytecode is a very hard problem. The logical way to do it is to write a "JVM backend", and so conceptually you'd cross-compile to the JVM. Steve Chamberlain (then) of Transmeta contributed a PicoJava port which has been removed, but it could used as a starting point. Whether something like this would be useful enough to justify the work is a different matter.

I'd be much more interested in "JVM support" for gdb, so it could debug interpreted bytecode.

GCC gets a new Optimizer Framework

Posted May 16, 2004 12:31 UTC (Sun) by khim (subscriber, #9252) [Link]

Is is "less-integrated". When you compile with GCJ to native code it uses the same front-end. But when you want to compile to JVM - it's whole other compiler! Will it be possible to make JVM just another target ? RTL was unusable for this - but what about SSA ?

GCC gets a new Optimizer Framework

Posted May 20, 2004 9:43 UTC (Thu) by rmathew (subscriber, #20961) [Link]

Is is "less-integrated". When you compile with GCJ to native code it uses the same front-end. But when you want to compile to JVM - it's whole other compiler! Will it be possible to make JVM just another target ? RTL was unusable for this - but what about SSA ?

First off, RTL has not gone anywhere after the Tree-SSA merge - it is still there, though no longer where almost all of the optimisations are performed.

Second, the actual front end remains the same whether you are compiling to native code or to a class file containing JVM bytecodes - the difference is that JVM bytecodes are generated straight from the tree representation while native code is generated after the trees are converted to RTL and generic and machine-specific optimisations performed.

The main reason is that the GCC infrastructure does not directly support the notion of generating code for two different backends at the same time and it is not trivial to add support for such a thing.

Another reason is probably political.

GCC gets a new Optimizer Framework

Posted May 2, 2005 23:22 UTC (Mon) by plugwash (subscriber, #29694) [Link]

yeah rms is a "Free" software zealot

how much control does he have over gcc nowadays

notice the way he was advocating self-censorship of code in that discussion linked to

GCC gets a new Optimizer Framework

Posted May 13, 2004 19:14 UTC (Thu) by iabervon (subscriber, #722) [Link]

I wouldn't actually be too surprised if full C99 support were possible on the JVM (with a suitable library, of course). There are some limits on, for example, what you're allowed to do on pointers as integers and turn them back to pointers successfully. It would probably actually be possible to make a conforming implementation of C with type checking, garbage collection, etc., and it might be possible to use the JVM's services to do it.

GCC gets a new Optimizer Framework

Posted May 14, 2004 9:37 UTC (Fri) by rjw (guest, #10415) [Link]

http://www.xwt.org/mips2java/

its definitely possible, and this is one (ugly) starting point.

The jvm is turing complete, obviously.

The big problem is just working out what people are doing with pointers to be able to implement them efficently on the jvm....

More of this kind of article please

Posted May 13, 2004 13:33 UTC (Thu) by mrussell (guest, #3649) [Link]

This is a great article - in-depth coverage of the design of a important user-space tool, at a similar level of depth to the kernel development section. I would love to see more like this (perhaps commissioned from people involved in major projects like apache, scripting languages, databases etc - there's no shortage of subject matter).

More of this kind of article please

Posted May 13, 2004 18:09 UTC (Thu) by kalle (guest, #548) [Link]

I agree. Articles like this one and the ones in the kernel section is the reason I subscribe to LWN.

GCC gets a new Optimizer Framework

Posted May 13, 2004 14:19 UTC (Thu) by JamesErik (subscriber, #17417) [Link]

Can anyone comment as to whether the GCC teams has become any more amenable
to the idea of exporting intermediate representations to a user-accesible format such
as XML? The GCC-XML patches do this, but it would be nice if it were a standard
feature of the compiler itself. From what I understand, the GCC team's hesitancy is
for fear of proprietary backends being able to leverage the GCC front-end via a
standardized, accessible IR format. Still, it would be hugely useful for user-level
source code analysis (doxygen, for example) and source code generation (for me).

GCC gets a new Optimizer Framework

Posted May 13, 2004 17:17 UTC (Thu) by JoeBuck (subscriber, #2330) [Link]

While there are a variety of useful debugging dumps from GCC, the FSF is strongly opposed to mechanisms that would allow the complete internal state to be dumped and reloaded (for example, a complete XML representation of the parse trees), because a variety of people will quickly use such mechanisms to add proprietary optimization passes and back ends to GCC, instead of contributing that code in source form under the GPL to GCC. To support their argument, they point out that we only have a GNU C++ front end and a GNU Objective-C front end because these types of interfaces were not present at the time these front ends were developed (in both cases, companies involved tried to find legal ways of making their employees' work proprietary, then gave up and contributed the code).

The FSF will not try to prevent other people from making or distributing "XML patches", but those kinds of patches won't be accepted for the official GCC. At least, not now; it is possible that the policy will have to be relaxed at some point to properly support the C++ "export" keyword, in the sense that some form of pre-parsed external representation will have to be stored. And, in the case of Ada, there is a standard external representation, and the FSF is quite willing to let the GCC developers do what is necessary to support language standards.

Might be 4.0

Posted May 13, 2004 17:28 UTC (Thu) by JoeBuck (subscriber, #2330) [Link]

It is possible that the first release of GCC based on the tree-ssa framework will be called 4.0 rather than 3.5. The decision on that hasn't been made yet. It is certainly a bigger internal change than any that was made between GCC 2.x and GCC 3.x: the major change there was in the completely new C++ standard library, but for C there were no major architectural or user-visible changes.

Might be 4.0

Posted May 14, 2004 21:10 UTC (Fri) by xorbe (guest, #3165) [Link]

People who are concerned with getting the job done really don't care what it gets versioned as. 3.5, 4.0, either way it'll compile my code. Thanks for GCC!

Nice article

Posted May 13, 2004 18:29 UTC (Thu) by Ross (guest, #4065) [Link]

I never expected to read about compiler optimization theory on LWN but
I'm happy to see it here. The article was short and readable.

Nice article

Posted May 14, 2004 19:15 UTC (Fri) by Gady (guest, #1141) [Link]

I agree. This article, like the kernel page, is written just for me - technical but not specific. A pleasure to read.

Nice article

Posted May 19, 2004 18:49 UTC (Wed) by roelofs (guest, #2599) [Link]

I agree. This article, like the kernel page, is written just for me - technical but not specific. A pleasure to read.

Quite. Kudos to Steven and Diego!

Greg

GCC gets a new Optimizer Framework

Posted May 13, 2005 13:53 UTC (Fri) by tomek (guest, #29487) [Link]

A lot of compilers have come and gone over the last years but the GNU Compiler Collection has been one of the leading compilers in use over the last 15 years. It has a long and critically important history in the free and open source movement. GCC version 4 now features a new optimisation framework (Tree-SSA) and includes improvements to its optimiser (e.g. dead code elimination, autovectorisation of loops) as well as language-specific improvements. The changelog states: "Independent testers have measured speed-ups up to 25% in real-world production code, compared to the 3.4 family" (for C++). Tree-SSA will enable the development of many many more optimizations than were reasonably possible with the old infrastructure - so GCC 4.0 is the base of the next round of optimization, which will be part of GCC 4.1. Tom


Copyright © 2004, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds