Jun 28
2009

Compiling Templates

Posted by Walter Bright in Programming LanguagesProgramming Language ImplementationMetaprogramminglanguage engineeringD ProgrammingCPlusPlusCompilers

WalterBright

Templates can be a complex and intimidating feature. I know I'm never comfortable with a feature until I know how it compiles and works down to the bottom level. Any complex process can be broken down into simpler, lower level steps that can be mastered and then understanding built up from there. Rather than get into a long discussion of the arcana of templates, I'll show the step-by-step way they're compiled. Often, much of the strangeness of the arcana becomes clearer once the compilation process makes sense.

Given the simple template program that computes a factorial at compile time:

template<int n> class factorial
{
  public:
    enum
    {
      result =
         n * factorial<n-1>::result
    };
};

template<> class factorial<1>
{
  public:
    enum { result = 1 };
};

int foo()
{
    return factorial<10>::result;
}

1. The source code is preprocessed and lexed (converted to a stream of tokens). We'll skip past that as it's standard compiler technology and not peculiar to templates.

2. The two factorial templates are parsed. Parsing means a data structure is created that represents the templates. Our two factorial templates are installed into the symbol table, and since they have the same name, they are often stored in a list hanging off the 'factorial' name. The bodies (the stuff between the { }'s) of the templates are stored as ASTs (Abstract Syntax Trees). Sometimes, as in the Digital Mars C++ compiler, they are stored as lists of tokens.

3. The function foo() is semantically analyzed, meaning that the AST is evaluated, typechecked, symbols defined, etc. The factorial<10> is recognized as an instantiation of a template named factorial with an argument that is a literal integer 10.

4. factorial is looked up in the symbol table, and two factorial template declarations are found. The compiler must determine which of those templates to instantiate.

5. The template instantiation arguments, in this case the literal 10, is applied to each template declaration. The list of template declarations that match are called the 'candidate list'. In this case, the 10 matches the int n parameter of the first factorial template, and does not match the 1 parameter of the second. So there is only one candidate, and that is selected.

6. If there are more than one candidate templates, the compiler must determine the so-called best match. This is a lot like function overloading. The technique used for finding the best match for both C++ and the D programming language is called partial ordering.

7. Now we have the specific template declaration that needs to be instantiated and the actual arguments to instantiate it with. But there is a rule in C++ and D that a template with a specific set of arguments can only have one instantiation for the whole program. This means that if the template has already been instantiated with the same arguments, we can simply point to that instantiation and we're done.

Finding an existing instantiation is done by 'mangling' the template name, namespace, and arguments into a string. For factorial<10>, this mangled name looks like (for g++) 9factorialILi10E. (This can be read as a "9 character name 'factorial' with a literal integer 10 argument followed by the end of the arguments.) None of the template body is needed for this. This string will be unique in the program.

Hung off of the template declaration in the symbol table is another table with those mangled template instantiation names as indices to the corresponding template instantiations.

If the template instantiation isn't found, then it needs to be instantiated.

8. A fresh symbol table scope is created. The template declaration parameters are declared in it as symbols initialized to the actual argument values. Type parameters are declared as typedefs (aliases in D), integer parameters are declared as constants. The template declaration body is then parsed (if not already an AST), and semantically analyzed by the regular other bits of the compiler that deal with functions, classes, and declarations.

A reference to the new instantiation is also installed into the template declaration's table of instantiations.

9. The declarations created by the template instantiation are now treated as regular types and symbols by the rest of the compiler, which doesn't know or care they came from templates.

10. The astute reader will wonder what happens when separate compilation units generate the same template instantiations with the same arguments. Wasn't there a rule that says only one unique instantiation per program? You're quite right, and the solution is to be found in that the template instantiations are identified in the object file as "COMDATs", short for common data blocks, a delightful term which instructs the linker to remove the duplicates.
(Early C++ compilers which didn't have the luxury of COMDAT-supporting linkers generally had a kludgily awful time with this problem.)


I hope this has shed at least a little light on how templates work. If you want to learn more about how real compilers work, I am hosting a seminar in the fall on compiler construction.

Thanks to Eric Niebler, Andrei Alexandrescu and Jason House for review a draft of this.



Comments (0)Add Comment

Write comment
You must be logged in to a comment. Please register if you do not have an account yet.

busy

Get your FREE Subscription to Dr. Dobb’s Digest today!

Dobbs Code Talk Quick Poll

This time next year, your most important operating system (host and/or target) will be:

Look Who's Code Talking


Cecil Morris
City: St. Louis

Mike Firkser
City: Edison

Carl Dreher
City: Dallas

Victor Schrader
City: San Diego

Magnus Martensson
City: Malmo

Dirk Collins
City: Liberty

Dobbs Code Talk Tags

.NET abstraction Ada Adobe Agile Ajax algorithm Algorithmic complexity ALM Analogical reasoning Android Anecdotes Apple Application Development AppStore Architecture and Design ARM Artificial Intelligence Artificial Life Assembler Programming Audio files AVX AWK Banking Bazaar Best Practices Blender Books Brain computer interfacing Build C C Programming C Sharp Cartoon Category theory Cellular automata Clojure Cloud Computing Cobol Cocoa Coder Of The Month Cognition as compression Collaboration Common Process/Frameworks Compilers Computational humour Computational narrative Computational politics Computer Science Computers in art computing pioneers concurrency Conferences Consciousness research Contest Contest140 contests CPlusPlus crime CSharp D Programming Data Centers Databases Debugging Delphi Deployment design Design Patterns Digital Signal Processing Distributed Django Documentation DSL dynamic language Eclipse EDA education Emacs Embedded Systems Encryption engineering Erlang Etymology Excel exception handling Facebook Financial computing Five Questions Flash Flash Lite Flex Forth Fortran Fraud FreeBSD Fun Functional Programming gadgets Games Gender Git gnuplot Go Google Graphics GUI hardware Heron High School High-Performance Computing History Holographic reduced representations HTML5 Humanity Humour Hungarian Notation Identity Inkscape Innovation Intel Interview iPhone J2EE Java JavaFX JavaOne JavaScript language engineering Legal lex LINQ Linux Lisp Literate Programming Logic Programming m4 Mainframes Make Mathematica Mercurial Mesh messaging Metaprogramming Microsoft MID Miscellaneous Musings ML Mobile Software Mobility modeling modular programming multicore Music MVC myblog Natural Language Processing Networking Neural networks newspeak Nokia numerical computing Object Rexx ObjectiveC Office Office 2007 Online spreadsheets OOP Open Source Openaccess publishing OpenBSD OpenSolaris Operating Systems Optimization Oracle Pair Programming Parallelism Concurrency Parsing Pascal Patents Patterns Performance Perl PHP Podcast Pop11 Poplog Privacy Processing Productivity Programming Language Implementation Programming Language One Programming language semantics Programming Languages Programming Style Project Management Prolog Psychology Public understanding of science puzzle Python QA Quantum Computing Quotes Rails Realtime recls Requirements Research practice REST Review RIA rich internet applications Robotics Ruby SaaS Software as a service Scala Schadenfreude Science fiction Screencast Scripting SD Best Practices Search Security Semantic Web Silverlight Snobol SOA social Social Networks Society for the Study of Artificial Intelligence a Software Development Methodology and Management Songs and poems Spending Priorities Spreadsheets SQL Startups Statistics Storage String pattern matching Survey Teaching Testing The Business of Programming The Dobbs Challenge The Future Theory Topology Transhumanism Travel on the Job Twitter Types Unix Upgrade Usability Use Cases USENET User Experience User Interface Design Version Control video virtual machines Virtualization Visual Studio Visual Studio Sponsored Post WCF Web Development Windows Windows 7 Windows Live Wireless WOA WPF X Window System yacc

Subscribe to Dr. Dobbs Newsletter

Email:
Dr. Dobb's Update
Delivered twice a week, Dr. Dobb's Update provides unbiased and objective news, commentary and technical features spanning the entire software development marketplace.

Latest Comments

Jonathan's Last Day at Sun
For the 8 years I worked there, it was fantastic. I worked there under McNealy and I have undying admiration for the guy. I only knew Jonathan periphe...
Implementing Thread Local Storage on OS ...
Back in the day, I did a fair amount of work with PThreads. Wonderful design. Some quirks, but basically really, really nice. Although I wrote a lot ...
More Technonecrophilia with Snobol One-L...
Yeah, It's probably identical except for the (embedded) copy number, I would think. Once it became freely distributable, the copy I've been distribut...
More Technonecrophilia with Snobol One-L...
There's a spitbol-3.7-win.exe at http://code.google.com/p/spitbol/downloads/list . I found it via Dave Shield's blog page http://daveshields.wordpress...
Jonathan's Last Day at Sun
Sadness.

The Latest From Our Member Blogs

How To Select Trainees
Written by Joel Wiesen   
01/27/10
Hiring the right trainee can be harder than hiring a trained programmer.  One approach is described at my website: http://www.aprtestingservices.com/business/lpat/
 
Technical Job Interviews
Written by Keith Kerlan   
01/20/10
What is the best way to interview for software developer positions?  I've been on both sides of the job interviewing table, but have been on the interviewee side of some not too  great inter
 
Timers/timeouts in multi-threaded event-loops
Written by Christof Meerwald   
01/03/10
The traditional way to integrate timeout handling (or timers) in (single-threaded) event loops was to just pass the appropriate timeout value to the select/poll/epoll syscall. While this works fine
 
C vs C++
Written by Issam Lahlali   
12/04/09
I think that the debate "C vs C++" will end when the two langages died, and each one have its advantages and inconvenients, the choice of one instead of another depend on the application c
 
Great Jobs at CISCO
Written by Brent Rogers   
11/30/09
Hello! I am a recruiter at CISCO. We have a number of great jobopportunities at CISCO right now. Please take a look at the job links listedbelow and please send me an updated resume if you are interes
 
OK Labs, ST-Ericsson, and the Mobile/Wireless Ecosystem
Written by Steve Subar   
11/17/09
Two weeks ago, OK Labs and ST-Ericsson announced the selection of OK Labs as ST-Ericsson's mobile virtualization partner. To earn this coveted position, OK Labs prevailed in a rigorous evaluation
 
C++ Ninjas Needed in Santa Clara, California
Written by Brent Rogers   
09/30/09
Hello! I am a recruiter at CISCO. Our PostPath teamin Santa Clara is building a new Email SaaS business at CISCO. We are looking forsenior developers with Zimbra expertise to help us accomplish this t
 
Fighting Fragmentation with Mobile Virtualization
Written by Steve Subar   
09/21/09
Last week Motorola and T-Mobile announced the launch of a new and innovative Android-based smartphone, the Cliq. This attractive, feature-rich slider handset happens to build on a chipset and firmware
 
Insights into Router Design: Unit Testing of Networking Protocols
Written by Rajesh Kumar Venkateswaran   
09/07/09
  Unit testing is a software validation methodology through which a programmer tests individual modules or units of source code. If the programmer has been responsible for developing a networ
 
Insights into Router Design: Implementation of Networking Protocols
Written by Rajesh Kumar Venkateswaran   
09/06/09
  Modern data networking consists of a large number of networking protocols, each of which has its own domain of applicability. Some run on end stations (also called hosts), some on enterp
 
Insights and Innovations in Networking
Written by Rajesh Kumar Venkateswaran   
09/05/09
Networking devices such as routers and switches have evolved quite a bit over the past years, both in the service provider network and in the enterprise. It is a challenge to build these devices, bo
 
reddit threads community
Written by Christof Meerwald   
08/30/09
I have just started a threads community over at reddit to cover topics such as multithreading, concurrency and parallel programming. Feel free to join if you are interested. -- cmeerw.org 
 

The Latest From Dr. Dobbs

DDJ