Aug 01
2008

What is Invariant Good For?

Posted by Walter Bright in Software Development Methodology and ManagementProgramming LanguagesParallelism ConcurrencyOptimizationMetaprogrammingHigh-Performance ComputingFunctional ProgrammingD ProgrammingconcurrencyCompilersApplication Development

WalterBright

Invariant data in a program is data that, once initialized, never  changes. A new term was coined for it after discovering that the existing jargon const, final, readonly and immutable had different meanings for different programmers and different programming languages (and even had shifting meanings within programming languages!). A term was needed that enabled a precise definition.

The existing terminology confusion led to the discovery that invariance doesn't get much respect in the programming language business (*), it's the Rodney Dangerfield of concepts. C++ only has invariance for literals and top level const. There is, for example, no way to declare a pointer to invariant data. Java has similar problems with its final storage class. There is no way to declare an entire data structure as invariant, or even the leaves of a data structure as invariant. Perl has invariance only for strings.

(Note: pointer to const in C++ does not declare a pointer to invariant data, as another alias to that same data can still change it.)

The D programming language, however, brings invariant to the fore as a transitive, first class, pervasive language feature. Whole data structures can be invariant, as well as leaves and substructures of it. It adds some measure of complexity to the language, and since other languages have such severely limited invariant support, and yet large and successful programs are developed with them, the obvious question is why bother? Invariance doesn't seem to be a property anyone misses.

So here's a list, in no particular order of importance, of what invariance is good for and why it is worth the extra complexity:

  • By marking data as invariant, the self-documenting nature of the type is improved. There is no wondering if any part of a function's implementation modifies any part of an invariant argument's data structure. There is no wondering if any subsequent user of an invariant variable changes it or any part of what it refers to.
  • Invariant data enables many optimization opportunities. If an optimizer can figure out what an invariant variable is initialized with, it knows that any use of that variable will use the same value, and so generic code can be replaced with special case code for that particular value. 
  •  Data computed using invariant operands is also invariant, and so the result can be cached instead of recomputed.
  • Invariant function arguments make pure functions possible, and with pure functions one can use functional programming techniques. Pure functions are inherently thread safe and parallelizable, which is important in the coming ubiquitous world of multiple cores. 
  •  Invariant data can be treated as having the semantics of value types, but with the overhead of a reference type. Strings (discussed in one of my previous columns) are a canonical example of this. 
  •  If a deep copy of a data structure is required, such as for implementing a copy-modify-swap lock-free algorithm, the invariant portions of the structure need not be copied. For example, a symbol table using invariant strings for its leaves need not copy those strings when copying the symbol table.
  • Invariant data can be shared among multiple threads without need for locking, synchronization, etc. Data races and sequential consistency problems with mutable data are just not possible with invariant data.
  • Invariant data can be automatically partitioned out by the development tools into ROM for use in embedded systems.
  • In virtual memory systems, invariant data can be placed into pages marked with the read-only hardware bit, to provide protection against memory corruption and detection of such bugs.

I believe these add up to a compelling case for invariant support, what do you think?


(*) Functional programming (FP) languages inherently require invariance, but these languages have not really entered mainstream use. Perhaps this is because FP languages take a good thing too far and make everything invariant. FP has sometimes crept in the back door, for example C++ templates were discovered to actually be a functional programming language.



Comments (9)Add Comment
Please do not use the term "invariant" for this concept
written by Andrew Koenig, August 01, 2008
There are two good reasons not to use "invariant" for the concept you describe.

First, the term is already widely used for a different concept, namely the notion of a condition that will always be true at a particular point in the code. An invariant is an important intellectual tool in correctness proofs.

Second, there is another term that is already in widespread use in other programming languages for this concept, namely "immutable." For example, strings in Python are immutable in the sense that there is no operation you can perform that will change the value of a string object. Operations that appear on the surface to be doing so, such as +=, actually create a new object and rebind the relevant variable to that object.
Invariant meaning
written by Walter Bright, August 01, 2008
I appreciate your points, and D even does have a special syntax called a "class invariant" which is as you describe. But at this point, it's water under the bridge, and there's a lot of momentum behind 'invariant'.
Immutable data structures
written by Frank Hileman, August 01, 2008
Excellent points about immutable data structures. For me, the most important advantantage of immutable structures is a decrease in debugging time. Modifying data increases the "state space" of a component or application, and I think that is the main source of errors -- unexpected states.

Please change the terminology to "immutable" if possible. It is confusing. "Invariant" has a well established meaning.
invariant name
written by Walter Bright, August 01, 2008
While invariant does have another meaning, I don't think there is confusion between "class invariant" and "invariant data."
Invariant name
written by Sean Kelly, August 02, 2008
But the current approach is overloading a keyword for two entirely different uses. This even necessitated a syntax change by requiring the addition of empty parens on class invariant blocks, making D 1.0 and D 2.0 code not cross-compatible. And D 2.0 is still touted as an experimental branch of the language, which suggests to me that syntax changes are to be expected as concepts are refined.
Invariant name
written by Joel Rosdahl, August 02, 2008
I think, too, that "invariant" has other connotations than what's wanted in this context. "Immutable" is way better.

Walter: What problem do you see with "immutable"?
Application of invariant
written by Sean Kelly, August 03, 2008
Keywords aside, while I agree that the concept of invariance is a good one, I remain somewhat concerned that the syntax with which invariance is applied in D may result in unmaintainable code.

In C++ containers, for example, it's not uncommon to need both const and non-const versions of class methods that vary only by return type. In D 2.0, we not only have const methods but invariant methods as well, thus further increasing the maintenance cost of such classes.

Unlike C++ however, it os both possible and even desirable to overload a function for const and invariant parameters in order to achieve any optimization benefit from the latter when possible. If I have a collection of algorithms that operate on arrays, for example, it would be natural for me to declare these array parameters as const to provide a contract to the user. However, if the user passes actual invariant data then I would like the compiler to optimize my algorithm appropriately, thus requiring additional permutations of the function for each const/invariant parameter. This isn't an issue for template functions if the template system overloads on const-ness, but traditional functions are left with little recourse but explicit duplication, and not only two duplicates but potentially 2N (N being the number of const parameters for the function). As a library programmer who will have to do this by necessity, I am very much concerned that the potential, theoretical gain in optimization and automatic parallelization will not at all be worth the maintenance cost.
Um
written by Sean Kelly, August 03, 2008
Make that 2^N.
Experience with immutable data in Python
written by Andrew Koenig, August 03, 2008
Python has an important part of the core language semantics hooked into immutability -- only immutable data can serve as a key in a hash-table-based data structure. I'm sure you see why: If you can change the value of a key without moving it, the data structure won't be able to find it because its location is still tied to its old has code.

So, because strings are immutable, you can use them as keys. Tuples are immutable if all their elements are, so you can use a tuple of strings (or mixed strings and numbers) as keys as well.

However, sets are mutable -- otherwise, how can you add an element to one. That means that you can't use a set as a key, which means that you can't have a set of sets. The solution: A new data type named frozenset, which is like a set but is immutable. So if you want a set of sets, you have to convert each set to a frozenset (which copies the data structure), and then you can use it as a set element.

This somewhat clumsy circumlocution appears necessary, but perhaps it suggests that functional and imperative paradigms don't always mix all that well.

Write comment
You must be logged in to a comment. Please register if you do not have an account yet.

busy

Get your FREE Subscription to Dr. Dobb’s Digest today!

Dobbs Code Talk Quick Poll

This time next year, your most important operating system (host and/or target) will be:

Look Who's Code Talking


Robert Jacques
City: Baltimore

James Monschke
City: San Francisco

David Chapin
City: Mill Creek

Iqbal Khan
City: San Ramon

Simon Carless
City: San Francisco

Dobbs Code Talk Tags

.NET abstraction Ada Adobe Agile Ajax algorithm Algorithmic complexity ALM Analogical reasoning Android Anecdotes Apple Application Development AppStore Architecture and Design ARM Artificial Intelligence Artificial Life Assembler Programming Audio files AVX AWK Banking Bazaar Best Practices Blender Books Brain computer interfacing Build C C Programming C Sharp Cartoon Category theory Cellular automata Clojure Cloud Computing Cobol Cocoa Coder Of The Month Cognition as compression Collaboration Common Process/Frameworks Compilers Computational humour Computational narrative Computational politics Computer Science Computers in art computing pioneers concurrency Conferences Consciousness research Contest Contest140 contests CPlusPlus crime CSharp D Programming Data Centers Databases Debugging Delphi Deployment design Design Patterns Digital Signal Processing Distributed Django Documentation DSL dynamic language Eclipse EDA education Emacs Embedded Systems Encryption engineering Erlang Etymology Excel exception handling Facebook Financial computing Five Questions Flash Flash Lite Flex Forth Fortran Fraud FreeBSD Fun Functional Programming gadgets Games Gender Git gnuplot Go Google Graphics GUI hardware Heron High School High-Performance Computing History Holographic reduced representations HTML5 Humanity Humour Hungarian Notation Identity Inkscape Innovation Intel Interview iPhone J2EE Java JavaFX JavaOne JavaScript language engineering Legal lex LINQ Linux Lisp Literate Programming Logic Programming m4 Mainframes Make Mathematica Mercurial Mesh messaging Metaprogramming Microsoft MID Miscellaneous Musings ML Mobile Software Mobility modeling modular programming multicore Music MVC myblog Natural Language Processing Networking Neural networks newspeak Nokia numerical computing Object Rexx ObjectiveC Office Office 2007 Online spreadsheets OOP Open Source Openaccess publishing OpenBSD OpenSolaris Operating Systems Optimization Oracle Pair Programming Parallelism Concurrency Parsing Pascal Patents Patterns Performance Perl PHP Podcast Pop11 Poplog Privacy Processing Productivity Programming Language Implementation Programming Language One Programming language semantics Programming Languages Programming Style Project Management Prolog Psychology Public understanding of science puzzle Python QA Quantum Computing Quotes Rails Realtime recls Requirements Research practice REST Review RIA rich internet applications Robotics Ruby SaaS Software as a service Scala Schadenfreude Science fiction Screencast Scripting SD Best Practices Search Security Semantic Web Silverlight Snobol SOA social Social Networks Society for the Study of Artificial Intelligence a Software Development Methodology and Management Songs and poems Spending Priorities Spreadsheets SQL Startups Statistics Storage String pattern matching Survey Teaching Testing The Business of Programming The Dobbs Challenge The Future Theory Topology Transhumanism Travel on the Job Twitter Types Unix Upgrade Usability Use Cases USENET User Experience User Interface Design Version Control video virtual machines Virtualization Visual Studio Visual Studio Sponsored Post WCF Web Development Windows Windows 7 Windows Live Wireless WOA WPF X Window System yacc

Subscribe to Dr. Dobbs Newsletter

Email:
Dr. Dobb's Update
Delivered twice a week, Dr. Dobb's Update provides unbiased and objective news, commentary and technical features spanning the entire software development marketplace.

Latest Comments

Jonathan's Last Day at Sun
For the 8 years I worked there, it was fantastic. I worked there under McNealy and I have undying admiration for the guy. I only knew Jonathan periphe...
Implementing Thread Local Storage on OS ...
Back in the day, I did a fair amount of work with PThreads. Wonderful design. Some quirks, but basically really, really nice. Although I wrote a lot ...
More Technonecrophilia with Snobol One-L...
Yeah, It's probably identical except for the (embedded) copy number, I would think. Once it became freely distributable, the copy I've been distribut...
More Technonecrophilia with Snobol One-L...
There's a spitbol-3.7-win.exe at http://code.google.com/p/spitbol/downloads/list . I found it via Dave Shield's blog page http://daveshields.wordpress...
Jonathan's Last Day at Sun
Sadness.

The Latest From Our Member Blogs

How To Select Trainees
Written by Joel Wiesen   
01/27/10
Hiring the right trainee can be harder than hiring a trained programmer.  One approach is described at my website: http://www.aprtestingservices.com/business/lpat/
 
Technical Job Interviews
Written by Keith Kerlan   
01/20/10
What is the best way to interview for software developer positions?  I've been on both sides of the job interviewing table, but have been on the interviewee side of some not too  great inter
 
Timers/timeouts in multi-threaded event-loops
Written by Christof Meerwald   
01/03/10
The traditional way to integrate timeout handling (or timers) in (single-threaded) event loops was to just pass the appropriate timeout value to the select/poll/epoll syscall. While this works fine
 
C vs C++
Written by Issam Lahlali   
12/04/09
I think that the debate "C vs C++" will end when the two langages died, and each one have its advantages and inconvenients, the choice of one instead of another depend on the application c
 
Great Jobs at CISCO
Written by Brent Rogers   
11/30/09
Hello! I am a recruiter at CISCO. We have a number of great jobopportunities at CISCO right now. Please take a look at the job links listedbelow and please send me an updated resume if you are interes
 
OK Labs, ST-Ericsson, and the Mobile/Wireless Ecosystem
Written by Steve Subar   
11/17/09
Two weeks ago, OK Labs and ST-Ericsson announced the selection of OK Labs as ST-Ericsson's mobile virtualization partner. To earn this coveted position, OK Labs prevailed in a rigorous evaluation
 
C++ Ninjas Needed in Santa Clara, California
Written by Brent Rogers   
09/30/09
Hello! I am a recruiter at CISCO. Our PostPath teamin Santa Clara is building a new Email SaaS business at CISCO. We are looking forsenior developers with Zimbra expertise to help us accomplish this t
 
Fighting Fragmentation with Mobile Virtualization
Written by Steve Subar   
09/21/09
Last week Motorola and T-Mobile announced the launch of a new and innovative Android-based smartphone, the Cliq. This attractive, feature-rich slider handset happens to build on a chipset and firmware
 
Insights into Router Design: Unit Testing of Networking Protocols
Written by Rajesh Kumar Venkateswaran   
09/07/09
  Unit testing is a software validation methodology through which a programmer tests individual modules or units of source code. If the programmer has been responsible for developing a networ
 
Insights into Router Design: Implementation of Networking Protocols
Written by Rajesh Kumar Venkateswaran   
09/06/09
  Modern data networking consists of a large number of networking protocols, each of which has its own domain of applicability. Some run on end stations (also called hosts), some on enterp
 
Insights and Innovations in Networking
Written by Rajesh Kumar Venkateswaran   
09/05/09
Networking devices such as routers and switches have evolved quite a bit over the past years, both in the service provider network and in the enterprise. It is a challenge to build these devices, bo
 
reddit threads community
Written by Christof Meerwald   
08/30/09
I have just started a threads community over at reddit to cover topics such as multithreading, concurrency and parallel programming. Feel free to join if you are interested. -- cmeerw.org 
 

The Latest From Dr. Dobbs

DDJ