<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.wordaligned.org/~d/styles/itemcontent.css"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">
<channel>
<title>Word Aligned</title>
<link>http://wordaligned.org</link>
<description>tales from the code face</description>
<dc:creator>tag@wordaligned.org</dc:creator>
<language>en-gb</language>
<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.wordaligned.org/wordaligned" /><feedburner:info uri="wordaligned" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><item>
<title>Knuth visited, Brains Limited</title>
<description>&lt;p&gt;Lucky me, I &lt;strong&gt;did&lt;/strong&gt; get a ticket to see Professor Donald Knuth deliver the &lt;a href="http://www.bcs.org/content/ConWebDoc/38048"&gt;2011 Turing Lecture&lt;/a&gt; at Cardiff University. He started out by saying he&amp;#8217;d been wanting to come here ever since seeing Yellow Submarine in the 60s because of some &lt;a href="http://www.imdb.com/title/tt0063823/quotes"&gt;dialogue&lt;/a&gt; which goes something like:
&lt;/p&gt;
&lt;p&gt;&amp;#8220;What do you call a big school of whales?&amp;#8221;
&lt;/p&gt;
&lt;p&gt;&amp;#8220;A University of Wales.&amp;#8221;
&lt;/p&gt;
&lt;p&gt;What a fine country Wales is for a computer scientist, he continued, you have a place called &lt;a href="http://maps.google.co.uk/maps?f=q&amp;amp;source=s_q&amp;amp;hl=en&amp;amp;geocode=&amp;amp;q=login&amp;amp;aq=&amp;amp;sll=53.800651,-4.064941&amp;amp;sspn=17.130747,46.538086&amp;amp;ie=UTF8&amp;amp;hq=&amp;amp;hnear=Login,+Whitland,+United+Kingdom&amp;amp;ll=51.837475,-4.745407&amp;amp;spn=0.279173,0.727158&amp;amp;z=11"&gt;Login&lt;/a&gt;. A village in Pembrokeshire, someone in the audience chipped in, it&amp;#8217;s very beautiful. Wonderful, all the more reason to visit!
&lt;/p&gt;
&lt;p&gt;Openings over, the lecture turned into a question and answer session. If I&amp;#8217;m honest, I&amp;#8217;d have preferred something more formal, but having just completed &lt;a href="http://www-cs-faculty.stanford.edu/~uno/taocp.html"&gt;TAOCP4A&lt;/a&gt;, all 900 pages of it, &lt;strong&gt;and&lt;/strong&gt; Volume 8 of his collected papers, the &amp;#8220;dessert&amp;#8221; edition of the series on &lt;a href="http://www-cs-faculty.stanford.edu/~uno/fg.html"&gt;fun and games&lt;/a&gt;, Professor Knuth was in a holiday mood, and it was a delight to hear him field questions on the future of computer science, P = NP, literate programming, analog computers, brute force vs. elegance, and whether the increase in computing power diminishes the art of programming.
&lt;/p&gt;
&lt;p&gt;Yes, computers grow exponentially more powerful &amp;#8212; but human appetites grow yet more quickly. And even if you have all of that power, can you be sure the computers are getting the right answers? This afternoon, Professor Knuth said, while strolling through Cardiff, he&amp;#8217;d seen the famous brewery &lt;strong&gt;Brains &amp;amp; Co. Ltd.&lt;/strong&gt; and thought, yes, that&amp;#8217;s about right. Brains limited.
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://www.flickr.com/photos/arkadyevna/253441813/" title="BRAINS BRAINS BRAINS by Arkadyevna, on Flickr"&gt;&lt;img src="http://farm1.static.flickr.com/106/253441813_b1673d90a8.jpg" width="500" height="334" alt="BRAINS BRAINS BRAINS" /&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;My thanks to Arkadyevna for permission to use her wonderful &lt;a href="http://www.flickr.com/photos/arkadyevna/253441813/"&gt;photo.&lt;/a&gt;
&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/p_UycUgCXL0" height="1" width="1"/&gt;</description>
<dc:date>2011-02-03</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/knuth-visited-brains-limited</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/p_UycUgCXL0/knuth-visited-brains-limited</link>
<category>Knuth</category>
<feedburner:origLink>http://wordaligned.org/articles/knuth-visited-brains-limited</feedburner:origLink></item>

<item>
<title>Set.insert or set.add?</title>
<description>&lt;h2&gt;Get set, go!&lt;/h2&gt;
&lt;p&gt;Suppose you have an element &lt;code&gt;e&lt;/code&gt; to put in a set &lt;code&gt;S&lt;/code&gt;.
&lt;/p&gt;
&lt;p&gt;Should you:
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;S.add(e)

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;or:
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;S.insert(e)

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;?
&lt;/p&gt;
&lt;p&gt;It depends on which language you&amp;#8217;re using. I use C++ and Python and I usually get it wrong.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; S.insert(e)
Traceback (most recent call last):
  File "&amp;lt;stdin&amp;gt;", line 1, in &amp;lt;module&amp;gt;
AttributeError: 'set' object has no attribute 'insert'

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Try again!
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;error: 'class std::set&amp;lt;int, std::less&amp;lt;int&amp;gt;, std::allocator&amp;lt;int&amp;gt; &amp;gt;' 
has no member named 'add'

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Maybe my &lt;a href="http://wordaligned.org/articles/accidental-emacs" title="Emacs of course!"&gt;IDE&lt;/a&gt; should auto-complete the correct member function but it doesn&amp;#8217;t, or at least I haven&amp;#8217;t configured it to, so instead I&amp;#8217;ve worked out how to remember.
&lt;/p&gt;
&lt;p&gt;Now, neither C++ nor Python pins down how a set should be implemented &amp;#8212; read the language standard and reference manual respectively and all you&amp;#8217;ll get is an interface and some hints. Read between the lines of these references, though, or study &lt;a href="http://gcc.gnu.org/onlinedocs/libstdc++/latest-doxygen/a01064_source.html" title="G++ stl_tree.h, on which std::sets and std::multisets are based"&gt;the&lt;/a&gt; &lt;a href="http://svn.python.org/view/python/trunk/Objects/setobject.c?view=markup" title="setobject.c, from CPython"&gt;implementations&lt;/a&gt;, and you&amp;#8217;ll soon realise a Python set is an unordered container designed for fast membership, union, intersection, and differencing operations &amp;#8212; much like the mathematical sets I learned about at school &amp;#8212; whereas a C++ set is an ordered container, featuring logarithmic access times and persistent iterators. 
&lt;/p&gt;
&lt;p&gt;Think: C++ set &amp;asymp; binary tree; Python set &amp;asymp; hashed array.
&lt;/p&gt;
&lt;p&gt;It&amp;#8217;s apparent which method is correct for which language now. To put something into a binary tree you must recurse down the tree and find where to &lt;strong&gt;insert&lt;/strong&gt; it. Hence &lt;code&gt;std::set::insert()&lt;/code&gt; is correct C++. To put something into a hashed array you hash it and &lt;strong&gt;add&lt;/strong&gt; it right there. Hence &lt;code&gt;set.add()&lt;/code&gt; is proper Python.
&lt;/p&gt;

&lt;h2&gt;How long is a string?&lt;/h2&gt;
&lt;p&gt;I&amp;#8217;m suggesting programmers should know at least some of what goes on in their standard language library implementations. Appreciating an API isn&amp;#8217;t always enough. You &lt;strong&gt;insert&lt;/strong&gt; into trees and &lt;strong&gt;add&lt;/strong&gt; to hashes: so if your set is a tree, call &lt;code&gt;S.insert()&lt;/code&gt;, and if it&amp;#8217;s a hash, &lt;code&gt;S.add()&lt;/code&gt;.
&lt;/p&gt;
&lt;p&gt;Such logical arguments don&amp;#8217;t always deliver.
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Question:&lt;/strong&gt; Suppose now that &lt;code&gt;S&lt;/code&gt; is a string and you&amp;#8217;re after its length. Should you use &lt;code&gt;S.length()&lt;/code&gt; or &lt;code&gt;S.size()&lt;/code&gt;?
&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Answer:&lt;/strong&gt; Neither or both.
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://www.flickr.com/photos/the-g-uk/3867089043/" title="string [how long?] by the|G|, on Flickr"&gt;&lt;img src="http://farm3.static.flickr.com/2538/3867089043_2f2b3f5fa6.jpg" width="485" height="149" alt="string [how long?]" /&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;In Python a string is a standard sequence and as for all other sequences &lt;code&gt;len(S)&lt;/code&gt; does the trick. In C++ a string is a standard container and as for all other containers &lt;code&gt;S.size()&lt;/code&gt; returns the number of elements; &lt;strong&gt;but&lt;/strong&gt;, being &lt;code&gt;std::string&lt;/code&gt;, &lt;code&gt;S.length()&lt;/code&gt; does too.
&lt;/p&gt;
&lt;p&gt;Oh, and the next revision of C++ features an &lt;code&gt;unordered_set&lt;/code&gt; (available now as &lt;code&gt;std::tr1::unordered_set&lt;/code&gt;) which is a hashed container. I think &lt;code&gt;unordered_set&lt;/code&gt; is a poor name for something which models a set better than &lt;code&gt;std::set&lt;/code&gt; does but that&amp;#8217;s the price it pays for coming late to the party. And you don&amp;#8217;t &lt;code&gt;std::unordered_set::add&lt;/code&gt; elements to it, you &lt;code&gt;std::unordered_set::insert&lt;/code&gt; them.
&lt;/p&gt;
&lt;hr /&gt;

&lt;p&gt;My thanks to &lt;a href="http://www.flickr.com/photos/the-g-uk"&gt;the|G|&amp;trade;&lt;/a&gt; for permission to use his &lt;a href="http://www.flickr.com/photos/the-g-uk/3867089043" title="string [how long?] on Flickr"&gt;string&lt;/a&gt;.
&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/pGPcHZlX1NY" height="1" width="1"/&gt;</description>
<dc:date>2010-11-17</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/setinsert-or-setadd</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/pGPcHZlX1NY/setinsert-or-setadd</link>
<category>C++</category>
<category>Python</category>
<feedburner:origLink>http://wordaligned.org/articles/setinsert-or-setadd</feedburner:origLink></item>

<item>
<title>Define pedantic</title>
<description>&lt;p&gt;My dictionary &lt;span id="definition"&gt;defines a pedant&lt;/span&gt; as:
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;&lt;strong&gt;pedant&lt;/strong&gt; &lt;em&gt;n.&lt;/em&gt; &lt;strong&gt;1.&lt;/strong&gt; A person who relies too much on academic learning or who is concerned chiefly with academic detail.
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;Apparently the word derives from the Italian, &lt;em&gt;pedante&lt;/em&gt;, meaning teacher. During my career as a computer programmer a number of my colleagues have been surprisingly pedantic about the proper use of English.
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;&amp;#8220;I refuse to join a supermarket queue marked &lt;strong&gt;10 items or less&lt;/strong&gt;.&amp;#8221;
&lt;/p&gt;
&lt;p&gt;&amp;#8220;I do wish people would stop using &lt;strong&gt;target&lt;/strong&gt; as a verb. You &lt;strong&gt;aim&lt;/strong&gt; at a target, you don&amp;#8217;t &lt;strong&gt;target&lt;/strong&gt; it.&amp;#8221;
&lt;/p&gt;
&lt;p&gt;&amp;#8220;I am an exceptionally &lt;strong&gt;skilled grammarian&lt;/strong&gt; in English &amp;#8230; Take that rule and shove it!&amp;#8221;
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;Some of this fussiness may well be a reaction against &lt;a href="http://news.bbc.co.uk/1/hi/7457287.stm"&gt;corporate double-speak&lt;/a&gt;. Still, I wouldn&amp;#8217;t have expected programmers to be 1) so particular and 2) so certain they&amp;#8217;re right. Maybe this attitude comes from all those years of writing code. Programming languages are strict about what they&amp;#8217;ll accept: after all, they have standards!
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://www.flickr.com/photos/thomasguest/5142421926/" title="The C++ Standard vs Perl in a Nutshell by Thomas Guest, on Flickr"&gt;&lt;img src="http://farm2.static.flickr.com/1072/5142421926_6b76c52749_m.jpg" width="240" height="183" alt="The C++ Standard vs Perl in a Nutshell" /&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;span id="continue-reading"/&gt;

&lt;p&gt;Some programming languages are more pedantic than others. Paul Graham memorably characterises C++ as a pernickety aunt&lt;a id="fn1link" href="http://wordaligned.org/articles/define-pedantic#fn1"&gt;&lt;sup&gt;[1]&lt;/sup&gt;&lt;/a&gt;. By contrast, Perl won&amp;#8217;t pick nits. Designed by Larry Wall to be his software &lt;a href="http://www.wall.org/~larry/pm.html" title="Perl, the first postmodern computer language. Larry Wall"&gt;butler&lt;/a&gt;, Perl interprets your ill-expressed wishes with discretion and assurance. Hence you can end up with a Perl program which gets on with its job but which no-one fully understands.
&lt;/p&gt;
&lt;p&gt;Pedants, by definition, take things too far, but pedantry in programming isn&amp;#8217;t all bad. GCC has a useful &lt;a href="http://gcc.gnu.org/onlinedocs/gcc/Standards.html"&gt;&lt;code&gt;-pedantic&lt;/code&gt;&lt;/a&gt; flag. It helps you write portable programs. Perl has a &lt;a href="http://perldoc.perl.org/strict.html"&gt;&lt;code&gt;use strict&lt;/code&gt;&lt;/a&gt; pragma which recasts the butler as a personal trainer. 
&lt;/p&gt;
&lt;p&gt;When it comes to correctness, attention to detail matters. What if this input parameter goes negative? Will that file be closed when an exception is thrown? Can your algorithm handle an empty container?
&lt;/p&gt;
&lt;p&gt;I recently fixed a defect in some (of my own) code which assumed conformant input. When faced with garbage-in this code failed even to generate garbage-out, instead getting caught in an infinite loop. Should a pedantic program insist on correct inputs or should it consider how to handle all possible inputs? &lt;a href="http://wordaligned.org/articles/define-pedantic#definition"&gt;Define pedantic&lt;/a&gt;.
&lt;/p&gt;
&lt;hr /&gt;

&lt;p&gt;&lt;a id="fn1" href="http://wordaligned.org/articles/define-pedantic#fn1link"&gt;[1]&lt;/a&gt;: It turns out my memory is at fault here. When I checked the reference I discovered Paul Graham makes no explicit mention of C++. 
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;We need a language that lets us scribble and smudge and smear, not a language where you have to sit with a teacup of types balanced on your knee and make polite conversation with a strict old aunt of a compiler. &amp;#8212; Paul Graham, &lt;a href="http://www.paulgraham.com/hp.html" title="Hackers and Painters"&gt;Hackers and Painters&lt;/a&gt;
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;You might have guessed which language he&amp;#8217;s promoting for scribbling but there&amp;#8217;s no mention of Lisp either in this particular essay. In fact only one programming language earns a name-check. I&amp;#8217;ll let you find out which one for yourselves.
&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/ZwpJK28-YBM" height="1" width="1"/&gt;</description>
<dc:date>2010-11-02</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/define-pedantic</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/ZwpJK28-YBM/define-pedantic</link>
<category>C++</category>
<category>Perl</category>
<feedburner:origLink>http://wordaligned.org/articles/define-pedantic</feedburner:origLink></item>

<item>
<title>Hiding iterator boilerplate behind a Boost facade</title>
<description>&lt;div class="toc"&gt;
&lt;h2&gt;Contents&lt;/h2&gt;
&lt;ul&gt;
 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/boost-iterator-facade#tocfilling-in-missing-methods-python" name="toc0" id="toc0"&gt;Filling in missing methods. Python&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/boost-iterator-facade#tocfilling-in-missing-methods-c" name="toc1" id="toc1"&gt;Filling in missing methods. C++&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/boost-iterator-facade#tocenter-boost-iterators" name="toc2" id="toc2"&gt;Enter Boost iterators&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/boost-iterator-facade#tocusing-boostiteratorfacade" name="toc3" id="toc3"&gt;Using boost::iterator_facade&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/boost-iterator-facade#toctemplates-and-traits" name="toc4" id="toc4"&gt;Templates and Traits&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/boost-iterator-facade#tocconstructors-destructors-and-operators" name="toc5" id="toc5"&gt;Constructors, destructors and operators&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/boost-iterator-facade#tocwrinkles" name="toc6" id="toc6"&gt;Wrinkles&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/boost-iterator-facade#tocless-code-more-software" name="toc7" id="toc7"&gt;Less code, more software&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/boost-iterator-facade#tocperformance" name="toc8" id="toc8"&gt;Performance&lt;/a&gt;
 &lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;&lt;p&gt;&lt;a href="http://www.flickr.com/photos/davehamster/2336911145/" title="SS Great Britain by Dave Hamster, on Flickr"&gt;&lt;img src="http://farm3.static.flickr.com/2379/2336911145_5275811ec0_m.jpg" width="240" height="160" alt="SS Great Britain"/&gt;&lt;/a&gt;
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/boost-iterator-facade#toc0" name="tocfilling-in-missing-methods-python" id="tocfilling-in-missing-methods-python"&gt;Filling in missing methods. Python&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Here&amp;#8217;s another wholesome &lt;a href="http://code.activestate.com/recipes/576685" title="Total ordering class decorator, by Raymond Hettinger"&gt;recipe&lt;/a&gt; served up by Raymond Hettinger.
&lt;/p&gt;
&lt;div class="typocode"&gt;&lt;div class="codetitle"&gt;Total ordering class decorator&lt;/div&gt;

&lt;pre class="prettyprint"&gt;def total_ordering(cls):
    'Class decorator that fills-in missing ordering methods'    
    convert = {
        '__lt__': [('__gt__', lambda self, other: other &amp;lt; self),
                   ('__le__', lambda self, other: not other &amp;lt; self),
                   ('__ge__', lambda self, other: not self &amp;lt; other)],
        '__le__': [('__ge__', lambda self, other: other &amp;lt;= self),
                   ('__lt__', lambda self, other: not other &amp;lt;= self),
                   ('__gt__', lambda self, other: not self &amp;lt;= other)],
        '__gt__': [('__lt__', lambda self, other: other &amp;gt; self),
                   ('__ge__', lambda self, other: not other &amp;gt; self),
                   ('__le__', lambda self, other: not self &amp;gt; other)],
        '__ge__': [('__le__', lambda self, other: other &amp;gt;= self),
                   ('__gt__', lambda self, other: not other &amp;gt;= self),
                   ('__lt__', lambda self, other: not self &amp;gt;= other)]
    }
    roots = set(dir(cls)) &amp;amp; set(convert)
    assert roots, 'must define at least one ordering operation: &amp;lt; &amp;gt; &amp;lt;= &amp;gt;='
    root = max(roots)       # prefer __lt__ to __le__ to __gt__ to __ge__
    for opname, opfunc in convert[root]:
        if opname not in roots:
            opfunc.__name__ = opname
            opfunc.__doc__ = getattr(int, opname).__doc__
            setattr(cls, opname, opfunc)
    return cls

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;If you have a class, &lt;code&gt;X&lt;/code&gt;, which implements one or more of the ordering operators, &lt;code&gt;&amp;lt;, &amp;lt;=, &amp;gt;, &amp;gt;=&lt;/code&gt; then &lt;code&gt;total_ordering(X)&lt;/code&gt; adapts and returns the class with the missing operators filled-in. Alternatively, use standard decorator syntax to adapt a class. If we apply &lt;code&gt;@total_ordering&lt;/code&gt; to a &lt;code&gt;Point&lt;/code&gt; class
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;@total_ordering
class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y
    def __lt__(self, other):
        return (self.x, self.y) &amp;lt; (other.x, other.y)

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;then we can compare points however we like
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; p = Point(1,2)
&amp;gt;&amp;gt;&amp;gt; q = Point(1,3)
&amp;gt;&amp;gt;&amp;gt; p &amp;lt; q, p &amp;gt; q, p &amp;gt;= q, p &amp;lt;= q
(True, False, False, True)

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here&amp;#8217;s a nice touch: the freshly-baked methods even have documentation!
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; help(Point)
Help on class Point in module __main__:

class Point
 |  Methods defined here:
 |  
 |  __ge__(self, other)
 |      x.__ge__(y) &amp;lt;==&amp;gt; x&amp;gt;=y
 |  
 |  __gt__(self, other)
 |      x.__gt__(y) &amp;lt;==&amp;gt; x&amp;gt;y
 |  
 |  __init__(self, x, y)
 |  
 |  __le__(self, other)
 |      x.__le__(y) &amp;lt;==&amp;gt; x&amp;lt;=y
 |  
 |  __lt__(self, other)

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Writing class decorators may not be the first thing a new Python programmer attempts, but once you&amp;#8217;ve discovered the relationship between Python&amp;#8217;s special method names and the more familiar operator symbols, I think this recipe is remarkably straightforward.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;convert = {
    '__lt__': [('__gt__', lambda self, other: other &amp;lt; self),
               ('__le__', lambda self, other: not other &amp;lt; self),
               ('__ge__', lambda self, other: not self &amp;lt; other)],
    '__le__': [('__ge__', lambda self, other: other &amp;lt;= self),
               ('__lt__', lambda self, other: not other &amp;lt;= self),
               ('__gt__', lambda self, other: not self &amp;lt;= other)],
    '__gt__': [('__lt__', lambda self, other: other &amp;gt; self),
               ('__ge__', lambda self, other: not other &amp;gt; self),
               ('__le__', lambda self, other: not self &amp;gt; other)],
    '__ge__': [('__le__', lambda self, other: other &amp;gt;= self),
               ('__gt__', lambda self, other: not other &amp;gt;= self),
               ('__lt__', lambda self, other: not self &amp;gt;= other)]
}

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Before moving on to something more challenging, look again at one of the recipe&amp;#8217;s key ingredients, the &lt;code&gt;convert&lt;/code&gt; dict, which helps create the missing ordering functions from existing ones. As you can see, there&amp;#8217;s much repetition here, and plenty of opportunities for cut-and-paste errors.
&lt;/p&gt;
&lt;p&gt;This block of code is an example of what programmers term &lt;a href="http://en.wikipedia.org/wiki/Boilerplate_(text)#Boilerplate_code"&gt;boilerplate&lt;/a&gt;. By using the total ordering decorator, we can avoid boilerplating our own code.&lt;a id="fn1link" href="http://wordaligned.org/articles/boost-iterator-facade#fn1"&gt;&lt;sup&gt;[1]&lt;/sup&gt;&lt;/a&gt;
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/boost-iterator-facade#toc1" name="tocfilling-in-missing-methods-c" id="tocfilling-in-missing-methods-c"&gt;Filling in missing methods. C++&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Python is dynamic and self-aware, happy to expose its internals for this kind of tinkering.  It takes real wizardry to achieve similar results with a &lt;a href="http://sites.google.com/site/steveyegge2/tour-de-babel" title="C++ is the dumbest language on earth ... doesn't know about itself. It is not introspective"&gt;less flexible language, such as C++&lt;/a&gt; &amp;#8212; but it can be done.
&lt;/p&gt;
&lt;span id="continue-reading"/&gt;

&lt;p&gt;In a &lt;a href="http://wordaligned.org/articles/binary-search-revisited"&gt;previous article&lt;/a&gt; we developed a random access file iterator in C++. At its heart, this iterator simply repositioned itself using file-seeks and dereferenced itself using file-reads. There wasn&amp;#8217;t much to it.
&lt;/p&gt;
&lt;p&gt;Unfortunately we had to fill-out the iterator with the various members required to make it comply with the standard random access iterator requirements (which was the whole point, since we wanted something we could use with standard binary search algorithms).
&lt;/p&gt;
&lt;p&gt;We had to expose standard typedefs:
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;typedef std::random_access_iterator_tag iterator_category;
typedef item value_type;
typedef std::streamoff difference_type;
typedef item * pointer;
typedef item &amp;amp; reference;

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Worse, we had to implement a full set of comparison, iteration, step and access functions. Please, page down past the following code block! I only include it here so you can see how long it goes on for.
&lt;/p&gt;
&lt;div class="typocode"&gt;&lt;div class="codetitle"&gt;Iterator boilerplate&lt;/div&gt;

&lt;pre class="prettyprint"&gt;public: // Comparison
    bool operator&amp;lt;(iter const &amp;amp; other) const
    {
        return pos &amp;lt; other.pos;
    }
    
    bool operator&amp;gt;(iter const &amp;amp; other) const
    {
        return pos &amp;gt; other.pos;
    }
    
    bool operator==(iter const &amp;amp; other) const
    {
        return pos == other.pos;
    }
    
    bool operator!=(iter const &amp;amp; other) const
    {
        return pos != other.pos;
    }
    
public: // Iteration
    iter &amp;amp; operator++()
    {
        return *this += 1;
    }
    
    iter &amp;amp; operator--()
    {
        return *this -= 1;
    }
    
    iter operator++(int)
    {
        iter tmp(*this);
        ++(*this);
        return tmp;
    }
    
    iter operator--(int)
    {
        iter tmp(*this);
        --(*this);
        return tmp;
    }
    
public: // Step
    iter &amp;amp; operator+=(difference_type n)
    {
        advance(n);
        return *this;
    }
    
    iter &amp;amp; operator-=(difference_type n)
    {
        advance(-n);
        return *this;
    }
    
    iter operator+(difference_type n)
    {
        iter result(*this);
        return result += n;
    }
    
    iter operator-(difference_type n)
    {
        iter result(*this);
        return result -= n;
    }
    
public: // Distance
    difference_type operator-(iter &amp;amp; other)
    {
        return pos - other.pos;
    }
    
public: // Access
    value_type operator*()
    {
        return (*this)[0];
    }

value_type operator[](difference_type n)
    {
        std::streampos restore = getpos();
        advance(n);
        value_type const result = read();
        setpos(restore);
        return result;
    }
    ....

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;How tiresome! Most of these member functions are directly and unsurprisingly implemented in a standard way. It would be nice if we could write (and test!) what we actually needed to and have a decorator fill in the rest.
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://www.flickr.com/photos/chr1sp/3997724676/" title="Library - Ephesus by Chris. P, on Flickr"&gt;&lt;img src="http://farm3.static.flickr.com/2554/3997724676_bf73106637.jpg" width="500" height="334" alt="Library - Ephesus"/&gt;&lt;/a&gt;
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/boost-iterator-facade#toc2" name="tocenter-boost-iterators" id="tocenter-boost-iterators"&gt;Enter Boost iterators&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Actually, we can! I&amp;#8217;m grateful to proggitor dzorz for &lt;a href="http://www.reddit.com/r/programming/comments/c8fsk/binary_search_revisited/c0quxr0"&gt;telling me how&lt;/a&gt;.
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;A nicer solution would use boost::iterator_facade and just implement dereference, equal, increment, decrement, advance and distance_to.
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;Like many programmers I have mixed feelings about C++ &amp;#8212; when it&amp;#8217;s good it&amp;#8217;s very very good, but when it&amp;#8217;s bad it&amp;#8217;s horrid &amp;#8212; and these feelings are only amplified by the &lt;a href="http://www.boost.org" title="Boost library home page"&gt;Boost&lt;/a&gt; library. Boost is superb, so long as you stick to the good parts.
&lt;/p&gt;
&lt;p&gt;So which parts are good? It depends. On you, who you work with, and the platforms you&amp;#8217;re working on.
&lt;/p&gt;
&lt;p&gt;In my previous article I used an ingenious iterator adaptor from the &lt;a href="http://www.boost.org/doc/libs/release/libs/spirit/index.html"&gt;Boost.Spirit&lt;/a&gt; parser library to disastrous effect. If only I&amp;#8217;d looked a little more carefully I&amp;#8217;d have discovered something more useful in a more obvious place. &lt;a href="http://www.boost.org/doc/libs/release/libs/iterator/"&gt;Boost.Iterator&lt;/a&gt; could have helped.
&lt;/p&gt;
&lt;p&gt;As dzorz points out, &lt;code&gt;boost::iterator_facade&lt;/code&gt; can work with any C++ iterable. Implement whatever subset of 
&lt;/p&gt;
&lt;ul&gt;
 &lt;li&gt;
     dereference
 &lt;/li&gt;

 &lt;li&gt;
     equal
 &lt;/li&gt;

 &lt;li&gt;
     increment 
 &lt;/li&gt;

 &lt;li&gt;
     decrement
 &lt;/li&gt;

 &lt;li&gt;
     advance
 &lt;/li&gt;

 &lt;li&gt;
     distance_to
 &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;is appropriate and &lt;code&gt;iterator_facade&lt;/code&gt; will fill in the boilerplate required to standardise your iterator.
&lt;/p&gt;
&lt;p&gt;In our case, we&amp;#8217;ll need the full set. That&amp;#8217;s because we&amp;#8217;re after a random access iterator. Other iterators need rather less. Here&amp;#8217;s a &lt;a href="http://www.boost.org/doc/libs/1_43_0/libs/iterator/doc/iterator_facade.html#iterator-facade-requirements"&gt;table&lt;/a&gt; showing the relationship between core operations and iterator concepts.
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://www.boost.org/doc/libs/1_43_0/libs/iterator/doc/iterator_facade.html#iterator-facade-requirements"&gt;&lt;img src="http://wordaligned.org/images/iterator-facade.png" alt="iterator_facade Core Operations"/&gt;&lt;/a&gt;
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/boost-iterator-facade#toc3" name="tocusing-boostiteratorfacade" id="tocusing-boostiteratorfacade"&gt;Using boost::iterator_facade&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;The &lt;a href="http://www.boost.org/doc/libs/release/libs/iterator/"&gt;Boost.Iterator&lt;/a&gt; documentation is well-written but daunting. Read it from top-to bottom and you&amp;#8217;ll get:
&lt;/p&gt;
&lt;ul&gt;
 &lt;li&gt;
     rationale and theory
 &lt;/li&gt;

 &lt;li&gt;
     plans for standardisation (which don&amp;#8217;t seem correct &lt;a id="fn2link" href="http://wordaligned.org/articles/boost-iterator-facade#fn2"&gt;&lt;sup&gt;[2]&lt;/sup&gt;&lt;/a&gt;)
 &lt;/li&gt;

 &lt;li&gt;
     &lt;strong&gt;usage notes&lt;/strong&gt;
 &lt;/li&gt;

 &lt;li&gt;
     some subtle points on the implementation and its predecessor
 &lt;/li&gt;

 &lt;li&gt;
     a namecheck for the curiously recurring template pattern
 &lt;/li&gt;

 &lt;li&gt;
     a fat reference section detailing the boilerplate which this library allows you to forget
 &lt;/li&gt;

 &lt;li&gt;
     a &lt;strong&gt;tutorial&lt;/strong&gt;
 &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If you&amp;#8217;re tempted to skip to the end of the page, you&amp;#8217;ll see this code block.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;#include &amp;lt;boost/type_traits/is_convertible.hpp&amp;gt;
#include &amp;lt;boost/utility/enable_if.hpp&amp;gt;
  
  ....
  
private:
  struct enabler {};
  
public:
  template &amp;lt;class OtherValue&amp;gt;
  node_iter(
      node_iter&amp;lt;OtherValue&amp;gt; const&amp;amp; other
    , typename boost::enable_if&amp;lt;
          boost::is_convertible&amp;lt;OtherValue*,Value*&amp;gt;
        , enabler
      &amp;gt;::type = enabler()
  )
    : m_node(other.m_node) {}

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;According to the surrounding documentation this is &amp;#8220;magic&amp;#8221;. I find it scary.
&lt;/p&gt;
&lt;p&gt;Luckily it turns out the library is straightforward to use. What you really want, as a newcomer, are the &lt;a href="http://www.boost.org/doc/libs/1_43_0/libs/iterator/doc/iterator_facade.html#usage"&gt;usage notes&lt;/a&gt; and the &lt;a href="http://www.boost.org/doc/libs/1_43_0/libs/iterator/doc/iterator_facade.html#tutorial-example"&gt;tutorial example&lt;/a&gt;.
&lt;/p&gt;
&lt;p&gt;The tutorial walks through the process of skinning a singly-linked list with a forwards iterator facade. This is a different use case to ours: the tutorial shows a basic class which implements what it should, and the facade allows it to be treated as a forwards iterator. In our case we&amp;#8217;ve already created a full-blown random access iterator. We can retrospectively apply &lt;code&gt;iterator_facade&lt;/code&gt; to strip our class back to basics.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/boost-iterator-facade#toc4" name="toctemplates-and-traits" id="toctemplates-and-traits"&gt;Templates and Traits&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Where we had:
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;template &amp;lt;typename item&amp;gt;
class text_file_iter
{
public: // Traits typedefs, which make this class usable with
        // algorithms which need a random access iterator.
    typedef std::random_access_iterator_tag iterator_category;
    typedef item value_type;
    typedef std::streamoff difference_type;
    typedef item * pointer;
    typedef item &amp;amp; reference;
....
};

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;We now need (my thanks here to Giuseppe for correcting the code I originally posted here):
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;template &amp;lt;typename value&amp;gt;
class text_file_iter
    : public boost::iterator_facade&amp;lt;
      text_file_iter&amp;lt;value&amp;gt;
    , value
    , std::random_access_iterator_tag
    , value
    , std::streamoff
    &amp;gt;
....
};

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;(Yes, the class accepts itself as a template parameter. That&amp;#8217;s the curious recursion.)
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/boost-iterator-facade#toc5" name="tocconstructors-destructors-and-operators" id="tocconstructors-destructors-and-operators"&gt;Constructors, destructors and operators&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;We still need iterator constructors and destructors &amp;#8212; these are unchanged &amp;#8212; but &lt;strong&gt;we can eliminate every single operator&lt;/strong&gt; shown in the &amp;#8220;Iterator boilerplate&amp;#8221; code block above.
&lt;/p&gt;
&lt;p&gt;Here&amp;#8217;s what we need instead, to ensure &lt;code&gt;iterator_facade&lt;/code&gt; can do its job. The &lt;code&gt;read()&lt;/code&gt; member function we had before doesn&amp;#8217;t need changing.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;    ....
private: // Everything Boost's iterator facade needs
    friend class boost::iterator_core_access;
    
    value dereference() const
    {
        return read();
    }
    
    bool equal(iter const &amp;amp; other) const
    {
        return pos == other.pos;
    }
    
    void increment()
    {
        advance(1);
    }
    
    void decrement()
    {
        advance(-1);
    }
    
    void advance(std::streamoff n)
    {
        in.seekg(n, std::ios_base::cur);
        pos = in.tellg();
    }
    
    std::streamoff distance_to(iter const &amp;amp; other) const
    {
        return other.pos - pos;
    }

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;And that really is all there is to it. I&amp;#8217;m impressed.
&lt;/p&gt;
&lt;p&gt;Notice, by the way, that &lt;code&gt;friend&lt;/code&gt; is used to expose the primitive, private member functions for use by the &lt;code&gt;boost::iterator_core_access&lt;/code&gt; class. This follows the example set by the tutorial. I&amp;#8217;ve written enough C and Python to question C++&amp;#8217;s sophisticated access rules &amp;#8212; you have &lt;code&gt;public&lt;/code&gt;, &lt;code&gt;protected&lt;/code&gt; and &lt;code&gt;private&lt;/code&gt;, but that&amp;#8217;s &lt;strong&gt;still&lt;/strong&gt; not enough, so you need &lt;code&gt;friend&lt;/code&gt; declaration to cut through it all &amp;#8212; which tempts me to simply make &lt;code&gt;dereference()&lt;/code&gt;, &lt;code&gt;equal()&lt;/code&gt; etc. public, but then the facade wouldn&amp;#8217;t be a proper facade. Users should treat the final class exactly as they would any other random access iterator, and designating these members as &lt;code&gt;private&lt;/code&gt; means they&amp;#8217;ll have to.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/boost-iterator-facade#toc6" name="tocwrinkles" id="tocwrinkles"&gt;Wrinkles&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;You&amp;#8217;ll notice the &lt;code&gt;dereference()&lt;/code&gt; member function has a &lt;code&gt;const&lt;/code&gt; signature. However, the &lt;code&gt;read()&lt;/code&gt; member function is non-const.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;    // Return the item at the current position
    value_type read()
    {
        value_type n = 0;
        
        // Reverse till we hit whitespace or the start of the file
        while (in &amp;amp;&amp;amp; !isspace(in.peek()))
        {
            in.unget();
        }
        in.clear();
        
        in &amp;gt;&amp;gt; n;
        return n;
    }

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here, &lt;code&gt;in&lt;/code&gt; is a data member of type &lt;code&gt;std::ifstream&lt;/code&gt;, and clearly the read call modifies it. That&amp;#8217;s what this alarming compiler error is trying to tell us.
&lt;/p&gt;
&lt;pre&gt;
boost_binary_search_text_file.cpp: In member function 'value text_file_iter&amp;lt;value&amp;gt;::read() const [with value = long long int]':
boost_binary_search_text_file.cpp:90:   instantiated from 'value text_file_iter&amp;lt;value&amp;gt;::dereference() const [with value = long long int]'
/opt/local/include/boost/iterator/iterator_facade.hpp:516:   instantiated from 'static typename Facade::reference boost::iterator_core_access::dereference(const Facade&amp;amp;) [with Facade = text_file_iter&amp;lt;long long int&amp;gt;]'
/opt/local/include/boost/iterator/iterator_facade.hpp:634:   instantiated from 'Reference boost::iterator_facade&amp;lt;I, V, TC, R, D&amp;gt;::operator*() const [with Derived = text_file_iter&amp;lt;long long int&amp;gt;, Value = long long int, CategoryOrTraversal = boost::random_access_traversal_tag, Reference = long long int, Difference = long long int]'
/usr/include/c++/4.2.1/bits/stl_algo.h:4240:   instantiated from 'bool std::binary_search(_ForwardIterator, _ForwardIterator, const _Tp&amp;amp;) [with _ForwardIterator = text_file_iter&amp;lt;long long int&amp;gt;, _Tp = main::number]'
boost_binary_search_text_file.cpp:203:   instantiated from here
boost_binary_search_text_file.cpp:174: error: passing 'const std::basic_ifstream&amp;lt;char, std::char_traits&amp;lt;char&amp;gt; &amp;gt;' as 'this' argument of 'typename std::basic_istream&amp;lt;_CharT, _Traits&amp;gt;::int_type std::basic_istream&amp;lt;_CharT, _Traits&amp;gt;::peek() [with _CharT = char, _Traits = std::char_traits&amp;lt;char&amp;gt;]' discards qualifiers
boost_binary_search_text_file.cpp:176: error: passing 'const std::basic_ifstream&amp;lt;char, std::char_traits&amp;lt;char&amp;gt; &amp;gt;' as 'this' argument of 'std::basic_istream&amp;lt;_CharT, _Traits&amp;gt;&amp;amp; std::basic_istream&amp;lt;_CharT, _Traits&amp;gt;::unget() [with _CharT = char, _Traits = std::char_traits&amp;lt;char&amp;gt;]' discards qualifiers
boost_binary_search_text_file.cpp:178: error: passing 'const std::basic_ifstream&amp;lt;char, std::char_traits&amp;lt;char&amp;gt; &amp;gt;' as 'this' argument of 'void std::basic_ios&amp;lt;_CharT, _Traits&amp;gt;::clear(std::_Ios_Iostate) [with _CharT = char, _Traits = std::char_traits&amp;lt;char&amp;gt;]' discards qualifiers
boost_binary_search_text_file.cpp:180: error: passing 'const std::basic_ifstream&amp;lt;char, std::char_traits&amp;lt;char&amp;gt; &amp;gt;' as 'this' argument of 'std::basic_istream&amp;lt;_CharT, _Traits&amp;gt;&amp;amp; std::basic_istream&amp;lt;_CharT, _Traits&amp;gt;::operator&amp;gt;&amp;gt;(long long int&amp;amp;) [with _CharT = char, _Traits = std::char_traits&amp;lt;char&amp;gt;]' discards qualifiers
&lt;/pre&gt;

&lt;p&gt;Related to this, the &lt;code&gt;Reference&lt;/code&gt; template parameter (shown in bold in the listing below) is actually a &lt;code&gt;value&lt;/code&gt;, rather than the (default) &lt;code&gt;value &amp;amp;&lt;/code&gt;. As we originally implemented it, our file iterator reads values lazily, only when clients request them. We have no reference to return.
&lt;/p&gt;
&lt;pre&gt;
template &amp;lt;typename value&amp;gt;
class text_file_iter
    : public boost::iterator_facade&amp;lt;
      text_file_iter&amp;lt;value&amp;gt;
    , value
    , std::random_access_iterator_tag
    , &lt;strong&gt;value&lt;/strong&gt;
    , std::streamoff
    &amp;gt;
};
&lt;/pre&gt;

&lt;p&gt;I faced a dilemma here. Either I could modify my original file iterator, including a current value data member, which I would take care to update every time we repositioned the file read position. Then our references could be real references and &lt;code&gt;read()&lt;/code&gt; would naturally be &lt;code&gt;const&lt;/code&gt;, simply returning a reference to this member. Or, I could make the &lt;code&gt;in&lt;/code&gt; input stream data member &lt;code&gt;mutable&lt;/code&gt;.
&lt;/p&gt;
&lt;p&gt;&lt;code&gt;Mutable&lt;/code&gt; makes me uneasy for the same reason that &lt;code&gt;friend&lt;/code&gt; does &amp;#8212; if you can shake off the rigours of const-correctness so easily, then why bother with it? &amp;#8212; and for this reason the first option appealed. However, a read-only file is an unusual container: we do not change it, but we cannot supply const references to its elements without reading them in, and that will mean changes to the file input stream. The easier option, involving the smallest code change, was to make &lt;code&gt;in&lt;/code&gt; mutable. So that&amp;#8217;s what I did.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/boost-iterator-facade#toc7" name="tocless-code-more-software" id="tocless-code-more-software"&gt;Less code, more software&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;By employing two of my least favourite C++ keywords I now had a class which provided the functions it should, and, thanks to the magic worked by Boost&amp;#8217;s iterator facade, I also had a class which I could use as a standard random access iterator. Most of the code changes were the deletion of boilerplate &amp;#8212; very satisfying. I added code too, since I decided to invest a little more effort in tests. I didn&amp;#8217;t have any doubts about the Boost library&amp;#8217;s correctness but I thought I might not have been using it correctly. Happily these tests all passed.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/boost-iterator-facade#toc8" name="tocperformance" id="tocperformance"&gt;Performance&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Let&amp;#8217;s not forget why we originally wanted a random access file iterator: we had a large sorted file, too large to read into memory, and we wanted to test for the presence of the number in this file.
&lt;/p&gt;
&lt;p&gt;For test purposes I created a file just over 4GB in size. A simple linear search through this file took around 180 seconds on my (aging laptop) computer, and was light on memory use. By creating a random access file iterator, boilerplate and all, we took advantage of the standard binary search algorithm and reduced the time to around 4 milliseconds.
&lt;/p&gt;
&lt;p&gt;How would our version using Boost iterator facade do? I wasn&amp;#8217;t expecting it to be faster than the original, but I wouldn&amp;#8217;t have been surprised if it gave it a close run: using Boost doesn&amp;#8217;t usually involve compromise. In fact, over repeated runs of my performance tests there was no significant difference between the two iterator versions &amp;#8212; or at least there wasn&amp;#8217;t once a helpful reader had discovered and fixed a bug in my program, which was causing it to run correctly but slowly.
&lt;/p&gt;
&lt;p&gt;To trust a facade I guess you need some knowledge of what lies behind it.
&lt;/p&gt;
&lt;p style="text-align:center"&gt;&amp;sect;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: During my original performance tests, reported in the first version of this article, the Boost iterator performed woefully, far slower even than a linear search. By this time I&amp;#8217;d lost patience, and it was left up to a reader, Giuseppe, to &lt;a href="http://wordaligned.org/articles/boost-iterator-facade#comment-60988668"&gt;point out my mistake&lt;/a&gt;. I&amp;#8217;d been using a &lt;code&gt;boost::random_access_traversal_tag&lt;/code&gt; template parameter, with the result that &lt;code&gt;std::distance()&lt;/code&gt; was using repeated increments rather than calling &lt;code&gt;distance_to()&lt;/code&gt; to get an immediate result, and consequently ran very slowly. I should have used &lt;code&gt;std::random_access_iterator_tag&lt;/code&gt;. I modified my code accordingly and confirmed that the Boost version does indeed perform on a par with the original version.
&lt;/p&gt;
&lt;hr /&gt;

&lt;p&gt;My thanks to &lt;a href="http://www.flickr.com/photos/chr1sp/" title="Chris. P on Flickr"&gt;Chris P&lt;/a&gt; for permission to use his &lt;a href="http://www.flickr.com/photos/chr1sp/3997724676"&gt;photograph&lt;/a&gt; of the &lt;a href="http://en.wikipedia.org/wiki/Library_of_Celsus" title="Library of Celsus, Wikipedia"&gt;Library of Celsus&lt;/a&gt; at Ephesus, or at rather its beautiful facade. Ephesus is famous for the Temple of Artemis, one of the seven wonders of the ancient world, of which only fragments remain. Thanks too to &lt;a href="http://www.flickr.com/photos/davehamster/"&gt;Dave Hamster&lt;/a&gt; for the boilerplate &lt;a href="http://www.flickr.com/photos/davehamster/2336911145/"&gt;photo&lt;/a&gt; &amp;#8212; actually a detail from the hull of the &lt;a href="http://www.ssgreatbritain.org"&gt;SS Great Britain&lt;/a&gt;.
&lt;/p&gt;
&lt;p&gt;If you&amp;#8217;d like to continue this experiment the code and the tests I used are available via anonymous SVN access from &lt;a href="http://svn.wordaligned.org/svn/etc/search_text_file"&gt;http://svn.wordaligned.org/svn/etc/search_text_file&lt;/a&gt;.
&lt;/p&gt;
&lt;hr /&gt;

&lt;p&gt;&lt;a id="fn1" href="http://wordaligned.org/articles/boost-iterator-facade#fn1link"&gt;[1]&lt;/a&gt;: As of Python 2.7 and 3.2, the standard library will include a version of this recipe. It&amp;#8217;s in the &lt;a href="http://docs.python.org/dev/py3k/library/functools.html#functools.total_ordering" title="functools.total_ordering decorator documentation"&gt;functools module&lt;/a&gt;. For some reason, your class &amp;#8220;should supply an __eq__()&amp;#8221; method.
&lt;/p&gt;
&lt;p&gt;&lt;a id="fn2" href="http://wordaligned.org/articles/boost-iterator-facade#fn2link"&gt;[2]&lt;/a&gt;: According to the Boost.Iterator documentation:
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;Both &lt;code&gt;iterator_facade&lt;/code&gt; and &lt;code&gt;iterator_adaptor&lt;/code&gt; as well as many of the specialized adaptors mentioned below have been proposed for standardization, and accepted into the first C++ technical report; see our [Standard Proposal For Iterator Facade and Adaptor (PDF)][tr1proposal] for more details.
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;I assumed this meant there&amp;#8217;d be &lt;code&gt;tr1::iterator_(facade|adaptor)&lt;/code&gt; classes, but I don&amp;#8217;t think that&amp;#8217;s the case. Unlike other (good) bits of Boost, the Iterator library doesn&amp;#8217;t seem likely to be part of the next C++ release.
&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/GpCyHa2q-xw" height="1" width="1"/&gt;</description>
<dc:date>2010-07-07</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/boost-iterator-facade</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/GpCyHa2q-xw/boost-iterator-facade</link>
<category>C++</category>
<category>Python</category>
<category>Boost</category>
<feedburner:origLink>http://wordaligned.org/articles/boost-iterator-facade</feedburner:origLink></item>

<item>
<title>Equality and Equivalence</title>
<description>&lt;p&gt;&lt;a href="http://oracleofbacon.org"&gt;&lt;img style="float:right;" src="http://wordaligned.org/images/kevin-bacon.jpg" alt="Kevin Bacon mugshot"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;If A &amp;lt;= B and B &amp;lt;= A then A and B must be equal, right?
&lt;/p&gt;
&lt;p&gt;Wrong, actually.
&lt;/p&gt;
&lt;p&gt;We could rank actors according to their Bacon number, for example. Hugh Grant and Daniel Day-Lewis both have a &lt;a href="http://oracleofbacon.org" title="so says the Oracle of Bacon"&gt;Bacon number of 2&lt;/a&gt;, but that doesn&amp;#8217;t make them equal!
&lt;/p&gt;
&lt;p&gt;I think most programmers get the distinction between ordering and equality, but it&amp;#8217;s easy to forget.
&lt;/p&gt;
&lt;p&gt;Part of the problem is the less-than-or-&lt;strong&gt;equal&lt;/strong&gt;-to and greater-than-or-&lt;strong&gt;equal&lt;/strong&gt;-to operators both mention &lt;strong&gt;equal&lt;/strong&gt;. The standard programming representation of these operators includes a single equals symbol, whil the representation of equality has two equals symbols. Symbolically we might assume:
&lt;/p&gt;
&lt;pre style="font-size:400%"&gt;&amp;lt;= &lt;span style="color:#930;"&gt;&amp;and;&lt;/span&gt; &amp;gt;= &lt;span style="color:#930;"&gt;&amp;rArr;&lt;/span&gt; == &lt;/pre&gt;

&lt;p&gt;Wrong!
&lt;/p&gt;
&lt;div style="font-size:800%"&gt;&amp;#x2620;&lt;/div&gt;

&lt;p&gt;Another part of the problem is that we tend to think of numbers as archetypal objects: &lt;a href="http://www.google.com/search?q=%22when+in+doubt+do+as+the+ints+do%22" title="Scott Meyers advice"&gt;when in doubt, do as the ints do&lt;/a&gt;. For integers, it&amp;#8217;s true, equality and equivalence are the same. The same is true of real numbers, but what about their &lt;a href="http://docs.sun.com/source/806-3568/ncg_goldberg.html" title="What Every Computer Scientist Should Know About Floating-Point Arithmetic"&gt;floating point representations&lt;/a&gt;? A &lt;code&gt;NaN&lt;/code&gt; doesn&amp;#8217;t even equal itself. Complex numbers have no standard comparison operators.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; 3+4j == 4j+3
True
&amp;gt;&amp;gt;&amp;gt; 3+4j &amp;lt;= 4j+3
Traceback (most recent call last):
  File "&amp;lt;stdin&amp;gt;", line 1, in &amp;lt;module&amp;gt;
TypeError: no ordering relation is defined for complex numbers

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;To avoid trouble, remember that the &lt;strong&gt;equal&lt;/strong&gt; in less-than-or-&lt;strong&gt;equal&lt;/strong&gt;-to should really be &lt;strong&gt;equivalent&lt;/strong&gt;. 
&lt;/p&gt;
&lt;p&gt;Oh, and please don&amp;#8217;t confuse equality with assignment.
&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/OJMjmD6rLKI" height="1" width="1"/&gt;</description>
<dc:date>2010-06-09</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/equals-equals</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/OJMjmD6rLKI/equals-equals</link>
<category>Algorithms</category>
<category>Characters</category>
<feedburner:origLink>http://wordaligned.org/articles/equals-equals</feedburner:origLink></item>

<item>
<title>Binary search revisited</title>
<description>&lt;div class="toc"&gt;
&lt;h2&gt;Contents&lt;/h2&gt;
&lt;ul&gt;
 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/binary-search-revisited#tocrecap" name="toc0" id="toc0"&gt;Recap&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/binary-search-revisited#tocthe-problem" name="toc1" id="toc1"&gt;The Problem&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/binary-search-revisited#tocinput-iterators" name="toc2" id="toc2"&gt;Input iterators&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/binary-search-revisited#tocfind" name="toc3" id="toc3"&gt;Find&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/binary-search-revisited#tocrewrite-the-file" name="toc4" id="toc4"&gt;Rewrite the file!&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/binary-search-revisited#tocadapting-iterators" name="toc5" id="toc5"&gt;Adapting iterators&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/binary-search-revisited#tocmultipass-iterator" name="toc6" id="toc6"&gt;Multipass iterator&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/binary-search-revisited#tocnot-so-fast" name="toc7" id="toc7"&gt;Not so fast&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/binary-search-revisited#tocbetter-than-find" name="toc8" id="toc8"&gt;Better than find&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/binary-search-revisited#tocimplementation" name="toc9" id="toc9"&gt;Implementation&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/binary-search-revisited#tochardware-used" name="toc10" id="toc10"&gt;Hardware used&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/binary-search-revisited#tocconclusions" name="toc11" id="toc11"&gt;Conclusions&lt;/a&gt;
 &lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/binary-search-revisited#toc0" name="tocrecap" id="tocrecap"&gt;Recap&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Recently &lt;a href="http://wordaligned.org/articles/binary-search"&gt;I wrote&lt;/a&gt; about C++&amp;#8217;s standard binary search algorithms (yes, four of them!) which do such a fine job of:
&lt;/p&gt;
&lt;ul&gt;
 &lt;li&gt;
     specifying exactly what kind of range a binary search requires
 &lt;/li&gt;

 &lt;li&gt;
     separating the core algorithm from the details of the range it&amp;#8217;s working on
 &lt;/li&gt;

 &lt;li&gt;
     delivering precise results
 &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;To support these claims I included an implementation of a file iterator, suitable for use with &lt;code&gt;std::binary_search()&lt;/code&gt; etc. to efficiently locate values in very large files.
&lt;/p&gt;
&lt;p&gt;Now, there are a couple of issues with this approach:
&lt;/p&gt;
&lt;ol&gt;
 &lt;li&gt;
     we had to write a lot of code to make a file iterator suitable for use with standard algorithms
 &lt;/li&gt;

 &lt;li&gt;
     this file iterator only works on highly structured files, where each value occupies a fixed number of bytes
 &lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;In this follow up article we&amp;#8217;ll consider each of these issues in a little more depth by working through two very different solutions to a related problem.
&lt;/p&gt;
&lt;span id="continue-reading"/&gt;


&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/binary-search-revisited#toc1" name="tocthe-problem" id="tocthe-problem"&gt;The Problem&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Suppose, once again, that we have a large file, a few gigabytes, say. The file contains numbers, in order, and we&amp;#8217;re interested in testing if this file contains a given number. This time, though, the file is a text file, where the numbers are represented in the usual way as sequences of digits separated by whitespace.
&lt;/p&gt;
&lt;pre&gt;
$ less lots-of-numbers
...
10346  11467 11469 11472  11501 
  11662    12204 12290
...
&lt;/pre&gt;

&lt;p&gt;Note that a number in this file does not occupy a fixed number of bytes. If we jump to a new position in the file using a seek operation, we cannot expect to land exactly where a number starts. Thus the random access file iterator we developed last time won&amp;#8217;t work.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/binary-search-revisited#toc2" name="tocinput-iterators" id="tocinput-iterators"&gt;Input iterators&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;In C++ an input file is an example of an input stream, and the standard library gives us &lt;code&gt;istream_iterators&lt;/code&gt; which perform formatted input. In our case, an &lt;code&gt;istream_iterator&amp;lt;int&amp;gt;&lt;/code&gt; effectively converts the file into a stream of numbers.
&lt;/p&gt;
&lt;p&gt;Istream iterators are &lt;a href="http://www.sgi.com/tech/stl/InputIterator.html" title="InputIterator, SGI STL documentation"&gt;input iterators&lt;/a&gt;. They progress through the input stream, item by item, with no repeating or rewinding allowed. Despite their limitations, the C++ standard library provides some algorithms which require nothing more than basic input iterators. For example, to count up even numbers in the file whose name is supplied on the command line we might use &lt;code&gt;std::count_if&lt;/code&gt;.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;#include &amp;lt;algorithm&amp;gt;
#include &amp;lt;fstream&amp;gt;
#include &amp;lt;iostream&amp;gt;
#include &amp;lt;iterator&amp;gt;

bool is_even(int x)
{
    return x % 2 == 0;
}

int main(int argc, char * argv[])
{
    typedef std::istream_iterator&amp;lt;int&amp;gt; i_iter;
    typedef std::ostream_iterator&amp;lt;int&amp;gt; o_iter;
    std::ifstream in(argv[1]);
    
    std::cout &amp;lt;&amp;lt; std::count_if(i_iter(in), i_iter(), is_even) &amp;lt;&amp;lt; '\n';
    
    return 0;
}

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The next version of C++ supports lambda functions, so you&amp;#8217;ll be able to put &lt;code&gt;is_even&lt;/code&gt; right where it&amp;#8217;s used, in the &lt;code&gt;count_if()&lt;/code&gt; function call. Or, with the current version of C++, you could write:
&lt;/p&gt;
&lt;div class="typocode"&gt;&lt;div class="codetitle"&gt;Ouch!&lt;/div&gt;

&lt;pre class="prettyprint"&gt;    ....
    std::count_if(i_iter(in), i_iter(),
                 std::not1(std::bind2nd(std::modulus&amp;lt;int&amp;gt;(), 2)))

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Maybe not!
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/binary-search-revisited#toc3" name="tocfind" id="tocfind"&gt;Find&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;The very simplest search algorithm, &lt;code&gt;std::find&lt;/code&gt;, needs nothing more than an input iterator. To determine if a number is in a file, we &lt;strong&gt;could&lt;/strong&gt; just invoke &lt;code&gt;std::find&lt;/code&gt;.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;typedef std::istream_iterator&amp;lt;int&amp;gt; i_iter;

bool 
is_number_in_file(char const * filename, int n)
{
    std::ifstream in(filename);
    i_iter begin(in);
    i_iter end;
    return std::find(begin, end, n) != end;
}

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here, the find algorithm advances through the numbers in the file, from start to finish, stopping as soon as it hits one equal to the supplied value, &lt;code&gt;n&lt;/code&gt;. We can expect this function to be light on memory use &amp;#8212; there will be some buffering at the lower levels of the file access, but nothing more &amp;#8212; and the function is evidently correct.
&lt;/p&gt;
&lt;p&gt;It would be correct even if our file was unsorted, however. Is there any way we can do better?
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/binary-search-revisited#toc4" name="tocrewrite-the-file" id="tocrewrite-the-file"&gt;Rewrite the file!&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;In the previous article we developed a random access iterator for accessing binary files, and usable for efficient binary searches of sorted binary files. Now would be a good time to question the problem specification. Is this a one off? Or are we going to be testing the presence of more numbers in the file in future? And if so, can we convert the file to binary to save time in the long run? 
&lt;/p&gt;
&lt;p&gt;Although I&amp;#8217;m not going to pursue this option here, it may well be the best approach. For now, though, let&amp;#8217;s assume we have a one-off problem to solve, and that we aren&amp;#8217;t allowed to tinker with the input.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/binary-search-revisited#toc5" name="tocadapting-iterators" id="tocadapting-iterators"&gt;Adapting iterators&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;If we want to use &lt;code&gt;std::binary_search&lt;/code&gt; we need, as a minimum, &lt;a href="http://www.sgi.com/tech/stl/ForwardIterator.html" title="ForwardIterator, SGI STL documentation"&gt;forward iterators&lt;/a&gt;. Like input iterators, forward iterators advance, one step at a time. Unlike input iterators, you can copy a forward iterator and dereference or advance that copy in future, independently of the original.
&lt;/p&gt;
&lt;p&gt;Forward iterators are suitable for multipass algorithms, such as &lt;code&gt;std::search&lt;/code&gt;, which looks for the first occurrence of a sequence within a sequence (a generalised &lt;code&gt;strstr&lt;/code&gt;, if you like), or &lt;code&gt;std::adjacent_find&lt;/code&gt; and &lt;code&gt;std::search_n&lt;/code&gt; which look for repeated elements; and of course &lt;code&gt;std::binary_search&lt;/code&gt;, which is our immediate interest.
&lt;/p&gt;
&lt;p&gt;Wouldn&amp;#8217;t it be nice if we could convert our istream iterators into forwards iterators? Then we could plug them directly into all these algorithms.
&lt;/p&gt;
&lt;p&gt;Other languages allow this. You can replicate streams in the Unix shell with &lt;code&gt;tee&lt;/code&gt;. And you can do something similar in Python, thanks to one of the standard &lt;a href="http://docs.python.org/py3k/library/itertools.html"&gt;iterator tools&lt;/a&gt;. Independent iterators over the same sequence needed? &lt;tt&gt;&lt;a href="http://docs.python.org/py3k/library/itertools.html#itertools.tee
"&gt;Itertools.tee&lt;/a&gt;&lt;/tt&gt; is your friend. The example below codes up adjacent find in Python.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;from itertools import tee
import sys

def adjacent_find(xs):
    '''Does the supplied iterable contain any adjacent repeats?
    
    Returns True if xs contains two consecutive, equal items,
    False otherwise. 
    '''
    try:
        curr, next_ = tee(xs)
        next(next_)
        return any(c == n for c, n in zip(curr, next_))
    except StopIteration:
        return False

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="http://www.jezuk.co.uk/mango" title="Mango: iterators, algorithms, functions, for Java, by Jez Higgins"&gt;&lt;img src="http://www.jezuk.co.uk/files/mango-header.png" alt="Mango: iterators, algorithms, functions, for Java, by Jez Higgins"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;Why, even Java has an iterator adaptors, courtesy of Jez Higgins&amp;#8217; &lt;a href="http://www.jezuk.co.uk/mango" title="Mango: iterators, algorithms, functions, for Java, by Jez Higgins"&gt;Mango library&lt;/a&gt;.
&lt;/p&gt;
&lt;p&gt;What about C++? I couldn&amp;#8217;t find any such iterator adaptors in the standard library, but I turned up something in the standard library research and development unit, also known as &lt;a href="http://www.boost.org" title="Free, peer-reviewed, portable C++ source libraries"&gt;Boost&lt;/a&gt;.
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://boost.org"&gt;&lt;img src="http://www.boost.org/doc/libs/1_43_0/boost.png" alt="Boost logo"/&gt;&lt;/a&gt;
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/binary-search-revisited#toc6" name="tocmultipass-iterator" id="tocmultipass-iterator"&gt;Multipass iterator&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;&lt;a href="http://www.boost.org/doc/libs/1_43_0/libs/spirit/doc/html/index.html"&gt;Boost.Spirit&lt;/a&gt; is a remarkable C++ parser framework, which uses operator overloading to represent parsers directly as EBNF grammars in C++. Somewhere in its depths it tracks back, and hence must adapt input iterators into forward iterators &amp;#8212; or &lt;a href="http://www.boost.org/doc/libs/1_43_0/libs/spirit/doc/html/spirit/support/multi_pass.html"&gt;multipass iterators&lt;/a&gt;, to use its own term.
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;The &lt;code&gt;multi_pass&lt;/code&gt; iterator will convert any input iterator into a forward iterator suitable for use with Spirit.Qi. &lt;code&gt;multi_pass&lt;/code&gt; will buffer data when needed and will discard the buffer when its contents is not needed anymore. This happens either if only one copy of the iterator exists or if no backtracking can occur.
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;What&amp;#8217;s good enough for parsing is more than good enough for searching. Here&amp;#8217;s a function which detects whether a number is in a file. Most of the code here just includes the right headers and defines some typedefs. By leaning on high quality support libraries we&amp;#8217;ve overcome our first issue: we no longer have to write loads of code just to call binary search!
&lt;/p&gt;
&lt;div class="typocode"&gt;&lt;div class="codetitle"&gt;Boost spirit multipass iterators&lt;/div&gt;

&lt;pre class="prettyprint"&gt;#include &amp;lt;fstream&amp;gt;
#include &amp;lt;iterator&amp;gt;
#include &amp;lt;algorithm&amp;gt;

#include &amp;lt;boost/spirit/include/support_multi_pass.hpp&amp;gt;

namespace spirit = boost::spirit;

typedef long long number;
typedef std::istream_iterator&amp;lt;number&amp;gt; in_it;
typedef spirit::multi_pass&amp;lt;in_it&amp;gt; fwd_it;

/*
  Returns true if the input number can be found in the named 
  file, false otherwise. The file must contain ordered, 
  whitespace separated numbers.
*/
bool
is_number_in_file(number n, char const * filename)
{
    std::ifstream in(filename);
    
    fwd_it begin = spirit::make_default_multi_pass(in_it(in));
    fwd_it end = spirit::make_default_multi_pass(in_it());
    
    return std::binary_search(begin, end, n);
}

&lt;/pre&gt;

&lt;/div&gt;


&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/binary-search-revisited#toc7" name="tocnot-so-fast" id="tocnot-so-fast"&gt;Not so fast&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;If this library-based solution looks too good to be true, that&amp;#8217;s because it is! As we noted &lt;a href="http://wordaligned.org/articles/binary-search#tocstdbinarysearch-requirements"&gt;before&lt;/a&gt;, the standard binary search algorithm may indeed work with forward iterators, but it works far better with random access iterators. There&amp;#8217;s no point reducing the number of integer comparisons to &lt;code&gt;O(log(N))&lt;/code&gt; if we&amp;#8217;re going to advance our iterators &lt;code&gt;O(N)&lt;/code&gt; times.
&lt;/p&gt;
&lt;p&gt;What&amp;#8217;s worse, these multipass iterators aren&amp;#8217;t magic. Did you read the smallprint concerning Python&amp;#8217;s &lt;tt&gt;&lt;a href="http://docs.python.org/py3k/library/itertools.html#itertools.tee"&gt;tee&lt;/a&gt;&lt;/tt&gt; iterator?
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;This itertool may require significant auxiliary storage (depending on how much temporary data needs to be stored).
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;If teed iterators diverge, intervening values have to be stored somewhere, and the same appears to be true of our inscrutable multipass iterators. Huge chunks of our large input file are buffered into memory. When I ran this function to confirm the presence of a single number somewhere near the middle of a 4.4GB input file, it took over 19 minutes.
&lt;/p&gt;
&lt;pre&gt;
real	19m13.675s
user	5m19.219s
sys	1m26.278s
&lt;/pre&gt;

&lt;p&gt;Much of this time was spent paging.
&lt;/p&gt;
&lt;p&gt;As a comparison, testing for the same value using &lt;code&gt;find&lt;/code&gt; took just under 3 minutes.
&lt;/p&gt;
&lt;pre&gt;
real	2m48.139s
user	2m21.336s
sys	0m7.252s
&lt;/pre&gt;

&lt;p&gt;You&amp;#8217;ll have noticed that we used default multipass iterators. These iterators permit multi-dimensional &lt;a href="http://www.boost.org/doc/libs/1_43_0/libs/spirit/doc/html/spirit/support/multi_pass.html"&gt;customisation&lt;/a&gt;. I wasn&amp;#8217;t feeling brave enough to attempt a template storage policy class, and I very much doubt I could have beaten a simple linear find anyway; anything built on a generic input iterator is unlikely to solve our problem efficiently.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/binary-search-revisited#toc8" name="tocbetter-than-find" id="tocbetter-than-find"&gt;Better than find&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;We can beat &lt;code&gt;std::find&lt;/code&gt; with a bit of ingenuity. Standard istream iterators are useful but, in this case, not a good starting point. A better idea is to create a novel iterator which uses file seek operations to advance through the file, then fine-tunes the file position to point at a number.
&lt;/p&gt;
&lt;p&gt;Consider an imagine an iterator which can be positioned at any seekable position in the file, and which we dereference to be the first number in the file which ends at or after that position. The graphic below shows a file with 11 seekable positions, 0 through 10 inclusive. 
&lt;/p&gt;
&lt;ul&gt;
 &lt;li&gt;
     positions 0 and 1 dereference to the number 42  &lt;br /&gt;
 &lt;/li&gt;

 &lt;li&gt;
     positions 2, 3, 4 and 5 dereference to the number 57
 &lt;/li&gt;

 &lt;li&gt;
     positions 6, 7, 8 and 9 dereference to the number 133
 &lt;/li&gt;

 &lt;li&gt;
     it is an error to try and dereference position 10, at the end of the file
 &lt;/li&gt;
&lt;/ul&gt;
&lt;img src="http://wordaligned.org/images/text-file-iterator.png" alt="Text file iterator"/&gt;

&lt;p&gt;Now, this is a rather unusual iterator. It iterates over the numbers in the file, but each number gets repeated for every byte in the file it occupies. Despite this duality it&amp;#8217;s perfectly usable &amp;#8212; so long as we keep a clear head. Binary searches are fine.
&lt;/p&gt;
&lt;p&gt;How does this version perform?
&lt;/p&gt;
&lt;p&gt;Recall, a linear search for a single value in the middle of a 4.4GB took nearly 3 minutes. Running 10 binary searches through the same file took just 40 milliseconds &amp;#8212; that&amp;#8217;s a rate of 25 searches a second!
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/binary-search-revisited#toc9" name="tocimplementation" id="tocimplementation"&gt;Implementation&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Here&amp;#8217;s our weird new iterator. It should be usable on files containing whitespace separated items of any type for which the stream read &lt;code&gt;operator&amp;gt;&amp;gt;()&lt;/code&gt; has been defined.
&lt;/p&gt;
&lt;p&gt;There&amp;#8217;s quite a lot of code here, but much of it is random access iterator scaffolding. The interesting functions are the private implementation details towards the end of the class.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;#include &amp;lt;fstream&amp;gt;
#include &amp;lt;ios&amp;gt;
#include &amp;lt;iterator&amp;gt;
#include &amp;lt;stdexcept&amp;gt;

#include &amp;lt;ctype.h&amp;gt;

// File read error, thrown when low level file access fails.
class file_read_error : public std::runtime_error
{
public:
    file_read_error(std::string const &amp;amp; what)
        : std::runtime_error(what)
    {
    }
};

/*
  Here's an unusual iterator which can be used to binary search
  for whitespace-separated items in a text file.
  
  It masquerades as a random access iterator but a file
  is not usually a random access device. Nonetheless, file seek
  operations are quicker than stepping through the file item by
  item.
  
  The unusual thing is that the iterators correspond to 
  file offsets rather than items within the file.
  
  Here's a short example where the items are numbers.
  
  +---+---+---+---+---+---+---+---+---+---+
  | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
  +---+---+---+---+---+---+---+---+---+---+
  |'4'|'2'|   |   |'5'|'7'|   |'1'|'3'|'3'|
  +---+---+---+---+---+---+---+---+---+---+
  
  The graphic shows a text file which contains 3 numbers,
  42, 57, 133, separated by whitespace.
  
  The file itself is 10 bytes long, and hence there are 11
  iterators over the file, corresponding to actual file positions
  (including the one-past-the end position). To dereference an
  iterator, we step back through the file until we reach either
  whitespace or the start of the file. Then we look forwards 
  again and read in the next item.
  
  In the graphic above:
  
   - Iterators 0 and 1 point to number 42
   - Iterators 2, 3, 4 and 5 point to number 57
   - Iterators 6, 7, 8, 9 point to number 133
   - Iterator 10 is the end, and must not be dereferenced
  
  Dereferencing an iterator always returns an item which is in
  the file, and all items in the file have iterators pointing to
  them, so std::binary_search based on these iterators is valid.
  
  The iterators also expose their underlying file positions
  directory (via the getpos() member function), and with a
  little thought we can make use of std::lower_bound() and
  std::upper_bound().
*/
template &amp;lt;typename item&amp;gt;
class text_file_item_iter
{
    typedef text_file_item_iter&amp;lt;item&amp;gt; iter;
    
private: // Sanity
    
    // Check things are OK, throwing an error on failure.
    void check(bool ok, std::string const &amp;amp; what)
    {
        if (!ok)
        {
            throw file_read_error(what);          
        }
    }
    
public: // Traits typedefs, which make this class usable with
        // algorithms which need a random access iterator.
    typedef std::random_access_iterator_tag iterator_category;
    typedef item value_type;
    typedef std::streamoff difference_type;
    typedef item * pointer;
    typedef item &amp;amp; reference;
    
    enum start_pos { begin, end };
    
public: // Lifecycle
    text_file_item_iter(iter const &amp;amp; other)
        : fname(other.fname)
    {
        open();
        setpos(other.pos);
    }
    
    text_file_item_iter()
        : pos(-1)
    {
    }
    
    text_file_item_iter(std::string const &amp;amp; fname,
                        start_pos where = begin)
        : fname(fname)
        , pos(-1)
    {
        open();
        if (where == end)
        {
            seek_end();
        }
    }
    
    ~text_file_item_iter()
    {
        close();
    }
    
    iter &amp;amp; operator=(iter const &amp;amp; other)
    {
        close();
        fname = other.fname;
        open();
        setpos(other.pos);
        return *this;
    } 
    
public: // Comparison
        // Note: it's an error to compare iterators over different files.
    bool operator&amp;lt;(iter const &amp;amp; other) const
    {
        return pos &amp;lt; other.pos;
    }
    
    bool operator&amp;gt;(iter const &amp;amp; other) const
    {
        return pos &amp;gt; other.pos;
    }
    
    bool operator==(iter const &amp;amp; other) const
    {
        return pos == other.pos;
    }
    
    bool operator!=(iter const &amp;amp; other) const
    {
        return pos != other.pos;
    }
    
public: // Iteration
    iter &amp;amp; operator++()
    {
        return *this += 1;
    }
    
    iter &amp;amp; operator--()
    {
        return *this -= 1;
    }
    
    iter operator++(int)
    {
        iter tmp(*this);
        ++(*this);
        return tmp;
    }
    
    iter operator--(int)
    {
        iter tmp(*this);
        --(*this);
        return tmp;
    }
    
public: // Step
    iter &amp;amp; operator+=(difference_type n)
    {
        advance(n);
        return *this;
    }
    
    iter &amp;amp; operator-=(difference_type n)
    {
        advance(-n);
        return *this;
    }
    
    iter operator+(difference_type n)
    {
        iter result(*this);
        return result += n;
    }
    
    iter operator-(difference_type n)
    {
        iter result(*this);
        return result -= n;
    }
    
public: // Distance
    difference_type operator-(iter &amp;amp; other)
    {
        return pos - other.pos;
    }
    
public: // Access
    value_type operator*()
    {
        return (*this)[0];
    }
    
    value_type operator[](difference_type n)
    {
        std::streampos restore = getpos();
        advance(n);
        value_type const result = read();
        setpos(restore);
        return result;
    }
    
    // Allow direct access to the underlying stream position
    std::streampos getpos()
    {
        std::streampos pos_ = in.tellg();
        check(in, "getpos failed");
        return pos_;
    }
    
private: // Implementation details
    void open()
    {
        in.open(fname.c_str(), std::ios::binary);
        check(in, "open failed");
        pos = getpos();
    }
    
    void close()
    {
        if (in.is_open())
        {
            in.close();
            check(in, "close failed");
        }
    }
    
    void advance(difference_type n)
    {
        check(in.seekg(n, std::ios_base::cur), "advance failed");
        pos = getpos();
    }
    
    void seek_end()
    {
        check(in.seekg(0, std::ios_base::end), "seek_end failed");
        chop_whitespace();
        pos = getpos();
    }
    
    void chop_whitespace()
    {
        do
        {
            in.unget();
        } while (isspace(in.peek()));
        in.get();
        in.clear();
    }
    
    void setpos(std::streampos newpos)
    {
        check(in.seekg(newpos), "setpos failed");
        pos = newpos;
    }
    
    // Return the item at the current position
    value_type read()
    {
        item n = 0;
        // Reverse till we hit whitespace or the start of the file
        while (in &amp;amp;&amp;amp; !isspace(in.peek()))
        {
            in.unget();
        }
        in.clear();
        check(in &amp;gt;&amp;gt; n, "read failed");
        return n;
    }
    
private: // State
    std::string fname;
    std::ifstream in;
    std::streampos pos;
};

&lt;/pre&gt;

&lt;/div&gt;


&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/binary-search-revisited#toc10" name="tochardware-used" id="tochardware-used"&gt;Hardware used&lt;/a&gt;&lt;/h3&gt;
&lt;pre&gt;
  Model Name:	            MacBook
  Model Identifier:	    MacBook1,1
  Processor Name:           Intel Core Duo
  Processor Speed:          2 GHz
  Number Of Processors:	    1
  Total Number Of Cores:    2
  L2 Cache (per processor): 2 MB
  Memory:                   2 GB
  Bus Speed:                667 MHz
&lt;/pre&gt;

&lt;p&gt;&lt;a href="http://www.flickr.com/photos/photobunny_earl/1008279066" title="Mushroom, by photobunny"&gt;&lt;img src="http://farm2.static.flickr.com/1440/1008279066_847d73c90d.jpg" alt="Mushroom, by photobunny"/&gt;&lt;/a&gt;
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/binary-search-revisited#toc11" name="tocconclusions" id="tocconclusions"&gt;Conclusions&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Initially the Boost.Spirit solution looked promising but we pushed it too hard. Suitable abstractions can remove complexity; but they can also hide it. When efficiency matters, we need a handle on what&amp;#8217;s going on.
&lt;/p&gt;
&lt;p&gt;After this false start we &lt;strong&gt;did&lt;/strong&gt; find a way to create a file iterator suitable for use with the standard binary search algorithms. Use it with care, though!
&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/vN5JXjm_NNA" height="1" width="1"/&gt;</description>
<dc:date>2010-05-26</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/binary-search-revisited</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/vN5JXjm_NNA/binary-search-revisited</link>
<category>C++</category>
<category>Algorithms</category>
<category>Python</category>
<feedburner:origLink>http://wordaligned.org/articles/binary-search-revisited</feedburner:origLink></item>

<item>
<title>Man or man(1)?</title>
<description>&lt;p&gt;How careless, we&amp;#8217;d forgotten to configure &lt;a href="http://gd.tuwien.ac.at/linuxcommand.org/man_pages/logrotate8.html"&gt;log rotation&lt;/a&gt;. So our application had gone with a default designed for a less verbose age, rotating files as soon as they exceeded a megabyte in size, and never throwing any of them away. Oh, and it was putting these log files at the root of the file system where they&amp;#8217;d somehow gone unnoticed for some time. As a consequence, the file system had become clogged up with squillions of files.
&lt;/p&gt;
&lt;pre&gt;
$ cd /
$ ls
...
server.log.736624
server.log.736625
server.log.736626
server.log.736627
...
&lt;/pre&gt;

&lt;p&gt;&lt;a href="http://www.flickr.com/photos/fdecomite/2318674303"&gt;&lt;img src="http://farm3.static.flickr.com/2206/2318674303_a3c9d8bef4.jpg" alt="20 levels by fdecomit"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;span id="continue-reading"/&gt;

&lt;p&gt;How many files, exactly?
&lt;/p&gt;
&lt;pre&gt;
$ ls | wc -l
^C
&lt;/pre&gt;

&lt;p&gt;No time to wait. Too many! We had to act fast.
&lt;/p&gt;
&lt;p&gt;We changed the log rotate configuration to something more appropriate, restarted the application, and set about cleaning up. Now, this is when you &lt;strong&gt;don&amp;#8217;t&lt;/strong&gt; want to open a file browser and drag files into trash can, not unless you like watching egg-timers. The desktop metaphor fails when you have squillions of files on your desk. Alarmingly, the shell complains too.
&lt;/p&gt;
&lt;pre&gt;
$ rm server.log.*
-bash: /bin/rm: Argument list too long
&lt;/pre&gt;

&lt;p&gt;At this point, a clear head and a steady hand is needed. I use pathname expansion and &lt;code&gt;rm&lt;/code&gt; all the time and I&amp;#8217;m confident the commands I type will have the right effect. But in my current situation &amp;#8212; as root user, in the root directory, on a machine running an unfamiliar flavour of Unix, about to combine &lt;code&gt;find&lt;/code&gt; with &lt;code&gt;xargs&lt;/code&gt; and &lt;code&gt;rm&lt;/code&gt; &amp;#8212; I grow nervous.
&lt;/p&gt;
&lt;p&gt;How to stop &lt;code&gt;find&lt;/code&gt; from descending? &lt;code&gt;-Maxdepth&lt;/code&gt;, I think, but level &lt;code&gt;0&lt;/code&gt; or &lt;code&gt;1&lt;/code&gt;? Is &lt;code&gt;-print&lt;/code&gt; required? Should I create a scratch directory and practise.
&lt;/p&gt;
&lt;p&gt;Enough questions already! Are you a man or a &lt;code&gt;man(1)&lt;/code&gt; reader?
&lt;/p&gt;
&lt;pre&gt;
$ find / -maxdepth 1 -name 'server.log.*' | xargs rm -f
&lt;/pre&gt;

&lt;p&gt;&lt;a href="http://www.flickr.com/photos/sarahandmikeprobably/3356749485/"&gt;&lt;img src="http://farm4.static.flickr.com/3660/3356749485_66f532e6c0.jpg" alt="74/365: Falling Cards, by Sarah and Mike ...probably"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;Done!
&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/JjA7-RBydoo" height="1" width="1"/&gt;</description>
<dc:date>2010-05-19</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/man-man</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/JjA7-RBydoo/man-man</link>
<category>Unix</category>
<category>Shell</category>
<feedburner:origLink>http://wordaligned.org/articles/man-man</feedburner:origLink></item>

<item>
<title>Binary search returns … ?</title>
<description>&lt;div class="toc"&gt;
&lt;h2&gt;Contents&lt;/h2&gt;
&lt;ul&gt;
 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/binary-search#tocbsearch-in-c" name="toc0" id="toc0"&gt;Bsearch in C&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/binary-search#tocbinary-search-in-c" name="toc1" id="toc1"&gt;Binary search in C++&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/binary-search#tocstdbinarysearch-requirements" name="toc2" id="toc2"&gt;Std::binary_search() requirements&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/binary-search#tocstdbinarysearch-limitations" name="toc3" id="toc3"&gt;Std::binary_search() limitations&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/binary-search#toclocating-missing-elements" name="toc4" id="toc4"&gt;Locating missing elements&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/binary-search#toclowerbound" name="toc5" id="toc5"&gt;Lower_bound&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/binary-search#tocbinary-search-variants" name="toc6" id="toc6"&gt;Binary search variants&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/binary-search#tociterating-over-numbers-in-a-file" name="toc7" id="toc7"&gt;Iterating over numbers in a file&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/binary-search#toctriple-fail" name="toc8" id="toc8"&gt;Triple fail&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/binary-search#tocthanks" name="toc9" id="toc9"&gt;Thanks&lt;/a&gt;
 &lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;&lt;p&gt;In an article inspired by Jon Bentley&amp;#8217;s classic book, &lt;a href="http://www.cs.bell-labs.com/cm/cs/pearls/"&gt;Programming Pearls&lt;/a&gt;, Mike Taylor &lt;a href="http://reprog.wordpress.com/2010/04/19/are-you-one-of-the-10-percent/" title="Are you one of the 10% of programmers who can write a binary search?"&gt;invites his readers&lt;/a&gt; to implement the binary search algorithm. To spice things up, he requests we work:
&lt;/p&gt;
&lt;ol&gt;
 &lt;li&gt;
     without reference to any existing implementation
 &lt;/li&gt;

 &lt;li&gt;
     without calling any library routine, such as &lt;code&gt;bsearch&lt;/code&gt;
 &lt;/li&gt;

 &lt;li&gt;
     without writing tests.
 &lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Mike Taylor doesn&amp;#8217;t formally specify the problem. He&amp;#8217;s confident his readers will know what a binary search is, and if not, the description he quotes from Programming Pearls should suffice:
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;Binary search solves the problem [of searching within a pre-sorted array] by keeping track of a range within the array in which T [i.e. the sought value] must be if it is anywhere in the array.  Initially, the range is the entire array.  The range is shrunk by comparing its middle element to T and discarding half the range.  The process continues until T is discovered in the array, or until the range in which it must lie is known to be empty.
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;So could our binary search implementation simply return a binary result, &lt;code&gt;true&lt;/code&gt; if &lt;code&gt;T&lt;/code&gt; is in the array, &lt;code&gt;false&lt;/code&gt; otherwise? Well, Yes. And No. A binary search can provide more information, as Mike Taylor hints when he mentions &lt;code&gt;bsearch&lt;/code&gt;. 
&lt;/p&gt;
&lt;p&gt;Jon Bentley and Mike Taylor are primarily interested in how often programmers  make a mess of what appears to be a simple assignment and in how to avoid this mess. In this article, I&amp;#8217;d like to point out that the problem specification needs attention too.
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://www.flickr.com/photos/pinprick/2547648374"&gt;&lt;img src="http://farm4.static.flickr.com/3066/2547648374_587dbe8f4b_m.jpg" alt="unwrapped morbier"/&gt;&lt;/a&gt;
   &lt;a href="http://www.flickr.com/photos/pinprick/2546825997"&gt;&lt;img src="http://farm4.static.flickr.com/3083/2546825997_c28af1da65_m.jpg" alt="cut morbier"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;span id="continue-reading"/&gt;


&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/binary-search#toc0" name="tocbsearch-in-c" id="tocbsearch-in-c"&gt;Bsearch in C&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;The C library&amp;#8217;s &lt;code&gt;bsearch&lt;/code&gt; function returns the location of &lt;code&gt;T&lt;/code&gt;, if found, or a sentinel value otherwise. We might use the array index of &lt;code&gt;T&lt;/code&gt; or &lt;code&gt;-1&lt;/code&gt; as location and sentinel. Standard C uses pointers:
&lt;/p&gt;
&lt;pre&gt;
&lt;b&gt;NAME&lt;/b&gt;
    &lt;b&gt;bsearch&lt;/b&gt; -- binary search of a sorted table
    
&lt;b&gt;SYNOPSIS&lt;/b&gt;
    #include &amp;lt;stdlib.h&amp;gt;
    
    void *
    &lt;b&gt;bsearch&lt;/b&gt;(const void *key, const void *base, size_t nel, 
        size_t width,
        int (*compar) (const void *, const void *));
    
&lt;b&gt;DESCRIPTION&lt;/b&gt; 
    The &lt;b&gt;bsearch()&lt;/b&gt; function searches an array of `nel` objects, 
    the initial member of which is pointed to by `base`, for a member
    that matches the  object pointed to by `key`.  The size (in bytes)
    of each member of the array is specified by `width`.
    
    The contents of the array should be in ascending sorted order 
    according to the comparison function referenced by `compar`.  The 
    `compar` routine is expected to have two arguments which point to
    the `key` object and to an array member, in that order.  It should 
    return an integer which is less than, equal to, or greater than
    zero if the `key` object is found, respectively, to be less than,
    to match, or be greater than the array member.

&lt;b&gt;RETURN VALUES&lt;/b&gt;
    The &lt;b&gt;bsearch()&lt;/b&gt; function returns a pointer to a matching member
    of the array, or a null pointer if no match is found.  If two members
    compare as equal, which member is matched is unspecified.
&lt;/pre&gt;

&lt;p&gt;Void pointers, function pointers, raw memory &amp;#8212; generic functions in C aren&amp;#8217;t pretty. How would this function look in a language with better support for generic programming?
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/binary-search#toc1" name="tocbinary-search-in-c" id="tocbinary-search-in-c"&gt;Binary search in C++&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;C++ programmers can of course use &lt;code&gt;bsearch&lt;/code&gt; directly since C++ includes the standard C library. The C++ counterpart would seem to be &lt;a href="http://www.sgi.com/tech/stl/binary_search.html"&gt;&lt;code&gt;std::binary_search&lt;/code&gt;&lt;/a&gt;.
&lt;/p&gt;
&lt;p&gt;At first glance &lt;code&gt;std::binary_search&lt;/code&gt; appears to be a weakened version of &lt;code&gt;bsearch&lt;/code&gt;. Like &lt;code&gt;bsearch&lt;/code&gt;, it searches for a value. Unlike &lt;code&gt;bsearch&lt;/code&gt;, it simply returns a boolean result: &lt;code&gt;true&lt;/code&gt; if the value is found, &lt;code&gt;false&lt;/code&gt; otherwise. Nonetheless, it can tell us more than &lt;code&gt;bsearch&lt;/code&gt; in some circumstances.
&lt;/p&gt;
&lt;p&gt;Let&amp;#8217;s return to Mike Taylor&amp;#8217;s second constraint, the one about implementing functions which already exist in standard libraries. In a &lt;a href="http://reprog.wordpress.com/2010/04/21/binary-search-redux-part-1/" title="Mike Taylor discusses his binary search challenge"&gt;follow up article&lt;/a&gt; he explains:
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;&amp;#8230; sometimes you do need to write a binary search, and the library routines won&amp;#8217;t get the job done.  Or if they will, they&amp;#8217;re grotesquely inefficient.  For example, suppose you have a 64-bit integer, and you need to find out whether it&amp;#8217;s among the nine billion 64-bit integers that are stored in ascending order in a 72 Gb file.  The naive solution is to read the file into memory, making an array (or, heaven help us, an Array) of nine billion elements, then invoke the library search function.  And of course that just plain won&amp;#8217;t work &amp;#8212; the array won&amp;#8217;t fit in memory.
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;Agreed! We should know how our wheels work and be ready to reinvent them when necessary: but C++&amp;#8217;s &lt;code&gt;std::binary_search&lt;/code&gt; &lt;strong&gt;will&lt;/strong&gt; solve this problem efficiently. All we need is a suitable iterator over the file, in this case one which:
&lt;/p&gt;
&lt;ul&gt;
 &lt;li&gt;
     increments in 8 byte steps
 &lt;/li&gt;

 &lt;li&gt;
     uses file seeks for larger steps
 &lt;/li&gt;

 &lt;li&gt;
     is dereferenced by reading 8 byte values from the file
 &lt;/li&gt;

 &lt;li&gt;
     stores file position, for use in ordering and distance operations
 &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I include an &lt;a href="http://wordaligned.org/articles/binary-search#tociterating-over-numbers-in-a-file"&gt;implementation&lt;/a&gt; of just such an iterator towards the end of this article. My aging laptop didn&amp;#8217;t have enough disk space for a 72GB data file but I found room for a 5GB one. &lt;code&gt;Std::binary_search()&lt;/code&gt; took milliseconds to test the presence of values in this file, and the times improved dramatically on repeat runs; using a linear search, the time extended to minutes, and repeat runs showed no such improvements.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/binary-search#toc2" name="tocstdbinarysearch-requirements" id="tocstdbinarysearch-requirements"&gt;Std::binary_search() requirements&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;It&amp;#8217;s fair to suggest that creating a custom iterator just so we could use &lt;code&gt;std::binary_search&lt;/code&gt; merely moves the problem. The iterator&amp;#8217;s implementation is longer and arguably more fiddly than any custom binary search function would be. Why couldn&amp;#8217;t we use a standard &lt;a href="http://www.sgi.com/tech/stl/istream_iterator.html"&gt;input stream iterator&lt;/a&gt; with the standard binary search algorithm?
&lt;/p&gt;
&lt;p&gt;The reason is that &lt;code&gt;std::istream_iterator&lt;/code&gt;s are &lt;a href="http://www.sgi.com/tech/stl/InputIterator.html" title="SGI STL input iterator documentation"&gt;input iterators&lt;/a&gt;, suitable only for single pass algorithms. Binary search doesn&amp;#8217;t need to take any backwards steps but it does need to be able copy its iterators and advance them repeatedly. As a minimum, then, it requires &lt;a href="http://www.sgi.com/tech/stl/ForwardIterator.html" title="SGI STL forwards iterator documentation"&gt;forwards iterators&lt;/a&gt;.
&lt;/p&gt;
&lt;p&gt;Note the algorithm&amp;#8217;s &lt;a href="http://www.sgi.com/tech/stl/binary_search.html"&gt;complexity&lt;/a&gt;!
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;The number of comparisons is logarithmic: at most &lt;code&gt;log(last - first) + 2&lt;/code&gt;. If ForwardIterator is a Random Access Iterator then the number of steps through the range is also logarithmic; otherwise, the number of steps is proportional to &lt;code&gt;last - first&lt;/code&gt;.
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;In the case of our large file of numbers, comparisons are cheap; there&amp;#8217;s little point minimising them if we&amp;#8217;re going to take billions of short steps through the file. This is why we created a random access file iterator&lt;a id="fn1link" href="http://wordaligned.org/articles/binary-search#fn1"&gt;&lt;sup&gt;[1]&lt;/sup&gt;&lt;/a&gt;.
&lt;/p&gt;
&lt;p&gt;A more subtle point is that binary search deals with equivalence rather than equality: it only requires a less-than operator (or a comparison function), and returns true if it can find an element &lt;code&gt;x&lt;/code&gt; which satisfies &lt;code&gt;!(x &amp;lt; t) &amp;amp;&amp;amp; !(t &amp;lt; x)&lt;/code&gt;.
&lt;/p&gt;
&lt;p&gt;The point I&amp;#8217;m making is that C++ does a nice job of separating algorithms and containers, which is why the same algorithm can be used on vectors, files, arrays etc. It also carefully defines minimum requirements on the types used by algorithms&lt;a id="fn2link" href="http://wordaligned.org/articles/binary-search#fn2"&gt;&lt;sup&gt;[2]&lt;/sup&gt;&lt;/a&gt;.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/binary-search#toc3" name="tocstdbinarysearch-limitations" id="tocstdbinarysearch-limitations"&gt;Std::binary_search() limitations&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;We noted earlier that &lt;code&gt;std::binary_search&lt;/code&gt; delivers nothing more than a binary result. Is the element there or not? From the SGI STL &lt;a href="http://www.sgi.com/tech/stl/binary_search.html"&gt;documentation&lt;/a&gt;:
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;Note that this is not necessarily the information you are interested in!
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;Even &lt;code&gt;bsearch&lt;/code&gt; tells us where it found the match; or rather, where it found &lt;b&gt;a&lt;/b&gt; match, since there could be several. This imprecision is one of &lt;code&gt;bsearch&lt;/code&gt;&amp;#8217;s failings &amp;#8212; but it really lets us down when it can&amp;#8217;t find the element: in this case, it subdivides the range until it finds where the element would be if it were there, realises there is no match, then throws all positional information away and returns a null pointer.
&lt;/p&gt;
&lt;p&gt;Suppose our large file represents a set of numbers and we want to know where our test number should go in this file, if it isn&amp;#8217;t already present? A C++ binary search algorithm can do this, but it isn&amp;#8217;t &lt;code&gt;std::binary_search&lt;/code&gt;.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/binary-search#toc4" name="toclocating-missing-elements" id="toclocating-missing-elements"&gt;Locating missing elements&lt;/a&gt;&lt;/h3&gt;
&lt;img src="http://wordaligned.org/images/london-marathon-2008.jpg" alt="London Marathon, runners crossing Tower Bridge"/&gt;

&lt;p&gt;Here&amp;#8217;s another problem binary search can solve. Suppose we want to know how many runners finished the 2010 London marathon in a time between 3 and 4 hours. Let&amp;#8217;s suppose we&amp;#8217;ve already loaded the ordered finishing times into an array.
&lt;/p&gt;
&lt;p&gt;We might try using &lt;code&gt;bsearch&lt;/code&gt; to find the position of the runners who finished with a time of exactly 3 hours and with a time of exactly 4 hours. Then the answer would be the difference between these two positions.
&lt;/p&gt;
&lt;p&gt;There are two problems with this approach:
&lt;/p&gt;
&lt;ol&gt;
 &lt;li&gt;
     what if no one finished with a time of &lt;strong&gt;exactly&lt;/strong&gt; 3 or 4 hours? 
 &lt;/li&gt;

 &lt;li&gt;
     what if more than one runner finished with a time of exactly 3 or 4 hours?
 &lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;In the first case &lt;code&gt;bsearch&lt;/code&gt; returns a null pointer and we can&amp;#8217;t complete our calculation. In the second case, &lt;code&gt;bsearch&lt;/code&gt; makes no guarantees about which of the equally-placed runners it will find, and even if we can make our calculation, we cannot be sure it is correct.
&lt;/p&gt;
&lt;p&gt;&lt;code&gt;Bsearch&lt;/code&gt; is not much use, then, but a binary search can give us our answer. 
&lt;/p&gt;
&lt;p&gt;Imagine we had a late result for the race, a runner who recorded a time of exactly 3 hours. What&amp;#8217;s the first position in the array at which we could place this runner, whilst maintaining the array ordering? Similarly, where&amp;#8217;s the first position at which we could insert a runner with a time of 4 hours, maintaining the array ordering. Both these positions are well defined and precise &amp;#8212; even if everyone finished the race in less than 3 hours, or even if no one ran the race &amp;#8212; and the correct answer is their difference.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/binary-search#toc5" name="toclowerbound" id="toclowerbound"&gt;Lower_bound&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;C++ supplies just such an algorithm. It goes by the name of &lt;a href="http://www.sgi.com/tech/stl/lower_bound.html"&gt;&lt;code&gt;std::lower_bound&lt;/code&gt;&lt;/a&gt;, but really it&amp;#8217;s good old binary search. We want to find the first place our target element could go, whilst maintaining the ordering, which we do by repeatedly splitting the range.
&lt;/p&gt;
&lt;ul&gt;
 &lt;li&gt;
     while the range is non-empty
 &lt;/li&gt;

 &lt;li&gt;
     look at the element in the middle of the range
 &lt;/li&gt;

 &lt;li&gt;
     is its value less than the target value?
 &lt;/li&gt;

 &lt;li&gt;
     if so, continue looking in the top half of the range
 &lt;/li&gt;

 &lt;li&gt;
     if not, continue looking in the bottom half of the range
 &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The while loop exits when the range has been reduced to a single point and this point is what we return. On my platform, the code itself reads a bit like:
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;template&amp;lt;typename fwd_it, typename t&amp;gt;
fwd_it
lower_bound(fwd_it first, fwd_it last, const t &amp;amp; val)
{
    typedef typename iterator_traits&amp;lt;fwd_it&amp;gt;::difference_type distance;
    
    distance len = std::distance(first, last);
    distance half;
    fwd_it middle;
    
    while (len &amp;gt; 0)
    {
        half = len &amp;gt;&amp;gt; 1;
        middle = first;
        std::advance(middle, half);
        if (*middle &amp;lt; val)
        {
            first = middle;
            ++first;
            len = len - half - 1;
        }
        else
            len = half;
    }
    return first;
}

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;I think this version of binary search is &lt;a href="http://wordaligned.org/articles/next-permutation" title="Next_permutation: when C++ gets it right"&gt;yet another gem from the C++ standard library&lt;/a&gt;. As Jon Bentley and Mike Taylor eloquently point out, the implementation is subtle &amp;#8212; in particular, if &lt;code&gt;(*middle &amp;lt; val)&lt;/code&gt; we must eliminate &lt;code&gt;middle&lt;/code&gt; or risk an infinite loop &amp;#8212; but by tightening the problem specification and paring back the requirements we&amp;#8217;ve created a function which is far more useful than &lt;code&gt;bsearch&lt;/code&gt; and arguably simpler to code.
&lt;/p&gt;
&lt;p&gt;For comparison, here&amp;#8217;s the &lt;code&gt;bsearch&lt;/code&gt; implemented by glibc, version 2.11.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;/* Perform a binary search for KEY in BASE which has NMEMB elements
   of SIZE bytes each.  The comparisons are done by (*COMPAR)().  */
void *
bsearch (const void *key, const void *base, size_t nmemb, size_t size,
         int (*compar) (const void *, const void *))
{
  size_t l, u, idx;
  const void *p;
  int comparison;
  
  l = 0;
  u = nmemb;
  while (l &amp;lt; u)
    {
      idx = (l + u) / 2;
      p = (void *) (((const char *) base) + (idx * size));
      comparison = (*compar) (key, p);
      if (comparison &amp;lt; 0)
        u = idx;
      else if (comparison &amp;gt; 0)
        l = idx + 1;
      else
        return (void *) p;
    }

return NULL;
}

&lt;/pre&gt;

&lt;/div&gt;


&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/binary-search#toc6" name="tocbinary-search-variants" id="tocbinary-search-variants"&gt;Binary search variants&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;On my platform, &lt;code&gt;std::binary_search&lt;/code&gt; is built directly on &lt;code&gt;std::lower_bound&lt;/code&gt;. Here&amp;#8217;s the code.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;template&amp;lt;typename fwd_it, typename t&amp;gt;
fwd_it
lower_bound(fwd_it first, fwd_it last, const t &amp;amp; val)
{
    fwd_it i = std::lower_bound(first, last, val);
    return i != last &amp;amp;&amp;amp; !(val &amp;lt; *i);
}

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;code&gt;Std::upper_bound&lt;/code&gt; searches a sorted range to find the last position an item could be inserted without changing the ordering.
&lt;/p&gt;
&lt;p&gt;&lt;code&gt;Std::equal_range&lt;/code&gt; returns a pair of iterators, logically equal to &lt;code&gt;make_pair(lower_bound(...), upper_bound(...))&lt;/code&gt;.
&lt;/p&gt;
&lt;hr /&gt;


&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/binary-search#toc7" name="tociterating-over-numbers-in-a-file" id="tociterating-over-numbers-in-a-file"&gt;Iterating over numbers in a file&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;The iterator class I created to use &lt;code&gt;std::binary_search&lt;/code&gt; on an file containing fixed width binary formatted numbers appears below. To determine whether the file &lt;code&gt;numbers.bin&lt;/code&gt; contains the target value &lt;code&gt;288230376151711744&lt;/code&gt;, we would write something like:
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;#include &amp;lt;algorithm&amp;gt;

....
    
    typedef binary_file_number_iter&amp;lt;long long, 8&amp;gt; iter;
    long long target = 288230376151711744LL;
    
    bool found = std::binary_search(iter("numbers.bin", iter::begin),
                                    iter("numbers.bin", iter::end),
                                    target);

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;To test the performance of these iterators I created a 5GB binary file packed with 8 byte numbers. These numbers were multiples of 3:
&lt;/p&gt;
&lt;pre title="File contents"&gt;
0, 3, 6, 9, ..., 2015231997
&lt;/pre&gt;

&lt;p&gt;I then timed how long it took to search this file for 10 interesting numbers (and to confirm the returned results were as expected).
&lt;/p&gt;
&lt;pre title="Seach targets"&gt;
-1, 0, 1, 2, 1007616000, 1007616001, 1007616002, 1007616003, 2015231997, 2015232000
&lt;/pre&gt;

&lt;p&gt;&lt;code&gt;Binary_search()&lt;/code&gt; recorded a time of 0.308 seconds on a rather old MacBook&lt;a id="fn3link" href="http://wordaligned.org/articles/binary-search#fn3"&gt;&lt;sup&gt;[3]&lt;/sup&gt;&lt;/a&gt;. Using a hand-coded linear search the run time was just over 38 minutes. That is, the binary search ran 7000 times faster on this sample.
&lt;/p&gt;
&lt;p&gt;Interestingly, repeated runs of the binary search test using the same input file and the same targets ran in an average time of just 0.030 seconds, a 10-fold times speed up over the first run. Similarly repeating the linear search showed no such improvement. I&amp;#8217;m attributing this to operating system file caching, but I don&amp;#8217;t pretend to know exactly what&amp;#8217;s going on here. (My thanks to Michal Mocny for his explanation in the &lt;a href="http://wordaligned.org/articles/binary-search#comment-49972118"&gt;comments&lt;/a&gt; below).
&lt;/p&gt;
&lt;div class="typocode"&gt;&lt;div class="codetitle"&gt;Binary file number iterator&lt;/div&gt;

&lt;pre class="prettyprint"&gt;#ifndef BINARY_FILE_NUMBER_ITERATOR_HPP_INCLUDED
#define BINARY_FILE_NUMBER_ITERATOR_HPP_INCLUDED

#include &amp;lt;fstream&amp;gt;
#include &amp;lt;ios&amp;gt;
#include &amp;lt;iostream&amp;gt;
#include &amp;lt;iterator&amp;gt;
#include &amp;lt;stdexcept&amp;gt;

// File read error, thrown when low level file access fails.
class file_read_error : public std::runtime_error
{
public:
    file_read_error(std::string const &amp;amp; what)
        : std::runtime_error(what)
    {
    }
};

// This iterator class is used for numbers packed into a file
// using a fixed width binary format. Numbers must be packed
// most significant byte first.
//
// The file is not read into memory. Iterators are moved by
// file seeking and dereferenced by reading from the file.
//
// These iterators declare themselves to be random access
// iterators but a file is not usually a random access device.
// For example, advancing an iterator a large distance may well
// take longer than advancing a small distance.
template &amp;lt;typename number, int number_size&amp;gt;
class binary_file_number_iter
{
    typedef binary_file_number_iter&amp;lt;number, number_size&amp;gt; iter;
    
private: // Sanity
    // Check things are OK, closing the stream and throwing an error on failure.
    void check(bool ok, std::string const &amp;amp; what)
    {
        if (!ok)
        {
            close();
            throw file_read_error(what);          
        }
    }
    
public: // Traits typedefs, which make this class usable with
        // algorithms which need a random access iterator.
    typedef std::random_access_iterator_tag iterator_category;
    typedef number value_type;
    typedef std::streamoff difference_type;
    typedef number * pointer;
    typedef number &amp;amp; reference;
    
public:
    static int const number_width = number_size;
    
public: // Enum used to construct begin, end iterators
    enum start_pos { begin, end };
    
public: // Lifecycle
    binary_file_number_iter(std::string const &amp;amp; filename,
                            start_pos where = begin)
        : filename(filename)
        , pos(-1)
    {
        open();
        if (where == end)
        {
            seek_end();
        }
    }
    
    binary_file_number_iter()
        : pos(-1)
    {
    }
    
    binary_file_number_iter(iter const &amp;amp; other)
        : filename(other.filename)
    {
        open();
        setpos(other.pos);
    }
    
    ~binary_file_number_iter()
    {
        close();
    }
    
    iter &amp;amp; operator=(iter const &amp;amp; other)
    {
        close();
        filename = other.filename;
        open();
        setpos(other.pos);
        return *this;
    } 
    
public: // Comparison
        // Note: it is an error to compare iterators into different files.
    bool operator&amp;lt;(iter const &amp;amp; other) const
    {
        return pos &amp;lt; other.pos;
    }
    
    bool operator&amp;gt;(iter const &amp;amp; other) const
    {
        return pos &amp;gt; other.pos;
    }
    
    bool operator==(iter const &amp;amp; other) const
    {
        return pos == other.pos;
    }
    
    bool operator!=(iter const &amp;amp; other) const
    {
        return pos != other.pos;
    }
    
public: // Iteration
    iter &amp;amp; operator++()
    {
        return *this += 1;
    }
    
    iter &amp;amp; operator--()
    {
        return *this -= 1;
    }
    
    iter operator++(int)
    {
        iter tmp(*this);
        ++(*this);
        return tmp;
    }
    
    iter operator--(int)
    {
        iter tmp(*this);
        --(*this);
        return tmp;
    }
    
public: // Step
    iter &amp;amp; operator+=(difference_type n)
    {
        advance(n);
        return *this;
    }
    
    iter &amp;amp; operator-=(difference_type n)
    {
        advance(-n);
        return *this;
    }
    
    iter operator+(difference_type n)
    {
        iter result(*this);
        return result += n;
    }
    
    iter operator-(difference_type n)
    {
        iter result(*this);
        return result -= n;
    }
    
public: // Distance
    difference_type operator-(iter &amp;amp; other)
    {
        return (pos - other.pos) / number_size;
    }
    
public: // Access
    value_type operator*()
    {
        return (*this)[0];
    }
    
    value_type operator[](difference_type n)
    {
        std::streampos restore = getpos();
        advance(n);
        value_type const result = read();
        setpos(restore);
        return result;
    }
    
    // Allow access to the underlying stream position
    std::streampos getpos()
    {
        std::streampos s = in.tellg();
        check(in, "getpos failed");
        return s;
    }
private: // Implementation details
    void open()
    {
        in.open(filename.c_str(), std::ios::binary);
        check(in, "open failed");
        pos = getpos();
    }
    
    void close()
    {
        if (in.is_open())
        {
            in.close();
        }
    }
    
    void advance(difference_type n)
    {
        check(in.seekg(n * number_size, std::ios_base::cur), "advance failed");
        pos = getpos();
    }
    
    void seek_end()
    {
        check(in.seekg(0, std::ios_base::end), "seek_end failed");
        pos = getpos();
    }
    
    void setpos(std::streampos newpos)
    {
        check(in.seekg(newpos), "setpos failed");
        pos = newpos;
    }
    
    value_type read()
    {
        number n = 0;
        unsigned char buf[number_size];
        check(in.read((char *)buf, number_size), "read failed");
        
        for (int i = 0; i != number_size; ++i)
        {
            n &amp;lt;&amp;lt;= 8;
            n |= buf[i];
        }
        return n;
    }
    
private: // State
    std::string filename;
    std::ifstream in;
    std::streampos pos;
};

#endif // BINARY_FILE_NUMBER_ITERATOR_HPP_INCLUDED

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here are some basic tests for the binary file number iterator.
&lt;/p&gt;
&lt;div class="typocode"&gt;&lt;div class="codetitle"&gt;Test binary file number iterator&lt;/div&gt;

&lt;pre class="prettyprint"&gt;#include &amp;lt;assert.h&amp;gt;
#include &amp;lt;fstream&amp;gt;
#include &amp;lt;algorithm&amp;gt;
#include &amp;lt;ext/algorithm&amp;gt; // For Gnu's non-standard is_sorted
#include &amp;lt;iostream&amp;gt;

#include "binary_file_number_iterator.hpp"

typedef binary_file_number_iter&amp;lt;long long, 8&amp;gt; iter8;
typedef binary_file_number_iter&amp;lt;int, 4&amp;gt; iter4;
typedef binary_file_number_iter&amp;lt;short, 2&amp;gt; iter2;
typedef binary_file_number_iter&amp;lt;char, 1&amp;gt; iter1;

template &amp;lt;typename fwd_it&amp;gt;
bool is_sorted(fwd_it beg, fwd_it end)
{
    return __gnu_cxx::is_sorted(beg, end);
}

char const * empty_test_file(char const * name)
{
    std::ofstream ofile;
    ofile.open(name);
    ofile.close();
    return name;
}

/*
  Create a small test file containing numbers, in ascending order,
  for number sizes 1, 2, 4 and 8 bytes.
  
  A hex view of the file looks like:
  
  0000 0000 0000 0000 0303 0303 0303 0303
  0606 0606 0606 0606 0909 0909 0909 0909
  0c0c 0c0c 0c0c 0c0c 0f0f 0f0f 0f0f 0f0f
*/
char const * basic_test_file(char const * name)
{
    std::ofstream ofile;
    ofile.open(name);
    for (unsigned char i = 0; i != 18; i += 3)
        for (unsigned j = 0; j != 8; ++j)
            ofile &amp;lt;&amp;lt; i;            
    ofile.close();
    return name;
}

void empty_file_tests()
{
    char const * empty_file = empty_test_file("empty_test_file");
    iter1 beg(empty_file, iter1::begin);
    iter1 end(empty_file, iter1::end);
    assert(beg == end);
    assert(std::lower_bound(beg, end, -1) == end);
    assert(std::upper_bound(beg, end, -1) == end);
    assert(!std::binary_search(beg, end, 0));
    assert(std::equal_range(beg, end, -1) == std::make_pair(beg, beg));
}

template &amp;lt;typename value_type&amp;gt;
value_type repeat(int v, int w)
{
    value_type result = 0;
    while (w-- != 0)
    {
        result &amp;lt;&amp;lt;= 8;
        result |= v;
    }
    return result;
}

template &amp;lt;typename iter&amp;gt;
void basic_file_tests()
{
    char const * basic_file = basic_test_file("basic_test_file");
    
    typedef typename iter::value_type value_t;
    typedef typename std::pair&amp;lt;iter, iter&amp;gt; range;
    int const w = iter::number_width;
    
    iter beg(basic_file, iter::begin);
    iter end(basic_file, iter::end);
    assert(beg &amp;lt; end);
    assert(!(beg &amp;gt; end));
    assert(!(beg == end));
    assert(beg != end);
    assert(end - beg == 48 / w);
    
    iter mid = beg;
    assert(mid[0] == 0);
    assert(mid[8/w] == repeat&amp;lt;value_t&amp;gt;(3, w));
    assert(*mid == 0);
    assert(*mid++ == 0);
    assert(*--mid == 0);
    assert(*(mid += 16/w) == repeat&amp;lt;value_t&amp;gt;(6, w));
    assert(mid &amp;lt; end);
    assert(mid &amp;gt; beg);
    
    assert(is_sorted(beg, end));
    assert(std::lower_bound(beg, mid, -1) == beg);
    assert(std::lower_bound(beg, mid, 0) == beg);
    assert(std::upper_bound(beg, mid, 0) == beg + 8/w);
    assert(std::upper_bound(beg, mid, 1) == beg + 8/w);
    assert(std::binary_search(beg, end, 0));
    assert(std::binary_search(beg, end, repeat&amp;lt;value_t&amp;gt;(0xf, w)));
    
    mid = beg + 8/w;
    assert(std::equal_range(beg, end, 0) == std::make_pair(beg, mid));
    assert(std::equal_range(beg, end, 1) == std::make_pair(mid, mid));
}

int main()
{
    empty_file_tests();
    basic_file_tests&amp;lt;iter1&amp;gt;();
    basic_file_tests&amp;lt;iter2&amp;gt;();
    basic_file_tests&amp;lt;iter4&amp;gt;();
    basic_file_tests&amp;lt;iter8&amp;gt;();
    return 0;
}

&lt;/pre&gt;

&lt;/div&gt;


&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/binary-search#toc8" name="toctriple-fail" id="toctriple-fail"&gt;Triple fail&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;In this article we&amp;#8217;ve discussed binary search:
&lt;/p&gt;
&lt;ol&gt;
 &lt;li&gt;
     referring to existing implementations
 &lt;/li&gt;

 &lt;li&gt;
     calling library routines, such as &lt;code&gt;std::binary_search&lt;/code&gt;
 &lt;/li&gt;

 &lt;li&gt;
     and written some tests.
 &lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Despite this indiscipline, we never even bothered to roll our own binary search: we&amp;#8217;ve tackled the exact opposite of the problem which Mike Taylor set. Programming tasks rarely start with a clear specification, and even if they do, the specification needs questioning.
&lt;/p&gt;
&lt;hr /&gt;


&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/binary-search#toc9" name="tocthanks" id="tocthanks"&gt;Thanks&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;My thanks to &lt;a href="http://www.flickr.com/photos/pinprick/"&gt;pinprick&lt;/a&gt; for the &lt;a href="http://www.flickr.com/photos/pinprick/2547648374"&gt;cheese&lt;/a&gt; &lt;a href="http://www.flickr.com/photos/pinprick/2546825997"&gt;photos&lt;/a&gt;, and for this delicious description.
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;morbier is a soft-ripened, washed rind cheese. the tradition of bathing the rinds in salty water (or strong ale) goes back to trappist monks, who perfected the art. washing the rind makes it tougher, protecting the cheese and making it last longer. washing the rind also makes it a place where a certain bacteria, b. linens, love to hang out. while they work their magic, making the cheese inside smooth and creamy and silky, they also make the outside stinky. there isn&amp;#8217;t any good way to put it. however, most stinky cheese taste amazing, and once you realize that, you find that you love the smell of stinky cheese. stink on the outside means gold on the inside! 
   &amp;#8212; &lt;a href="http://www.flickr.com/photos/pinprick/2547648374"&gt;pinprick&lt;/a&gt;
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;How long before flickr implements scratch and sniff?
&lt;/p&gt;
&lt;hr /&gt;

&lt;p&gt;&lt;a id="fn1" href="http://wordaligned.org/articles/binary-search#fn1link"&gt;[1]&lt;/a&gt;: Well, something which masquerades as a random access iterator. Files are not usually random access devices, and the time taken by a seek operation may well vary with the seek offset. By supplying random access scaffolding, we at least ensure that a single, efficient, seek operation is used each time we advance the file position. 
&lt;/p&gt;
&lt;p&gt;&lt;a id="fn2" href="http://wordaligned.org/articles/binary-search#fn2link"&gt;[2]&lt;/a&gt;: The C++ standard describes the requirements on types in some detail. Unfortunately C++ implementations provide little support for enforcing these requirements. Violations are likely to be punished by &lt;a href="http://wordaligned.org/articles/koenigs-first-rule-of-debugging#a-problem-on-line-106"&gt;grotesque compiler warnings&lt;/a&gt;.
&lt;/p&gt;
&lt;p&gt;&lt;a id="fn3" href="http://wordaligned.org/articles/binary-search#fn3link"&gt;[3]&lt;/a&gt;: The laptop specification:
&lt;/p&gt;
&lt;pre&gt;
Hardware Overview:
  Model Name:	            MacBook
  Model Identifier:	    MacBook1,1
  Processor Name:	    Intel Core Duo
  Processor Speed:	    2 GHz
  Number Of Processors:	    1
  Total Number Of Cores:    2
  L2 Cache (per processor): 2 MB
  Memory:                   2 GB
  Bus Speed:                667 MHz
&lt;/pre&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/k9VHRhXsS0U" height="1" width="1"/&gt;</description>
<dc:date>2010-05-12</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/binary-search</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/k9VHRhXsS0U/binary-search</link>
<category>Algorithms</category>
<category>C</category>
<category>C++</category>
<feedburner:origLink>http://wordaligned.org/articles/binary-search</feedburner:origLink></item>

<item>
<title>Think, quote, escape</title>
<description>&lt;p&gt;Evidently Jamie had got in before me. Somehow he&amp;#8217;d unpacked the new 2U server and balanced it on the side of his desk. He looked hassled. I didn&amp;#8217;t ask. The server&amp;#8217;s fans whirred noisily. On Jamie&amp;#8217;s second monitor I could see the familiar chatter of Linux installing itself. I stepped over the cardboard and polystyrene, sat, woke my machine.
&lt;/p&gt;
&lt;p&gt;Moments later I heard the install disk eject. Jamie typed something, cursed. The disk clattered into the bin. The room grew quiet as the server shut down.
&lt;/p&gt;
&lt;p&gt;We were a small, new team. Nonetheless, we&amp;#8217;d put something substantial together. It built on a Java framework and ran on Linux. We&amp;#8217;d tweaked the framework, meaning we had to build it from source before building our own stuff, so a clean build took almost half an hour. Just over half an hour later, Jamie had burned a new install disk. He placed it in the server&amp;#8217;s DVD drive. The fans roared up again. Ten minutes later, more cursing, another disk in the bin.
&lt;/p&gt;
&lt;p&gt;I walked across. Jamie was glaring at some code. A Perl list? His cursor was poised over an item in this list, a single-quoted string, inside which was a sed command, whose arguments themselves needed quoting, which was evidently meant to edit the contents of a double-quoted string in some configuration file. Think: quote and escape. I could see there were &lt;em&gt;four&lt;/em&gt; DVDs in the bin. Jamie must have been in for some time. No wonder he looked hassled.
&lt;/p&gt;
&lt;p&gt;So, what&amp;#8217;s up?
&lt;/p&gt;
&lt;p&gt;A trade show in the States. The salesman was out there. A bare machine had already been delivered and he needed the software, so Jamie had to cut a DVD which would directly install the operating system together with our application. Please don&amp;#8217;t expect a salesman to download an ISO image and set up enough of a network to boot from it. A courier was booked for midday to collect the DVD. That&amp;#8217;s what&amp;#8217;s up. So leave me alone and let me get on with it.
&lt;/p&gt;
&lt;p&gt;He didn&amp;#8217;t actually say that last bit. It&amp;#8217;s true, though, he was the Linux expert. I stubbornly watched as he changed some double quotes for single ones, added a couple more backslashes, checked in the file, kicked off another build.
&lt;/p&gt;
&lt;p&gt;Back at my desk, I reviewed the version control change logs. Evidently Jamie was working on a post-install script which took the form of a list of actions which would be evaluated and executed in Perl. Looking at the file diffs, the sticking point seemed to be a sed command to edit the X display settings. Embedding sed within Perl was proving tricky.
&lt;/p&gt;
&lt;p&gt;Jamie&amp;#8217;s edit-build-burn-install-check cycle seemed crazy to me. Why not recreate the broken post-install step as a standalone operation? Soon enough I&amp;#8217;d found a way to reproduce the problem. After reading documentation and experimenting I figured out how to nest and escape the various strings. I admit, it took me longer than I expected. By this time Jamie solved the problem by trial and error anyway &amp;#8212; and as proof he had an install disk which he knew worked. He may well have spent less time actually concentrating on the issue than I had; the build-burn-install phases of his process all ran as background activities.
&lt;/p&gt;
&lt;p&gt;In the event we needed to revise the software anyway. So the salesman had to download a new build at the last minute. Jamie stayed late the next day to talk him through the installation.
&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/6jdPZYA7SV8" height="1" width="1"/&gt;</description>
<dc:date>2010-03-30</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/think-quote-escape</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/6jdPZYA7SV8/think-quote-escape</link>
<category>Self</category>
<category>Build</category>
<feedburner:origLink>http://wordaligned.org/articles/think-quote-escape</feedburner:origLink></item>

<item>
<title>Beware the March of IDEs!</title>
<description>&lt;h3&gt;The March of IDEs&lt;/h3&gt;
&lt;blockquote&gt;&lt;p&gt;Visual Studio can be one of the programmer&amp;#8217;s best friends, but over the years it has become increasingly pushy, domineering, and suffering from unsettling control issues. Should we just surrender to Visual Studio&amp;#8217;s insistence on writing our code for us? Or is Visual Studio sapping our programming intelligence rather than augmenting it? 
&lt;/p&gt;
&lt;p&gt;&amp;#8212; Charles Petzold, &lt;a href="http://www.charlespetzold.com/etc/DoesVisualStudioRotTheMind.html"&gt;Does Visual Studio Rot the Mind?&lt;/a&gt;
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;&lt;a href="http://en.wikipedia.org/wiki/Charles_Petzold" title="Charles Petzold, Wikipedia"&gt;&lt;img src="http://upload.wikimedia.org/wikipedia/en/9/93/Charles_petzold.png" alt="Charles Petzold, Wikipedia"/&gt;&lt;/a&gt;
   &lt;a href="http://www.charlespetzold.com/pw5/" title="Programming Windows home page"&gt;&lt;img src="http://www.charlespetzold.com/pw5/pw5.png" alt="Programming Windows front cover"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://en.wikipedia.org/wiki/Ides_of_March" title="The Ides of March is the name of March 15 in the Roman calendar"&gt;March the 15th&lt;/a&gt; is a good day to revisit a talk &lt;a href="http://www.charlespetzold.com/"&gt;Charles Petzold&lt;/a&gt; gave to at the NYC .NET Developer&amp;#8217;s Group some four and half years ago. The speaker is probably best known as the author of &lt;a href="http://www.charlespetzold.com/pw5/"&gt;Programming Windows&lt;/a&gt;, now in its fifth edition, and during his talk he confesses to a love/hate relationship with the software most people use for Windows programming &amp;#8212; Microsoft Visual Studio (VS).
&lt;/p&gt;
&lt;p&gt;If you haven&amp;#8217;t read the &lt;a href="http://www.charlespetzold.com/etc/DoesVisualStudioRotTheMind.html"&gt;talk&lt;/a&gt;, read it now; and if you have, give it another look. Charles Petzold frets about the addictive nature of VS: if you become hooked on intellisense will you become dependent and mentally flabby? And what about the code VS generates in directories which it chooses? 
&lt;/p&gt;
&lt;p&gt;Beware! Your IDE is taking over: you&amp;#8217;ll end up working for it.
&lt;/p&gt;
&lt;span id="continue-reading"/&gt;

&lt;p&gt;The story has a happy ending. Charles Petzold finds an antidote, working on programs to solve a series of maths puzzles set by &lt;a href="http://www.newscientist.com/" title="New Scientist web site"&gt;New Scientist&lt;/a&gt;.
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;I decided to use plain old ANSI C, and to edit the source code in Notepad &amp;#8212; which has no IntelliSense and no sense of any other kind &amp;#8212; and to compile on the command line using both the Microsoft C compiler and the Gnu C compiler.
&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Me and my IDE&lt;/h3&gt;
&lt;p&gt;I rarely use Visual Studio these days, mainly because I&amp;#8217;m primarily developing on Unix platforms. I have used VS in the past, though: but I still turned to Emacs to edit files. For debugging, I&amp;#8217;ll admit VS is peerless. VS project files aren&amp;#8217;t portable. It took someone far more expert than me a considerable amount of time to create a cross-platform build system capable of keeping everyone happy. Similarly, a few years ago I spent a short while doing Java development on Linux. Again, I tended to fall back on Emacs for editing, while using &lt;a href="http://www.eclipse.org/"&gt;Eclipse&lt;/a&gt; to enforce style rules, tidy imports, debug, that kind of thing. I would have used Eclipse more but it kept on crashing. When I first started developing on a Mac I tried out XCode. It opened far too many windows and GDB didn&amp;#8217;t seem to want to play with it. I abandoned it.
&lt;/p&gt;
&lt;p&gt;I suppose &lt;a href="http://wordaligned.org/articles/accidental-emacs" title="Things I discovered about Emacs, by accident"&gt;I do use Emacs&lt;/a&gt; as an IDE for code development, and for everything else, but it&amp;#8217;s set up the way I want and it never tries to smooth over the details of what it&amp;#8217;s up to. I frequently drop into the shell (even if it&amp;#8217;s a shell running within Emacs).
&lt;/p&gt;
&lt;p&gt;My biggest problem with the traditional IDE is the view it imposes on a software project. The IDE makes you feel like you&amp;#8217;re an airplane pilot sitting in front of a bank of controls, steering great swathes of code on a demanding course. The &lt;strong&gt;I&lt;/strong&gt; in &lt;strong&gt;I&lt;/strong&gt;DE emphasises integration; it&amp;#8217;s in danger of creating &lt;strong&gt;I&lt;/strong&gt;nterdependencies, of coupling everything. I prefer something more focused.
&lt;/p&gt;

&lt;h3&gt;Learning to Say &amp;#8220;Hello, World&amp;#8221;&lt;/h3&gt;
&lt;p&gt;&lt;a href="http://oreilly.com/catalog/9780596809492"&gt;&lt;img src="http://covers.oreilly.com/images/9780596809492/cat.gif" alt="97TEPSK cover"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;It took me a while to reach this view. Charles Petzold&amp;#8217;s epiphany is almost identical to one I had myself, many years ago. 
&lt;/p&gt;
&lt;p&gt;Towards the end of last year, O&amp;#8217;Reilly books posed the question: What 97 things should every programmer programmer know? In answer, I wrote about the moment I decided to shut down my IDE and get back to basics.
&lt;/p&gt;
&lt;p&gt;You can find the story, &lt;a href="http://programmer.97things.oreilly.com/wiki/index.php/Learn_to_Say_%22Hello,_World%22" title="My contribution to 97 Things Every Programmer Should Know"&gt;Learn to Say &amp;#8220;Hello, World&amp;#8221;&lt;/a&gt; on the wiki which was used for developing the book. Alternatively, the submission was accepted for publication, so you can buy &lt;a href="http://oreilly.com/catalog/9780596809492"&gt;the book&lt;/a&gt; itself.
&lt;/p&gt;
&lt;hr /&gt;


&lt;h3&gt;The Ides of March&lt;/h3&gt;
&lt;p&gt;&lt;a href="http://www.britishmuseum.org/explore/highlights/highlight_objects/cm/s/silver_denarius_of_marcus_juni.aspx" alt="A coin celebrating ancient Rome's most famous murder"/&gt;&lt;img src="http://www.britishmuseum.org/images/an107083_m.jpg" alt="A coin celebrating ancient Rome's most famous murder"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;This [silver denarius] was struck in honour of Marcus Junius Brutus, one of the assassins of Julius Caesar. The reverse shows the cap of liberty given to freed slaves flanked by two daggers. This indicates Brutus&amp;#8217; intention of freeing Rome from Caesar&amp;#8217;s imperial ambitions and the murder weapons employed to do so. Below is the day of the deed; EID.MAR, the ides of March.
&lt;/p&gt;
&lt;p&gt;&amp;#8212; From the &lt;a href="http://www.britishmuseum.org/explore/highlights/highlight_objects/cm/s/silver_denarius_of_marcus_juni.aspx" title="A coin celebrating ancient Rome's most famous murder"&gt;British Museum&lt;/a&gt; collection
&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Credits&lt;/h3&gt;
&lt;p&gt;My thanks to &lt;a href="http://curbralan.com/" title="Curbralan, Kevlin Henney's consultancy"&gt;Kevlin Henney&lt;/a&gt; for editing my contribution to &lt;a href="http://oreilly.com/catalog/9780596809492"&gt;97 Things Every Programmer Should Know&lt;/a&gt;, and indeed to everyone at O&amp;#8217;Reilly involved with putting the book together.
&lt;/p&gt;
&lt;p&gt;I wish I could claim the title of this blog post was my own invention. I&amp;#8217;m also struggling to remember where I first heard it. I think it might have been Matt Bowers, who also invented &lt;a href="http://wordaligned.org/articles/antisocial-build-orders" title="Anti-Social Build Orders"&gt;build ASBOs&lt;/a&gt;. Anyone?
&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/G3wwyfql5Rc" height="1" width="1"/&gt;</description>
<dc:date>2010-03-15</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/beware-the-march-of-ides</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/G3wwyfql5Rc/beware-the-march-of-ides</link>
<category>IDE</category>
<category>Windows</category>
<feedburner:origLink>http://wordaligned.org/articles/beware-the-march-of-ides</feedburner:origLink></item>

<item>
<title>Pi seconds is a nanocentury</title>
<description>&lt;p&gt;&lt;a href="http://www.wolframalpha.com/input/?i=pi" title="Pi, Wolfram alpha" style="font-weight: 900; font-size: 2000%;"&gt;&amp;pi;&lt;/a&gt;
&lt;/p&gt;
&lt;blockquote&gt;&lt;blockquote&gt;&lt;p&gt;&lt;i&gt;&amp;pi; seconds is a nanocentury&lt;/i&gt;
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;&amp;#8212; Jon Bentley, quoting Tom Duff, &lt;a href="http://www.cs.bell-labs.com/cm/cs/pearls/" title="Programming Pearls website."&gt;Programming Pearls&lt;/a&gt;
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;No, this isn&amp;#8217;t a mysterious and fundamental connection between circles and the solar system. Actually, it&amp;#8217;s incorrect: 
&lt;/p&gt;
&lt;ul&gt;
 &lt;li&gt;
     nanocentury ~3.15576 seconds
 &lt;/li&gt;

 &lt;li&gt;
     ratio of a circle&amp;#8217;s circumference to its diameter ~3.14159
 &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Rather, &lt;a href="http://www.iq0.com/duffgram/whine.html" title="The inventor of Duff's device says it like it is"&gt;Tom Duff&lt;/a&gt;&amp;#8217;s rule of thumb is a useful mnemonic. One year is roundabout 3.14&amp;times;10&lt;sup&gt;7&lt;/sup&gt; seconds. The accuracy of this figure is more than good enough for back of an envelope calculations. (Such calculations would be easier if we adopted a decimal time system &amp;#8212; 100 seconds in a minute, 100 minutes in an hour, say &amp;#8212; but time is one of the few currencies with internationally agreed denominations. I can&amp;#8217;t see it changing, and 100 days in a year doesn&amp;#8217;t make sense anyway.)
&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/pXFi1O1CGyA" height="1" width="1"/&gt;</description>
<dc:date>2010-03-14</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/pi-seconds</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/pXFi1O1CGyA/pi-seconds</link>
<category>Pi</category>
<feedburner:origLink>http://wordaligned.org/articles/pi-seconds</feedburner:origLink></item>

<item>
<title>Bike charts by Google</title>
<description>&lt;p&gt;I&amp;#8217;ve liked the &lt;a href="http://code.google.com/apis/chart/"&gt;Google chart API&lt;/a&gt; ever since &lt;a href="http://wordaligned.org/articles/the-maximum-subsequence-problem" title="My first charts using the google API"&gt;I first discovered it&lt;/a&gt;. Pack a text definition of an image into a URL &lt;code&gt;http://chart.apis.google.com/chart?YOUR-IMAGE-HERE&lt;/code&gt; and you&amp;#8217;ll be served up a freshly cooked PNG. It&amp;#8217;s free. There&amp;#8217;s not even a watermark.
&lt;/p&gt;
&lt;img width="320px" height="160px" src="http://chart.apis.google.com/chart?chs=320x160&amp;amp;cht=gom&amp;amp;chd=t:70&amp;amp;chl=Nice!" alt="Swing-o-meter, Nice!"/&gt;

&lt;pre&gt;
http://chart.apis.google.com/chart?  # A chart, please
    &amp;chs=320x160                     # sized 320x160 pixels
    &amp;cht=gom                         # of type swin&lt;b&gt;gom&lt;/b&gt;eter
    &amp;chd=t:70                        # with 70% swing
    &amp;chl=Nice!                       # labeled "Nice!"
&lt;/pre&gt;

&lt;p&gt;Gone are the days when the &lt;a href="http://code.google.com/apis/chart/docs/making_charts.html" title="main entry point to the google chart API docs"&gt;documentation&lt;/a&gt; fitted on a single web-page. The API has fattened up and filled out. Every time I visit something new has been added: &lt;a href="http://code.google.com/apis/chart/docs/gallery/formulas.html" title="or should that be formulas?"&gt;mathematical formulae&lt;/a&gt; written in TeX; a &lt;a href="http://code.google.com/apis/chart/docs/chart_playground.html" title="Live chart playground"&gt;playground&lt;/a&gt; where you can sketch a chart directly; a &lt;a href="http://code.google.com/intl/uk/apis/chart/docs/debugging.html"&gt;validation&lt;/a&gt; option which tells you where you went wrong &amp;#8212; much more helpful than a bare 404.
&lt;/p&gt;
&lt;span id="continue-reading"/&gt;

&lt;p&gt;New to me: &lt;a href="http://code.google.com/apis/chart/docs/gallery/dynamic_icons.html" title="dynamic icons - callouts, bubbles, pins, and other graphics"&gt;dynamic icons&lt;/a&gt;, which let you create &amp;#8220;a variety of interesting callouts, pins, or bubbles that mix text and images&amp;#8221;. 
&lt;/p&gt;
&lt;img src="http://chart.apis.google.com/chart?chst=d_fnote&amp;amp;chld=thought|2|993300|h|We+could+have+|fun+with+this!" alt="here's a thought..."/&gt;

&lt;img src="http://chart.apis.google.com/chart?chst=d_bubble_icon_texts_big&amp;amp;chld=bicycle|bb|ffff33|663300|Classic+Tour+Finishes|Let's+make+some+charts+which+depict|classic+stage+finishes+in+the+Tour+de+France" alt="classic cycle charts"/&gt;

&lt;img src="http://chart.apis.google.com/chart?chst=d_bubble_text_small&amp;amp;chld=bbbr|Good+idea,+go+for+it!|ffff33|663300" alt="Go for it!"/&gt;

&lt;p&gt;Mercurial manxman Mark Cavendish won an incredible &lt;strong&gt;6 stages&lt;/strong&gt; of last year&amp;#8217;s Tour. Here he is, becoming the first Briton ever to win the final showdown on the Champs-&amp;Eacute;lys&amp;eacute;es, and winning it by an immense margin. For me, it was a bitter-sweet moment: that sprint should have put Cav in the green jersey, but he&amp;#8217;d thrown away his chance in the points competition earlier in the race with an &lt;a href="http://tag.wordaligned.org/posts/cav-wants-race"&gt;act of petulance&lt;/a&gt; which I still struggle to understand.
&lt;/p&gt;
&lt;img alt="Cavendish, first on the the Champs-&amp;Eacute;lys&amp;eacute;es" src="http://chart.apis.google.com/chart?&amp;amp;cht=lc&amp;amp;chs=540x280&amp;amp;chls=4,3,0&amp;amp;chd=s:ZZZZZZZZZZZZZZZZZZZ&amp;amp;chco=aaaaaa&amp;amp;chm=B,0000ff,0,0:7,0|B,ffffff,0,6:12,0|B,ff0000,0,12:,0,&amp;amp;chem=y;s=simple_text_icon_left;d=,14,000,bicycle,24,000,FFF;of=0,10;dp=15|y;s=simple_text_icon_left;d=,14,000,helicopter,24,000,FFF;of=0,120;dp=11|y;s=simple_text_icon_left;d=,14,000,bicycle,24,000,FFF;of=0,10;dp=10|y;s=simple_text_icon_left;d=,14,000,bicycle,24,000,FFF;of=0,10;dp=9|y;s=simple_text_icon_left;d=,14,000,bicycle,24,000,FFF;of=0,10;dp=range,3,9,.8|y;s=simple_text_icon_left;d=,14,000,civic-building,24,000,FFF;of=0,14;dp=1"/&gt;

&lt;p&gt;The Champs-&amp;Eacute;lys&amp;eacute;es may have a cobbled surface but it&amp;#8217;s level and straight &amp;#8212; definitely one for the sprinters. How about something twisted and mountainous? This second tableau recreates Fabian Cancellara&amp;#8217;s dare-devil descent during stage 7 of last year&amp;#8217;s tour. Defending the maillot jaune, Cancellara got dropped by the peleton following a wheel change. Watch him weave between team cars and camera bikes at top speed to regain his place. &lt;a href="http://www.youtube.com/watch?v=RxXqQqAc2pA" title="Watch Cancellara's descent on YouTube"&gt;Awesome!&lt;/a&gt;
&lt;/p&gt;
&lt;img alt="Cancellara descending" src="http://chart.apis.google.com/chart?cht=lc&amp;amp;chs=540x280&amp;amp;chls=4,3,0&amp;amp;chd=s:zyxwwvuttsrppponmmlkjihgfedcbaZYXVUUUTSRQONNMLKJIHHHGFEEEDCCCCCBBBBB&amp;amp;chco=aaaaaa&amp;amp;chm=B,ffff33,0,0,0&amp;amp;chem=y;s=simple_text_icon_left;d=,14,000,bicycle,24,000,FFF;of=4,8;dp=11|y;s=simple_text_icon_left;d=,14,000,car-dealer,24,000,FFF;of=0,10;dp=range,15,24,4|y;s=simple_text_icon_left;d=,14,000,helicopter,24,000,FFF;of=0,80;dp=16|y;s=simple_text_icon_left;d=,14,000,motorcycle,24,000,FFF;of=0,6;dp=range,7,21,9|y;s=simple_text_icon_left;d=,14,000,bicycle,24,000,FFF;of=4,8;dp=range,45,60,2"/&gt;

&lt;p&gt;Now for a real classic &amp;#8212; when Stephen Roche dug deep during an epic mountain stage in the 1987 Tour. Pedro Delgado, wearing yellow, had built a substantial lead over his rival on the climb up La Plagne. Yet somehow Roche clawed his way back into contention, appearing at the finish line just 5 seconds down on Delgado. He surprised everyone. He collapsed, exhausted, and had to be given oxygen, but he&amp;#8217;d done enough. Roche went on to win the Tour. &lt;a href="http://www.youtube.com/watch?v=sQojh-wqL04" title="Roche at La Plagne, commentary by Phil Liggett"&gt;Formidable!&lt;/a&gt;
&lt;/p&gt;
&lt;img alt="IT'S STEPHEN ROCHE!" src="http://chart.apis.google.com/chart?cht=lc&amp;amp;chs=540x280&amp;amp;chls=4,3,0&amp;amp;chd=s:ACDEHIJKMOQSTUVXYabcdfghjkmnoppqqrssttuu&amp;amp;chco=aaaaaa&amp;amp;chm=B,ffff33,0,0,0&amp;amp;chem=y;s=simple_text_icon_left;d=,14,000,bicycle,24,000,FFF;of=0,10;dp=29|y;s=simple_text_icon_left;d=,14,000,bicycle,24,000,FFF;of=0,12;dp=32|y;s=simple_text_icon_left;d=,14,000,wc-male,24,000,FFF;of=0,16;dp=35|y;s=simple_text_icon_left;d=,14,000,medical,24,000,FFF;of=0,16;dp=37|y;s=bubble_text_small;d=bbbr,that+looks+like+Stephen+Roche....+IT'S+STEPHEN+ROCHE!,ffff00,000000;of=40,230"/&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/Bi2tz6kVghE" height="1" width="1"/&gt;</description>
<dc:date>2010-02-18</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/bike-charts</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/Bi2tz6kVghE/bike-charts</link>
<category>Google</category>
<category>Cycling</category>
<category>Charts</category>
<feedburner:origLink>http://wordaligned.org/articles/bike-charts</feedburner:origLink></item>

<item>
<title>When you comment on a comment</title>
<description>&lt;blockquote&gt;&lt;p&gt;&lt;a href="http://twitter.com/ianbicking/status/8891604954"&gt;@ianbicking&lt;/a&gt; these days, I very rarely bother reading anything where I cannot comment. &amp;#8212; &lt;a href="http://twitter.com/drjtwit/status/8898216561"&gt;@drjtwit&lt;/a&gt;
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p style="text-align:center"&gt;&amp;sect;&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;You&amp;#8217;ll notice there are no comments here. I hate discussing things via blog comments. If you&amp;#8217;d like to talk, drop me a line. If you&amp;#8217;d like to discuss things in public, post on your blog. 
&lt;/p&gt;
&lt;p&gt;&amp;#8212; &lt;a href="http://codahale.com"&gt;Coda Hale&lt;/a&gt;
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p style="text-align:center"&gt;&amp;sect;&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;When you leave a comment on a comment, how often do you wonder what your rights are? Not too often, I&amp;#8217;d guess. Over the years, it has become an accepted fact that content contributed to a website simply belongs to that website. If the website, or blog for today&amp;#8217;s web, goes away then all of your contributions disappear along with it.
   A real world analogy would be sending in letters or artwork to a magazine. There&amp;#8217;s usually that disclaimer which says the publication can do whatever with your submission. And, of course, they can&amp;#8217;t return anything to you. It belongs to the magazine now.
&lt;/p&gt;
&lt;p&gt;&amp;#8212; Daniel Ha, &lt;a href="http://blog.disqus.net/2008/05/30/a-commenters-rights/"&gt;A commenter&amp;#8217;s rights&lt;/a&gt;
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p style="text-align:center"&gt;&amp;sect;&lt;/p&gt;

&lt;p&gt;In a recent &lt;a href="http://dantwining.com/2010/01/30/using-twitter-ids-for-comments/"&gt;blog post Dan Twining&lt;/a&gt; writes about blog comments and asks what I think of &lt;a href="http://disqus.com"&gt;Disqus&lt;/a&gt;, the commenting service used here at &lt;a href="http://wordaligned.org/"&gt;Word Aligned&lt;/a&gt;. The question comes at an interesting time.
&lt;/p&gt;
&lt;span id="continue-reading"/&gt;

&lt;p&gt;&lt;a href="http://disqus.com"&gt;&lt;img src="http://wordaligned.org/images/disqus-comments.gif" alt="Disqus comments" style="float:right;"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;ol&gt;
 &lt;li&gt;&lt;p&gt;Unlike Blogger, Wordpress, Typepad etc. Disqus isn&amp;#8217;t a blogging platform. Disqus do comments, and their central idea is integration. You don&amp;#8217;t need yet another online id to leave a disqus comment. Sign in using your OpenID or Yahoo! account, for example. You can have your comments tweeted or posted on Facebook. Disqus works with whatever blogging system you use and it works &lt;strong&gt;across&lt;/strong&gt; different systems: if a blog uses disqus and you post a comment on that blog, then that comment remains yours &amp;#8212; the blog owner can&amp;#8217;t edit it, and other readers can click through to see comments you&amp;#8217;ve posted on other sites. Well, other comments posted using disqus that is. Clearly these connections extend as more sites adopt disqus, and this seems to be happening.
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;I used to use Haloscan for comments. I switched to Disqus about this time last year. I had no particular complaints with Haloscan; installation was simple, I didn&amp;#8217;t get any spam comments, and the service proved reliable enough. It just seemed Haloscan wasn&amp;#8217;t going anywhere. As it turns out, Haloscan will soon be gone. They&amp;#8217;re shutting down the service this week.
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;It so happens Dan&amp;#8217;s question comes when the action on this site is happening in the comments section. Which makes me think.
&lt;/p&gt;

 &lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Back to Daniel Ha, the Disqus CEO, who has clearly thought harder about comments than I ever will. Towards the top of this note is a quote from his article &lt;a href="http://blog.disqus.net/2008/05/30/a-commenters-rights/"&gt;&amp;#8220;A Commenter&amp;#8217;s Rights&amp;#8221;&lt;/a&gt;. Daniel Ha goes on to point out that times have changed, and that online publishers are able to involve their readers and include their input in more sophisticated ways. It&amp;#8217;s better for both commenters and publishers, he suggests, if commenters retain rights over their original material.
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;So what are a commenter&amp;#8217;s rights? I&amp;#8217;m going to make an initial attempt to materialize what some rights should be.&lt;/p&gt;
&lt;ol style="list-style-type:lower-alpha"&gt;&lt;li&gt;The ability to edit and remove their comments&lt;/li&gt;
&lt;li&gt;Access to all of their comments, even if it has been deleted on a blog&lt;/li&gt;
&lt;li&gt;The right to use their own comments as blog posts. After all, a commenter is just a publisher not writing on his own website.&lt;/li&gt;
&lt;li&gt;A life for the comment beyond a single blog. I want to take my comments with me, even if the blog shuts down.&lt;/li&gt;&lt;/ol&gt;
&lt;p&gt;This may seem threatening to the publisher, but it really isn&amp;#8217;t. A commenter should have rights to what they post, but bloggers should still have control over content that appear on their blogs. Bloggers should still control:&lt;/p&gt;
&lt;ol style="list-style-type:lower-alpha"&gt;&lt;li&gt;Whether or not someone is allowed to comment on his blog&lt;/li&gt;
&lt;li&gt;The deletion of a comment&lt;/li&gt;
&lt;li&gt;The modification of a comment, as long as the original copy is still accessible and the edit is transparent&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;&amp;mdash; Daniel Ha, &lt;a href="http://blog.disqus.net/2008/05/30/a-commenters-rights/"&gt;A commenter&amp;#8217;s rights&lt;/a&gt;&lt;/p&gt;&lt;/blockquote&gt;

&lt;p style="text-align:center"&gt;&amp;sect;&lt;/p&gt;

&lt;p&gt;Why bother with site-hosted comments at all? If people want to comment they can do so on specially designed community sites like Proggit or Hacker News. This is the internet: we go where we please, we find what we like.
&lt;/p&gt;
&lt;p&gt;Well, I guess I included comments on this site because that&amp;#8217;s what other blogs did, and because I thought it was a way to engage readers and persuade them to return. The truth is, most readers arrive here via &lt;a href="http://www.reddit.com/domain/wordaligned.org"&gt;proggit&lt;/a&gt;; if I did away with comments here I might get more comments on reddit, and consequently more visits.
&lt;/p&gt;
&lt;p&gt;Look again at Daniel Ha&amp;#8217;s words
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;When you leave a comment on a comment &amp;#8230;
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;I&amp;#8217;m guessing this is a typo, and that he meant &amp;#8220;When you leave a comment on a &lt;strong&gt;website&lt;/strong&gt; &amp;#8230;&amp;#8221;. If so, it&amp;#8217;s an interesting slip. Modern comment systems are designed for comments on comments as much as for comments on the original article. Why bother with the original if the comments are more interesting? Jump straight into the discussion!
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;[&amp;#8230;] It&amp;#8217;s not that I don&amp;#8217;t like sites with comments on, but when you read a site with comments it automatically puts you, the reader, in a defensive mode where you&amp;#8217;re saying, &amp;#8220;what&amp;#8217;s good in this comment thread? What can I skim?&amp;#8221;
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;These are John Gruber&amp;#8217;s words but you won&amp;#8217;t find them on his &lt;a href="http://daringfireball.net"&gt;Daring Fireball&lt;/a&gt; website (the quotation has been &lt;a href="http://shawnblanc.net/2007/07/why-daring-fireball-is-comment-free/"&gt;transcribed&lt;/a&gt; by Shawn Blanc from an interview), and you won&amp;#8217;t find comments there either. Have a look at an article on Daring Fireball. &lt;a href="http://daringfireball.net/2010/02/winer_flash_open_standards"&gt;Here&amp;#8217;s a recent one about Adobe Flash&lt;/a&gt;. Now try to imagine how the article would look with comments.
&lt;/p&gt;
&lt;p style="text-align:center"&gt;&amp;sect;&lt;/p&gt;

&lt;p&gt;I&amp;#8217;ve exported any Haloscan comments left on this site and imported them into Disqus using the &lt;a href="http://groups.google.com/group/disqus-dev"&gt;disqus developer API&lt;/a&gt;. My thanks to Roberto Alsina for his &lt;a href="http://lateral.netmanagers.com.ar/weblog/posts/BB856.html" title="Migrating from Haloscan to Disqus - instructions"&gt;helpful pointers&lt;/a&gt; on how to do this. Links to comments will have broken, but everything else should be fine. I&amp;#8217;m sticking with comments and I&amp;#8217;m sticking with Disqus, which gets better all the time. Please &lt;a href="mailto:tag@wordaligned.org"&gt;let me know&lt;/a&gt; if you spot any problems.
&lt;/p&gt;
&lt;p style="text-align:center"&gt;&amp;sect;&lt;/p&gt;

&lt;p&gt;I&amp;#8217;ll &lt;a href="http://stevenf.com/pages/shutup/" title="A user style sheet which hides comments"&gt;shut up&lt;/a&gt; now.
&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/102729l9Vuo" height="1" width="1"/&gt;</description>
<dc:date>2010-02-10</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/comments-on-comments</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/102729l9Vuo/comments-on-comments</link>
<category>Self</category>
<category>Disqus</category>
<feedburner:origLink>http://wordaligned.org/articles/comments-on-comments</feedburner:origLink></item>

<item>
<title>Power programming</title>
<description>&lt;div class="toc"&gt;
&lt;h2&gt;Contents&lt;/h2&gt;
&lt;ul&gt;
 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/power-programming#tocpowerful-or-dangerous" name="toc0" id="toc0"&gt;Powerful or dangerous?&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/power-programming#tocdecision-trees" name="toc1" id="toc1"&gt;Decision trees&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/power-programming#toccuteness-calculator" name="toc2" id="toc2"&gt;Cuteness calculator&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/power-programming#toceval" name="toc3" id="toc3"&gt;Eval&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/power-programming#tocdynamic-or-hacky" name="toc4" id="toc4"&gt;Dynamic or hacky?&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/power-programming#tocjam-to-golf" name="toc5" id="toc5"&gt;Jam to golf&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/power-programming#toccode-vs-data" name="toc6" id="toc6"&gt;Code vs data&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/power-programming#tocpowerful-language-vs-power-user" name="toc7" id="toc7"&gt;Powerful language vs power user?&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/power-programming#tocappendix-a-first-impressions-of-arc" name="toc8" id="toc8"&gt;Appendix A: First impressions of Arc&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/power-programming#tocappendix-b-c-solution" name="toc9" id="toc9"&gt;Appendix B: C++ solution&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/power-programming#tocappendix-c-a-python-solution" name="toc10" id="toc10"&gt;Appendix C: A Python Solution&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/power-programming#tocnotes" name="toc11" id="toc11"&gt;Notes&lt;/a&gt;
 &lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/power-programming#toc0" name="tocpowerful-or-dangerous" id="tocpowerful-or-dangerous"&gt;Powerful or dangerous?&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Recently I &lt;a href="http://wordaligned.org/articles/next-permutation" title="Next permutation: when C++ gets it right"&gt;wrote about&lt;/a&gt; one of the &lt;a href="http://code.google.com/codejam/"&gt;Google Code Jam&lt;/a&gt; challenges, where, perhaps surprisingly, the best answer &amp;#8212; the most elegant and obviously correct answer, requiring the fewest lines of code, with virtually zero space overhead, and running the quickest &amp;#8212; the very best answer was coded in C++.
&lt;/p&gt;
&lt;p&gt;Why should this be surprising? C++ is a powerful language.
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;In my experience there is almost no limit to the damage that a sufficiently ingenious fool can do with C++. But there is also almost no limit to the degree of complexity that a skillful library designer can hide behind a simple, safe, and elegant C++ interface. 
&lt;/p&gt;
&lt;p&gt;&amp;#8212; Greg Colvin, &lt;a href="http://www.artima.com/cppsource/spiritofc2.html" title="Greg Colvin, In the Spirit of C"&gt;&amp;#8220;In the Spirit of C&amp;#8221;&lt;/a&gt;
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;Yes. And yes! But in this article I wanted to discuss something C++ &lt;strong&gt;can&amp;#8217;t&lt;/strong&gt; do. Let&amp;#8217;s start with another &lt;a href="http://code.google.com/codejam/contest/dashboard?c=186264#" title="Decision Tree, Google Code Jam 2009"&gt;challenge&lt;/a&gt; from the same round of the 2009 Google Code Jam.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/power-programming#toc1" name="tocdecision-trees" id="tocdecision-trees"&gt;Decision trees&lt;/a&gt;&lt;/h3&gt;
&lt;blockquote&gt;&lt;p&gt;Decision trees &amp;#8212; in particular, a type called classification trees &amp;#8212; are data structures that are used to classify &lt;i&gt;items&lt;/i&gt; into &lt;i&gt;categories&lt;/i&gt; using &lt;i&gt;features&lt;/i&gt; of those items. For example, each animal is either &amp;#8220;cute&amp;#8221; or not. For any given animal, we can decide whether it is cute by looking at the animal&amp;#8217;s features and using the following decision tree.&lt;/p&gt;
&lt;pre&gt;(0.2 furry
  (0.81 fast
    (0.3)
    (0.2)
  )
  (0.1 fishy
    (0.3 freshwater
      (0.01)
      (0.01)
    )
    (0.1)
  )
)
&lt;/pre&gt;&lt;p&gt;&amp;mdash; &lt;a href="http://code.google.com/codejam/contest/dashboard?c=186264#"&gt;Decision Trees, Google Code Jam 2009&lt;/a&gt;&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="http://www.zazzle.com/cute_beaver_magnet-147411069592023743"&gt;&lt;img src="http://wordaligned.org/images/cute-beaver.png" alt="Cute beaver!" width="227px" height="193px" style="float:right;margin:25px 25px"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;The challenge goes on to describe the structure more formally, then steps through an example calculation. What is the probability, &lt;code&gt;p&lt;/code&gt;, that a beaver is cute?
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;For example, a beaver is an animal that has two features: &lt;code&gt;furry&lt;/code&gt; and &lt;code&gt;freshwater&lt;/code&gt;. We start at the root with &lt;code&gt;p&lt;/code&gt; equal to &lt;code&gt;1&lt;/code&gt;. We multiply &lt;code&gt;p&lt;/code&gt; by &lt;code&gt;0.2&lt;/code&gt;, the weight of the root and move into the first sub-tree because the beaver has the &lt;code&gt;furry&lt;/code&gt; feature. There, we multiply &lt;code&gt;p&lt;/code&gt; by &lt;code&gt;0.81&lt;/code&gt;, which makes &lt;code&gt;p&lt;/code&gt; equal to &lt;code&gt;0.162&lt;/code&gt;. From there we move further down into the second sub-tree because the beaver does not have the fast feature. Finally, we multiply &lt;code&gt;p&lt;/code&gt; by &lt;code&gt;0.2&lt;/code&gt; and end up with &lt;code&gt;0.0324&lt;/code&gt; &amp;#8212; the probability that the beaver is cute. 
&lt;/p&gt;
&lt;/blockquote&gt;&lt;img src="http://wordaligned.org/images/decision-tree.png" alt="Decision tree calculation"/&gt;

&lt;p&gt;The challenge itself involves processing input comprising a number of test cases. Each test case consists of a decision tree followed by a number of animals. A solution should parse the input and output the calculated cuteness probabilities.
&lt;/p&gt;
&lt;span id="continue-reading"/&gt;


&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/power-programming#toc2" name="toccuteness-calculator" id="toccuteness-calculator"&gt;Cuteness calculator&lt;/a&gt;&lt;/h3&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;def cuteness(decision_tree, features):
    """Return the probability an animal is cute.
    
    - decision_tree, the decision tree
    - features, the animal's features,
    """
    p = 1.0
    dt = decision_tree
    has_feature = features.__contains__
    while dt:
        weight, *dt = dt
        p *= weight
        if dt:
            feat, lt, rt = dt
            dt = lt if has_feature(feat) else rt
    return p

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Calculating an animal&amp;#8217;s cuteness given a decision tree and the animal&amp;#8217;s features isn&amp;#8217;t hard. In Python we don&amp;#8217;t need to code up a specialised decision tree class &amp;#8212; a nested tuple does just fine. The &lt;code&gt;cuteness()&lt;/code&gt; function shown above descends the decision tree, switching left or right according to each feature&amp;#8217;s presence or absence. The efficiency of this algorithm is proportional to the depth of the tree multiplied by the length of the feature list; as far as the code jam challenge goes, it&amp;#8217;s not a concern.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; decision_tree = (
...     0.2, 'furry',
...         (0.81, 'fast',
...             (0.3,),
...             (0.2,),
...         ),
...         (0.1, 'fishy',
...             (0.3, 'freshwater',
...                  (0.01,),
...                  (0.01,),
...             ),
...             (0.1,),
...         ),
...     )
&amp;gt;&amp;gt;&amp;gt; beaver = ('furry', 'freshwater')
&amp;gt;&amp;gt;&amp;gt; cuteness(decision_tree, beaver)
0.032400000000000005

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;No, the real problem here is how to parse the input data to create the decision trees and feature sets. As you can see, though, the textual specification of a decision tree closely resembles a Python representation of that decision tree. Just add punctuation.
&lt;/p&gt;
&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;td&gt;Specification&lt;/td&gt;&lt;td&gt;Python&lt;/td&gt;&lt;/tr&gt;&lt;/thead&gt;
&lt;tr&gt;&lt;td&gt;&lt;pre&gt;(0.2 furry
  (0.81 fast
    (0.3)
    (0.2)
  )
  (0.1 fishy
    (0.3 freshwater
      (0.01)
      (0.01)
    )
    (0.1)
  )
)&lt;/pre&gt;&lt;/td&gt;&lt;td&gt;&lt;pre&gt;(0.2, 'furry',
  (0.81, 'fast',
    (0.3,),
    (0.2,),
  ),
  (0.1, 'fishy',
    (0.3, 'freshwater',
      (0.01,),
      (0.01,),
      ),
      (0.1,),
  ),
)&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;

&lt;p&gt;Rather than parse the decision tree definition by hand, why not tweak it so that it &lt;strong&gt;is&lt;/strong&gt; a valid Python nested tuple? Then we can just let the Python interpreter &lt;a href="http://docs.python.org/library/functions.html#eval"&gt;&lt;tt&gt;eval&lt;/tt&gt;&lt;/a&gt; the tuple and use it directly.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/power-programming#toc3" name="toceval" id="toceval"&gt;Eval&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;A program&amp;#8217;s ability to read and execute source code at run-time is one of the things which makes &lt;a href="http://en.wikipedia.org/wiki/Dynamic_programming_language#Eval"&gt;dynamic languages&lt;/a&gt; dynamic. You can&amp;#8217;t do it in C and C++ &amp;#8212; no, sneaking instructions &lt;a href="http://en.wikipedia.org/wiki/Buffer_overrun"&gt;past the end of a buffer&lt;/a&gt; doesn&amp;#8217;t count. Should you do it in Python? Well, it won&amp;#8217;t hurt to give it a try.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;spec = '''\
(0.2 furry
  (0.81 fast
    (0.3)
    (0.2)
  )
  (0.1 fishy
    (0.3 freshwater
      (0.01)
      (0.01)
    )
    (0.1)
  )
)
'''

tuple_rep = re.sub(r'([\.\d]+|\))', r'\1,', spec)
tuple_rep = re.sub(r'([a-z]+)', r'"\1",', tuple_rep)
decision_tree = eval(tuple_rep)[0]

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here, we start with the input specification of the decision tree (imagine this has been read directly from standard input). The first regex substitution inserts commas after numbers, and right parentheses. The second substitution quotes and inserts a comma after feature strings. This turns the decision tree&amp;#8217;s specification into a textual representation of a nested Python tuple. We then &lt;code&gt;eval&lt;/code&gt; that tuple and assign the result to &lt;code&gt;decision_tree&lt;/code&gt; &amp;#8212; a Python decision tree we can go on and use in the rest of our program. And that&amp;#8217;s the code jam challenge cracked, pretty much.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; from pprint import pprint
&amp;gt;&amp;gt;&amp;gt; pprint(decision_tree)
(0.2,
 'furry',
 (0.81, 'fast', (0.3,), (0.2,)),
 (0.1, 'fishy', (0.3, 'freshwater', (0.01,), (0.01,)), (0.1,)))

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;(Minor wrinkle: you&amp;#8217;ll have spotted the final decision tree is the first element of the evaluated tuple. That&amp;#8217;s because the regex substitution puts a trailing comma after the right parenthesis which closes the decision tree specification, which turns &lt;code&gt;tuple_rep&lt;/code&gt; into a one-tuple. The single element contained in this one-tuple is what we need.)
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/power-programming#toc4" name="tocdynamic-or-hacky" id="tocdynamic-or-hacky"&gt;Dynamic or hacky?&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;As you can see, it doesn&amp;#8217;t take much code to pull the decision tree in ready for use. Python allows us to convert between text and code and to execute code within the current environment: you just need to keep a clear head and remember where you are. Regular expressions may not have the first class language support they enjoy in Perl and Ruby, but they are well supported, and the raw string syntax makes them more readable.
&lt;/p&gt;
&lt;p&gt;Certainly, this code snippet is easier to put together than a full blown &lt;a href="http://code.google.com/codejam/contest/dashboard?c=186264#s=a&amp;amp;a=0" title="Google's analysis of the decision tree challenge, including a parser"&gt;parser&lt;/a&gt;, but I think it will take more than this to convince a C++ programmer that Python is a powerful language, rather than a dangerous tool for ingenious fools. It fails to convince me. I can&amp;#8217;t remember ever using &lt;code&gt;eval&lt;/code&gt; or &lt;code&gt;exec&lt;/code&gt; in production code, where keeping a separation between layers is more important than speed of coding.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/power-programming#toc5" name="tocjam-to-golf" id="tocjam-to-golf"&gt;Jam to golf&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;&lt;a href="http://codegolf.com"&gt;&lt;img src="http://codegolf.com/images/logo.png" alt="Code Golf logo" width="332px" height="75px"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;That said, Python is a fine language for scripting, and speed of coding &lt;strong&gt;is&lt;/strong&gt; what matters in this particular challenge. Just for fun, what if there were &lt;a href="http://stackoverflow.com/questions/1433263/decision-tree-code-golf" title="Decision tree code golf on Stack Overflow"&gt;a prize for brevity&lt;/a&gt;? Then of course Perl would &lt;a href="http://stackoverflow.com/questions/1433263/decision-tree-code-golf/1442392#1442392" title="gnibbler's winning Perl entry"&gt;win&lt;/a&gt;!
&lt;/p&gt;
&lt;div class="typocode"&gt;&lt;div class="codetitle"&gt;Code Jam golf, by gnibbler, Stack Overflow&lt;/div&gt;

&lt;pre class="prettyprint"&gt;say("Case #$_:"),
$_=eval"''".'.&amp;lt;&amp;gt;'x&amp;lt;&amp;gt;,
s:[a-z]+:*(/ $&amp;amp;\\s/?:g,s/\)\s*\(/):/g,
eval"\$_=&amp;lt;&amp;gt;;say$_;"x&amp;lt;&amp;gt;for 1..&amp;lt;&amp;gt;

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Note that this does more than simply parse a decision tree &amp;#8212; it&amp;#8217;s a complete solution to the code jam &lt;a href="http://code.google.com/codejam/contest/dashboard?c=186264#" title="Decision Tree, Google Code Jam 2009"&gt;challenge&lt;/a&gt;, reading trees, features, calculating cutenesses, and producing output in the required format. Sadly that&amp;#8217;s all I can say about it because the details of its operation are beyond me.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/power-programming#toc6" name="toccode-vs-data" id="toccode-vs-data"&gt;Code vs data&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Using Python to dynamically execute code may not generally be needed or welcomed in Python production code, and over-reliance on the same trick risks reinforcing Perl&amp;#8217;s  &amp;#8220;write only&amp;#8221; reputation, but Python and Perl aren&amp;#8217;t the only contenders. &lt;span /&gt;The equivalence of code and data marks Lisp&amp;#8217;s apotheosis. Take a look at a &lt;a href="http://stackoverflow.com/questions/1433263/decision-tree-code-golf/1540845#1540845" title="Arc solution to decision tree"&gt;Lisp solution&lt;/a&gt; to the challenge. This one is coded up in &lt;a href="http://arclanguage.org" title="Arc, a new dialect of Lisp"&gt;Arc&lt;/a&gt;.
&lt;/p&gt;
&lt;pre class="prettyprint lang-lisp"&gt;
(def r () (read))
(for i 1 (r)
  (prn "Case #" i ":")
  (r)
  (= z (r))
  (repeat (r)
    (r)
    (loop (= g (n-of (r) (r))
             c z
             p 1)
       c
       (= p (* (pop c) p)
          c (if (pos (pop c) g)
                (c 0)
                (cadr c))))
    (prn p)))
&lt;/pre&gt;

&lt;p&gt;Which challenge does this solve? 
&lt;/p&gt;
&lt;p&gt;I meant the code golf challenge, of solving the decision tree problem using the fewest keystrokes. At 154 characters this Arc program is nearly half as long again as the winning Perl entry, but it&amp;#8217;s hardly flabby. What really impresses me, though, is that the code is (almost) as readable as it is succinct. It&amp;#8217;s elegant code. The only real scars left by the battle for brevity are the one character variable names. Here&amp;#8217;s the same code with improved variable names and some comments added. It&amp;#8217;s the &lt;code&gt;(read)&lt;/code&gt; calls which evaluate expressions on standard input. The &lt;code&gt;(for ...)&lt;/code&gt; and &lt;code&gt;(repeat ...)&lt;/code&gt; expressions operate as you might expect. The third looping construct, &lt;code&gt;(loop ...)&lt;/code&gt; initialises, tests and proceeds, much like a C for loop.
&lt;/p&gt;
&lt;pre class="prettyprint lang-lisp"&gt;
(for i 1 (read)               ; Read N, # test cases, and loop
  (prn "Case #" i ":")
  
  (read)                      ; Skip L, # lines taken by decision tree
  (= dtree (read))            ; and read the tree in directly
  
  (repeat (read)              ; Repeat over A, # animals
    (read)                    ; Skip animal name
    ; Read in the animal's features and walk down the 
    ; decision tree calculating p, the cuteness probability
    (loop (= features (n-of (read) (read)) 
             dt dtree
             p 1)
       dt
       (= p (* (pop dt) p)
          dt (if (pos (pop dt) features)
                (car dt)
                (cadr dt))))
    (prn p)))
&lt;/pre&gt;

&lt;p&gt;You could argue the elegance of this solution is due to the fact the input comprises a sequence of tokens and &lt;a href="http://en.wikipedia.org/wiki/S-expression" title="S-expressions, Wikipedia"&gt;S-expressions&lt;/a&gt;. If commas had been used to separate input elements and the text fields had been enclosed in quotes, then maybe a Python solution would have been equally clean. Or if the input had been in XML, then we&amp;#8217;d be looking to a library rather than &lt;code&gt;eval&lt;/code&gt; for parsing the input.
&lt;/p&gt;
&lt;p&gt;It&amp;#8217;s a fair point, but the equivalence of code and data counts as Lisp&amp;#8217;s biggest idea. Where Python&amp;#8217;s &lt;code&gt;eval&lt;/code&gt; is workable but rarely needed, Lisp&amp;#8217;s &lt;code&gt;(read)&lt;/code&gt; is fundamental.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/power-programming#toc7" name="tocpowerful-language-vs-power-user" id="tocpowerful-language-vs-power-user"&gt;Powerful language vs power user?&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;So, the most elegant answer to the code jam decision tree challenge would also be the quickest to write, and it would be written in Lisp. Did code jam champion, &lt;a href="http://www.go-hero.net/jam/09/name/ACRush" title="ACRush's code jam solutions"&gt;ACRush&lt;/a&gt;, submit a Lisp solution?
&lt;/p&gt;
&lt;p&gt;Absolutely not!
&lt;/p&gt;
&lt;p&gt;Another fundamental thing about Lisp is that it&amp;#8217;s straightforward to parse. A C++ expert can knock up an input parser for decision trees and features to order. ACRush brushed this round aside with a perfect score, taking just 45 minutes to code up working C++ solutions to this question &lt;strong&gt;and two others&lt;/strong&gt;. I&amp;#8217;ve reproduced his solution to the &lt;a href="http://code.google.com/codejam/contest/dashboard?c=186264#" title="Decision Tree, Google Code Jam 2009"&gt;decision tree challenge&lt;/a&gt; at the end of this article. It&amp;#8217;s plain and direct. Given the time constraints, I think it exhibits astonishing fluency &amp;#8212; the work of someone who can think in C++.
&lt;/p&gt;
&lt;p&gt;In this article we&amp;#8217;ve encountered four programming languages:
&lt;/p&gt;
&lt;ul&gt;
 &lt;li&gt;
     Python
 &lt;/li&gt;

 &lt;li&gt;
     Perl
 &lt;/li&gt;

 &lt;li&gt;
     Lisp
 &lt;/li&gt;

 &lt;li&gt;
     C++
 &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These languages are very different but they share features too. They are all mature, popular and well-supported&lt;a id="fn1link" href="http://wordaligned.org/articles/power-programming#fn1"&gt;&lt;sup&gt;[1]&lt;/sup&gt;&lt;/a&gt;. Each is a powerful general purpose programming language. &lt;span /&gt;But ultimately, the power of the programmer is what matters.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/power-programming#toc8" name="tocappendix-a-first-impressions-of-arc" id="tocappendix-a-first-impressions-of-arc"&gt;Appendix A: First impressions of Arc&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Here&amp;#8217;s another revision of the Arc solution, this time decomposed into subfunctions. I found no complete formal documentation of &lt;a href="http://arclanguage.org" title="Arc, a new dialect of Lisp"&gt;Arc&lt;/a&gt;. You&amp;#8217;ll have to read the source and follow the forum, and to actually run any code you&amp;#8217;ll have to download a an old version of MzScheme. The official line is: by all means have a play, but expect things to change. That said, the language looks delightful, practical, and quite &lt;a href="http://www.paulgraham.com/arcll1.html" title="No onions in the varnish, says Paul Graham"&gt;onion free&lt;/a&gt;. The &lt;a href="http://ycombinator.com/arc/tut.txt"&gt;tutorial&lt;/a&gt; made me smile. Recommended reading.
&lt;/p&gt;
&lt;pre class="prettyprint lang-lisp"&gt;
; The input is a sequence of valid Arc expressions.
; Create some read aliases to execute these.
(= skip read
   decision-tree read
   n-features read 
   n-tests read
   n-animals read)

(def animal-features ()
     ; Get an animal's features
     (skip) ; animal name
     (n-of (n-features) (read)))

(def cuteness (dtree features)
     ; Calculate cuteness from a decision tree and feature set
     (= dt dtree
        p 1.0)
     (while dt
          (= p (* (pop dt) p)
             dt (if (pos (pop dt) features)
                (car dt)
                (cadr dt))))
     p)

; Loop through the tests, printing results
(for i 1 (n-tests)
     (prn "Case #" i ":")
     (skip) ; # lines the tree specification takes
     (= dtree (decision-tree))
     (repeat 
         (n-animals)
         (prn (cuteness dtree (animal-features)))))
&lt;/pre&gt;


&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/power-programming#toc9" name="tocappendix-b-c-solution" id="tocappendix-b-c-solution"&gt;Appendix B: C++ solution&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Here&amp;#8217;s champion ACRush&amp;#8217;s C++ solution. I&amp;#8217;ve removed some general purpose macros from the top of the file. You can download the &lt;a href="http://code.google.com/codejam/contest/scoreboard/do?cmd=GetSourceCode&amp;amp;contest=186264&amp;amp;problem=171116&amp;amp;io_set_id=1&amp;amp;username=ACRush"&gt;original here&lt;/a&gt;.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;#include &amp;lt;set&amp;gt;
#include &amp;lt;string&amp;gt;
#include &amp;lt;vector&amp;gt;
#include &amp;lt;sstream&amp;gt;
#include &amp;lt;cstdio&amp;gt;
#include &amp;lt;cstdlib&amp;gt;

using namespace std;

vector&amp;lt;string&amp;gt; A;
vector&amp;lt;int&amp;gt; P;
set&amp;lt;string&amp;gt; M;

#define SIZE(X) ((int)(X.size()))

double solve(int H,int T)
{
    H++;T--;
    double p=atof(A[H].c_str());
    if (H==T) return p;
    if (M.find(A[H+1])!=M.end())
        return p*solve(H+2,P[H+2]);
    else
        return p*solve(P[T],T);
}
int main()
{
    freopen("A-large.in","r",stdin);freopen("A-large.out","w",stdout);
    int testcase;
    scanf("%d",&amp;amp;testcase);
    for (int caseId=1;caseId&amp;lt;=testcase;caseId++)
    {
        int nline;
        scanf("%d",&amp;amp;nline);
        A.clear();
        char str[1024];
        gets(str);
        for (int i=0;i&amp;lt;nline;i++)
        {
            gets(str);
            string s="";
            for (int k=0;str[k];k++)
                if (str[k]=='(' || str[k]==')')
                    s+=" "+string(1,str[k])+" ";
                else
                    s+=str[k];
            istringstream sin(s);
            for (;sin&amp;gt;&amp;gt;s;A.push_back(s));
        }
        P.resize(SIZE(A),-1);
        vector&amp;lt;int&amp;gt; stack;
        for (int i=0;i&amp;lt;SIZE(A);i++)
            if (A[i]=="(")
                stack.push_back(i);
            else if (A[i]==")")
            {
                int p=stack[SIZE(stack)-1];
                P[i]=p;
                P[p]=i;
                stack.pop_back();
            }
        int cnt;
        printf("Case #%d:\n",caseId);
        for (scanf("%d",&amp;amp;cnt);cnt&amp;gt;0;cnt--)
        {
            scanf("%s",str);
            M.clear();
            int length;
            for (scanf("%d",&amp;amp;length);length&amp;gt;0;length--)
            {
                scanf("%s",str);
                M.insert(str);
            }
            double r=solve(0,SIZE(A)-1);
            printf("%.12lf\n",r);
        }
    }
    return 0;
}

&lt;/pre&gt;

&lt;/div&gt;


&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/power-programming#toc10" name="tocappendix-c-a-python-solution" id="tocappendix-c-a-python-solution"&gt;Appendix C: A Python Solution&lt;/a&gt;&lt;/h3&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;import re
from itertools import islice

def cuteness(decision_tree, features):
    p = decision_tree[0]
    if len(decision_tree) &amp;gt; 1:
        _, feat, lt, rt = decision_tree
        p *= cuteness(lt if feat in features else rt, features)
    return p

def read_decision_tree(spec):
    tuple_rep = re.sub(r'([\.\d]+|\))', r'\1,', spec)
    tuple_rep = re.sub(r'([a-z]+)', r'"\1",', tuple_rep)
    return eval(tuple_rep)[0]

def take_lines(lines, n):
    return ''.join(islice(lines, n))

def main(fp):
    lines = iter(fp)
    n_tests = int(next(lines))
    for tc in range(1, n_tests + 1):
        print("Case #%d:" % tc)
        tree_spec = take_lines(lines, int(next(lines)))
        dtree = read_decision_tree(tree_spec)
        n_animals = int(next(lines))
        for line in islice(lines, n_animals):
            features = set(line.split()[2:])
            print(cuteness(dtree, features))

import sys
main(sys.stdin)

&lt;/pre&gt;

&lt;/div&gt;

&lt;hr /&gt;


&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/power-programming#toc11" name="tocnotes" id="tocnotes"&gt;Notes&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;&lt;a id="fn1" href="http://wordaligned.org/articles/power-programming#fn1link"&gt;[1]&lt;/a&gt; (Arc may not be mature, popular or well-supported; but Lisp certainly is.)
&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/JnMrlsjIGO4" height="1" width="1"/&gt;</description>
<dc:date>2010-01-26</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/power-programming</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/JnMrlsjIGO4/power-programming</link>
<category>Python</category>
<category>Perl</category>
<category>Lisp</category>
<category>Arc</category>
<feedburner:origLink>http://wordaligned.org/articles/power-programming</feedburner:origLink></item>

<item>
<title>Python, Surprise me!</title>
<description>&lt;h3&gt;A Simple Function&lt;/h3&gt;
&lt;p&gt;Here&amp;#8217;s a simple function which converts the third item of a list into an integer and returns it, returning -1 if the list has fewer than three entries or if the third entry fails to convert.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;def third_int(xs):
    '''Convert the third item of xs into an int and return it.
        
    Returns -1 on failure.
    '''    
    try:
        return int(xs[2])
    except IndexError, ValueError:
        return -1

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Unfortunately this simple function is simply wrong. Evidently some exceptions aren&amp;#8217;t being caught.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; third_int([1, 2, 3, 4])
3
&amp;gt;&amp;gt;&amp;gt; third_int([1])
-1
&amp;gt;&amp;gt;&amp;gt; third_int(('1', '2', '3', '4',))
3
&amp;gt;&amp;gt;&amp;gt; third_int(['one', 'two', 'three', 'four'])
Traceback (most recent call last):
    ....
ValueError: invalid literal for int() with base 10: 'three'

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;How ever did a &lt;code&gt;ValueError&lt;/code&gt; sneak past the &lt;code&gt;except&lt;/code&gt; clause?
&lt;/p&gt;
&lt;span id="continue-reading"/&gt;


&lt;h3&gt;The Real Surprise&lt;/h3&gt;
&lt;p&gt;There&amp;#8217;s nothing mysterious or surprising going on here, but I&amp;#8217;ll delay answering this question for a moment. For me, the real surprise about Python is that, generally, I get it right first time. Python similarly &lt;a href="http://www.python.org/about/success/esr" title="Why Python? by Eric S. Raymond"&gt;caught Eric S. Raymond by surprise&lt;/a&gt;. His first surprise was that it took him just 20 minutes to get used to syntactically significant whitespace. And just 100 minutes later &amp;#8230;
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;My second [surprise] came a couple of hours into the project, when I noticed (allowing for pauses needed to look up new features in &lt;em&gt;Programming Python&lt;/em&gt;) I was generating working code nearly as fast as I could type. When I realized this, I was quite startled. An important measure of effort in coding is the frequency with which you write something that doesn&amp;#8217;t actually match your mental representation of the problem, and have to backtrack on realizing that what you just typed won&amp;#8217;t actually tell the language to do what you&amp;#8217;re thinking. An important measure of good language design is how rapidly the percentage of missteps of this kind falls as you gain experience with the language.
&lt;/p&gt;
&lt;p&gt;&amp;#8212; Eric S. Raymond, &lt;a href="http://www.python.org/about/success/esr" title="Why Python? by Eric S. Raymond"&gt;Why Python?&lt;/a&gt;
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;I certainly don&amp;#8217;t generate working code as fast as I can type, and I&amp;#8217;m not even a particularly &lt;a href="http://steve-yegge.blogspot.com/2008/09/programmings-dirtiest-little-secret.html" title="Learn to type, Yegge says"&gt;quick typist&lt;/a&gt;, but I rarely make syntactic errors when writing Python &amp;#8212; and I don&amp;#8217;t often need to consult the documentation on such matters. As Chuck Allison memorably puts it: &lt;a href="http://www.artima.com/cppsource/simple.html"&gt;&amp;#8220;the syntax is so clean it squeaks&amp;#8221;&lt;/a&gt;.
&lt;/p&gt;

&lt;h3&gt;Parentheses Required(?)&lt;/h3&gt;
&lt;p&gt;There are some oddities and gotchas though. I don&amp;#8217;t object to the &lt;a href="http://effbot.org/pyfaq/why-must-self-be-used-explicitly-in-method-definitions-and-calls.htm"&gt;explicit &lt;code&gt;self&lt;/code&gt;&lt;/a&gt; in methods, but I do sometimes forget to write it &amp;#8212; especially if I&amp;#8217;ve just switched over from C++. 
&lt;/p&gt;
&lt;p&gt;A side-effect of the whitespace thing is that you can&amp;#8217;t just wrap a long line. The &lt;a href="http://docs.python.org/reference/lexical_analysis.html#explicit-line-joining"&gt;line ending&lt;/a&gt; needs to be escaped.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;if 1900 &amp;lt; year &amp;lt; 2100 and 1 &amp;lt;= month &amp;lt;= 12 \
    and 1 &amp;lt;= day &amp;lt;= 31 and 0 &amp;lt;= hour &amp;lt; 24 \
    and 0 &amp;lt;= minute &amp;lt; 60 and 0 &amp;lt;= second &amp;lt; 60: # Looks like a valid date
    return 1

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Alternatively, parenthesize.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;if (1900 &amp;lt; year &amp;lt; 2100 and 1 &amp;lt;= month &amp;lt;= 12
    and 1 &amp;lt;= day &amp;lt;= 31 and 0 &amp;lt;= hour &amp;lt; 24
    and 0 &amp;lt;= minute &amp;lt; 60 and 0 &amp;lt;= second &amp;lt; 60): # Looks like a valid date
    return 1

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;In the above, the parentheses aren&amp;#8217;t required to group terms, but instead serve to implicitly continue the line of code past a couple of newline characters.
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://en.wikipedia.org/wiki/Tree_%28data_structure%29"&gt;&lt;img width="360px" src="http://upload.wikimedia.org/wikipedia/commons/thumb/f/f7/Binary_tree.svg/500px-Binary_tree.svg.png" alt="Wikipedia Tree"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;Parentheses serve more than one role in Python&amp;#8217;s syntax. As in all C-family languages, they can group expressions. They also get involved building tuples, &lt;code&gt;(1, 2, 3)&lt;/code&gt; or &lt;code&gt;('red', 0xff0000)&lt;/code&gt; for example. Beware the special case: a one-tuple needs a trailing comma, &lt;code&gt;("singleton",)&lt;/code&gt;. This isn&amp;#8217;t something I forget or accidentally omit, but it can make things fiddly. Here&amp;#8217;s a tuple-tised &lt;a href="http://en.wikipedia.org/wiki/Tree_%28data_structure%29"&gt;tree&lt;/a&gt;, where we represent a tree as a tuple whose first element is a node value, and any subsequent elements are sub-trees. Careful with those commas!
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;tree = (2, (7, (2,), (6, (5,), (11,))), (5, (9, (4,))))

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Actually, tuples are just comma-separated lists of expressions &amp;#8212; no parentheses required &amp;#8212; so we might equally well have written.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;tree = 2, (7, (2,), (6, (5,), (11,))), (5, (9, (4,)))

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here, the superfluous outermost parentheses have been omitted; the inner ones are still required for grouping.
&lt;/p&gt;
&lt;p&gt;How about we always append a trailing comma to our tuples so the one-tuple no longer looks different?
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;tree = 2, (7, (2,), (6, (5,), (11,))), (5, (9, (4,))),

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;That&amp;#8217;s allowed and fine. Unless we need an empty tuple, that is, in which case the parentheses &lt;strong&gt;are&lt;/strong&gt; required. And a comma would be wrong.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; ()
()
&amp;gt;&amp;gt;&amp;gt; (),
((),)
&amp;gt;&amp;gt;&amp;gt; ,
   ....
SyntaxError: invalid syntax
&amp;gt;&amp;gt;&amp;gt; (,)
   ....
SyntaxError: invalid syntax
&amp;gt;&amp;gt;&amp;gt; tuple()
()

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Python 3 introduces a nice new syntax for &lt;code&gt;set&lt;/code&gt; literals, reusing the braces which traditionally enclose &lt;code&gt;dict&lt;/code&gt;s.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; ls = { 1, 11, 21, 1211, 111221, 312211 }

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Again, beware the edge case: &lt;code&gt;{}&lt;/code&gt; is an empty &lt;code&gt;dict&lt;/code&gt;, not an empty set.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; zs = {}
&amp;gt;&amp;gt;&amp;gt; type(zs)
&amp;lt;class 'dict'&amp;gt;
&amp;gt;&amp;gt;&amp;gt; zs = set()
&amp;gt;&amp;gt;&amp;gt; type(zs)
&amp;lt;class 'set'&amp;gt;

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Python 3 allows non-ascii characters in identifiers, but not any old character, so we &lt;strong&gt;cannot&lt;/strong&gt; get away with
&lt;/p&gt;
&lt;pre&gt;
&gt;&gt;&gt; &amp;empty; = set()
      ^
SyntaxError: invalid character in identifier
&lt;/pre&gt;

&lt;p&gt;Parentheses are used for function calls too, and also for generator expressions. Here&amp;#8217;s a lazy list of squares of numbers less than a million.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; sqs = (x * x for x in range(1000000))

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here&amp;#8217;s the sum of these numbers.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; sum((x * x for x in range(1000000)))
333332833333500000

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Actually, we can omit the generator-expression parentheses in the sum. The function call parentheses magically turn the enclosed &lt;code&gt;x * x for x in range(1000000)&lt;/code&gt; into a generator expression. As usual, Python does what we want.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; sum(x * x for x in range(1000000))
333332833333500000

&lt;/pre&gt;

&lt;/div&gt;


&lt;h3&gt;Serious about Syntax&lt;/h3&gt;
&lt;p&gt;If you&amp;#8217;ve read this far you may well be thinking: &amp;#8220;So what?&amp;#8221; I haven&amp;#8217;t shown any gotchas, merely a few quirks and corner cases. As already mentioned, the real surprise is that Python fails to surprise. Part of this, as I hope I&amp;#8217;ve shown here, can be attributed to the interpreter, which positively invites you to experiment; but mainly &lt;span /&gt;Python&amp;#8217;s clean and transparent design takes the credit. Repeating Eric S. Raymond: you don&amp;#8217;t have to &amp;#8220;actually tell the language to do what you&amp;#8217;re thinking&amp;#8221;.
&lt;/p&gt;
&lt;p&gt;Since I first started using Python the syntax has grown considerably, yet the extensions and additions seem almost as if they&amp;#8217;d been planned from the start&lt;a id="fn1link" href="http://wordaligned.org/articles/python-surprise-me#fn1"&gt;&lt;sup&gt;[1]&lt;/sup&gt;&lt;/a&gt;. Generator expressions complement list comprehensions. The yield statement fits nicely with iteration.
&lt;/p&gt;
&lt;p&gt;Even more remarkably, Python 3 has chosen to break backwards compatibility, so it can undo those few early choices which now seem mistakes. Which brings us back to the broken function at the top of this article. Here it is again, docstring omitted for brevity.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;def third_int(xs):
    try:
        return int(xs[2])
    except IndexError, ValueError:
        return -1

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;I really did write a function like this, and I really did get it wrong in just this way. The code is syntactically valid, but I should have written
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;def third_int(xs):
    try:
        return int(xs[2])
    except (IndexError, ValueError):
        return -1

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The parentheses in the &lt;code&gt;except&lt;/code&gt; clause are crucial. The formal syntax of this form of &lt;a href="http://docs.python.org/reference/compound_stmts.html#the-try-statement"&gt;try statement&lt;/a&gt; is
&lt;/p&gt;
&lt;pre&gt;
try1_stmt ::=  "try" ":" suite
               ("except" [expression [("as" | ",") target]] ":" suite)+
               ["else" ":" suite]
               ["finally" ":" suite]
&lt;/pre&gt;

&lt;p&gt;In the corrected version of &lt;code&gt;third_int()&lt;/code&gt;, the parentheses group &lt;code&gt;IndexError, ValueError&lt;/code&gt; into a single expression, a tuple, and the except clause matches any object with class (or base class) &lt;code&gt;IndexError&lt;/code&gt; or &lt;code&gt;ValueError&lt;/code&gt;. The broken version is very different, as becomes clear if we use the alternative &lt;code&gt;"as"&lt;/code&gt; form.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;def third_int(xs):
    try:
        return int(xs[2])
    except IndexError as ValueError:
        return -1

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here, the except clause will match an object with class or base class &lt;code&gt;IndexError&lt;/code&gt;, and assigns that object to the target, which is called &lt;code&gt;ValueError&lt;/code&gt; (and which shadows the &amp;#8220;real&amp;#8221; ValueError in the rest of the function definition). If &lt;code&gt;int()&lt;/code&gt; raises a &lt;code&gt;ValueError&lt;/code&gt;, it will not be matched.
&lt;/p&gt;

&lt;h3&gt;Won&amp;#8217;t Get Fooled Again&lt;/h3&gt;
&lt;p&gt;Oh, I get it, now. It &lt;strong&gt;is&lt;/strong&gt; a bit subtle, but I won&amp;#8217;t make that mistake again.
&lt;/p&gt;
&lt;p&gt;Wait, there&amp;#8217;s more! In Python 3k, my broken implementation is properly broken &amp;#8212; a syntax error.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;Python 3.1
&amp;gt;&amp;gt;&amp;gt; def third_int(xs):
...     try:
...         return int(xs[2])
...     except IndexError, ValueError:
  File "&amp;lt;stdin&amp;gt;", line 4
    except IndexError, ValueError:
                     ^
SyntaxError: invalid syntax

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The Python 3k syntax of this form of &lt;a href="http://docs.python.org/py3k/reference/compound_stmts.html#the-try-statement"&gt;try statement&lt;/a&gt; reads.
&lt;/p&gt;
&lt;pre&gt;
try1_stmt ::=  "try" ":" suite
               ("except" [expression "as" target]] ":" suite)+
               ["else" ":" suite]
               ["finally" ":" suite]
&lt;/pre&gt;

&lt;p&gt;You can&amp;#8217;t use a comma to capture the target any more. It&amp;#8217;s an advance and a simplification. Why am I not surprised?
&lt;/p&gt;
&lt;hr /&gt;

&lt;p&gt;&lt;a id="fn1" href="http://wordaligned.org/articles/python-surprise-me#fn1link"&gt;[1]&lt;/a&gt;: With the possible exception of &lt;a href="http://docs.python.org/reference/expressions.html#boolean-operations"&gt;conditional expressions&lt;/a&gt;, that is.
&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/hC5iZAu8OII" height="1" width="1"/&gt;</description>
<dc:date>2009-12-15</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/python-surprise-me</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/hC5iZAu8OII/python-surprise-me</link>
<category>Python</category>
<feedburner:origLink>http://wordaligned.org/articles/python-surprise-me</feedburner:origLink></item>

<item>
<title>Next permutation: When C++ gets it right</title>
<description>&lt;div class="toc"&gt;
&lt;h2&gt;Contents&lt;/h2&gt;
&lt;ul&gt;
 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#tocthe-next-number-problem" name="toc0" id="toc0"&gt;The Next Number Problem&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#tocchoice-of-algorithm" name="toc1" id="toc1"&gt;Choice of Algorithm&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#toclexicographical-ordering" name="toc2" id="toc2"&gt;Lexicographical Ordering&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#tocnext-permutation-in-action" name="toc3" id="toc3"&gt;Next permutation in action&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#tocsnail-sorts-revenge" name="toc4" id="toc4"&gt;Snail sort&amp;#8217;s revenge&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#tocthe-next-number-solved" name="toc5" id="toc5"&gt;The Next Number, Solved&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#tocimplementation" name="toc6" id="toc6"&gt;Implementation&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#tocwhats-happening-here" name="toc7" id="toc7"&gt;What&amp;#8217;s happening here?&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#tocbeautiful-c" name="toc8" id="toc8"&gt;Beautiful C++?&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#tocpermutations-in-python" name="toc9" id="toc9"&gt;Permutations in Python&lt;/a&gt;
 &lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#toc0" name="tocthe-next-number-problem" id="tocthe-next-number-problem"&gt;The Next Number Problem&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Suppose you have a fixed list of digits chosen from the range 1..9. What numbers can you make with them? You&amp;#8217;re allowed as many zeros as you want. Write the numbers in increasing order.
&lt;/p&gt;
&lt;p&gt;Exactly &lt;a href="http://code.google.com/codejam/contest/dashboard?c=186264#s=p1"&gt;this puzzle&lt;/a&gt; came up in the recent &lt;a href="http://code.google.com/codejam"&gt;Google Code Jam&lt;/a&gt; programming contest:
&lt;/p&gt;
&lt;blockquote&gt;You are writing out a list of numbers. Your list contains all numbers with exactly &lt;strong&gt;D&lt;sub&gt;i&lt;/sub&gt;&lt;/strong&gt; digits in its decimal representation which are equal to i, for each i between 1 and 9, inclusive. You are writing them out in ascending order.&lt;/p&gt;&lt;p&gt;For example, you might be writing every number with two &amp;#8216;1&amp;#8217;s and one &amp;#8216;5&amp;#8217;. Your list would begin 115, 151, 511, 1015, 1051.&lt;/p&gt;&lt;p&gt;Given N, the last number you wrote, compute what the next number in the list will be.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;The competition has closed now, but if you&amp;#8217;d like to give it a go sample input files can be found on the &lt;a href="http://code.google.com/codejam/contest/dashboard?c=186264#s=p1"&gt;website&lt;/a&gt;, where you can also upload your results and have them checked.
&lt;/p&gt;
&lt;p&gt;Here&amp;#8217;s a short section from a trial I ran on my computer. Input numbers are in the left-hand column: the corresponding output numbers are in the right-hand column.
&lt;/p&gt;
&lt;pre style="font-size:150%"&gt;
50110812884911623516 &amp;rarr; 50110812884911623561
82454322474161687049 &amp;rarr; 82454322474161687094
82040229261723155710 &amp;rarr; 82040229261723157015
43888989554234187388 &amp;rarr; 43888989554234187838
76080994872481480636 &amp;rarr; 76080994872481480663
31000989133449480678 &amp;rarr; 31000989133449480687
20347716554681051891 &amp;rarr; 20347716554681051918
&lt;/pre&gt;


&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#toc1" name="tocchoice-of-algorithm" id="tocchoice-of-algorithm"&gt;Choice of Algorithm&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Like many of the code jam challenges, you&amp;#8217;ll need to write a program which runs fast enough; but choosing the right algorithm is more important than choosing the right language. Typically a high-level interpreted language like Python allows me to code and test a solution far more quickly than using a low-level language like C or C++.
&lt;/p&gt;
&lt;p&gt;In this particular case, though, like most &lt;a href="http://www.go-hero.net/jam/09/problems/2/2"&gt;successful candidates&lt;/a&gt;, I used C++. &lt;a href="http://www.sgi.com/tech/stl/next_permutation.html"&gt;Here&amp;#8217;s why&lt;/a&gt;.
&lt;/p&gt;
&lt;span id="continue-reading"/&gt;

&lt;blockquote&gt;&lt;p&gt;&lt;a href="http://www.sgi.com/tech/stl/next_permutation.html"&gt;&lt;code&gt;Next_permutation&lt;/code&gt;&lt;/a&gt; transforms the range of elements &lt;code&gt;[first, last)&lt;/code&gt; into the lexicographically next greater permutation of the elements. [&amp;#8230;] If such a permutation exists, &lt;code&gt;next_permutation&lt;/code&gt; transforms &lt;code&gt;[first, last)&lt;/code&gt; into that permutation and returns true. Otherwise it transforms &lt;code&gt;[first, last)&lt;/code&gt; into the lexicographically smallest permutation and returns &lt;code&gt;false&lt;/code&gt;.
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;Although the next number problem appears to be about numbers and lexicographical ordering appears to be about words, &lt;code&gt;std::next_permutation&lt;/code&gt; is exactly what&amp;#8217;s needed here.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#toc2" name="toclexicographical-ordering" id="toclexicographical-ordering"&gt;Lexicographical Ordering&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;&lt;a href="http://www.flickr.com/photos/thomasguest/4099819327/" title="Lexicographical order by Thomas Guest, on Flickr"&gt;&lt;img src="http://farm3.static.flickr.com/2449/4099819327_4063635302.jpg" width="500" height="216" alt="Lexicographical order" /&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;A dictionary provides the canonical example of lexicographical ordering. Words are built from characters, which can be alphabetically ordered A, B, C, &amp;#8230; , so in the dictionary words which begin with &lt;strong&gt;A&lt;/strong&gt; appear before words which begin with &lt;strong&gt;B&lt;/strong&gt;, which themselves come in front of words beginning with &lt;strong&gt;C&lt;/strong&gt;, etc. If two words start with the same letter, pop that letter from the head of the word and compare their tails, which puts AARDVARK before ANIMAL, and &amp;#8212; applying this rule recursively &amp;#8212; after &lt;a href="http://www.aardman.com/" title="Bristol's finest"&gt;AARDMAN&lt;/a&gt;. Imagine there&amp;#8217;s an empty word marking position zero, before A, right at the front of the dictionary, and our recursive  definition is complete.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#toc3" name="tocnext-permutation-in-action" id="tocnext-permutation-in-action"&gt;Next permutation in action&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Here&amp;#8217;s a simple program which shows &lt;code&gt;next_permutation()&lt;/code&gt; in action.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;#include &amp;lt;algorithm&amp;gt;
#include &amp;lt;cstdio&amp;gt;

int main()
{
    char xs[] = "123";
    do
    {
        std::puts(xs);
    }
    while (std::next_permutation(xs, xs + sizeof(xs) - 1));
    return 0;
}

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This program outputs lexicographically ordered permutations of 1, 2 and 3. When the main function returns, the array &lt;code&gt;xs&lt;/code&gt; will have cycled round to hold the lexicographically smallest arrangement of its elements, which is &lt;code&gt;"123"&lt;/code&gt;. Note that we never convert the characters &lt;code&gt;'1'&lt;/code&gt;, &lt;code&gt;'2'&lt;/code&gt;, &lt;code&gt;'3'&lt;/code&gt; into the numbers &lt;code&gt;1&lt;/code&gt;, &lt;code&gt;2&lt;/code&gt;, &lt;code&gt;3&lt;/code&gt;. The values of both sets of data types appear in the same order, so all works as expected.
&lt;/p&gt;
&lt;pre&gt;
123
132
213
231
312
321
&lt;/pre&gt;

&lt;p&gt;If we tweak and rerun the same program with &lt;code&gt;xs&lt;/code&gt; initialised to &lt;code&gt;"AAADKRRV"&lt;/code&gt; we get rather more output.
&lt;/p&gt;
&lt;pre&gt;
AAADKRRV
AAADKRVR
AAADKVRR
...
AARDVARK
...
VRRKAADA
VRRKADAA
VRRKDAAA
&lt;/pre&gt;

&lt;p&gt;The sequence &lt;strong&gt;doesn&amp;#8217;t&lt;/strong&gt; start by repeating &lt;code&gt;"AAADKRRV"&lt;/code&gt; 6 times, once for every permutation of the 3 A&amp;#8217;s. Only strictly increasing permutations are included. And although the repeated calls to &lt;code&gt;next_permutation&lt;/code&gt; generate a series of permutations, the algorithm holds no state. Each function call works on its input range afresh.
&lt;/p&gt;
&lt;p&gt;This second run of the program yields 3360 lines of output, even though there are 8! = 40320 possible permutations of 8 characters. Each unique permutation corresponds to 3! &amp;times; 2! = 12 actual permutations of the 8 characters (because there are 3 A&amp;#8217;s and 2 R&amp;#8217;s), and 40320 &amp;divide; 12 is 3360.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#toc4" name="tocsnail-sorts-revenge" id="tocsnail-sorts-revenge"&gt;Snail sort&amp;#8217;s revenge&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;&lt;a href="http://www.flickr.com/photos/tim_norris/2789759648/"&gt;&lt;img src="http://farm4.static.flickr.com/3143/2789759648_ab4bfb5ea8.jpg" width="500px" height="333px" alt="...and in last place. By Tim Norris"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;As you can see, &lt;code&gt;next_permutation&lt;/code&gt; sorts an input range, one step at a time.  When &lt;code&gt;next_permutation&lt;/code&gt; eventually returns false, the range will be perfectly ordered. Hence we have &lt;code&gt;snail_sort()&lt;/code&gt;, hailed by the SGI STL &lt;a href="http://www.sgi.com/tech/stl/next_permutation.html"&gt;documentation&lt;/a&gt; as the worst known deterministic sorting algorithm.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;template &amp;lt;class Iter&amp;gt; 
void snail_sort(Iter first, Iter last)
{
    while (next_permutation(first, last)) {}
}

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Very witty, and evidence that code can be both &lt;a href="http://wordaligned.org/articles/elegance-and-efficiency"&gt;elegant and inefficient&lt;/a&gt;.
&lt;/p&gt;
&lt;p&gt;In two important edge cases, though, &lt;code&gt;snail_sort&lt;/code&gt; performs on a par with super-charged &lt;code&gt;quicksort&lt;/code&gt;!
&lt;/p&gt;
&lt;ol&gt;
 &lt;li&gt;
     I snail sorted an array filled with 100000000 zeros in 0.502 seconds. Running quicksort on the same array took 5.504 seconds. 
 &lt;/li&gt;

 &lt;li&gt;
     Starting with an array of the same size filled with the values 99999999, 99999998, 99999997, &amp;#8230; 1, 0 snail sort&amp;#8217;s 0.500 seconds trounced quicksort&amp;#8217;s 4.08 seconds.
 &lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#toc5" name="tocthe-next-number-solved" id="tocthe-next-number-solved"&gt;The Next Number, Solved&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Here&amp;#8217;s an outline solution to the &lt;a href="http://code.google.com/codejam/contest/dashboard?c=186264#s=p1"&gt;next number problem&lt;/a&gt;. (I&amp;#8217;ve glossed over the exact input and output file formats for clarity.) It reads numbers from standard input and writes next numbers to standard output. &lt;code&gt;Next_permutation&lt;/code&gt; does the hard work, and there&amp;#8217;s a bit of fiddling when we have to increase the number of digits by adding a zero.&lt;a id="fn1link" href="http://wordaligned.org/articles/next-permutation#fn1"&gt;&lt;sup&gt;[1]&lt;/sup&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;#include &amp;lt;algorithm&amp;gt;
#include &amp;lt;iostream&amp;gt;

/*
 Given a string of digits, shift any leading '0's
 past the first non-zero digit and insert an extra zero.
 
 Examples:
  
 123 -&amp;gt; 1023
 008 -&amp;gt; 8000
 034 -&amp;gt; 3004
*/
void insert_a_zero(std::string &amp;amp; number)
{
    size_t nzeros = number.find_first_not_of('0');
    number = number.substr(nzeros);
    number.insert(1, nzeros + 1, '0');
}

/*
 Outline solution to the 2009 code jam Next Number problem.
 
 Given a string representing a decimal number, find the next
 number which can be formed from the same set of digits. Add
 another zero if necessary. Repeat for all such strings read
 from standard input.
*/
int main()
{
    std::string number;
    while (std::cin &amp;gt;&amp;gt; number)
    {
        if (!next_permutation(number.begin(), number.end()))
        {
            insert_a_zero(number);
        }
        std::cout &amp;lt;&amp;lt; number &amp;lt;&amp;lt; '\n';
    }
    return 0;
}

&lt;/pre&gt;

&lt;/div&gt;


&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#toc6" name="tocimplementation" id="tocimplementation"&gt;Implementation&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Having used the C++ standard library to solve the puzzle, let&amp;#8217;s take a look at how it works. Next permutation is a clever algorithm which shuffles a collection in place. My system implements it like this&lt;a id="fn2link" href="http://wordaligned.org/articles/next-permutation#fn2"&gt;&lt;sup&gt;[2]&lt;/sup&gt;&lt;/a&gt;.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;template&amp;lt;typename Iter&amp;gt;
bool next_permutation(Iter first, Iter last)
{
    if (first == last)
        return false;
    Iter i = first;
    ++i;
    if (i == last)
        return false;
    i = last;
    --i;
        
    for(;;)
    {
        Iter ii = i;
        --i;
        if (*i &amp;lt; *ii)
        {
            Iter j = last;
            while (!(*i &amp;lt; *--j))
            {}
            std::iter_swap(i, j);
            std::reverse(ii, last);
            return true;
        }
        if (i == first)
        {
            std::reverse(first, last);
            return false;
        }
    }
}

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;We start with a range delimited by a pair of bi-directional iterators, &lt;code&gt;[first, last)&lt;/code&gt;. If the range contains one item or fewer, there can be no next permutation, so leave the range as is and return &lt;code&gt;false&lt;/code&gt;. Otherwise, enter the &lt;code&gt;for&lt;/code&gt; loop with an iterator &lt;code&gt;i&lt;/code&gt; pointing at the final item in the range.
&lt;/p&gt;
&lt;p&gt;At each pass through the body of this for loop we decrement &lt;code&gt;i&lt;/code&gt; by one, stepping towards the first item in the range. We are looking for one of two conditions:
&lt;/p&gt;
&lt;ol&gt;
 &lt;li&gt;
     the value pointed to by &lt;code&gt;i&lt;/code&gt; is smaller than the one it pointed to previously
 &lt;/li&gt;

 &lt;li&gt;
     &lt;code&gt;i&lt;/code&gt; reaches into the first item in the range
 &lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Put another way, we divide the range into a head and tail, where the tail is the longest possible decreasing tail of the range.
&lt;/p&gt;
&lt;p&gt;If this tail is the whole range (the second condition listed above) then the whole range is in reverse order, and we have the lexicographical maximum formed from its elements. Reversing the range returns it to its lexicographical minimum, and we can return &lt;code&gt;false&lt;/code&gt;.
&lt;/p&gt;
&lt;p&gt;If this tail is not the whole range, then the final item in the head of the range, the item &lt;code&gt;i&lt;/code&gt; points to, this item is smaller than at least one of the items in the tail of the range, and we can certainly generate a greater permutation by moving the item towards the end of the range. To find the next permutation, we reverse iterate from the end of the range until we find an item &lt;code&gt;*j&lt;/code&gt; bigger than &lt;code&gt;*i&lt;/code&gt; &amp;#8212; that&amp;#8217;s what the while loop does. Swapping the items pointed to by &lt;code&gt;i&lt;/code&gt; and &lt;code&gt;j&lt;/code&gt; ensures the head of the range is bigger than it was, and the tail of the range remains in reverse order. Finally, we reverse the tail of the range, leaving us with a permutation exactly one beyond the input permutation, and we return &lt;code&gt;true&lt;/code&gt;. 
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#toc7" name="tocwhats-happening-here" id="tocwhats-happening-here"&gt;What&amp;#8217;s happening here?&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;It&amp;#8217;s clear from this paper analysis that the algorithm is of linear complexity. Essentially, it walks up and down the tail of the list, comparing and swapping. But why does it work?
&lt;/p&gt;
&lt;p&gt;Let &lt;code&gt;xs&lt;/code&gt; be the range &lt;code&gt;(first, last)&lt;/code&gt;. As described above, divide this range into prefix and suffix subranges, &lt;code&gt;head&lt;/code&gt; and &lt;code&gt;tail&lt;/code&gt;, where &lt;code&gt;tail&lt;/code&gt; is the longest monotonically decreasing tail of the range.
&lt;/p&gt;
&lt;p&gt;If the &lt;code&gt;head&lt;/code&gt; of the range is empty, then the range &lt;code&gt;xs&lt;/code&gt; is clearly at its lexicographical maximum. 
&lt;/p&gt;
&lt;p&gt;Otherwise, &lt;code&gt;tail&lt;/code&gt; is a lexicographical maximum of the elements it contains, and &lt;code&gt;xs&lt;/code&gt; is therefore the largest permutation which starts with the subrange &lt;code&gt;head&lt;/code&gt;. What will the &lt;code&gt;head&lt;/code&gt; of the next permutation be? We have to swap the final item in &lt;code&gt;head&lt;/code&gt; with the smallest item of &lt;code&gt;tail&lt;/code&gt; which exceeds it: the definition of &lt;code&gt;tail&lt;/code&gt; guarantees at least one such item exists. Now we want to permute the new &lt;code&gt;tail&lt;/code&gt; to be at a its lexicographical minimum, which is a matter of sorting it from low to high.
&lt;/p&gt;
&lt;p&gt;Since &lt;code&gt;tail&lt;/code&gt; is in reverse order, finding the smallest item larger than &lt;code&gt;head[-1]&lt;/code&gt; is a matter of walking back from the end of the range to find the first such items; and once we&amp;#8217;ve swapped these items, &lt;code&gt;tail&lt;/code&gt; remains in reverse order, so a simple reversed will sort it.
&lt;/p&gt;
&lt;p&gt;As an example consider finding the next permutation of:
&lt;/p&gt;
&lt;pre style="font-size:250%;"&gt;
8342666411
&lt;/pre&gt;

&lt;p&gt;The longest monotonically decreasing tail is &lt;code&gt;666411&lt;/code&gt;, and the corresponding head is &lt;code&gt;8342&lt;/code&gt;.
&lt;/p&gt;
&lt;pre style="font-size:250%;"&gt;
8342 666411
&lt;/pre&gt;

&lt;p&gt;&lt;code&gt;666411&lt;/code&gt; is, by definition, reverse-ordered, and cannot be increased by permuting its elements. To find the next permutation, we must increase the head; a matter of finding the smallest tail element larger than the head&amp;#8217;s final &lt;code&gt;2&lt;/code&gt;.
&lt;/p&gt;
&lt;pre style="font-size:250%;"&gt;
834&lt;span style="color:#930"&gt;2&lt;/span&gt; 666411
&lt;/pre&gt;

&lt;p&gt;Walking back from the end of tail, the first element greater than &lt;code&gt;2&lt;/code&gt; is &lt;code&gt;4&lt;/code&gt;.
&lt;/p&gt;
&lt;pre style="font-size:250%;"&gt;
834&lt;span style="color:#930"&gt;2&lt;/span&gt;  666&lt;span style="color:#930"&gt;4&lt;/span&gt;11
&lt;/pre&gt;

&lt;p&gt;Swap the &lt;code&gt;2&lt;/code&gt; and the &lt;code&gt;4&lt;/code&gt;
&lt;/p&gt;
&lt;pre style="font-size:250%;"&gt;
834&lt;span style="color:#930"&gt;4&lt;/span&gt; 666&lt;span style="color:#930"&gt;2&lt;/span&gt;11
&lt;/pre&gt;

&lt;p&gt;Since head has increased, we now have a greater permutation. To reduce to the next permutation, we reverse tail, putting it into increasing order.
&lt;/p&gt;
&lt;pre style="font-size:250%;"&gt;
8344 112666
&lt;/pre&gt;

&lt;p&gt;Join the head and tail back together. The permutation one greater than &lt;code&gt;8342666411&lt;/code&gt; is &lt;code&gt;8344112666&lt;/code&gt;.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#toc8" name="tocbeautiful-c" id="tocbeautiful-c"&gt;Beautiful C++?&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;&lt;a href="http://wordaligned.org/articles/looping-forever-and-ever"&gt;&lt;img  src="http://wordaligned.org/images/mite.jpg" alt="for(;;) dust mite"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;C++ has its &lt;a href="http://yosefk.com/c++fqa/defective.html" title="If you are an expert in the intricacies of C++, please consider this knowledge a kind of martial art - something a real master never uses. Yossi Keinin"&gt;detractors&lt;/a&gt;, who characterise it as subtle, &lt;a href="http://twitter.com/dabeaz/status/5677453478" title="C++0x reminds me of blocks stacked by my toddler. Really wobbly and one block too many makes it topple. @dabeaz"&gt;complex&lt;/a&gt;, and &lt;a href="http://www2.research.att.com/~bs/bs_faq.html#really-say-that" title="C++ can blow your whole leg off. Bjarne Stroustrup"&gt;dangerous&lt;/a&gt;; but sometimes it excels. Look once more at the C++ implementation of this algorithm.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;template&amp;lt;typename Iter&amp;gt;
bool next_permutation(Iter first, Iter last)
{
    if (first == last)
        return false;
    Iter i = first;
    ++i;
    if (i == last)
        return false;
    i = last;
    --i;
    
    for(;;)
    {
        Iter ii = i;
        --i;
        if (*i &amp;lt; *ii)
        {
            Iter j = last;
            while (!(*i &amp;lt; *--j))
            {}
            std::iter_swap(i, j);
            std::reverse(ii, last);
            return true;
        }
        if (i == first)
        {
            std::reverse(first, last);
            return false;
        }
    }
}

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;span /&gt;With its special cases, boolean literals, multiple returns (4, count them!), disembodied and infinite loops, this code fails to exhibit conventional beauty. Yet &lt;em&gt;it is&lt;/em&gt; beautiful. All the next permutation algorithm needs are iterators which can advance forwards or backwards, step by step. And that&amp;#8217;s all this implementation uses.
&lt;/p&gt;
&lt;p&gt;I&amp;#8217;m as excited as anyone by the mathematical rigour of &lt;a href="http://www.informit.com/articles/article.aspx?p=1407357&amp;amp;seqNum=3" title="Great article by Andrei Alexandrescu, which questions a pure Haskell quicksort implementation"&gt;functional programming&lt;/a&gt;, but sometimes computer science is about algorithms with virtually no space overhead, algorithms which loop rather than recurse. Sometimes it&amp;#8217;s about shuffling, nudging and swapping &amp;#8212; operations which map directly to the machine&amp;#8217;s most primitive operations. In such cases, C++ gets it right.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#toc9" name="tocpermutations-in-python" id="tocpermutations-in-python"&gt;Permutations in Python&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;For the code jam, though, as mentioned earlier, having a super-fast program rarely matters. More often, it&amp;#8217;s about developing a fast enough program super-quickly.
&lt;/p&gt;
&lt;p&gt;I find Python a far quicker language for developing code than C++. (Indeed, sometimes when it&amp;#8217;s obvious from the outset that a final program will need implementing in C++, I put together a working prototype using Python, which I then translate.) Could we solve the next number problem using code from the standard Python library?
&lt;/p&gt;
&lt;p&gt;At a first glance, &lt;a href="http://docs.python.org/py3k/library/itertools.html#itertools.permutations"&gt;itertools.permutations&lt;/a&gt; looks promising.
&lt;/p&gt;
&lt;blockquote&gt;&lt;h3&gt;&lt;tt&gt;itertools.permutations(&lt;em&gt;iterable&lt;/em&gt;, &lt;em&gt;r=None&lt;/em&gt;)&lt;/tt&gt;&lt;/h3&gt;&lt;p&gt;Return successive &lt;em&gt;r&lt;/em&gt; length permutations of elements in the &lt;em&gt;iterable&lt;/em&gt;.&lt;/p&gt;&lt;p&gt;If &lt;em&gt;r&lt;/em&gt; is not specified or is &lt;tt&gt;None&lt;/tt&gt;, then &lt;em&gt;r&lt;/em&gt; defaults to the length
of the &lt;em&gt;iterable&lt;/em&gt; and all possible full-length permutations
are generated.&lt;/p&gt;&lt;p&gt;Permutations are emitted in lexicographic sort order.  So, if the input &lt;em&gt;iterable&lt;/em&gt; is sorted, the permutation tuples will be produced in sorted order.&lt;/p&gt;&lt;p&gt;Elements are treated as unique based on their position, not on their value.  So if the input elements are unique, there will be no repeat values in each permutation.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;However, this algorithm doesn&amp;#8217;t care about the values of the items in the iterable, and the lexicographic sort order applies to the indices of these items. So although the ordering of the generated items is well-defined:
&lt;/p&gt;
&lt;ol&gt;
 &lt;li&gt;
     we get repeats, and
 &lt;/li&gt;

 &lt;li&gt;
     it&amp;#8217;s not the ordering we want (in this case)
 &lt;/li&gt;
&lt;/ol&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; from itertools import permutations
&amp;gt;&amp;gt;&amp;gt; concat = ''.join
&amp;gt;&amp;gt;&amp;gt; list(map(concat, permutations('AAA')))
['AAA', 'AAA', 'AAA', 'AAA', 'AAA', 'AAA']
&amp;gt;&amp;gt;&amp;gt; list(map(concat, permutations('231')))
['231', '213', '321', '312', '123', '132']

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;It &lt;strong&gt;is&lt;/strong&gt; possible to code up &lt;code&gt;next_permutation&lt;/code&gt; using nothing more than the standard itertools, but it isn&amp;#8217;t advisable.
&lt;/p&gt;
&lt;div class="typocode"&gt;&lt;div class="codetitle"&gt;Snail permute&lt;/div&gt;

&lt;pre class="prettyprint"&gt;from itertools import permutations, groupby

def next_permutation(xs):
    """Calculate the next permutation of the sequence xs.
    
    Returns a pair (yn, xs'), where yn is a boolean and xs' is the 
    next permutation. If yn is True, xs' will be the lexicographic 
    next permutation of xs, otherwise xs' is the lexicographic 
    smallest permutation of xs.
    """
    xs = tuple(xs)
    if not xs:
        return False, xs
    else:
        ps = [p for p, gp in groupby(sorted(permutations(xs)))]
        np = len(ps)
        ix = ps.index(xs) + 1
        if ix == len(ps):
            return False, ps[0]
        else:
            return True, ps[ix]

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;As it happens, a solution based on this exhaustive search would score points in the code jam since it copes with the small input set. For the large input set its factorial complexity rules it out, and we&amp;#8217;d need to implement the next permutation algorithm &lt;a href="http://code.google.com/codejam/contest/dashboard?c=186264#s=a&amp;amp;a=1"&gt;from scratch&lt;/a&gt;.
&lt;/p&gt;
&lt;hr /&gt;

&lt;p&gt;&lt;a id="fn1" href="http://wordaligned.org/articles/next-permutation#fn1link"&gt;[1]&lt;/a&gt;: A more cunning &lt;a href="http://code.google.com/codejam/contest/dashboard?c=186264#s=a&amp;amp;a=1"&gt;solution&lt;/a&gt; avoids the special case by pushing the extra zero to head of the string before applying &lt;code&gt;next_permutation&lt;/code&gt;, then popping it if it hasn&amp;#8217;t been moved.
&lt;/p&gt;
&lt;p&gt;&lt;a id="fn2" href="http://wordaligned.org/articles/next-permutation#fn2link"&gt;[2]&lt;/a&gt;: I&amp;#8217;ve tweaked the layout and parameter names for use on this site.
&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/ESjRGU30bM8" height="1" width="1"/&gt;</description>
<dc:date>2009-11-19</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/next-permutation</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/ESjRGU30bM8/next-permutation</link>
<category>Puzzles</category>
<category>C++</category>
<category>Algorithms</category>
<category>Python</category>
<category>Google</category>
<feedburner:origLink>http://wordaligned.org/articles/next-permutation</feedburner:origLink></item>

<item>
<title>Python on Ice</title>
<description>&lt;div class="toc"&gt;
&lt;h2&gt;Contents&lt;/h2&gt;
&lt;ul&gt;
 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/antipep#tocpython" name="toc0" id="toc0"&gt;Python?&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/antipep#toctwisted-81-on-python-25" name="toc1" id="toc1"&gt;Twisted 8.1 on Python 2.5&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/antipep#tocpython-3-on-word-aligned" name="toc2" id="toc2"&gt;Python 3 on Word Aligned&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/antipep#tocpython-3-absent-from-europython" name="toc3" id="toc3"&gt;Python 3 absent from Europython&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/antipep#tocpython-3-literature" name="toc4" id="toc4"&gt;Python 3 Literature&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/antipep#tocthe-cost-of-python-3" name="toc5" id="toc5"&gt;The Cost of Python 3&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/antipep#tocevolution-of-python" name="toc6" id="toc6"&gt;Evolution of Python&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/antipep#tocaccepting-python-3" name="toc7" id="toc7"&gt;Accepting Python 3&lt;/a&gt;
 &lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;&lt;blockquote&gt;&lt;p&gt;A moratorium on Python changes is probably a good thing&amp;#8212;the last edition of my book nearly made my head explode. &amp;#8212; &lt;a href="http://twitter.com/dabeaz/status/5055586588"&gt;@dabeaz&lt;/a&gt;
&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/antipep#toc0" name="tocpython" id="tocpython"&gt;Python?&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;&amp;#8220;Python?&amp;#8221;
&lt;/p&gt;
&lt;p&gt;&amp;#8220;Yes, Python. It&amp;#8217;s a &lt;a href="http://wordaligned.org/articles/pitching-python-in-three-syllables"&gt;high-level&lt;/a&gt; language, we used it for the prototype. We can use it for parts of the system where performance isn&amp;#8217;t critical. Connecting components together. The web server.&amp;#8221;
&lt;/p&gt;
&lt;p&gt;&amp;#8220;But what will we do when Python changes? It&amp;#8217;s a developing language, right? How can we maintain our system.&amp;#8221;
&lt;/p&gt;
&lt;p&gt;Not an issue, I explained. Python takes backwards compatibility very seriously. Besides, we choose which version of Python to deploy with, we choose when we migrate &amp;#8212; maybe never. Look, you can &lt;a href="http://python.org/download/releases"&gt;download the source&lt;/a&gt; for every version of Python ever released. All you need is a &lt;a href="http://www.python.org/dev/peps/pep-0007" title="A C89 compiler, in fact. (PEP 7, Style Guide for C Code)"&gt;C compiler&lt;/a&gt;. C is the porting layer, if you like, and C isn&amp;#8217;t going anywhere in a hurry.
&lt;/p&gt;
&lt;p&gt;In all honesty, I expected more maintenance issues with the C++ parts of our product, where the language may not have changed in a decade but &lt;a href="http://wordaligned.org/articles/code-rot"&gt;compilers are only just catching up&lt;/a&gt; with it; and in fact I didn&amp;#8217;t have to argue for long to persuade senior management, not on this issue at least. They&amp;#8217;d already seen how quickly I could get things up and running using Python. Even though the company had more experience with C, C++, Java, and even .Net, I convinced them Python had a role on the server-based system we were developing.
&lt;/p&gt;
&lt;p&gt;Nonetheless, I didn&amp;#8217;t think it the right time to mention Python 3. Why confuse things?
&lt;/p&gt;
&lt;span id="continue-reading"/&gt;


&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/antipep#toc1" name="toctwisted-81-on-python-25" id="toctwisted-81-on-python-25"&gt;Twisted 8.1 on Python 2.5&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;I won&amp;#8217;t go into detail about the product. Data flowed through it, redirected dynamically using tees and filters, and robots were attached to the resulting streams to monitor them. A web UI presented controls and a view of the system. We used C and C++ for managing the bulk of the flow. The robots we coded in both Python and C++. We connected and coordinated everything using &lt;a href="http://twistedmatrix.com" title="Twisted is an event-driven networking engine written in Python"&gt;Twisted&lt;/a&gt;, a Python networking engine. Our initial deployment used Python 2.5 and Twisted 8.1. We&amp;#8217;ve since upgraded to Python 2.6 and Twisted 8.2.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/antipep#toc2" name="tocpython-3-on-word-aligned" id="tocpython-3-on-word-aligned"&gt;Python 3 on Word Aligned&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;At work there was no question of using Python 3 even though it became available when we started development. Twisted hadn&amp;#8217;t been released against Python 3 (&lt;a href="http://stackoverflow.com/questions/172306/how-are-you-planning-on-handling-the-migration-to-python-3/214601#214601"&gt;it still hasn&amp;#8217;t&lt;/a&gt;) and even if it had been, we wouldn&amp;#8217;t have trusted it immediately. Here at &lt;a href="http://wordaligned.org/"&gt;Word Aligned&lt;/a&gt;, though, I switched to Python 3 pretty much as soon as it was officially released. Since the start of 2009, &lt;a href="http://wordaligned.org/articles/perl-6-python-3#tocword-aligned-and-python-30"&gt;any Python code published on this site has been written in Python 3&lt;/a&gt;.
&lt;/p&gt;
&lt;p&gt;Since then I&amp;#8217;ve come to question my decision. I want people to visit my site and I want them to stay long enough to read any code here. Python is perfect because it&amp;#8217;s readable and accessible. Anyone who&amp;#8217;s ever written a program, whatever the  language, can understand Python. But many times I&amp;#8217;ve felt the need to explain my Python 3 code, not to Java, C#, C++ and C users, not even to Perl and Ruby users, but to Python users!
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;Note that in Python 3 &amp;#8230; whereas in Python 2 &amp;#8230; available in Python 3.1 only &amp;#8230; you&amp;#8217;d need to write &amp;#8230; from __future__ import print_function
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;I wouldn&amp;#8217;t have felt the need to say any of this if I&amp;#8217;d stuck with Python 2.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/antipep#toc3" name="tocpython-3-absent-from-europython" id="tocpython-3-absent-from-europython"&gt;Python 3 absent from Europython&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;&lt;a href="http://www.flickr.com/photos/thomasguest/4036266347/" title="Europython 2009 bag code by Thomas Guest, on Flickr"&gt;&lt;img src="http://farm3.static.flickr.com/2700/4036266347_9862579d68.jpg" width="500" height="208" alt="Code on the Europython 2009 bag" /&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;At Europython 2009 I was struck by the absence of Python 3 from the agenda. None of &lt;a href="http://www.europython.eu/talks/timetable/" title="Europython 2009 timetable"&gt;the sessions&lt;/a&gt; covered Python 3, used Python 3, or even mentioned Python 3 (unless you count David Jones&amp;#8217; talk on &lt;a href="http://www.europython.eu/talks/talk_abstracts/index.html#talk74"&gt;Loving Old Versions of Python&lt;/a&gt;). The only Python 3 code I saw appeared on the conference bag; a lightly obfuscated script which printed out the conference destination. Note that &lt;code&gt;print&lt;/code&gt; is being used in a way which works with both 2.x and 3.x &amp;#8212; that is, with parentheses and taking a single parameter. Very few systems resolve &lt;code&gt;/usr/bin/env python&lt;/code&gt; as Python 3, though&lt;a id="fn1link" href="http://wordaligned.org/articles/antipep#fn1"&gt;&lt;sup&gt;[1]&lt;/sup&gt;&lt;/a&gt;, which is lucky since even this simple function raises an exception under Python 3 (and transforming the code using &lt;code&gt;&lt;a href="http://docs.python.org/library/2to3.html"&gt;2to3&lt;/a&gt;&lt;/code&gt; makes it worse).
&lt;/p&gt;
&lt;p&gt;This Python 3 silence was at last broken during the question and answer session which followed the final keynote on the final day of the conference. An audibly nervous member of the audience asks &lt;a href="http://www.python.org/psf"&gt;Python Software Foundation&lt;/a&gt; supremo &lt;a href="http://holdenweb.com"&gt;Steve Holden&lt;/a&gt; a question:
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;Audience member: A source of confusion is the Python 2 Python 3 thing. How are you going about getting people to move from Python 2 to Python 3?
&lt;/p&gt;
&lt;p&gt;Steve Holden: I&amp;#8217;m not trying to get people to move to Python 3. [Audience applauds].
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;Steve Holden went on to round out this answer, saying that 2.6 is the recommended production version of Python. Anyone who took Python 3.0 into production, he said, would have been &amp;#8220;kicked in the teeth by the fact that the IO subsystem performed execrably slowly, it was really dreadful&amp;#8221; &amp;#8212; a fact the 3.0 release notes failed to mention, but which has been &lt;a href="http://docs.python.org/3.1/whatsnew/3.1.html#optimizations"&gt;fixed in Python 3.1&lt;/a&gt;. For teaching purposes, or for greenfield development which doesn&amp;#8217;t need to reuse other people&amp;#8217;s code, by all means try Python 3, he said. Python 3 is the future of Python. There&amp;#8217;s a migration strategy in place.
&lt;/p&gt;
&lt;p&gt;And what about the overhead on the core Python development team, who now have two versions to maintain? Well, Steve Holden said, there are tools to automate patching and merging, but yes, there&amp;#8217;s an overhead.
&lt;/p&gt;
&lt;p&gt;(To hear the question and full response, there&amp;#8217;s &lt;a href="http://blip.tv/file/2351630" title="The Python Software Foundation and Us, Steve Holden, Europython 2009"&gt;a video at blip.tv&lt;/a&gt;. Fast forward to 52 minutes and 40 seconds.)
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/antipep#toc4" name="tocpython-3-literature" id="tocpython-3-literature"&gt;Python 3 Literature&lt;/a&gt;&lt;/h3&gt;
&lt;div class="amazon"&gt;&lt;a href="http://www.amazon.com/Python-Essential-Reference-Developers-Library/dp/0672329786"&gt;&lt;img src="http://wordaligned.org/images/books/python-essential-reference4.png" alt="Python Essential Reference, 4th edition"/&gt;&lt;/a&gt;&lt;/div&gt;
&lt;div class="amazon"&gt;&lt;a href="http://www.amazon.com/gp/product/1430224150?ie=UTF8&amp;amp;tag=diveintomark-20"&gt;&lt;img height="222px" src="http://ecx.images-amazon.com/images/I/51N9HK%2B7WGL._SL300_.jpg" alt="Dive into Python 3 cover"/&gt;&lt;/a&gt;&lt;/div&gt;

&lt;p&gt;How have book authors reacted to Python 3? Mark Pilgrim has dived in with  aplomb. His introductory book, &lt;a href="http://www.diveintopython3.org"&gt;Dive into Python 3&lt;/a&gt;, uses Python 3 as Python 3 was intended. For example, you won&amp;#8217;t find &lt;a href="http://docs.python.org/py3k/library/stdtypes.html#old-string-formatting-operations"&gt;% characters&lt;/a&gt; used in string formatting; {up to date &lt;a href="http://www.python.org/dev/peps/pep-3101/" title="PEP 3101, Advanced String Formatting"&gt;braces&lt;/a&gt; are used exclusively}. It&amp;#8217;s an engaging, painstakingly-written book, and (bonus!) the online version is an object lesson in how to craft HTML.
&lt;/p&gt;
&lt;p&gt;David Beazley attacks the problem in a different way, but then his subject is different. His comprehensive &lt;a href="http://www.amazon.com/Python-Essential-Reference-Developers-Library/dp/0672329786"&gt;Python Essential Reference&lt;/a&gt; aims to cover the core language and its standard library in its entirety: think of it as the reference any serious Python programmer &lt;a href="http://www.dabeaz.com/blog/2009/08/essential-misconceptions.html"&gt;would like to have within reach&lt;/a&gt;. David Beazley&amp;#8217;s approach is to concentrate on the common subset of Python 2 and 3, omitting features of 2 which aren&amp;#8217;t in 3 and avoiding features of 3 which haven&amp;#8217;t been backported to 2. His book succeeds but it does raise some awkward questions. Will Pythonistas find themselves maintaining parallel code-bases, and end up twisting their code until it fits into the intersection of two flavours of the language?
&lt;/p&gt;
&lt;img src="http://chart.apis.google.com/chart?cht=v&amp;amp;chco=ffdd66,33ccff&amp;amp;chs=450x330&amp;amp;chd=t:100,90,100,70,100,70,70&amp;amp;chdl=Python+2|Python+3&amp;amp;chf=bg,s,EFEFEF&amp;amp;chdlp=l&amp;amp;chtt=Safe+Programming+Zone|Python+2+%e2%88%a9+Python+3&amp;amp;chts=333333,24" alt="Safe Python Programming zone"/&gt;

&lt;p&gt;Or will they simply avoid Python 3?
&lt;/p&gt;
&lt;p&gt;David Beazley eventually covers new Python 3 features in an appendix, by which time the strain has started to show:
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;Finally, even though Python 3.0 is described as the latest and greatest, it suffers from numerous performance and behavioral problems [&amp;#8230;] in the opinion of this author, Python 3.0 is really only suitable for experimental use by seasoned Python veterans.
&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/antipep#toc5" name="tocthe-cost-of-python-3" id="tocthe-cost-of-python-3"&gt;The Cost of Python 3&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;David Beazley may have ended up feeling like a Python 3 beta tester, but, as discussed at the start of this article, most Python users have a free choice. We can live a little longer with Python 2 &lt;a href="http://wiki.python.org/moin/PythonWarts"&gt;warts&lt;/a&gt; in exchange for a proven platform and an excellent set of supporting libraries. We can try and write to a language subset. We can use Python 2 and import much of the future. Or we can dive into Python 3.
&lt;/p&gt;
&lt;p&gt;The people who must find the language fork tough are the Python suppliers. Our choice, as consumers, means work for them: we&amp;#8217;ve mentioned the core Python team, who must surely spend more time patching and testing; think of Python library writers (such as the wizards behind Twisted).
&lt;/p&gt;
&lt;p&gt;There&amp;#8217;s another important class of supplier: the people working on alternative Python implementations, the ones which work on &lt;a href="http://www.jython.org"&gt;Java&lt;/a&gt;, or &lt;a href="http://www.codeplex.com/IronPython"&gt;.Net&lt;/a&gt;, the ones which have no global interpreter lock, the ones which can run deeply recursive functions. Pythonistas are understandably excited about &lt;a href="http://code.google.com/p/unladen-swallow/"&gt;Unladen Swallow&lt;/a&gt;, a development branch of CPython 2.6. Just look at the project &lt;a href="http://code.google.com/p/unladen-swallow/wiki/ProjectPlan"&gt;goals&lt;/a&gt;!
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;We want to make Python faster, but we also want to make it easy for large, well-established applications to switch to Unladen Swallow.
&lt;/p&gt;
&lt;ol&gt;
 &lt;li&gt;
     Produce a version of Python at least 5x faster than CPython.
 &lt;/li&gt;

 &lt;li&gt;
     Python application performance should be stable.
 &lt;/li&gt;

 &lt;li&gt;
     Maintain source-level compatibility with CPython applications.
 &lt;/li&gt;

 &lt;li&gt;
     Maintain source-level compatibility with CPython extension modules.
 &lt;/li&gt;

 &lt;li&gt;
     We do not want to maintain a Python implementation forever; we view our work as a branch, not a fork. 
 &lt;/li&gt;
&lt;/ol&gt;
&lt;/blockquote&gt;&lt;p&gt;In summary, if Unladen Swallow touches down safely, it will become CPython, and anyone using CPython will benefit from a high-level language capable of performing at native speeds.
&lt;/p&gt;
&lt;p&gt;I&amp;#8217;d like to highlight the final goal, the one about maintaining, branching and forking. Unladen Swallow is a branch taken from CPython 2.6&lt;a id="fn2link" href="http://wordaligned.org/articles/antipep#fn2"&gt;&lt;sup&gt;[2]&lt;/sup&gt;&lt;/a&gt;. If it succeeds, much hard and unglamorous work will be needed to merge it to the latest CPython 2, and patch it across to CPython 3. Python implementers must pay a high price, in terms of increased workload, for the Python 2, Python 3 fork.
&lt;/p&gt;
&lt;p&gt;Could Python have evolved in a more linear way, by deprecating then removing features, while adding in new ones? I guess not, a mature language wouldn&amp;#8217;t dare break backwards compatibility. 
&lt;/p&gt;
&lt;p&gt;Would it?
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/antipep#toc6" name="tocevolution-of-python" id="tocevolution-of-python"&gt;Evolution of Python&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;My employers were right to characterise Python as a developing language. On the subject of incremental change, I wanted to highlight again the &lt;a href="http://wordaligned.org/articles/python-counters"&gt;evolution of the multiset in Python&lt;/a&gt;, from Python 1.4, released in 2001, through to Python 3.1, just 4 months old.
&lt;/p&gt;
&lt;div class="typocode"&gt;&lt;div class="codetitle"&gt;Evolution of the Multiset in Python&lt;/div&gt;

&lt;pre class="prettyprint"&gt;def multiset_14(xs):
    multiset = {}
    for x in xs:
        if multiset.has_key(x):
            multiset[x] = multiset[x] + 1
        else:
            multiset[x] = 1
    return multiset

def multiset_15(xs):
    multiset = {}        
    for x in xs:
        multiset[x] = multiset.get(x, 0) + 1
    return multiset

import collections

def multiset_25(xs):
    multiset = collections.defaultdict(int)
    for x in xs:
        multiset[x] += 1
    return multiset

def multiset_31(xs):
    return collections.Counter(xs)

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Lovely!
&lt;/p&gt;
&lt;p&gt;&lt;span style="font-size:8px"&gt;&lt;code&gt;multiset_14()&lt;/code&gt; won&amp;#8217;t work in Python 3, and &lt;code&gt;multiset_31()&lt;/code&gt; won&amp;#8217;t work in Python 2 (not yet, anyway).&lt;/span&gt;
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/antipep#toc7" name="tocaccepting-python-3" id="tocaccepting-python-3"&gt;Accepting Python 3&lt;/a&gt;&lt;/h3&gt;
&lt;blockquote&gt;&lt;p&gt;The main goal of the Python development community at this point should be to get widespread acceptance of Python 3000. &amp;#8212; Guido van Rossum, &lt;a href="http://mail.python.org/pipermail/python-ideas/2009-October/006305.html"&gt;2009-10-21&lt;/a&gt;
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;Unlike Steve Holden, Guido van Rossum &lt;strong&gt;is&lt;/strong&gt; trying to get people to move to Python 3, or at least to accept it. &lt;a href="http://mail.python.org/pipermail/python-ideas/2009-October/006305.html"&gt;Here&amp;#8217;s how&lt;/a&gt;:
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;I propose a moratorium on language changes. This would be a period of several years during which no changes to Python&amp;#8217;s grammar or language semantics will be accepted. The reason is that frequent changes to the language cause pain for implementers of alternate implementations (Jython, IronPython, PyPy, and others probably already in the wings) at little or no benefit to the average user (who won&amp;#8217;t see the changes for years to come and might not be in a position to upgrade to the latest version for years after).
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;Wow!
&lt;/p&gt;
&lt;p&gt;I&amp;#8217;m not close enough to Python development to know exactly what&amp;#8217;s involved here, but a scan of the &lt;a href="http://mail.python.org/pipermail/python-ideas/2009-October/thread.html#6305"&gt;email thread&lt;/a&gt; suggests this proposal has been widely accepted. I think it&amp;#8217;s clear from the rest of this article that I sympathise with the motivation behind it. Yet I can&amp;#8217;t help feeling uneasy about putting Python on ice. Yes, there have been changes to the language grammar over the past fifteen years. I wouldn&amp;#8217;t say they&amp;#8217;ve been frequent, and there aren&amp;#8217;t many I&amp;#8217;d want to do without, even if I only get to use them (in production) a year or two after they&amp;#8217;ve been released. Yes, these changes cause pain to implementers&lt;a id="fn3link" href="http://wordaligned.org/articles/antipep#fn3"&gt;&lt;sup&gt;[3]&lt;/sup&gt;&lt;/a&gt;, but that&amp;#8217;s not the whole story. Who said implementing a language would be easy? Perhaps much of the pain comes from implementing the changes twice, once for 2 and once for 3.
&lt;/p&gt;
&lt;p&gt;Soon there&amp;#8217;ll be a &lt;a href="http://www.python.org/dev/peps" title="Python Enhancement Proposals"&gt;PEP&lt;/a&gt; stating more formally what exactly a moratorium on language changes will mean. That is,
&lt;/p&gt;
&lt;p&gt;A &lt;strong&gt;P&lt;/strong&gt;ython &lt;strong&gt;E&lt;/strong&gt;nhancement &lt;strong&gt;P&lt;/strong&gt;roposal which &lt;strong&gt;P&lt;/strong&gt;roposes: Stop &lt;strong&gt;E&lt;/strong&gt;nhancing &lt;strong&gt;P&lt;/strong&gt;ython!
&lt;/p&gt;
&lt;hr /&gt;

&lt;p&gt;&lt;a id="fn1" href="http://wordaligned.org/articles/antipep#fn1link"&gt;[1]&lt;/a&gt; I wonder if anyone can guess why this code fails, just by looking at it? As a hint, highlight the rest of this paragraph. &lt;span style="color:white"&gt;Some Python 2 codecs are byte rather than text oriented, and Python 3 prohibits this kind of confusion.&lt;/span&gt;
&lt;/p&gt;
&lt;p&gt;&lt;a id="fn2" href="http://wordaligned.org/articles/antipep#fn2link"&gt;[2]&lt;/a&gt; More details of the Unladen Swallow branch approach. 
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;In order to achieve our combination of performance and compatibility goals, we opt to modify CPython, rather than start our own implementation from scratch. In particular, we opt to start working on CPython 2.6.1: Python 2.6 nestles nicely between 2.4/2.5 (which most interesting applications are using) and 3.x (which is the eventual future). Starting from a CPython release allows us to avoid reimplementing a wealth of built-in functions, objects and standard library modules, and allows us to reuse the existing, well-used CPython C extension API. Starting from a 2.x CPython release allows us to more easily migrate existing applications; if we were to start with 3.x, and ask large application maintainers to first port their application, we feel this would be a non-starter for our intended audience. 
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;&lt;a id="fn3" href="http://wordaligned.org/articles/antipep#fn3link"&gt;[3]&lt;/a&gt;: C++ compiler writers, gear up for C++0x!
&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/yHwk_vQgrg8" height="1" width="1"/&gt;</description>
<dc:date>2009-10-28</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/antipep</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/yHwk_vQgrg8/antipep</link>
<category>Python</category>
<feedburner:origLink>http://wordaligned.org/articles/antipep</feedburner:origLink></item>

<item>
<title>Steady on Subversion</title>
<description>&lt;p&gt;&lt;a href="http://subversion.tigris.org"&gt;&lt;img src="http://subversion.tigris.org/images/subversion_logo_hor-468x64.png" width="468px" height="64px" alt="Subversion banner"/&gt;&lt;/a&gt;
&lt;/p&gt;

&lt;h3&gt;My secret shame&lt;/h3&gt;
&lt;p&gt;In a world of distributed version control systems I&amp;#8217;m ashamed to confess I still use &lt;a href="http://subversion.tigris.org"&gt;Subversion&lt;/a&gt;. We use it at work, exclusively. I use it at home, by default. Worse still, in a small way, I help promote and perpetuate this antiquated version control system: if you want to &lt;a href="http://www.google.com/search?q=mirror+subversion+repository"&gt;mirror a Subversion repository&lt;/a&gt; or set up a &lt;a href="http://www.google.com/search?q=subversion+pre-commit+hook"&gt;Subversion pre-commit hook&lt;/a&gt;, you may well find some faded notes I wrote on these subjects.
&lt;/p&gt;
&lt;p&gt;Whisper the words. &lt;span style="font-size:8px"&gt;I still like Subversion.&lt;/span&gt;
&lt;/p&gt;
&lt;p&gt;What do I like most? The command to revert a change. Merge it backwards.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;$ svn merge --change -666

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Branches and tags are one and the same. For someone who grew up with &lt;a href="http://www.nongnu.org/cvs/"&gt;CVS&lt;/a&gt;, that&amp;#8217;s quite a relief. Anyone can grasp the versioned file tree model. No one wants their version control system to surprise them. My boss, who doesn&amp;#8217;t get to program as much as he&amp;#8217;d like, has discovered a &lt;a href="http://versionsapp.com/"&gt;shiny Subversion client&lt;/a&gt; &amp;#8212; and he doesn&amp;#8217;t even use Windows. The sales team, who do use Windows, can use Subversion to collaborate on their office documents. &lt;a href="http://tortoisesvn.tigris.org/"&gt;TortoiseSVN&lt;/a&gt; interfaces to the Word diff tool, a nice touch. And software developers can surely find stable Subversion plug-ins for whatever tools they use.
&lt;/p&gt;
&lt;p&gt;The &lt;a href="http://svnbook.red-bean.com/"&gt;Subversion documentation&lt;/a&gt; is solid and has been for some while. I&amp;#8217;m surprised anyone ever arrives at my website &lt;a href="http://wordaligned.org/tag/subversion"&gt;seeking tips&lt;/a&gt;, but arrive they do, and in ever-increasing numbers.
&lt;/p&gt;
&lt;p&gt;Subversion does enough. The hard parts of my job are deciding what software to write, writing it, and working as a team. &lt;span /&gt;Version control should be frictionless, the easy bit. Which it is.
&lt;/p&gt;
&lt;span id="continue-reading"/&gt;

&lt;p&gt;Of course Subversion has weak points. It should be faster (whee, see how &lt;a href="http://git-scm.com"&gt;git&lt;/a&gt; flies!) and merging can be irksome (improving on CVS wasn&amp;#8217;t much of a target). But the biggest annoyance I&amp;#8217;ve had with Subversion is caused by its ubiquity and its continuing upgrade trajectory. Somehow I&amp;#8217;ve ended up accessing 1.4 and 1.5 format repositories on a machine which hosts 1.4, 1.5 and 1.6 clients in &lt;code&gt;/usr/local/bin&lt;/code&gt;, &lt;code&gt;/usr/bin&lt;/code&gt; and &lt;code&gt;/opt/local/bin&lt;/code&gt;, not necessarily in that order. Silly me, I&amp;#8217;m sorted now, I think, but I&amp;#8217;d happily see Subversion go into maintenance mode. &lt;span /&gt;For managing change, give me stable software.
&lt;/p&gt;

&lt;h3&gt;Do it the same, but better!&lt;/h3&gt;
&lt;p&gt;As mentioned at the start of this post, though, the world of version control has itself changed. Subversion represents evolution: by being a better CVS, it aimed to supplant its ancestor and become the VCS of choice for open source projects. CVS has indeed been supplanted, but true progress has come from the distributed version control revolution.
&lt;/p&gt;
&lt;p&gt;We&amp;#8217;ve been talking about a single, central source tree which develops in discrete steps. Everyone has a local working copy of the files in this tree, which they keep up to date, routinely merging changes back to base. Check out, check in. It &lt;strong&gt;is&lt;/strong&gt; an easy model to understand, but in practice there can be problems. What happens when you can&amp;#8217;t access the tree? Or when it gets pulled in different directions? Or when you lose track of who merged what where when? Now consider the distributed version control world, where the model extends to multiple trees. Everyone copies the entire repository as needed. Clone, merge.
&lt;/p&gt;
&lt;p&gt;In this distributed world a project needn&amp;#8217;t have a single, central repository&lt;a id="fn1link" href="http://wordaligned.org/articles/steady-on-subversion#fn1"&gt;&lt;sup&gt;[1]&lt;/sup&gt;&lt;/a&gt;. What&amp;#8217;s more, there is no single leading distributed version control system. As a result, open source projects are spoiled for choice.
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://git-scm.com/"&gt;&lt;img src="http://wordaligned.org/images/git-logo.png" alt="git logo"/&gt;&lt;/a&gt;&lt;a href="http://bazaar-vcs.org/"&gt;&lt;img src="http://bazaar-vcs.org/bzricons/bazaar-logo.png" alt="Bazaar logo"/&gt;&lt;/a&gt;&lt;a href="http://mercurial.selenic.com"&gt;&lt;img src="http://www.selenic.com/hg-logo/logo-droplets-100.png" alt="Mercurial logo"/&gt;&lt;/a&gt;&lt;a href="http://www.darcs.net"&gt;&lt;img src="http://www.darcs.net/logos/logo.png" alt="Darcs logo"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;The rise of the DVCS is a fascinating history, though one I&amp;#8217;ve yet to directly engage with &amp;#8212; unless you count the growing collection of DVCSes taking root on my hard disk (none of which shipped with my &lt;a href="http://www.apple.com/macosx/" title="Snow Leopard"&gt;operating system&lt;/a&gt;). I like the feel of &lt;a href="http://git-scm.com"&gt;git&lt;/a&gt;. Python will &lt;a href="http://python.org/dev/peps/pep-0385/" title="PEP 385. Migrating from svn to Mercurial"&gt;migrate to mercurial&lt;/a&gt;. For now, I&amp;#8217;m staying put.
&lt;/p&gt;

&lt;h3&gt;Definitive commentary&lt;/h3&gt;
&lt;p&gt;&lt;a href="http://www.red-bean.com/sussman/"&gt;Ben Collins-Sussman&lt;/a&gt; is one of the original designers and developers of Subversion, and co-author of &lt;a href="http://svnbook.red-bean.com/"&gt;Version Control with Subversion&lt;/a&gt;. His essays on the changing field of version control make fine reading. A couple of years ago he wrote:
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;Today, Subversion has now gone from &amp;#8220;cool subversive product&amp;#8221; to &amp;#8220;the default safe choice&amp;#8221; for both 80% and 20% audiences. The 80% companies who were once using crappy version control (or no version control at all) are now blogging to one another &amp;#8212; web developers giving &amp;#8220;hot tips&amp;#8221; to each other about using version control (and Subversion in particular) to manage their web sites at their small web-development shops. What was once new and hot to 20% people has finally trickled down to everyday-tool status among the 80%.
&lt;/p&gt;
&lt;p&gt;The great irony here &amp;#8230; is that Subversion was originally intended to subvert the open source world. It&amp;#8217;s done that to a reasonable degree, but it&amp;#8217;s proven far more subversive in the corporate world!
&lt;/p&gt;
&lt;p&gt;&amp;#8212; Ben Collins-Sussman, &lt;a href="http://blog.red-bean.com/sussman/?p=79" title="Version Control and the 80%"&gt;Version Control and &amp;#8220;the 80%&amp;#8221;&lt;/a&gt;, October 2007
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;In April last year he followed up with:
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;[&amp;#8230;] we think that [Subversion] will probably be the &amp;#8220;final&amp;#8221; centralized system that gets written in the open source world &amp;#8212; it represents the end-of-the-line for this model of code collaboration. 
&lt;/p&gt;
&lt;p&gt;It will continue to be used for many years, but specifically it will gain huge mindshare in the corporate world, while (eventually) losing mindshare to distributed systems in the open-source arena &amp;#8230; Subversion isn&amp;#8217;t anywhere near &amp;#8220;fading away&amp;#8221;. Quite the opposite: its adoption is still growing quadratically in the corporate world, with no sign of slowing down. This is happening independently of open source trailblazers losing interest in it. It may end up becoming a mainly &amp;#8220;corporate&amp;#8221; open source project (that is, all development funded by corporations that depend on it), but that&amp;#8217;s a fine way for a piece of mature software to settle down.
&lt;/p&gt;
&lt;p&gt;&amp;#8212; Ben Collins-Sussman, &lt;a href="http://blog.red-bean.com/sussman/?p=90"&gt;Subversion&amp;#8217;s Future&lt;/a&gt;, April 2008
&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Long live Subversion&lt;/h3&gt;
&lt;p&gt;Ben Collins-Sussman backs up his essay with &lt;a href="http://subversion.tigris.org/images/svn-dav-securityspace-survey.png"&gt;a graph&lt;/a&gt; showing the increasing numbers of Apache Subversion servers discoverable on the internet. His claims square with my personal experience. I&amp;#8217;m a corporate Subversion user and I don&amp;#8217;t see my employer switching version control systems any time soon (it&amp;#8217;s my decision as much as anyone&amp;#8217;s). What&amp;#8217;s more, Subversion is used in most of the companies I know of, where it has supplanted both legacy and proprietary systems. As stated already, version control isn&amp;#8217;t the hard part of my job, but should I ever need to change jobs, Subversion won&amp;#8217;t stand in my way.
&lt;/p&gt;
&lt;hr /&gt;

&lt;p&gt;&lt;a id="fn1" href="http://wordaligned.org/articles/steady-on-subversion#fn1link"&gt;[1]&lt;/a&gt;: A project may well choose to nominate a single central repository as the &amp;#8220;master&amp;#8221; repository. The functionality offered by distributed version control systems is effectively a superset of that offered by centralised ones.
&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/MLYzlslh92Y" height="1" width="1"/&gt;</description>
<dc:date>2009-10-13</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/steady-on-subversion</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/MLYzlslh92Y/steady-on-subversion</link>
<category>Subversion</category>
<feedburner:origLink>http://wordaligned.org/articles/steady-on-subversion</feedburner:origLink></item>

<item>
<title>Favicon</title>
<description>&lt;p&gt;On the subject of &lt;a href="http://wordaligned.org/"&gt;this site&lt;/a&gt;, I wanted to mention the recent addition of a favicon &lt;img src="http://wordaligned.org/images/favicon.png" alt="Little chap favicon"/&gt;. Per pixel, it&amp;#8217;s cost me more effort than any other feature; but then it&amp;#8217;s accessed more than any other asset. It&amp;#8217;s meant to be a piece from a jigsaw puzzle. I got the idea when &lt;a href="http://wordaligned.org/articles/recursive-pictures"&gt;re-reading Life A User&amp;#8217;s Manual&lt;/a&gt;. I like &lt;a href="http://wordaligned.org/tag/puzzles"&gt;puzzles&lt;/a&gt; and piecing things together.
&lt;/p&gt;
&lt;div class="amazon"&gt;&lt;a href="http://www.amazon.com/gp/product/0002719991?ie=UTF8&amp;amp;tag=wordalig-20"&gt;&lt;img src="http://wordaligned.org/images/books/life-a-users-manual.jpg" alt="Life A User's Manual"/&gt;&lt;/a&gt;&lt;/div&gt;

&lt;p&gt;Perec&amp;#8217;s great masterpiece is packed with interwoven stories and trickery, but at its heart is the epic battle between the millionaire, Bartlebooth, and the puzzle-maker Gaspard Winckler. Bartlebooth begins his campaign by learning how to paint, which takes him 10 years. For the next 20 years he travels the world, painting a water colour picture of a different port every couple of weeks. He sends the paintings back home to Paris. On receipt, Winckler glues each picture to a board which he then cuts, making a series of jigsaw puzzles for Bartlebooth to solve on his return. Once Bartlebooth completes each puzzle, an ingenious process is used to glue its pieces together and re-join the cut fibres of the paper; then the picture itself is lifted from the board, returned to the port it depicts, and washed clean in the sea; and finally the paper is returned in something close to its original state to Bartlebooth.
&lt;/p&gt;
&lt;p&gt;Thus, after 50 years of work, there will be nothing to show.
&lt;/p&gt;
&lt;img src="http://wordaligned.org/images/jigsaw-fr.png" alt="French jigsaw pieces"/&gt;

&lt;p&gt;In the book&amp;#8217;s preamble Perec describes familiar die-cut jigsaws, classifying the best known pieces of such puzzles as &amp;#8220;little chaps&amp;#8221;, &amp;#8220;double crosses&amp;#8221; and &amp;#8220;crossbars&amp;#8221;. Such diversions are eschewed by the true puzzler:
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;The art of jigsaw puzzling begins with wooden puzzles cut by hand, whose maker undertakes to ask himself all the questions the player will have to solve, and, instead of allowing chance to cover his tracks, aims to replace it with cunning, trickery and subterfuge. All the elements occurring in the image to be reassembled &amp;#8212; this armchair covered in gold brocade, that three-pointed black hat with its rather ruined black plume, or that silver-braided bright yellow livery &amp;#8212; serve by design as points of departure for trails that lead to false information.
&lt;/p&gt;
&lt;/blockquote&gt;&lt;hr /&gt;

&lt;p&gt;My thanks to Tim Beard for scanning a couple of pages from his edition of &lt;a href="http://en.wikipedia.org/wiki/La_Vie_mode_d%27emploi" title="Life A User's Manual, Wikipedia"&gt;La Vie, Mode d&amp;#8217;Emploi&lt;/a&gt;. I wanted to know what the &amp;#8220;little chaps&amp;#8221; etc. were before they got translated into English. I realise my favicon &lt;img src="http://wordaligned.org/images/favicon.png" alt="Little chap favicon"/&gt; could be &lt;a href="http://typophile.com/node/60577" title="Astonishing exploded view of improved YouTube favicon"&gt;improved&lt;/a&gt; but I don&amp;#8217;t know how to go about it. Anyone?
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://typophile.com/node/60577" title="YouTube Favicon - improved"&gt;&lt;img src="http://wordaligned.org/images/youtube1.png" width="52px" height="26px" alt="YouTube Favicon - improved"/&gt;&lt;/a&gt;&lt;a href="http://typophile.com/node/60577" title="YouTube Favicon - improved"&gt;&lt;img src="http://wordaligned.org/images/youtube2.png" width="104px" height="52px" alt="YouTube Favicon - improved"/&gt;&lt;/a&gt;&lt;a href="http://typophile.com/node/60577" title="YouTube Favicon - improved"&gt;&lt;img src="http://wordaligned.org/images/youtube3.png" width="208px" height="104px" alt="YouTube Favicon - improved"/&gt;&lt;/a&gt;
&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/meZDMcEn2Cg" height="1" width="1"/&gt;</description>
<dc:date>2009-09-16</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/favicon</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/meZDMcEn2Cg/favicon</link>
<category>Self</category>
<category>Perec</category>
<category>Puzzles</category>
<feedburner:origLink>http://wordaligned.org/articles/favicon</feedburner:origLink></item>

<item>
<title>Code Rot</title>
<description>&lt;div class="toc"&gt;
&lt;h2&gt;Contents&lt;/h2&gt;
&lt;ul&gt;
 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/code-rot#tocdvbcodec-fail" name="toc0" id="toc0"&gt;Dvbcodec Fail&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/code-rot#toccode-rot" name="toc1" id="toc1"&gt;Code Rot&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/code-rot#tocstandard-c" name="toc2" id="toc2"&gt;Standard C++&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/code-rot#tocsupport-rot" name="toc3" id="toc3"&gt;Support Rot&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/code-rot#tochow-did-this-ever-compile" name="toc4" id="toc4"&gt;How did this ever compile?&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/code-rot#tocboost-advances" name="toc5" id="toc5"&gt;Boost advances&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/code-rot#tocchanging-behaviour" name="toc6" id="toc6"&gt;Changing behaviour&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/code-rot#toccode-inaction" name="toc7" id="toc7"&gt;Code inaction&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/code-rot#tocrotten-artefacts" name="toc8" id="toc8"&gt;Rotten artefacts&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/code-rot#tocstopping-the-rot" name="toc9" id="toc9"&gt;Stopping the rot&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/code-rot#tocthanks" name="toc10" id="toc10"&gt;Thanks&lt;/a&gt;
 &lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;&lt;blockquote&gt;&lt;p&gt;Those of us who have to tiptoe around non-standard or ancient compilers will know that template template parameters are off limits. 
&lt;/p&gt;
&lt;p&gt;&amp;#8212; &lt;a href="http://www.oxyware.com/CheckedInt.pdf" title="CheckedInt: A policy-based range-checked integer, Hubert Matthews"&gt;Hubert Matthews (PDF)&lt;/a&gt;
&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/code-rot#toc0" name="tocdvbcodec-fail" id="tocdvbcodec-fail"&gt;Dvbcodec Fail&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Long ago, way back in 2004, I wrote an &lt;a href="http://wordaligned.org/docs/dvbcodec/index.html"&gt;article&lt;/a&gt; for &lt;a href="http://accu.org/index.php/journals/241" title="A Mini-project to decode a Mini-Language, Thomas Guest"&gt;Overload&lt;/a&gt; describing how to use the &lt;a href="http://www.boost.org/doc/libs/1_39_0/libs/spirit/index.html" title="Boost Spirit C++ parser framework"&gt;Boost Spirit&lt;/a&gt; parser framework to generate C++ code which could convert structured binary data to text. I went on to republish this article on my own website, where I also included a source distribution.
&lt;/p&gt;
&lt;p&gt;Much has changed since then. The C++ language may not have, but compiler and platform support for it has improved considerably. Boost survives &amp;#8212; indeed, many of its libraries will feed into the next version of C++. Overload thrives, adapting to an age when printed magazines about programming are all but extinct. My old website proved less durable: I&amp;#8217;ve changed domain name and shuffled things around more than once. But you can still find the article online if you look hard enough, and recently someone did indeed find it. He, let&amp;#8217;s call him Rick, downloaded the source code archive, &lt;a href="http://wordaligned.org/docs/dvbcodec/dvbcodec-1.0.zip" title="Rotten dvbcodec source distribution"&gt;dvbcodec-1.0.zip&lt;/a&gt;, extracted it, scanned the README, typed:
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;$ make

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&amp;#8230; and discovered the code didn&amp;#8217;t even build.
&lt;/p&gt;
&lt;p&gt;At this point many of us would assume (correctly) the code had not been maintained. We&amp;#8217;d delete it and write off the few minutes it took to evaluate it. Rick decided instead to contact me and let me know my code was broken. He even offered a fix for one problem.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/code-rot#toc1" name="toccode-rot" id="toccode-rot"&gt;Code Rot&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Sad to say, I wasn&amp;#8217;t entirely surprised. I no longer use this code. Unused code stops working. It decays.
&lt;/p&gt;
&lt;p&gt;I&amp;#8217;m not talking about a compiled executable, which the compiler has tied to a particular platform, and which therefore progressively degrades as the platform advances. (I&amp;#8217;ve heard stories about device drivers for which the source code has long been lost, and which require ever more elaborate emulation shims to keep them alive.) I&amp;#8217;m talking about source code. And the decay isn&amp;#8217;t usually literal, though I suppose you might have a source listing on a mouldy printout, or an unreadable floppy disk.
&lt;/p&gt;
&lt;p&gt;No, the code itself is usually a pristine copy of the original. Publishers often attach checksums to source distributions so readers can verify their download is correct. I hadn&amp;#8217;t taken this precaution with my &lt;code&gt;dvbcodec-1.0.zip&lt;/code&gt; but I&amp;#8217;m certain the version Rick downloaded was exactly the same as the one I created 5 years ago. Yet in that time it had stopped working. Why?
&lt;/p&gt;
&lt;span id="continue-reading"/&gt;


&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/code-rot#toc2" name="tocstandard-c" id="tocstandard-c"&gt;Standard C++&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;As already mentioned, this was C++ code. C++ is backed by an ISO standard, ratified in 1998, with corrigenda published in 2003. You might expect C++ code to improve with age, compiling and running more quickly, less likely to run out of resources.
&lt;/p&gt;
&lt;p&gt;Not so. My favourite counter-example comes from a nice paper &lt;a href="http://www.oxyware.com/CheckedInt.pdf" title="CheckedInt: A policy-based range-checked integer, Hubert Matthews"&gt;&amp;#8220;CheckedInt: A policy-based range-checked integer&amp;#8221; (PDF)&lt;/a&gt; published by Hubert Matthews in 2004 which discusses how to use C++ templates to implement a range-checked integer. The paper includes a source code listing together with some notes to help readers forced to &amp;#8220;tiptoe around non-standard or ancient compilers&amp;#8221; (think: MSVC6). Yet when I experimented with this code in 2005 I found myself tripped up by a strict and up-to-date compiler.
&lt;/p&gt;
&lt;pre&gt;
$ g++ -Wall -c checked_int.cpp
checked_int.cpp: In constructor `CheckedInt&amp;lt;low, high, ValueChecker&amp;gt;::CheckedInt(int)':
checked_int.cpp:45: error: there are no arguments to `RangeCheck' that
depend on a template parameter, so a declaration of `RangeCheck' must
be available
checked_int.cpp:45: error: (if you use `-fpermissive', G++ will accept
your code, but allowing the use of an undeclared name is deprecated)
&lt;/pre&gt;

&lt;p&gt;I emailed Hubert Matthews using the address included at the top of his paper. He swiftly and kindly put me straight on how to fix the problem.
&lt;/p&gt;
&lt;p&gt;What&amp;#8217;s interesting here is that this code is pure C++, just over a page of it. It has no dependencies on third party libraries. Hubert Matthews is a C++ expert and he acknowledges the help of two more experts, &lt;a href="http://erdani.org" title="Author of Modern C++ and coauthor of C++ Coding Standards"&gt;Andrei Alexandrescu&lt;/a&gt; and &lt;a href="http://curbralan.com" title="Programming guru"&gt;Kevlin Henney&lt;/a&gt;, in his paper. Yet the code fails to build using both ancient and modern compilers. In its published form it has the briefest of shelf-lives.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/code-rot#toc3" name="tocsupport-rot" id="tocsupport-rot"&gt;Support Rot&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Code alone is of limited use. What really matters for its ongoing health is that someone cares about it &amp;#8212; someone exercises, maintains and supports it. Hubert Matthews included an email address in his paper and I was able to contact him using that address.
&lt;/p&gt;
&lt;p&gt;How well would my code shape up on this front? Putting myself in Rick&amp;#8217;s position, I unzipped the source distribution I&amp;#8217;d archived 5 years ago. I was pleased to find a README which, at the very top, provides a URL for updates, &lt;a href="http://homepage.ntlworld.com/thomas.guest"&gt;http://homepage.ntlworld.com/thomas.guest&lt;/a&gt;. I was less pleased to find this URL gave me a &lt;strong&gt;404 Not Found&lt;/strong&gt; error. Similarly, when I tried emailling the project maintainer mentioned in the README, I got a &lt;strong&gt;550 Invalid recipient&lt;/strong&gt; error: the attempted delivery to &lt;a href="mailto:thomas.guest@ntlworld.com"&gt;thomas.guest@ntlworld.com&lt;/a&gt; had failed permanently.
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://homepage.ntlworld.com/thomas.guest"&gt;&lt;img src="http://wordaligned.org/images/ntlworld-404.png" alt="NTL World 404" width="520px" height="400px"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://www.w3.org/Provider/Style/URI" title="Tim Berners-Lee's classic advice on creating stable links"&gt;Cool URIs don&amp;#8217;t change&lt;/a&gt; but my old NTL homepage was anything but cool; it came for free with a dial-up connection I&amp;#8217;ve happily since abandoned. Looking back, maybe I should have found a more stable location for my code. If I&amp;#8217;d set up (e.g.) a Sourceforge project then my &lt;code&gt;dvbcodec&lt;/code&gt; project might still be alive and supported, possibly even by a new maintainer.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/code-rot#toc4" name="tochow-did-this-ever-compile" id="tochow-did-this-ever-compile"&gt;How did this ever compile?&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Wise hindsights wouldn&amp;#8217;t resurrect my code. If I wanted to continue I&amp;#8217;d have to go it alone. Here&amp;#8217;s what the README had to say about platform requirements.
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;   REQUIREMENTS and PLATFORMS
&lt;/p&gt;
&lt;p&gt;   To build the dvbcodec you will need Version 1.31.0 of Boost, or later.
&lt;/p&gt;
&lt;p&gt;   You will also need a good C++ compiler. The dvbcodec has been built and
      tested on the Windows operating system using: GCC 3.3.1, MSVC 7.1
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;A &amp;#8220;good C++ compiler&amp;#8221;, eh? As we&amp;#8217;ve already seen, GCC 3.3.1 may be good but my platform has GCC 4.0.1 installed, which is better. If my records can be believed, this &lt;code&gt;upperCase()&lt;/code&gt; function compiled cleanly using both GCC 3.3.1 and MSVC 7.1.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;std::string
upperCase(std::string const &amp;amp; lower)
{
    std::string upper = lower;
    
    for (std::string&amp;lt;char&amp;gt;::iterator cc = upper.begin();
         cc != upper.end(); ++cc)
    {
        * cc = std::toupper(* cc);
    }
    
    return upper;
}

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Huh? &lt;code&gt;Std::string&lt;/code&gt; is a typedef for &lt;code&gt;std::basic_string&amp;lt;char&amp;gt;&lt;/code&gt; and there&amp;#8217;s no such thing as a &lt;code&gt;std::basic_string&amp;lt;char&amp;gt;&amp;lt;char&amp;gt;::iterator&lt;/code&gt;, which is what GCC 4.0.1 says:
&lt;/p&gt;
&lt;pre&gt;
stringutils.cpp:58: error: 'std::string' is not a template
&lt;/pre&gt;

&lt;p&gt;The simple fix is to write &lt;code&gt;std::string::iterator&lt;/code&gt; instead of &lt;code&gt;std::string&amp;lt;char&amp;gt;::iterator&lt;/code&gt;. A better fix, suggested by Rick, is to use &lt;code&gt;std::transform()&lt;/code&gt;. I wonder why I missed this first time round?
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;std::string
upperCase(std::string const &amp;amp; lower)
{
    std::string upper = lower;
    std::transform(upper.begin(), upper.end(), upper.begin(), ::toupper);
    return upper;
}

&lt;/pre&gt;

&lt;/div&gt;


&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/code-rot#toc5" name="tocboost-advances" id="tocboost-advances"&gt;Boost advances&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;GCC has become stricter about what it accepts even though the formal specification of what it should do (the C++ standard) has stayed put. The Boost C++ libraries have more freedom to evolve, and the next round of build problems I encountered relate to Boost.Spirit&amp;#8217;s evolution. Whilst it would be possible to require dvbcodec users to build against Boost 1.31 (which can still be downloaded from the &lt;a href="http://www.boost.org"&gt;Boost website&lt;/a&gt;) it wouldn&amp;#8217;t be reasonable. So I updated my machine (using Macports) to make sure I had an up to date version of Boost, 1.38 at the time of writing.
&lt;/p&gt;
&lt;pre&gt;
$ sudo port upgrade boost
&lt;/pre&gt;

&lt;p&gt;Boost&amp;#8217;s various dependencies triggered an upgrade of boost-jam, gperf, libiconv, ncursesw, ncurses, gettext, zlib, bzip2, and this single command took over an hour to complete.
&lt;/p&gt;
&lt;p&gt;I discovered that Boost.Spirit, the C++ parser framework on which &lt;code&gt;dvbcodec&lt;/code&gt; is based, has gone through an overhaul. According to the change log the flavour of Spirit used by &lt;code&gt;dvbcodec&lt;/code&gt; is now known respectfully as Spirit Classic. A clever use of namespaces and include path forwarding meant my &amp;#8220;classic&amp;#8221; client code would at least compile, at the expense of some deprecation warnings.
&lt;/p&gt;
&lt;pre&gt;
Computing dependencies for decodeout.cpp...
Compiling decodeout.cpp...
In file included from codectypedefs.hpp:11,
                 from decodecontext.hpp:10,
                 from decodeout.cpp:8:
/opt/local/include/boost/spirit/tree/ast.hpp:18:4: warning: #warning "This header is deprecated. Please use: boost/spirit/include/classic_ast.hpp"
In file included from codectypedefs.hpp:12,
                 from decodecontext.hpp:10,
                 from decodeout.cpp:8:
&lt;/pre&gt;

&lt;p&gt;To suppress these warnings I included the preferred header. I then had to change namespace directives from &lt;code&gt;boost::spirit&lt;/code&gt; to &lt;code&gt;boost::spirit::classic&lt;/code&gt;. I fleetingly considered porting my code to Spirit V2, but decided against it: for even after this first round of changes, I still had a build problem.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/code-rot#toc6" name="tocchanging-behaviour" id="tocchanging-behaviour"&gt;Changing behaviour&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Actually, this was a second level build problem. The &lt;code&gt;dvbcodec&lt;/code&gt; build has multiple phases:
&lt;/p&gt;
&lt;ol&gt;
 &lt;li&gt;
     it builds a program to generate code. This generator can parse binary format syntax descriptions and emit C++ code which will convert data formatted according to these descriptions
 &lt;/li&gt;

 &lt;li&gt;
     it runs this generator with the available syntax descriptions as inputs
 &lt;/li&gt;

 &lt;li&gt;
     it compiles the emitted C++ code into a final &lt;code&gt;dvbcodec&lt;/code&gt; executable
 &lt;/li&gt;
&lt;/ol&gt;
&lt;img src="http://wordaligned.org/images/dvbcodec-build.png" alt="Dvbcodec build process"/&gt;

&lt;p&gt;I ran into a problem during the second phase of this process. The dvbcodec generator no longer parsed all of the supplied syntax descriptions. Specifically, I was seeing this conditional test raise an exception when trying to parse section format syntax descriptions.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;    if (!parse(section_format,
               section_grammar,
               space_p).full)
    {
        throw SectionFormatParseException(section_format);
    }

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here, &lt;code&gt;parse&lt;/code&gt; is &lt;code&gt;boost::spirit::classic::parse&lt;/code&gt;, which parses something &amp;#8212; the section format syntax description, passed as a string in this case &amp;#8212; according to the supplied grammar. The third parameter, &lt;code&gt;boost::spirit::classic::space_p&lt;/code&gt;, is a skip parser which tells &lt;code&gt;parse&lt;/code&gt; to skip whitespace between tokens. &lt;code&gt;Parse&lt;/code&gt; returns a &lt;code&gt;parse_info&lt;/code&gt; struct whose &lt;code&gt;full&lt;/code&gt; field is a boolean which will be set to &lt;code&gt;true&lt;/code&gt; if the input section format has been fully consumed.
&lt;/p&gt;
&lt;p&gt;I soon figured out that the parse call was failing to fully consume binary syntax descriptions with trailing spaces, such as the the one shown below.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;" program_association_section() {"
"    table_id                   8"
"    section_syntax_indicator   1"
"    '0'                        1"
....
"    CRC_32                    32"
" }                              "

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;If I stripped the trailing whitespace after the closing brace before calling &lt;code&gt;parse()&lt;/code&gt; all would be fine. I wasn&amp;#8217;t fine about this fix though. The Spirit documentation is very good but it had been a while since I&amp;#8217;d read it and, as already mentioned, my code used the &amp;#8220;classic&amp;#8221; version of Spirit, in danger of becoming the &amp;#8220;legacy&amp;#8221; then &amp;#8220;deprecated&amp;#8221; and eventually the &amp;#8220;dead&amp;#8221; version. Re-reading the documentation it wasn&amp;#8217;t clear to me exactly what the correct behaviour of &lt;code&gt;parse()&lt;/code&gt; should be in this case. Should it fully consume trailing space? Had my program ever worked?
&lt;/p&gt;
&lt;p&gt;I went back in time, downloading and building against Boost 1.31, and satisfied myself that my code used to work, though maybe it worked due to a bug in the old version of Spirit. Stripping trailing spaces before parsing allowed my code to work with Spirit past and present, so I curtailed my investigation and made the fix.
&lt;/p&gt;
&lt;p&gt;(Interestingly, Boost 1.31 found a way to warn me I was using a compiler it didn&amp;#8217;t know about.
&lt;/p&gt;
&lt;pre&gt;
boost_1_31_0/boost/config/compiler/gcc.hpp:92:7: warning: 
#warning "Unknown compiler version - please run the configure tests and report the results"
&lt;/pre&gt;

&lt;p&gt;I ignored this warning.)
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/code-rot#toc7" name="toccode-inaction" id="toccode-inaction"&gt;Code inaction&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Apologies for the lengthy explanation in the previous section. The point is, few software projects stand alone, and changes in any dependencies, &lt;strong&gt;including bug fixes&lt;/strong&gt;, can have knock on effects. In this instance, I consider myself lucky; &lt;code&gt;dvbcodec&lt;/code&gt;&amp;#8217;s unusual three phase build enabled me to catch a runtime error before generating the final product. Of course, to actually catch that error, I needed to at least try building my code.
&lt;/p&gt;
&lt;p&gt;More simply: if you don&amp;#8217;t use your code, it rots.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/code-rot#toc8" name="tocrotten-artefacts" id="tocrotten-artefacts"&gt;Rotten artefacts&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;It wasn&amp;#8217;t just the code which had gone off. My source distribution included documentation &amp;#8212; the plain text version of the article I&amp;#8217;d written for Overload &amp;#8212; and the Makefile had a build target to generate an HTML version of this documentation. This target depended on &lt;a href="http://www.boost.org/doc/tools/quickbook/index.html" title="Quickbook, a Boost documentation tool"&gt;Quickbook&lt;/a&gt;, another Boost tool. Quickbook generates Docbook XML from plain text source, and Docbook is a good starting point for HTML, PDF and other standard output formats.
&lt;/p&gt;
&lt;p&gt;This is quite a sophisticated toolchain. It&amp;#8217;s also one I no longer use. Most of what I write goes straight to the web and I don&amp;#8217;t need such a fiddly process just to produce HTML. So I decided to freshen up dead links, leave the original documentation as a record, and simply cut the documentation target from the Makefile.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/code-rot#toc9" name="tocstopping-the-rot" id="tocstopping-the-rot"&gt;Stopping the rot&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;As we&amp;#8217;ve seen, software, like other soft organic things, breaks down over time. How can we stop the rot?
&lt;/p&gt;
&lt;p&gt;Freezing software to a particular executable built against a fixed set of dependencies to run on a single platform is one way &amp;#8212; and maybe some of us still have an aging Windows 95 machine, kept alive purely to run some such frozen program.
&lt;/p&gt;
&lt;p&gt;A better solution is to actively tend the software and ensure it stays in shape. Exercise it regularly on a build server. Record test results. Fix faults as and when they appear. Review the architecture. Upgrade the platform and dependencies. Prune unused features, splice in new ones. This is the path taken by the Boost project, though certainly the growth far outpaces any pruning (the Boost 1.39 download is 5 times bigger than its 1.31 ancestor). Boost takes forwards and backwards compatibility seriously, hence the ongoing support for Spirit classic and the compiler version certification headers. Maintaining compatibility can be at odds with simplicity.
&lt;/p&gt;
&lt;p&gt;There is another way too. Although the &lt;code&gt;dvbcodec&lt;/code&gt; project has collapsed into disrepair the idea behind it certainly hasn&amp;#8217;t. I&amp;#8217;ve taken this same idea &amp;#8212; of parsing formal syntax descriptions to generate code which handles binary formatted data &amp;#8212; and enhanced it to work more flexibly and with a wider range of inputs. Whenever I come across a new binary data structure, I paste its syntax into a text file, regenerate the code, and I can work with this structure. Unfortunately I can&amp;#8217;t show you any code (it&amp;#8217;s proprietary) but I hope I&amp;#8217;ve shown you the idea. Effectively, &lt;span /&gt;the old C++ code has been left to rot but the idea within it remains green, recoded in Python. Maybe I should find a way to humanely destroy the C++ and all links to it, but for now I&amp;#8217;ll let it degrade, an illustration of its time.
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;Is it possible that software is not like anything else, that it is meant to be discarded: that the whole point is to see it as a soap bubble? &amp;#8212; &lt;a href="http://www.cs.yale.edu/quotes.html"&gt;Alan J. Perlis&lt;/a&gt;
&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/code-rot#toc10" name="tocthanks" id="tocthanks"&gt;Thanks&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;I would like to thank to Rick Engelbrecht for reporting and helping to fix the bugs discussed in this article.
&lt;/p&gt;
&lt;p&gt;This article first appeared in Overload 92, and I would like to thank the team at Overload for their expert help.
&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/_PKUCHV6Xms" height="1" width="1"/&gt;</description>
<dc:date>2009-09-03</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/code-rot</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/_PKUCHV6Xms/code-rot</link>
<category>C++</category>
<category>Build</category>
<feedburner:origLink>http://wordaligned.org/articles/code-rot</feedburner:origLink></item>

</channel>
</rss>

