<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.wordaligned.org/~d/styles/itemcontent.css"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">
<channel>
<title>Word Aligned</title>
<link>http://wordaligned.org</link>
<description>tales from the code face</description>
<dc:creator>tag@wordaligned.org</dc:creator>
<language>en-gb</language>
<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.wordaligned.org/wordaligned" /><feedburner:info uri="wordaligned" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><item>
<title>Bike charts by Google</title>
<description>&lt;p&gt;I&amp;#8217;ve liked the &lt;a href="http://code.google.com/apis/chart/"&gt;Google chart API&lt;/a&gt; ever since &lt;a href="http://wordaligned.org/articles/the-maximum-subsequence-problem" title="My first charts using the google API"&gt;I first discovered it&lt;/a&gt;. Pack a text definition of an image into a URL &lt;code&gt;http://chart.apis.google.com/chart?YOUR-IMAGE-HERE&lt;/code&gt; and you&amp;#8217;ll be served up a freshly cooked PNG. It&amp;#8217;s free. There&amp;#8217;s not even a watermark.
&lt;/p&gt;
&lt;img width="320px" height="160px" src="http://chart.apis.google.com/chart?chs=320x160&amp;amp;cht=gom&amp;amp;chd=t:70&amp;amp;chl=Nice!" alt="Swing-o-meter, Nice!"/&gt;

&lt;pre&gt;
http://chart.apis.google.com/chart?  # A chart, please
    &amp;chs=320x160                     # sized 320x160 pixels
    &amp;cht=gom                         # of type swin&lt;b&gt;gom&lt;/b&gt;eter
    &amp;chd=t:70                        # with 70% swing
    &amp;chl=Nice!                       # labeled "Nice!"
&lt;/pre&gt;

&lt;p&gt;Gone are the days when the &lt;a href="http://code.google.com/apis/chart/docs/making_charts.html" title="main entry point to the google chart API docs"&gt;documentation&lt;/a&gt; fitted on a single web-page. The API has fattened up and filled out. Every time I visit something new has been added: &lt;a href="http://code.google.com/apis/chart/docs/gallery/formulas.html" title="or should that be formulas?"&gt;mathematical formulae&lt;/a&gt; written in TeX; a &lt;a href="http://code.google.com/apis/chart/docs/chart_playground.html" title="Live chart playground"&gt;playground&lt;/a&gt; where you can sketch a chart directly; a &lt;a href="http://code.google.com/intl/uk/apis/chart/docs/debugging.html"&gt;validation&lt;/a&gt; option which tells you where you went wrong &amp;#8212; much more helpful than a bare 404.
&lt;/p&gt;
&lt;span id="continue-reading"/&gt;

&lt;p&gt;New to me: &lt;a href="http://code.google.com/apis/chart/docs/gallery/dynamic_icons.html" title="dynamic icons - callouts, bubbles, pins, and other graphics"&gt;dynamic icons&lt;/a&gt;, which let you create &amp;#8220;a variety of interesting callouts, pins, or bubbles that mix text and images&amp;#8221;. 
&lt;/p&gt;
&lt;img src="http://chart.apis.google.com/chart?chst=d_fnote&amp;amp;chld=thought|2|993300|h|We+could+have+|fun+with+this!" alt="here's a thought..."/&gt;

&lt;img src="http://chart.apis.google.com/chart?chst=d_bubble_icon_texts_big&amp;amp;chld=bicycle|bb|ffff33|663300|Classic+Tour+Finishes|Let's+make+some+charts+which+depict|classic+stage+finishes+in+the+Tour+de+France" alt="classic cycle charts"/&gt;

&lt;img src="http://chart.apis.google.com/chart?chst=d_bubble_text_small&amp;amp;chld=bbbr|Good+idea,+go+for+it!|ffff33|663300" alt="Go for it!"/&gt;

&lt;p&gt;Mercurial manxman Mark Cavendish won an incredible &lt;strong&gt;6 stages&lt;/strong&gt; of last year&amp;#8217;s Tour. Here he is, becoming the first Briton ever to win the final showdown on the Champs-&amp;Eacute;lys&amp;eacute;es, and winning it by an immense margin. For me, it was a bitter-sweet moment: that sprint should have put Cav in the green jersey, but he&amp;#8217;d thrown away his chance in the points competition earlier in the race with an &lt;a href="http://tag.wordaligned.org/posts/cav-wants-race"&gt;act of petulance&lt;/a&gt; which I still struggle to understand.
&lt;/p&gt;
&lt;img alt="Cavendish, first on the the Champs-&amp;Eacute;lys&amp;eacute;es" src="http://chart.apis.google.com/chart?&amp;amp;cht=lc&amp;amp;chs=540x280&amp;amp;chls=4,3,0&amp;amp;chd=s:ZZZZZZZZZZZZZZZZZZZ&amp;amp;chco=aaaaaa&amp;amp;chm=B,0000ff,0,0:7,0|B,ffffff,0,6:12,0|B,ff0000,0,12:,0,&amp;amp;chem=y;s=simple_text_icon_left;d=,14,000,bicycle,24,000,FFF;of=0,10;dp=15|y;s=simple_text_icon_left;d=,14,000,helicopter,24,000,FFF;of=0,120;dp=11|y;s=simple_text_icon_left;d=,14,000,bicycle,24,000,FFF;of=0,10;dp=10|y;s=simple_text_icon_left;d=,14,000,bicycle,24,000,FFF;of=0,10;dp=9|y;s=simple_text_icon_left;d=,14,000,bicycle,24,000,FFF;of=0,10;dp=range,3,9,.8|y;s=simple_text_icon_left;d=,14,000,civic-building,24,000,FFF;of=0,14;dp=1"/&gt;

&lt;p&gt;The Champs-&amp;Eacute;lys&amp;eacute;es may have a cobbled surface but it&amp;#8217;s level and straight &amp;#8212; definitely one for the sprinters. How about something twisted and mountainous? This second tableau recreates Fabian Cancellara&amp;#8217;s dare-devil descent during stage 7 of last year&amp;#8217;s tour. Defending the maillot jaune, Cancellara got dropped by the peleton following a wheel change. Watch him weave between team cars and camera bikes at top speed to regain his place. &lt;a href="http://www.youtube.com/watch?v=RxXqQqAc2pA" title="Watch Cancellara's descent on YouTube"&gt;Awesome!&lt;/a&gt;
&lt;/p&gt;
&lt;img alt="Cancellara descending" src="http://chart.apis.google.com/chart?cht=lc&amp;amp;chs=540x280&amp;amp;chls=4,3,0&amp;amp;chd=s:zyxwwvuttsrppponmmlkjihgfedcbaZYXVUUUTSRQONNMLKJIHHHGFEEEDCCCCCBBBBB&amp;amp;chco=aaaaaa&amp;amp;chm=B,ffff33,0,0,0&amp;amp;chem=y;s=simple_text_icon_left;d=,14,000,bicycle,24,000,FFF;of=4,8;dp=11|y;s=simple_text_icon_left;d=,14,000,car-dealer,24,000,FFF;of=0,10;dp=range,15,24,4|y;s=simple_text_icon_left;d=,14,000,helicopter,24,000,FFF;of=0,80;dp=16|y;s=simple_text_icon_left;d=,14,000,motorcycle,24,000,FFF;of=0,6;dp=range,7,21,9|y;s=simple_text_icon_left;d=,14,000,bicycle,24,000,FFF;of=4,8;dp=range,45,60,2"/&gt;

&lt;p&gt;Now for a real classic &amp;#8212; when Stephen Roche dug deep during an epic mountain stage in the 1987 Tour. Pedro Delgado, wearing yellow, had built a substantial lead over his rival on the climb up La Plagne. Yet somehow Roche clawed his way back into contention, appearing at the finish line just 5 seconds down on Delgado. He surprised everyone. He collapsed, exhausted, and had to be given oxygen, but he&amp;#8217;d done enough. Roche went on to win the Tour. &lt;a href="http://www.youtube.com/watch?v=sQojh-wqL04" title="Roche at La Plagne, commentary by Phil Liggett"&gt;Formidable!&lt;/a&gt;
&lt;/p&gt;
&lt;img alt="IT'S STEPHEN ROCHE!" src="http://chart.apis.google.com/chart?cht=lc&amp;amp;chs=540x280&amp;amp;chls=4,3,0&amp;amp;chd=s:ACDEHIJKMOQSTUVXYabcdfghjkmnoppqqrssttuu&amp;amp;chco=aaaaaa&amp;amp;chm=B,ffff33,0,0,0&amp;amp;chem=y;s=simple_text_icon_left;d=,14,000,bicycle,24,000,FFF;of=0,10;dp=29|y;s=simple_text_icon_left;d=,14,000,bicycle,24,000,FFF;of=0,12;dp=32|y;s=simple_text_icon_left;d=,14,000,wc-male,24,000,FFF;of=0,16;dp=35|y;s=simple_text_icon_left;d=,14,000,medical,24,000,FFF;of=0,16;dp=37|y;s=bubble_text_small;d=bbbr,that+looks+like+Stephen+Roche....+IT'S+STEPHEN+ROCHE!,ffff00,000000;of=40,230"/&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=Bi2tz6kVghE:zFVfVTvk5jI:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=Bi2tz6kVghE:zFVfVTvk5jI:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=Bi2tz6kVghE:zFVfVTvk5jI:F7zBnMyn0Lo"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=Bi2tz6kVghE:zFVfVTvk5jI:F7zBnMyn0Lo" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=Bi2tz6kVghE:zFVfVTvk5jI:gIN9vFwOqvQ"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=Bi2tz6kVghE:zFVfVTvk5jI:gIN9vFwOqvQ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=Bi2tz6kVghE:zFVfVTvk5jI:cGdyc7Q-1BI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?d=cGdyc7Q-1BI" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/Bi2tz6kVghE" height="1" width="1"/&gt;</description>
<dc:date>2010-02-18</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/bike-charts</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/Bi2tz6kVghE/bike-charts</link>
<category>Google</category>
<category>Cycling</category>
<category>Charts</category>
<feedburner:origLink>http://wordaligned.org/articles/bike-charts</feedburner:origLink></item>

<item>
<title>When you comment on a comment</title>
<description>&lt;blockquote&gt;&lt;p&gt;&lt;a href="http://twitter.com/ianbicking/status/8891604954"&gt;@ianbicking&lt;/a&gt; these days, I very rarely bother reading anything where I cannot comment. &amp;#8212; &lt;a href="http://twitter.com/drjtwit/status/8898216561"&gt;@drjtwit&lt;/a&gt;
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p style="text-align:center"&gt;&amp;sect;&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;You&amp;#8217;ll notice there are no comments here. I hate discussing things via blog comments. If you&amp;#8217;d like to talk, drop me a line. If you&amp;#8217;d like to discuss things in public, post on your blog. 
&lt;/p&gt;
&lt;p&gt;&amp;#8212; &lt;a href="http://codahale.com"&gt;Coda Hale&lt;/a&gt;
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p style="text-align:center"&gt;&amp;sect;&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;When you leave a comment on a comment, how often do you wonder what your rights are? Not too often, I&amp;#8217;d guess. Over the years, it has become an accepted fact that content contributed to a website simply belongs to that website. If the website, or blog for today&amp;#8217;s web, goes away then all of your contributions disappear along with it.
   A real world analogy would be sending in letters or artwork to a magazine. There&amp;#8217;s usually that disclaimer which says the publication can do whatever with your submission. And, of course, they can&amp;#8217;t return anything to you. It belongs to the magazine now.
&lt;/p&gt;
&lt;p&gt;&amp;#8212; Daniel Ha, &lt;a href="http://blog.disqus.net/2008/05/30/a-commenters-rights/"&gt;A commenter&amp;#8217;s rights&lt;/a&gt;
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p style="text-align:center"&gt;&amp;sect;&lt;/p&gt;

&lt;p&gt;In a recent &lt;a href="http://dantwining.com/2010/01/30/using-twitter-ids-for-comments/"&gt;blog post Dan Twining&lt;/a&gt; writes about blog comments and asks what I think of &lt;a href="http://disqus.com"&gt;Disqus&lt;/a&gt;, the commenting service used here at &lt;a href="http://wordaligned.org/"&gt;Word Aligned&lt;/a&gt;. The question comes at an interesting time.
&lt;/p&gt;
&lt;span id="continue-reading"/&gt;

&lt;p&gt;&lt;a href="http://disqus.com"&gt;&lt;img src="http://wordaligned.org/images/disqus-comments.gif" alt="Disqus comments" style="float:right;"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;ol&gt;
 &lt;li&gt;&lt;p&gt;Unlike Blogger, Wordpress, Typepad etc. Disqus isn&amp;#8217;t a blogging platform. Disqus do comments, and their central idea is integration. You don&amp;#8217;t need yet another online id to leave a disqus comment. Sign in using your OpenID or Yahoo! account, for example. You can have your comments tweeted or posted on Facebook. Disqus works with whatever blogging system you use and it works &lt;strong&gt;across&lt;/strong&gt; different systems: if a blog uses disqus and you post a comment on that blog, then that comment remains yours &amp;#8212; the blog owner can&amp;#8217;t edit it, and other readers can click through to see comments you&amp;#8217;ve posted on other sites. Well, other comments posted using disqus that is. Clearly these connections extend as more sites adopt disqus, and this seems to be happening.
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;I used to use Haloscan for comments. I switched to Disqus about this time last year. I had no particular complaints with Haloscan; installation was simple, I didn&amp;#8217;t get any spam comments, and the service proved reliable enough. It just seemed Haloscan wasn&amp;#8217;t going anywhere. As it turns out, Haloscan will soon be gone. They&amp;#8217;re shutting down the service this week.
&lt;/p&gt;

 &lt;/li&gt;

 &lt;li&gt;&lt;p&gt;It so happens Dan&amp;#8217;s question comes when the action on this site is happening in the comments section. Which makes me think.
&lt;/p&gt;

 &lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Back to Daniel Ha, the Disqus CEO, who has clearly thought harder about comments than I ever will. Towards the top of this note is a quote from his article &lt;a href="http://blog.disqus.net/2008/05/30/a-commenters-rights/"&gt;&amp;#8220;A Commenter&amp;#8217;s Rights&amp;#8221;&lt;/a&gt;. Daniel Ha goes on to point out that times have changed, and that online publishers are able to involve their readers and include their input in more sophisticated ways. It&amp;#8217;s better for both commenters and publishers, he suggests, if commenters retain rights over their original material.
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;So what are a commenter&amp;#8217;s rights? I&amp;#8217;m going to make an initial attempt to materialize what some rights should be.&lt;/p&gt;
&lt;ol style="list-style-type:lower-alpha"&gt;&lt;li&gt;The ability to edit and remove their comments&lt;/li&gt;
&lt;li&gt;Access to all of their comments, even if it has been deleted on a blog&lt;/li&gt;
&lt;li&gt;The right to use their own comments as blog posts. After all, a commenter is just a publisher not writing on his own website.&lt;/li&gt;
&lt;li&gt;A life for the comment beyond a single blog. I want to take my comments with me, even if the blog shuts down.&lt;/li&gt;&lt;/ol&gt;
&lt;p&gt;This may seem threatening to the publisher, but it really isn&amp;#8217;t. A commenter should have rights to what they post, but bloggers should still have control over content that appear on their blogs. Bloggers should still control:&lt;/p&gt;
&lt;ol style="list-style-type:lower-alpha"&gt;&lt;li&gt;Whether or not someone is allowed to comment on his blog&lt;/li&gt;
&lt;li&gt;The deletion of a comment&lt;/li&gt;
&lt;li&gt;The modification of a comment, as long as the original copy is still accessible and the edit is transparent&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;&amp;mdash; Daniel Ha, &lt;a href="http://blog.disqus.net/2008/05/30/a-commenters-rights/"&gt;A commenter&amp;#8217;s rights&lt;/a&gt;&lt;/p&gt;&lt;/blockquote&gt;

&lt;p style="text-align:center"&gt;&amp;sect;&lt;/p&gt;

&lt;p&gt;Why bother with site-hosted comments at all? If people want to comment they can do so on specially designed community sites like Proggit or Hacker News. This is the internet: we go where we please, we find what we like.
&lt;/p&gt;
&lt;p&gt;Well, I guess I included comments on this site because that&amp;#8217;s what other blogs did, and because I thought it was a way to engage readers and persuade them to return. The truth is, most readers arrive here via &lt;a href="http://www.reddit.com/domain/wordaligned.org"&gt;proggit&lt;/a&gt;; if I did away with comments here I might get more comments on reddit, and consequently more visits.
&lt;/p&gt;
&lt;p&gt;Look again at Daniel Ha&amp;#8217;s words
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;When you leave a comment on a comment &amp;#8230;
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;I&amp;#8217;m guessing this is a typo, and that he meant &amp;#8220;When you leave a comment on a &lt;strong&gt;website&lt;/strong&gt; &amp;#8230;&amp;#8221;. If so, it&amp;#8217;s an interesting slip. Modern comment systems are designed for comments on comments as much as for comments on the original article. Why bother with the original if the comments are more interesting? Jump straight into the discussion!
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;[&amp;#8230;] It&amp;#8217;s not that I don&amp;#8217;t like sites with comments on, but when you read a site with comments it automatically puts you, the reader, in a defensive mode where you&amp;#8217;re saying, &amp;#8220;what&amp;#8217;s good in this comment thread? What can I skim?&amp;#8221;
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;These are John Gruber&amp;#8217;s words but you won&amp;#8217;t find them on his &lt;a href="http://daringfireball.net"&gt;Daring Fireball&lt;/a&gt; website (the quotation has been &lt;a href="http://shawnblanc.net/2007/07/why-daring-fireball-is-comment-free/"&gt;transcribed&lt;/a&gt; by Shawn Blanc from an interview), and you won&amp;#8217;t find comments there either. Have a look at an article on Daring Fireball. &lt;a href="http://daringfireball.net/2010/02/winer_flash_open_standards"&gt;Here&amp;#8217;s a recent one about Adobe Flash&lt;/a&gt;. Now try to imagine how the article would look with comments.
&lt;/p&gt;
&lt;p style="text-align:center"&gt;&amp;sect;&lt;/p&gt;

&lt;p&gt;I&amp;#8217;ve exported any Haloscan comments left on this site and imported them into Disqus using the &lt;a href="http://groups.google.com/group/disqus-dev"&gt;disqus developer API&lt;/a&gt;. My thanks to Roberto Alsina for his &lt;a href="http://lateral.netmanagers.com.ar/weblog/posts/BB856.html" title="Migrating from Haloscan to Disqus - instructions"&gt;helpful pointers&lt;/a&gt; on how to do this. Links to comments will have broken, but everything else should be fine. I&amp;#8217;m sticking with comments and I&amp;#8217;m sticking with Disqus, which gets better all the time. Please &lt;a href="mailto:tag@wordaligned.org"&gt;let me know&lt;/a&gt; if you spot any problems.
&lt;/p&gt;
&lt;p style="text-align:center"&gt;&amp;sect;&lt;/p&gt;

&lt;p&gt;I&amp;#8217;ll &lt;a href="http://stevenf.com/pages/shutup/" title="A user style sheet which hides comments"&gt;shut up&lt;/a&gt; now.
&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=102729l9Vuo:xUfIb8Ia3-Y:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=102729l9Vuo:xUfIb8Ia3-Y:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=102729l9Vuo:xUfIb8Ia3-Y:F7zBnMyn0Lo"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=102729l9Vuo:xUfIb8Ia3-Y:F7zBnMyn0Lo" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=102729l9Vuo:xUfIb8Ia3-Y:gIN9vFwOqvQ"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=102729l9Vuo:xUfIb8Ia3-Y:gIN9vFwOqvQ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=102729l9Vuo:xUfIb8Ia3-Y:cGdyc7Q-1BI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?d=cGdyc7Q-1BI" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/102729l9Vuo" height="1" width="1"/&gt;</description>
<dc:date>2010-02-10</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/comments-on-comments</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/102729l9Vuo/comments-on-comments</link>
<category>Self</category>
<category>Disqus</category>
<feedburner:origLink>http://wordaligned.org/articles/comments-on-comments</feedburner:origLink></item>

<item>
<title>Power programming</title>
<description>&lt;div class="toc"&gt;
&lt;h2&gt;Contents&lt;/h2&gt;
&lt;ul&gt;
 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/power-programming#tocpowerful-or-dangerous" name="toc0" id="toc0"&gt;Powerful or dangerous?&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/power-programming#tocdecision-trees" name="toc1" id="toc1"&gt;Decision trees&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/power-programming#toccuteness-calculator" name="toc2" id="toc2"&gt;Cuteness calculator&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/power-programming#toceval" name="toc3" id="toc3"&gt;Eval&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/power-programming#tocdynamic-or-hacky" name="toc4" id="toc4"&gt;Dynamic or hacky?&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/power-programming#tocjam-to-golf" name="toc5" id="toc5"&gt;Jam to golf&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/power-programming#toccode-vs-data" name="toc6" id="toc6"&gt;Code vs data&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/power-programming#tocpowerful-language-vs-power-user" name="toc7" id="toc7"&gt;Powerful language vs power user?&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/power-programming#tocappendix-a-first-impressions-of-arc" name="toc8" id="toc8"&gt;Appendix A: First impressions of Arc&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/power-programming#tocappendix-b-c-solution" name="toc9" id="toc9"&gt;Appendix B: C++ solution&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/power-programming#tocappendix-c-a-python-solution" name="toc10" id="toc10"&gt;Appendix C: A Python Solution&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/power-programming#tocnotes" name="toc11" id="toc11"&gt;Notes&lt;/a&gt;
 &lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/power-programming#toc0" name="tocpowerful-or-dangerous" id="tocpowerful-or-dangerous"&gt;Powerful or dangerous?&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Recently I &lt;a href="http://wordaligned.org/articles/next-permutation" title="Next permutation: when C++ gets it right"&gt;wrote about&lt;/a&gt; one of the &lt;a href="http://code.google.com/codejam/"&gt;Google Code Jam&lt;/a&gt; challenges, where, perhaps surprisingly, the best answer &amp;#8212; the most elegant and obviously correct answer, requiring the fewest lines of code, with virtually zero space overhead, and running the quickest &amp;#8212; the very best answer was coded in C++.
&lt;/p&gt;
&lt;p&gt;Why should this be surprising? C++ is a powerful language.
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;In my experience there is almost no limit to the damage that a sufficiently ingenious fool can do with C++. But there is also almost no limit to the degree of complexity that a skillful library designer can hide behind a simple, safe, and elegant C++ interface. 
&lt;/p&gt;
&lt;p&gt;&amp;#8212; Greg Colvin, &lt;a href="http://www.artima.com/cppsource/spiritofc2.html" title="Greg Colvin, In the Spirit of C"&gt;&amp;#8220;In the Spirit of C&amp;#8221;&lt;/a&gt;
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;Yes. And yes! But in this article I wanted to discuss something C++ &lt;strong&gt;can&amp;#8217;t&lt;/strong&gt; do. Let&amp;#8217;s start with another &lt;a href="http://code.google.com/codejam/contest/dashboard?c=186264#" title="Decision Tree, Google Code Jam 2009"&gt;challenge&lt;/a&gt; from the same round of the 2009 Google Code Jam.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/power-programming#toc1" name="tocdecision-trees" id="tocdecision-trees"&gt;Decision trees&lt;/a&gt;&lt;/h3&gt;
&lt;blockquote&gt;&lt;p&gt;Decision trees &amp;#8212; in particular, a type called classification trees &amp;#8212; are data structures that are used to classify &lt;i&gt;items&lt;/i&gt; into &lt;i&gt;categories&lt;/i&gt; using &lt;i&gt;features&lt;/i&gt; of those items. For example, each animal is either &amp;#8220;cute&amp;#8221; or not. For any given animal, we can decide whether it is cute by looking at the animal&amp;#8217;s features and using the following decision tree.&lt;/p&gt;
&lt;pre&gt;(0.2 furry
  (0.81 fast
    (0.3)
    (0.2)
  )
  (0.1 fishy
    (0.3 freshwater
      (0.01)
      (0.01)
    )
    (0.1)
  )
)
&lt;/pre&gt;&lt;p&gt;&amp;mdash; &lt;a href="http://code.google.com/codejam/contest/dashboard?c=186264#"&gt;Decision Trees, Google Code Jam 2009&lt;/a&gt;&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="http://www.zazzle.com/cute_beaver_magnet-147411069592023743"&gt;&lt;img src="http://wordaligned.org/images/cute-beaver.png" alt="Cute beaver!" width="227px" height="193px" style="float:right;margin:25px 25px"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;The challenge goes on to describe the structure more formally, then steps through an example calculation. What is the probability, &lt;code&gt;p&lt;/code&gt;, that a beaver is cute?
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;For example, a beaver is an animal that has two features: &lt;code&gt;furry&lt;/code&gt; and &lt;code&gt;freshwater&lt;/code&gt;. We start at the root with &lt;code&gt;p&lt;/code&gt; equal to &lt;code&gt;1&lt;/code&gt;. We multiply &lt;code&gt;p&lt;/code&gt; by &lt;code&gt;0.2&lt;/code&gt;, the weight of the root and move into the first sub-tree because the beaver has the &lt;code&gt;furry&lt;/code&gt; feature. There, we multiply &lt;code&gt;p&lt;/code&gt; by &lt;code&gt;0.81&lt;/code&gt;, which makes &lt;code&gt;p&lt;/code&gt; equal to &lt;code&gt;0.162&lt;/code&gt;. From there we move further down into the second sub-tree because the beaver does not have the fast feature. Finally, we multiply &lt;code&gt;p&lt;/code&gt; by &lt;code&gt;0.2&lt;/code&gt; and end up with &lt;code&gt;0.0324&lt;/code&gt; &amp;#8212; the probability that the beaver is cute. 
&lt;/p&gt;
&lt;/blockquote&gt;&lt;img src="http://wordaligned.org/images/decision-tree.png" alt="Decision tree calculation"/&gt;

&lt;p&gt;The challenge itself involves processing input comprising a number of test cases. Each test case consists of a decision tree followed by a number of animals. A solution should parse the input and output the calculated cuteness probabilities.
&lt;/p&gt;
&lt;span id="continue-reading"/&gt;


&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/power-programming#toc2" name="toccuteness-calculator" id="toccuteness-calculator"&gt;Cuteness calculator&lt;/a&gt;&lt;/h3&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;def cuteness(decision_tree, features):
    """Return the probability an animal is cute.
    
    - decision_tree, the decision tree
    - features, the animal's features,
    """
    p = 1.0
    dt = decision_tree
    has_feature = features.__contains__
    while dt:
        weight, *dt = dt
        p *= weight
        if dt:
            feat, lt, rt = dt
            dt = lt if has_feature(feat) else rt
    return p

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Calculating an animal&amp;#8217;s cuteness given a decision tree and the animal&amp;#8217;s features isn&amp;#8217;t hard. In Python we don&amp;#8217;t need to code up a specialised decision tree class &amp;#8212; a nested tuple does just fine. The &lt;code&gt;cuteness()&lt;/code&gt; function shown above descends the decision tree, switching left or right according to each feature&amp;#8217;s presence or absence. The efficiency of this algorithm is proportional to the depth of the tree multiplied by the length of the feature list; as far as the code jam challenge goes, it&amp;#8217;s not a concern.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; decision_tree = (
...     0.2, 'furry',
...         (0.81, 'fast',
...             (0.3,),
...             (0.2,),
...         ),
...         (0.1, 'fishy',
...             (0.3, 'freshwater',
...                  (0.01,),
...                  (0.01,),
...             ),
...             (0.1,),
...         ),
...     )
&amp;gt;&amp;gt;&amp;gt; beaver = ('furry', 'freshwater')
&amp;gt;&amp;gt;&amp;gt; cuteness(decision_tree, beaver)
0.032400000000000005

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;No, the real problem here is how to parse the input data to create the decision trees and feature sets. As you can see, though, the textual specification of a decision tree closely resembles a Python representation of that decision tree. Just add punctuation.
&lt;/p&gt;
&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;td&gt;Specification&lt;/td&gt;&lt;td&gt;Python&lt;/td&gt;&lt;/tr&gt;&lt;/thead&gt;
&lt;tr&gt;&lt;td&gt;&lt;pre&gt;(0.2 furry
  (0.81 fast
    (0.3)
    (0.2)
  )
  (0.1 fishy
    (0.3 freshwater
      (0.01)
      (0.01)
    )
    (0.1)
  )
)&lt;/pre&gt;&lt;/td&gt;&lt;td&gt;&lt;pre&gt;(0.2, 'furry',
  (0.81, 'fast',
    (0.3,),
    (0.2,),
  ),
  (0.1, 'fishy',
    (0.3, 'freshwater',
      (0.01,),
      (0.01,),
      ),
      (0.1,),
  ),
)&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;

&lt;p&gt;Rather than parse the decision tree definition by hand, why not tweak it so that it &lt;strong&gt;is&lt;/strong&gt; a valid Python nested tuple? Then we can just let the Python interpreter &lt;a href="http://docs.python.org/library/functions.html#eval"&gt;&lt;tt&gt;eval&lt;/tt&gt;&lt;/a&gt; the tuple and use it directly.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/power-programming#toc3" name="toceval" id="toceval"&gt;Eval&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;A program&amp;#8217;s ability to read and execute source code at run-time is one of the things which makes &lt;a href="http://en.wikipedia.org/wiki/Dynamic_programming_language#Eval"&gt;dynamic languages&lt;/a&gt; dynamic. You can&amp;#8217;t do it in C and C++ &amp;#8212; no, sneaking instructions &lt;a href="http://en.wikipedia.org/wiki/Buffer_overrun"&gt;past the end of a buffer&lt;/a&gt; doesn&amp;#8217;t count. Should you do it in Python? Well, it won&amp;#8217;t hurt to give it a try.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;spec = '''\
(0.2 furry
  (0.81 fast
    (0.3)
    (0.2)
  )
  (0.1 fishy
    (0.3 freshwater
      (0.01)
      (0.01)
    )
    (0.1)
  )
)
'''

tuple_rep = re.sub(r'([\.\d]+|\))', r'\1,', spec)
tuple_rep = re.sub(r'([a-z]+)', r'"\1",', tuple_rep)
decision_tree = eval(tuple_rep)[0]

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here, we start with the input specification of the decision tree (imagine this has been read directly from standard input). The first regex substitution inserts commas after numbers, and right parentheses. The second substitution quotes and inserts a comma after feature strings. This turns the decision tree&amp;#8217;s specification into a textual representation of a nested Python tuple. We then &lt;code&gt;eval&lt;/code&gt; that tuple and assign the result to &lt;code&gt;decision_tree&lt;/code&gt; &amp;#8212; a Python decision tree we can go on and use in the rest of our program. And that&amp;#8217;s the code jam challenge cracked, pretty much.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; from pprint import pprint
&amp;gt;&amp;gt;&amp;gt; pprint(decision_tree)
(0.2,
 'furry',
 (0.81, 'fast', (0.3,), (0.2,)),
 (0.1, 'fishy', (0.3, 'freshwater', (0.01,), (0.01,)), (0.1,)))

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;(Minor wrinkle: you&amp;#8217;ll have spotted the final decision tree is the first element of the evaluated tuple. That&amp;#8217;s because the regex substitution puts a trailing comma after the right parenthesis which closes the decision tree specification, which turns &lt;code&gt;tuple_rep&lt;/code&gt; into a one-tuple. The single element contained in this one-tuple is what we need.)
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/power-programming#toc4" name="tocdynamic-or-hacky" id="tocdynamic-or-hacky"&gt;Dynamic or hacky?&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;As you can see, it doesn&amp;#8217;t take much code to pull the decision tree in ready for use. Python allows us to convert between text and code and to execute code within the current environment: you just need to keep a clear head and remember where you are. Regular expressions may not have the first class language support they enjoy in Perl and Ruby, but they are well supported, and the raw string syntax makes them more readable.
&lt;/p&gt;
&lt;p&gt;Certainly, this code snippet is easier to put together than a full blown &lt;a href="http://code.google.com/codejam/contest/dashboard?c=186264#s=a&amp;amp;a=0" title="Google's analysis of the decision tree challenge, including a parser"&gt;parser&lt;/a&gt;, but I think it will take more than this to convince a C++ programmer that Python is a powerful language, rather than a dangerous tool for ingenious fools. It fails to convince me. I can&amp;#8217;t remember ever using &lt;code&gt;eval&lt;/code&gt; or &lt;code&gt;exec&lt;/code&gt; in production code, where keeping a separation between layers is more important than speed of coding.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/power-programming#toc5" name="tocjam-to-golf" id="tocjam-to-golf"&gt;Jam to golf&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;&lt;a href="http://codegolf.com"&gt;&lt;img src="http://codegolf.com/images/logo.png" alt="Code Golf logo" width="332px" height="75px"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;That said, Python is a fine language for scripting, and speed of coding &lt;strong&gt;is&lt;/strong&gt; what matters in this particular challenge. Just for fun, what if there were &lt;a href="http://stackoverflow.com/questions/1433263/decision-tree-code-golf" title="Decision tree code golf on Stack Overflow"&gt;a prize for brevity&lt;/a&gt;? Then of course Perl would &lt;a href="http://stackoverflow.com/questions/1433263/decision-tree-code-golf/1442392#1442392" title="gnibbler's winning Perl entry"&gt;win&lt;/a&gt;!
&lt;/p&gt;
&lt;div class="typocode"&gt;&lt;div class="codetitle"&gt;Code Jam golf, by gnibbler, Stack Overflow&lt;/div&gt;

&lt;pre class="prettyprint"&gt;say("Case #$_:"),
$_=eval"''".'.&amp;lt;&amp;gt;'x&amp;lt;&amp;gt;,
s:[a-z]+:*(/ $&amp;amp;\\s/?:g,s/\)\s*\(/):/g,
eval"\$_=&amp;lt;&amp;gt;;say$_;"x&amp;lt;&amp;gt;for 1..&amp;lt;&amp;gt;

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Note that this does more than simply parse a decision tree &amp;#8212; it&amp;#8217;s a complete solution to the code jam &lt;a href="http://code.google.com/codejam/contest/dashboard?c=186264#" title="Decision Tree, Google Code Jam 2009"&gt;challenge&lt;/a&gt;, reading trees, features, calculating cutenesses, and producing output in the required format. Sadly that&amp;#8217;s all I can say about it because the details of its operation are beyond me.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/power-programming#toc6" name="toccode-vs-data" id="toccode-vs-data"&gt;Code vs data&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Using Python to dynamically execute code may not generally be needed or welcomed in Python production code, and over-reliance on the same trick risks reinforcing Perl&amp;#8217;s  &amp;#8220;write only&amp;#8221; reputation, but Python and Perl aren&amp;#8217;t the only contenders. &lt;span /&gt;The equivalence of code and data marks Lisp&amp;#8217;s apotheosis. Take a look at a &lt;a href="http://stackoverflow.com/questions/1433263/decision-tree-code-golf/1540845#1540845" title="Arc solution to decision tree"&gt;Lisp solution&lt;/a&gt; to the challenge. This one is coded up in &lt;a href="http://arclanguage.org" title="Arc, a new dialect of Lisp"&gt;Arc&lt;/a&gt;.
&lt;/p&gt;
&lt;pre class="prettyprint lang-lisp"&gt;
(def r () (read))
(for i 1 (r)
  (prn "Case #" i ":")
  (r)
  (= z (r))
  (repeat (r)
    (r)
    (loop (= g (n-of (r) (r))
             c z
             p 1)
       c
       (= p (* (pop c) p)
          c (if (pos (pop c) g)
                (c 0)
                (cadr c))))
    (prn p)))
&lt;/pre&gt;

&lt;p&gt;Which challenge does this solve? 
&lt;/p&gt;
&lt;p&gt;I meant the code golf challenge, of solving the decision tree problem using the fewest keystrokes. At 154 characters this Arc program is nearly half as long again as the winning Perl entry, but it&amp;#8217;s hardly flabby. What really impresses me, though, is that the code is (almost) as readable as it is succinct. It&amp;#8217;s elegant code. The only real scars left by the battle for brevity are the one character variable names. Here&amp;#8217;s the same code with improved variable names and some comments added. It&amp;#8217;s the &lt;code&gt;(read)&lt;/code&gt; calls which evaluate expressions on standard input. The &lt;code&gt;(for ...)&lt;/code&gt; and &lt;code&gt;(repeat ...)&lt;/code&gt; expressions operate as you might expect. The third looping construct, &lt;code&gt;(loop ...)&lt;/code&gt; initialises, tests and proceeds, much like a C for loop.
&lt;/p&gt;
&lt;pre class="prettyprint lang-lisp"&gt;
(for i 1 (read)               ; Read N, # test cases, and loop
  (prn "Case #" i ":")
  
  (read)                      ; Skip L, # lines taken by decision tree
  (= dtree (read))            ; and read the tree in directly
  
  (repeat (read)              ; Repeat over A, # animals
    (read)                    ; Skip animal name
    ; Read in the animal's features and walk down the 
    ; decision tree calculating p, the cuteness probability
    (loop (= features (n-of (read) (read)) 
             dt dtree
             p 1)
       dt
       (= p (* (pop dt) p)
          dt (if (pos (pop dt) features)
                (car dt)
                (cadr dt))))
    (prn p)))
&lt;/pre&gt;

&lt;p&gt;You could argue the elegance of this solution is due to the fact the input comprises a sequence of tokens and &lt;a href="http://en.wikipedia.org/wiki/S-expression" title="S-expressions, Wikipedia"&gt;S-expressions&lt;/a&gt;. If commas had been used to separate input elements and the text fields had been enclosed in quotes, then maybe a Python solution would have been equally clean. Or if the input had been in XML, then we&amp;#8217;d be looking to a library rather than &lt;code&gt;eval&lt;/code&gt; for parsing the input.
&lt;/p&gt;
&lt;p&gt;It&amp;#8217;s a fair point, but the equivalence of code and data counts as Lisp&amp;#8217;s biggest idea. Where Python&amp;#8217;s &lt;code&gt;eval&lt;/code&gt; is workable but rarely needed, Lisp&amp;#8217;s &lt;code&gt;(read)&lt;/code&gt; is fundamental.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/power-programming#toc7" name="tocpowerful-language-vs-power-user" id="tocpowerful-language-vs-power-user"&gt;Powerful language vs power user?&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;So, the most elegant answer to the code jam decision tree challenge would also be the quickest to write, and it would be written in Lisp. Did code jam champion, &lt;a href="http://www.go-hero.net/jam/09/name/ACRush" title="ACRush's code jam solutions"&gt;ACRush&lt;/a&gt;, submit a Lisp solution?
&lt;/p&gt;
&lt;p&gt;Absolutely not!
&lt;/p&gt;
&lt;p&gt;Another fundamental thing about Lisp is that it&amp;#8217;s straightforward to parse. A C++ expert can knock up an input parser for decision trees and features to order. ACRush brushed this round aside with a perfect score, taking just 45 minutes to code up working C++ solutions to this question &lt;strong&gt;and two others&lt;/strong&gt;. I&amp;#8217;ve reproduced his solution to the &lt;a href="http://code.google.com/codejam/contest/dashboard?c=186264#" title="Decision Tree, Google Code Jam 2009"&gt;decision tree challenge&lt;/a&gt; at the end of this article. It&amp;#8217;s plain and direct. Given the time constraints, I think it exhibits astonishing fluency &amp;#8212; the work of someone who can think in C++.
&lt;/p&gt;
&lt;p&gt;In this article we&amp;#8217;ve encountered four programming languages:
&lt;/p&gt;
&lt;ul&gt;
 &lt;li&gt;
     Python
 &lt;/li&gt;

 &lt;li&gt;
     Perl
 &lt;/li&gt;

 &lt;li&gt;
     Lisp
 &lt;/li&gt;

 &lt;li&gt;
     C++
 &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These languages are very different but they share features too. They are all mature, popular and well-supported&lt;a id="fn1link" href="http://wordaligned.org/articles/power-programming#fn1"&gt;&lt;sup&gt;[1]&lt;/sup&gt;&lt;/a&gt;. Each is a powerful general purpose programming language. &lt;span /&gt;But ultimately, the power of the programmer is what matters.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/power-programming#toc8" name="tocappendix-a-first-impressions-of-arc" id="tocappendix-a-first-impressions-of-arc"&gt;Appendix A: First impressions of Arc&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Here&amp;#8217;s another revision of the Arc solution, this time decomposed into subfunctions. I found no complete formal documentation of &lt;a href="http://arclanguage.org" title="Arc, a new dialect of Lisp"&gt;Arc&lt;/a&gt;. You&amp;#8217;ll have to read the source and follow the forum, and to actually run any code you&amp;#8217;ll have to download a an old version of MzScheme. The official line is: by all means have a play, but expect things to change. That said, the language looks delightful, practical, and quite &lt;a href="http://www.paulgraham.com/arcll1.html" title="No onions in the varnish, says Paul Graham"&gt;onion free&lt;/a&gt;. The &lt;a href="http://ycombinator.com/arc/tut.txt"&gt;tutorial&lt;/a&gt; made me smile. Recommended reading.
&lt;/p&gt;
&lt;pre class="prettyprint lang-lisp"&gt;
; The input is a sequence of valid Arc expressions.
; Create some read aliases to execute these.
(= skip read
   decision-tree read
   n-features read 
   n-tests read
   n-animals read)

(def animal-features ()
     ; Get an animal's features
     (skip) ; animal name
     (n-of (n-features) (read)))

(def cuteness (dtree features)
     ; Calculate cuteness from a decision tree and feature set
     (= dt dtree
        p 1.0)
     (while dt
          (= p (* (pop dt) p)
             dt (if (pos (pop dt) features)
                (car dt)
                (cadr dt))))
     p)

; Loop through the tests, printing results
(for i 1 (n-tests)
     (prn "Case #" i ":")
     (skip) ; # lines the tree specification takes
     (= dtree (decision-tree))
     (repeat 
         (n-animals)
         (prn (cuteness dtree (animal-features)))))
&lt;/pre&gt;


&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/power-programming#toc9" name="tocappendix-b-c-solution" id="tocappendix-b-c-solution"&gt;Appendix B: C++ solution&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Here&amp;#8217;s champion ACRush&amp;#8217;s C++ solution. I&amp;#8217;ve removed some general purpose macros from the top of the file. You can download the &lt;a href="http://code.google.com/codejam/contest/scoreboard/do?cmd=GetSourceCode&amp;amp;contest=186264&amp;amp;problem=171116&amp;amp;io_set_id=1&amp;amp;username=ACRush"&gt;original here&lt;/a&gt;.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;#include &amp;lt;set&amp;gt;
#include &amp;lt;string&amp;gt;
#include &amp;lt;vector&amp;gt;
#include &amp;lt;sstream&amp;gt;
#include &amp;lt;cstdio&amp;gt;
#include &amp;lt;cstdlib&amp;gt;

using namespace std;

vector&amp;lt;string&amp;gt; A;
vector&amp;lt;int&amp;gt; P;
set&amp;lt;string&amp;gt; M;

#define SIZE(X) ((int)(X.size()))

double solve(int H,int T)
{
    H++;T--;
    double p=atof(A[H].c_str());
    if (H==T) return p;
    if (M.find(A[H+1])!=M.end())
        return p*solve(H+2,P[H+2]);
    else
        return p*solve(P[T],T);
}
int main()
{
    freopen("A-large.in","r",stdin);freopen("A-large.out","w",stdout);
    int testcase;
    scanf("%d",&amp;amp;testcase);
    for (int caseId=1;caseId&amp;lt;=testcase;caseId++)
    {
        int nline;
        scanf("%d",&amp;amp;nline);
        A.clear();
        char str[1024];
        gets(str);
        for (int i=0;i&amp;lt;nline;i++)
        {
            gets(str);
            string s="";
            for (int k=0;str[k];k++)
                if (str[k]=='(' || str[k]==')')
                    s+=" "+string(1,str[k])+" ";
                else
                    s+=str[k];
            istringstream sin(s);
            for (;sin&amp;gt;&amp;gt;s;A.push_back(s));
        }
        P.resize(SIZE(A),-1);
        vector&amp;lt;int&amp;gt; stack;
        for (int i=0;i&amp;lt;SIZE(A);i++)
            if (A[i]=="(")
                stack.push_back(i);
            else if (A[i]==")")
            {
                int p=stack[SIZE(stack)-1];
                P[i]=p;
                P[p]=i;
                stack.pop_back();
            }
        int cnt;
        printf("Case #%d:\n",caseId);
        for (scanf("%d",&amp;amp;cnt);cnt&amp;gt;0;cnt--)
        {
            scanf("%s",str);
            M.clear();
            int length;
            for (scanf("%d",&amp;amp;length);length&amp;gt;0;length--)
            {
                scanf("%s",str);
                M.insert(str);
            }
            double r=solve(0,SIZE(A)-1);
            printf("%.12lf\n",r);
        }
    }
    return 0;
}

&lt;/pre&gt;

&lt;/div&gt;


&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/power-programming#toc10" name="tocappendix-c-a-python-solution" id="tocappendix-c-a-python-solution"&gt;Appendix C: A Python Solution&lt;/a&gt;&lt;/h3&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;import re
from itertools import islice

def cuteness(decision_tree, features):
    p = decision_tree[0]
    if len(decision_tree) &amp;gt; 1:
        _, feat, lt, rt = decision_tree
        p *= cuteness(lt if feat in features else rt, features)
    return p

def read_decision_tree(spec):
    tuple_rep = re.sub(r'([\.\d]+|\))', r'\1,', spec)
    tuple_rep = re.sub(r'([a-z]+)', r'"\1",', tuple_rep)
    return eval(tuple_rep)[0]

def take_lines(lines, n):
    return ''.join(islice(lines, n))

def main(fp):
    lines = iter(fp)
    n_tests = int(next(lines))
    for tc in range(1, n_tests + 1):
        print("Case #%d:" % tc)
        tree_spec = take_lines(lines, int(next(lines)))
        dtree = read_decision_tree(tree_spec)
        n_animals = int(next(lines))
        for line in islice(lines, n_animals):
            features = set(line.split()[2:])
            print(cuteness(dtree, features))

import sys
main(sys.stdin)

&lt;/pre&gt;

&lt;/div&gt;

&lt;hr /&gt;


&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/power-programming#toc11" name="tocnotes" id="tocnotes"&gt;Notes&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;&lt;a id="fn1" href="http://wordaligned.org/articles/power-programming#fn1link"&gt;[1]&lt;/a&gt; (Arc may not be mature, popular or well-supported; but Lisp certainly is.)
&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=JnMrlsjIGO4:VGs5kirp14o:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=JnMrlsjIGO4:VGs5kirp14o:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=JnMrlsjIGO4:VGs5kirp14o:F7zBnMyn0Lo"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=JnMrlsjIGO4:VGs5kirp14o:F7zBnMyn0Lo" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=JnMrlsjIGO4:VGs5kirp14o:gIN9vFwOqvQ"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=JnMrlsjIGO4:VGs5kirp14o:gIN9vFwOqvQ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=JnMrlsjIGO4:VGs5kirp14o:cGdyc7Q-1BI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?d=cGdyc7Q-1BI" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/JnMrlsjIGO4" height="1" width="1"/&gt;</description>
<dc:date>2010-01-26</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/power-programming</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/JnMrlsjIGO4/power-programming</link>
<category>Python</category>
<category>Perl</category>
<category>Lisp</category>
<category>Arc</category>
<feedburner:origLink>http://wordaligned.org/articles/power-programming</feedburner:origLink></item>

<item>
<title>Python, Surprise me!</title>
<description>&lt;h3&gt;A Simple Function&lt;/h3&gt;
&lt;p&gt;Here&amp;#8217;s a simple function which converts the third item of a list into an integer and returns it, returning -1 if the list has fewer than three entries or if the third entry fails to convert.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;def third_int(xs):
    '''Convert the third item of xs into an int and return it.
        
    Returns -1 on failure.
    '''    
    try:
        return int(xs[2])
    except IndexError, ValueError:
        return -1

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Unfortunately this simple function is simply wrong. Evidently some exceptions aren&amp;#8217;t being caught.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; third_int([1, 2, 3, 4])
3
&amp;gt;&amp;gt;&amp;gt; third_int([1])
-1
&amp;gt;&amp;gt;&amp;gt; third_int(('1', '2', '3', '4',))
3
&amp;gt;&amp;gt;&amp;gt; third_int(['one', 'two', 'three', 'four'])
Traceback (most recent call last):
    ....
ValueError: invalid literal for int() with base 10: 'three'

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;How ever did a &lt;code&gt;ValueError&lt;/code&gt; sneak past the &lt;code&gt;except&lt;/code&gt; clause?
&lt;/p&gt;
&lt;span id="continue-reading"/&gt;


&lt;h3&gt;The Real Surprise&lt;/h3&gt;
&lt;p&gt;There&amp;#8217;s nothing mysterious or surprising going on here, but I&amp;#8217;ll delay answering this question for a moment. For me, the real surprise about Python is that, generally, I get it right first time. Python similarly &lt;a href="http://www.python.org/about/success/esr" title="Why Python? by Eric S. Raymond"&gt;caught Eric S. Raymond by surprise&lt;/a&gt;. His first surprise was that it took him just 20 minutes to get used to syntactically significant whitespace. And just 100 minutes later &amp;#8230;
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;My second [surprise] came a couple of hours into the project, when I noticed (allowing for pauses needed to look up new features in &lt;em&gt;Programming Python&lt;/em&gt;) I was generating working code nearly as fast as I could type. When I realized this, I was quite startled. An important measure of effort in coding is the frequency with which you write something that doesn&amp;#8217;t actually match your mental representation of the problem, and have to backtrack on realizing that what you just typed won&amp;#8217;t actually tell the language to do what you&amp;#8217;re thinking. An important measure of good language design is how rapidly the percentage of missteps of this kind falls as you gain experience with the language.
&lt;/p&gt;
&lt;p&gt;&amp;#8212; Eric S. Raymond, &lt;a href="http://www.python.org/about/success/esr" title="Why Python? by Eric S. Raymond"&gt;Why Python?&lt;/a&gt;
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;I certainly don&amp;#8217;t generate working code as fast as I can type, and I&amp;#8217;m not even a particularly &lt;a href="http://steve-yegge.blogspot.com/2008/09/programmings-dirtiest-little-secret.html" title="Learn to type, Yegge says"&gt;quick typist&lt;/a&gt;, but I rarely make syntactic errors when writing Python &amp;#8212; and I don&amp;#8217;t often need to consult the documentation on such matters. As Chuck Allison memorably puts it: &lt;a href="http://www.artima.com/cppsource/simple.html"&gt;&amp;#8220;the syntax is so clean it squeaks&amp;#8221;&lt;/a&gt;.
&lt;/p&gt;

&lt;h3&gt;Parentheses Required(?)&lt;/h3&gt;
&lt;p&gt;There are some oddities and gotchas though. I don&amp;#8217;t object to the &lt;a href="http://effbot.org/pyfaq/why-must-self-be-used-explicitly-in-method-definitions-and-calls.htm"&gt;explicit &lt;code&gt;self&lt;/code&gt;&lt;/a&gt; in methods, but I do sometimes forget to write it &amp;#8212; especially if I&amp;#8217;ve just switched over from C++. 
&lt;/p&gt;
&lt;p&gt;A side-effect of the whitespace thing is that you can&amp;#8217;t just wrap a long line. The &lt;a href="http://docs.python.org/reference/lexical_analysis.html#explicit-line-joining"&gt;line ending&lt;/a&gt; needs to be escaped.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;if 1900 &amp;lt; year &amp;lt; 2100 and 1 &amp;lt;= month &amp;lt;= 12 \
    and 1 &amp;lt;= day &amp;lt;= 31 and 0 &amp;lt;= hour &amp;lt; 24 \
    and 0 &amp;lt;= minute &amp;lt; 60 and 0 &amp;lt;= second &amp;lt; 60: # Looks like a valid date
    return 1

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Alternatively, parenthesize.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;if (1900 &amp;lt; year &amp;lt; 2100 and 1 &amp;lt;= month &amp;lt;= 12
    and 1 &amp;lt;= day &amp;lt;= 31 and 0 &amp;lt;= hour &amp;lt; 24
    and 0 &amp;lt;= minute &amp;lt; 60 and 0 &amp;lt;= second &amp;lt; 60): # Looks like a valid date
    return 1

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;In the above, the parentheses aren&amp;#8217;t required to group terms, but instead serve to implicitly continue the line of code past a couple of newline characters.
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://en.wikipedia.org/wiki/Tree_%28data_structure%29"&gt;&lt;img width="360px" src="http://upload.wikimedia.org/wikipedia/commons/thumb/f/f7/Binary_tree.svg/500px-Binary_tree.svg.png" alt="Wikipedia Tree"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;Parentheses serve more than one role in Python&amp;#8217;s syntax. As in all C-family languages, they can group expressions. They also get involved building tuples, &lt;code&gt;(1, 2, 3)&lt;/code&gt; or &lt;code&gt;('red', 0xff0000)&lt;/code&gt; for example. Beware the special case: a one-tuple needs a trailing comma, &lt;code&gt;("singleton",)&lt;/code&gt;. This isn&amp;#8217;t something I forget or accidentally omit, but it can make things fiddly. Here&amp;#8217;s a tuple-tised &lt;a href="http://en.wikipedia.org/wiki/Tree_%28data_structure%29"&gt;tree&lt;/a&gt;, where we represent a tree as a tuple whose first element is a node value, and any subsequent elements are sub-trees. Careful with those commas!
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;tree = (2, (7, (2,), (6, (5,), (11,))), (5, (9, (4,))))

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Actually, tuples are just comma-separated lists of expressions &amp;#8212; no parentheses required &amp;#8212; so we might equally well have written.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;tree = 2, (7, (2,), (6, (5,), (11,))), (5, (9, (4,)))

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here, the superfluous outermost parentheses have been omitted; the inner ones are still required for grouping.
&lt;/p&gt;
&lt;p&gt;How about we always append a trailing comma to our tuples so the one-tuple no longer looks different?
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;tree = 2, (7, (2,), (6, (5,), (11,))), (5, (9, (4,))),

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;That&amp;#8217;s allowed and fine. Unless we need an empty tuple, that is, in which case the parentheses &lt;strong&gt;are&lt;/strong&gt; required. And a comma would be wrong.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; ()
()
&amp;gt;&amp;gt;&amp;gt; (),
((),)
&amp;gt;&amp;gt;&amp;gt; ,
   ....
SyntaxError: invalid syntax
&amp;gt;&amp;gt;&amp;gt; (,)
   ....
SyntaxError: invalid syntax
&amp;gt;&amp;gt;&amp;gt; tuple()
()

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Python 3 introduces a nice new syntax for &lt;code&gt;set&lt;/code&gt; literals, reusing the braces which traditionally enclose &lt;code&gt;dict&lt;/code&gt;s.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; ls = { 1, 11, 21, 1211, 111221, 312211 }

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Again, beware the edge case: &lt;code&gt;{}&lt;/code&gt; is an empty &lt;code&gt;dict&lt;/code&gt;, not an empty set.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; zs = {}
&amp;gt;&amp;gt;&amp;gt; type(zs)
&amp;lt;class 'dict'&amp;gt;
&amp;gt;&amp;gt;&amp;gt; zs = set()
&amp;gt;&amp;gt;&amp;gt; type(zs)
&amp;lt;class 'set'&amp;gt;

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Python 3 allows non-ascii characters in identifiers, but not any old character, so we &lt;strong&gt;cannot&lt;/strong&gt; get away with
&lt;/p&gt;
&lt;pre&gt;
&gt;&gt;&gt; &amp;empty; = set()
      ^
SyntaxError: invalid character in identifier
&lt;/pre&gt;

&lt;p&gt;Parentheses are used for function calls too, and also for generator expressions. Here&amp;#8217;s a lazy list of squares of numbers less than a million.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; sqs = (x * x for x in range(1000000))

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here&amp;#8217;s the sum of these numbers.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; sum((x * x for x in range(1000000)))
333332833333500000

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Actually, we can omit the generator-expression parentheses in the sum. The function call parentheses magically turn the enclosed &lt;code&gt;x * x for x in range(1000000)&lt;/code&gt; into a generator expression. As usual, Python does what we want.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; sum(x * x for x in range(1000000))
333332833333500000

&lt;/pre&gt;

&lt;/div&gt;


&lt;h3&gt;Serious about Syntax&lt;/h3&gt;
&lt;p&gt;If you&amp;#8217;ve read this far you may well be thinking: &amp;#8220;So what?&amp;#8221; I haven&amp;#8217;t shown any gotchas, merely a few quirks and corner cases. As already mentioned, the real surprise is that Python fails to surprise. Part of this, as I hope I&amp;#8217;ve shown here, can be attributed to the interpreter, which positively invites you to experiment; but mainly &lt;span /&gt;Python&amp;#8217;s clean and transparent design takes the credit. Repeating Eric S. Raymond: you don&amp;#8217;t have to &amp;#8220;actually tell the language to do what you&amp;#8217;re thinking&amp;#8221;.
&lt;/p&gt;
&lt;p&gt;Since I first started using Python the syntax has grown considerably, yet the extensions and additions seem almost as if they&amp;#8217;d been planned from the start&lt;a id="fn1link" href="http://wordaligned.org/articles/python-surprise-me#fn1"&gt;&lt;sup&gt;[1]&lt;/sup&gt;&lt;/a&gt;. Generator expressions complement list comprehensions. The yield statement fits nicely with iteration.
&lt;/p&gt;
&lt;p&gt;Even more remarkably, Python 3 has chosen to break backwards compatibility, so it can undo those few early choices which now seem mistakes. Which brings us back to the broken function at the top of this article. Here it is again, docstring omitted for brevity.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;def third_int(xs):
    try:
        return int(xs[2])
    except IndexError, ValueError:
        return -1

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;I really did write a function like this, and I really did get it wrong in just this way. The code is syntactically correct, but I should have written
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;def third_int(xs):
    try:
        return int(xs[2])
    except (IndexError, ValueError):
        return -1

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The parentheses in the &lt;code&gt;except&lt;/code&gt; clause are crucial. The formal syntax of this form of &lt;a href="http://docs.python.org/reference/compound_stmts.html#the-try-statement"&gt;try statement&lt;/a&gt; is
&lt;/p&gt;
&lt;pre&gt;
try1_stmt ::=  "try" ":" suite
               ("except" [expression [("as" | ",") target]] ":" suite)+
               ["else" ":" suite]
               ["finally" ":" suite]
&lt;/pre&gt;

&lt;p&gt;In the corrected version of &lt;code&gt;third_int()&lt;/code&gt;, the parentheses group &lt;code&gt;IndexError, ValueError&lt;/code&gt; into a single expression, a tuple, and the except clause matches any object with class (or base class) &lt;code&gt;IndexError&lt;/code&gt; or &lt;code&gt;ValueError&lt;/code&gt;. The broken version is very different, as becomes clear if we use the alternative &lt;code&gt;"as"&lt;/code&gt; form.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;def third_int(xs):
    try:
        return int(xs[2])
    except IndexError as ValueError:
        return -1

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here, the except clause will match an object with class or base class &lt;code&gt;IndexError&lt;/code&gt;, and assigns that object to the target, which is called &lt;code&gt;ValueError&lt;/code&gt; (and which shadows the &amp;#8220;real&amp;#8221; ValueError in the rest of the function definition). If &lt;code&gt;int()&lt;/code&gt; raises a &lt;code&gt;ValueError&lt;/code&gt;, it will not be matched.
&lt;/p&gt;

&lt;h3&gt;Won&amp;#8217;t Get Fooled Again&lt;/h3&gt;
&lt;p&gt;Oh, I get it, now. It &lt;strong&gt;is&lt;/strong&gt; a bit subtle, but I won&amp;#8217;t make that mistake again.
&lt;/p&gt;
&lt;p&gt;Wait, there&amp;#8217;s more! In Python 3k, my broken implementation is properly broken &amp;#8212; a syntax error.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;Python 3.1
&amp;gt;&amp;gt;&amp;gt; def third_int(xs):
...     try:
...         return int(xs[2])
...     except IndexError, ValueError:
  File "&amp;lt;stdin&amp;gt;", line 4
    except IndexError, ValueError:
                     ^
SyntaxError: invalid syntax

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The Python 3k syntax of this form of &lt;a href="http://docs.python.org/py3k/reference/compound_stmts.html#the-try-statement"&gt;try statement&lt;/a&gt; reads.
&lt;/p&gt;
&lt;pre&gt;
try1_stmt ::=  "try" ":" suite
               ("except" [expression "as" target]] ":" suite)+
               ["else" ":" suite]
               ["finally" ":" suite]
&lt;/pre&gt;

&lt;p&gt;You can&amp;#8217;t use a comma to capture the target any more. It&amp;#8217;s an advance and a simplification. Why am I not surprised?
&lt;/p&gt;
&lt;hr /&gt;

&lt;p&gt;&lt;a id="fn1" href="http://wordaligned.org/articles/python-surprise-me#fn1link"&gt;[1]&lt;/a&gt;: With the possible exception of &lt;a href="http://docs.python.org/reference/expressions.html#boolean-operations"&gt;conditional expressions&lt;/a&gt;, that is.
&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=hC5iZAu8OII:Jnn3brUw_fE:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=hC5iZAu8OII:Jnn3brUw_fE:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=hC5iZAu8OII:Jnn3brUw_fE:F7zBnMyn0Lo"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=hC5iZAu8OII:Jnn3brUw_fE:F7zBnMyn0Lo" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=hC5iZAu8OII:Jnn3brUw_fE:gIN9vFwOqvQ"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=hC5iZAu8OII:Jnn3brUw_fE:gIN9vFwOqvQ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=hC5iZAu8OII:Jnn3brUw_fE:cGdyc7Q-1BI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?d=cGdyc7Q-1BI" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/hC5iZAu8OII" height="1" width="1"/&gt;</description>
<dc:date>2009-12-15</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/python-surprise-me</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/hC5iZAu8OII/python-surprise-me</link>
<category>Python</category>
<feedburner:origLink>http://wordaligned.org/articles/python-surprise-me</feedburner:origLink></item>

<item>
<title>Next permutation: When C++ gets it right</title>
<description>&lt;div class="toc"&gt;
&lt;h2&gt;Contents&lt;/h2&gt;
&lt;ul&gt;
 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#tocthe-next-number-problem" name="toc0" id="toc0"&gt;The Next Number Problem&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#tocchoice-of-algorithm" name="toc1" id="toc1"&gt;Choice of Algorithm&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#toclexicographical-ordering" name="toc2" id="toc2"&gt;Lexicographical Ordering&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#tocnext-permutation-in-action" name="toc3" id="toc3"&gt;Next permutation in action&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#tocsnail-sorts-revenge" name="toc4" id="toc4"&gt;Snail sort&amp;#8217;s revenge&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#tocthe-next-number-solved" name="toc5" id="toc5"&gt;The Next Number, Solved&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#tocimplementation" name="toc6" id="toc6"&gt;Implementation&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#tocwhats-happening-here" name="toc7" id="toc7"&gt;What&amp;#8217;s happening here?&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#tocbeautiful-c" name="toc8" id="toc8"&gt;Beautiful C++?&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#tocpermutations-in-python" name="toc9" id="toc9"&gt;Permutations in Python&lt;/a&gt;
 &lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#toc0" name="tocthe-next-number-problem" id="tocthe-next-number-problem"&gt;The Next Number Problem&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Suppose you have a fixed list of digits chosen from the range 1..9. What numbers can you make with them? You&amp;#8217;re allowed as many zeros as you want. Write the numbers in increasing order.
&lt;/p&gt;
&lt;p&gt;Exactly &lt;a href="http://code.google.com/codejam/contest/dashboard?c=186264#s=p1"&gt;this puzzle&lt;/a&gt; came up in the recent &lt;a href="http://code.google.com/codejam"&gt;Google Code Jam&lt;/a&gt; programming contest:
&lt;/p&gt;
&lt;blockquote&gt;You are writing out a list of numbers. Your list contains all numbers with exactly &lt;strong&gt;D&lt;sub&gt;i&lt;/sub&gt;&lt;/strong&gt; digits in its decimal representation which are equal to i, for each i between 1 and 9, inclusive. You are writing them out in ascending order.&lt;/p&gt;&lt;p&gt;For example, you might be writing every number with two &amp;#8216;1&amp;#8217;s and one &amp;#8216;5&amp;#8217;. Your list would begin 115, 151, 511, 1015, 1051.&lt;/p&gt;&lt;p&gt;Given N, the last number you wrote, compute what the next number in the list will be.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;The competition has closed now, but if you&amp;#8217;d like to give it a go sample input files can be found on the &lt;a href="http://code.google.com/codejam/contest/dashboard?c=186264#s=p1"&gt;website&lt;/a&gt;, where you can also upload your results and have them checked.
&lt;/p&gt;
&lt;p&gt;Here&amp;#8217;s a short section from a trial I ran on my computer. Input numbers are in the left-hand column: the corresponding output numbers are in the right-hand column.
&lt;/p&gt;
&lt;pre style="font-size:150%"&gt;
50110812884911623516 &amp;rarr; 50110812884911623561
82454322474161687049 &amp;rarr; 82454322474161687094
82040229261723155710 &amp;rarr; 82040229261723157015
43888989554234187388 &amp;rarr; 43888989554234187838
76080994872481480636 &amp;rarr; 76080994872481480663
31000989133449480678 &amp;rarr; 31000989133449480687
20347716554681051891 &amp;rarr; 20347716554681051918
&lt;/pre&gt;


&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#toc1" name="tocchoice-of-algorithm" id="tocchoice-of-algorithm"&gt;Choice of Algorithm&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Like many of the code jam challenges, you&amp;#8217;ll need to write a program which runs fast enough; but choosing the right algorithm is more important than choosing the right language. Typically a high-level interpreted language like Python allows me to code and test a solution far more quickly than using a low-level language like C or C++.
&lt;/p&gt;
&lt;p&gt;In this particular case, though, like most &lt;a href="http://www.go-hero.net/jam/09/problems/2/2"&gt;successful candidates&lt;/a&gt;, I used C++. &lt;a href="http://www.sgi.com/tech/stl/next_permutation.html"&gt;Here&amp;#8217;s why&lt;/a&gt;.
&lt;/p&gt;
&lt;span id="continue-reading"/&gt;

&lt;blockquote&gt;&lt;p&gt;&lt;a href="http://www.sgi.com/tech/stl/next_permutation.html"&gt;&lt;code&gt;Next_permutation&lt;/code&gt;&lt;/a&gt; transforms the range of elements &lt;code&gt;[first, last)&lt;/code&gt; into the lexicographically next greater permutation of the elements. [&amp;#8230;] If such a permutation exists, &lt;code&gt;next_permutation&lt;/code&gt; transforms &lt;code&gt;[first, last)&lt;/code&gt; into that permutation and returns true. Otherwise it transforms &lt;code&gt;[first, last)&lt;/code&gt; into the lexicographically smallest permutation and returns &lt;code&gt;false&lt;/code&gt;.
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;Although the next number problem appears to be about numbers and lexicographical ordering appears to be about words, &lt;code&gt;std::next_permutation&lt;/code&gt; is exactly what&amp;#8217;s needed here.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#toc2" name="toclexicographical-ordering" id="toclexicographical-ordering"&gt;Lexicographical Ordering&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;&lt;a href="http://www.flickr.com/photos/thomasguest/4099819327/" title="Lexicographical order by Thomas Guest, on Flickr"&gt;&lt;img src="http://farm3.static.flickr.com/2449/4099819327_4063635302.jpg" width="500" height="216" alt="Lexicographical order" /&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;A dictionary provides the canonical example of lexicographical ordering. Words are built from characters, which can be alphabetically ordered A, B, C, &amp;#8230; , so in the dictionary words which begin with &lt;strong&gt;A&lt;/strong&gt; appear before words which begin with &lt;strong&gt;B&lt;/strong&gt;, which themselves come in front of words beginning with &lt;strong&gt;C&lt;/strong&gt;, etc. If two words start with the same letter, pop that letter from the head of the word and compare their tails, which puts AARDVARK before ANIMAL, and &amp;#8212; applying this rule recursively &amp;#8212; after &lt;a href="http://www.aardman.com/" title="Bristol's finest"&gt;AARDMAN&lt;/a&gt;. Imagine there&amp;#8217;s an empty word marking position zero, before A, right at the front of the dictionary, and our recursive  definition is complete.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#toc3" name="tocnext-permutation-in-action" id="tocnext-permutation-in-action"&gt;Next permutation in action&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Here&amp;#8217;s a simple program which shows &lt;code&gt;next_permutation()&lt;/code&gt; in action.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;#include &amp;lt;algorithm&amp;gt;
#include &amp;lt;cstdio&amp;gt;

int main()
{
    char xs[] = "123";
    do
    {
        std::puts(xs);
    }
    while (std::next_permutation(xs, xs + sizeof(xs) - 1));
    return 0;
}

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This program outputs lexicographically ordered permutations of 1, 2 and 3. When the main function returns, the array &lt;code&gt;xs&lt;/code&gt; will have cycled round to hold the lexicographically smallest arrangement of its elements, which is &lt;code&gt;"123"&lt;/code&gt;. Note that we never convert the characters &lt;code&gt;'1'&lt;/code&gt;, &lt;code&gt;'2'&lt;/code&gt;, &lt;code&gt;'3'&lt;/code&gt; into the numbers &lt;code&gt;1&lt;/code&gt;, &lt;code&gt;2&lt;/code&gt;, &lt;code&gt;3&lt;/code&gt;. The values of both sets of data types appear in the same order, so all works as expected.
&lt;/p&gt;
&lt;pre&gt;
123
132
213
231
312
321
&lt;/pre&gt;

&lt;p&gt;If we tweak and rerun the same program with &lt;code&gt;xs&lt;/code&gt; initialised to &lt;code&gt;"AAADKRRV"&lt;/code&gt; we get rather more output.
&lt;/p&gt;
&lt;pre&gt;
AAADKRRV
AAADKRVR
AAADKVRR
...
AARDVARK
...
VRRKAADA
VRRKADAA
VRRKDAAA
&lt;/pre&gt;

&lt;p&gt;The sequence &lt;strong&gt;doesn&amp;#8217;t&lt;/strong&gt; start by repeating &lt;code&gt;"AAADKRRV"&lt;/code&gt; 6 times, once for every permutation of the 3 A&amp;#8217;s. Only strictly increasing permutations are included. And although the repeated calls to &lt;code&gt;next_permutation&lt;/code&gt; generate a series of permutations, the algorithm holds no state. Each function call works on its input range afresh.
&lt;/p&gt;
&lt;p&gt;This second run of the program yields 3360 lines of output, even though there are 8! = 40320 possible permutations of 8 characters. Each unique permutation corresponds to 3! &amp;times; 2! = 12 actual permutations of the 8 characters (because there are 3 A&amp;#8217;s and 2 R&amp;#8217;s), and 40320 &amp;divide; 12 is 3360.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#toc4" name="tocsnail-sorts-revenge" id="tocsnail-sorts-revenge"&gt;Snail sort&amp;#8217;s revenge&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;&lt;a href="http://www.flickr.com/photos/tim_norris/2789759648/"&gt;&lt;img src="http://farm4.static.flickr.com/3143/2789759648_ab4bfb5ea8.jpg" width="500px" height="333px" alt="...and in last place. By Tim Norris"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;As you can see, &lt;code&gt;next_permutation&lt;/code&gt; sorts an input range, one step at a time.  When &lt;code&gt;next_permutation&lt;/code&gt; eventually returns false, the range will be perfectly ordered. Hence we have &lt;code&gt;snail_sort()&lt;/code&gt;, hailed by the SGI STL &lt;a href="http://www.sgi.com/tech/stl/next_permutation.html"&gt;documentation&lt;/a&gt; as the worst known deterministic sorting algorithm.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;template &amp;lt;class Iter&amp;gt; 
void snail_sort(Iter first, Iter last)
{
    while (next_permutation(first, last)) {}
}

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Very witty, and evidence that code can be both &lt;a href="http://wordaligned.org/articles/elegance-and-efficiency"&gt;elegant and inefficient&lt;/a&gt;.
&lt;/p&gt;
&lt;p&gt;In two important edge cases, though, &lt;code&gt;snail_sort&lt;/code&gt; performs on a par with super-charged &lt;code&gt;quicksort&lt;/code&gt;!
&lt;/p&gt;
&lt;ol&gt;
 &lt;li&gt;
     I snail sorted an array filled with 100000000 zeros in 0.502 seconds. Running quicksort on the same array took 5.504 seconds. 
 &lt;/li&gt;

 &lt;li&gt;
     Starting with an array of the same size filled with the values 99999999, 99999998, 99999997, &amp;#8230; 1, 0 snail sort&amp;#8217;s 0.500 seconds trounced quicksort&amp;#8217;s 4.08 seconds.
 &lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#toc5" name="tocthe-next-number-solved" id="tocthe-next-number-solved"&gt;The Next Number, Solved&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Here&amp;#8217;s an outline solution to the &lt;a href="http://code.google.com/codejam/contest/dashboard?c=186264#s=p1"&gt;next number problem&lt;/a&gt;. (I&amp;#8217;ve glossed over the exact input and output file formats for clarity.) It reads numbers from standard input and writes next numbers to standard output. &lt;code&gt;Next_permutation&lt;/code&gt; does the hard work, and there&amp;#8217;s a bit of fiddling when we have to increase the number of digits by adding a zero.&lt;a id="fn1link" href="http://wordaligned.org/articles/next-permutation#fn1"&gt;&lt;sup&gt;[1]&lt;/sup&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;#include &amp;lt;algorithm&amp;gt;
#include &amp;lt;iostream&amp;gt;

/*
 Given a string of digits, shift any leading '0's
 past the first non-zero digit and insert an extra zero.
 
 Examples:
  
 123 -&amp;gt; 1023
 008 -&amp;gt; 8000
 034 -&amp;gt; 3004
*/
void insert_a_zero(std::string &amp;amp; number)
{
    size_t nzeros = number.find_first_not_of('0');
    number = number.substr(nzeros);
    number.insert(1, nzeros + 1, '0');
}

/*
 Outline solution to the 2009 code jam Next Number problem.
 
 Given a string representing a decimal number, find the next
 number which can be formed from the same set of digits. Add
 another zero if necessary. Repeat for all such strings read
 from standard input.
*/
int main()
{
    std::string number;
    while (std::cin &amp;gt;&amp;gt; number)
    {
        if (!next_permutation(number.begin(), number.end()))
        {
            insert_a_zero(number);
        }
        std::cout &amp;lt;&amp;lt; number &amp;lt;&amp;lt; '\n';
    }
    return 0;
}

&lt;/pre&gt;

&lt;/div&gt;


&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#toc6" name="tocimplementation" id="tocimplementation"&gt;Implementation&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Having used the C++ standard library to solve the puzzle, let&amp;#8217;s take a look at how it works. Next permutation is a clever algorithm which shuffles a collection in place. My system implements it like this&lt;a id="fn2link" href="http://wordaligned.org/articles/next-permutation#fn2"&gt;&lt;sup&gt;[2]&lt;/sup&gt;&lt;/a&gt;.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;template&amp;lt;typename Iter&amp;gt;
bool next_permutation(Iter first, Iter last)
{
    if (first == last)
        return false;
    Iter i = first;
    ++i;
    if (i == last)
        return false;
    i = last;
    --i;
        
    for(;;)
    {
        Iter ii = i;
        --i;
        if (*i &amp;lt; *ii)
        {
            Iter j = last;
            while (!(*i &amp;lt; *--j))
            {}
            std::iter_swap(i, j);
            std::reverse(ii, last);
            return true;
        }
        if (i == first)
        {
            std::reverse(first, last);
            return false;
        }
    }
}

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;We start with a range delimited by a pair of bi-directional iterators, &lt;code&gt;[first, last)&lt;/code&gt;. If the range contains one item or fewer, there can be no next permutation, so leave the range as is and return &lt;code&gt;false&lt;/code&gt;. Otherwise, enter the &lt;code&gt;for&lt;/code&gt; loop with an iterator &lt;code&gt;i&lt;/code&gt; pointing at the final item in the range.
&lt;/p&gt;
&lt;p&gt;At each pass through the body of this for loop we decrement &lt;code&gt;i&lt;/code&gt; by one, stepping towards the first item in the range. We are looking for one of two conditions:
&lt;/p&gt;
&lt;ol&gt;
 &lt;li&gt;
     the value pointed to by &lt;code&gt;i&lt;/code&gt; is smaller than the one it pointed to previously
 &lt;/li&gt;

 &lt;li&gt;
     &lt;code&gt;i&lt;/code&gt; reaches into the first item in the range
 &lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Put another way, we divide the range into a head and tail, where the tail is the longest possible decreasing tail of the range.
&lt;/p&gt;
&lt;p&gt;If this tail is the whole range (the second condition listed above) then the whole range is in reverse order, and we have the lexicographical maximum formed from its elements. Reversing the range returns it to its lexicographical minimum, and we can return &lt;code&gt;false&lt;/code&gt;.
&lt;/p&gt;
&lt;p&gt;If this tail is not the whole range, then the final item in the head of the range, the item &lt;code&gt;i&lt;/code&gt; points to, this item is smaller than at least one of the items in the tail of the range, and we can certainly generate a greater permutation by moving the item towards the end of the range. To find the next permutation, we reverse iterate from the end of the range until we find an item &lt;code&gt;*j&lt;/code&gt; bigger than &lt;code&gt;*i&lt;/code&gt; &amp;#8212; that&amp;#8217;s what the while loop does. Swapping the items pointed to by &lt;code&gt;i&lt;/code&gt; and &lt;code&gt;j&lt;/code&gt; ensures the head of the range is bigger than it was, and the tail of the range remains in reverse order. Finally, we reverse the tail of the range, leaving us with a permutation exactly one beyond the input permutation, and we return &lt;code&gt;true&lt;/code&gt;. 
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#toc7" name="tocwhats-happening-here" id="tocwhats-happening-here"&gt;What&amp;#8217;s happening here?&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;It&amp;#8217;s clear from this paper analysis that the algorithm is of linear complexity. Essentially, it walks up and down the tail of the list, comparing and swapping. But why does it work?
&lt;/p&gt;
&lt;p&gt;Let &lt;code&gt;xs&lt;/code&gt; be the range &lt;code&gt;(first, last)&lt;/code&gt;. As described above, divide this range into prefix and suffix subranges, &lt;code&gt;head&lt;/code&gt; and &lt;code&gt;tail&lt;/code&gt;, where &lt;code&gt;tail&lt;/code&gt; is the longest monotonically decreasing tail of the range.
&lt;/p&gt;
&lt;p&gt;If the &lt;code&gt;head&lt;/code&gt; of the range is empty, then the range &lt;code&gt;xs&lt;/code&gt; is clearly at its lexicographical maximum. 
&lt;/p&gt;
&lt;p&gt;Otherwise, &lt;code&gt;tail&lt;/code&gt; is a lexicographical maximum of the elements it contains, and &lt;code&gt;xs&lt;/code&gt; is therefore the largest permutation which starts with the subrange &lt;code&gt;head&lt;/code&gt;. What will the &lt;code&gt;head&lt;/code&gt; of the next permutation be? We have to swap the final item in &lt;code&gt;head&lt;/code&gt; with the smallest item of &lt;code&gt;tail&lt;/code&gt; which exceeds it: the definition of &lt;code&gt;tail&lt;/code&gt; guarantees at least one such item exists. Now we want to permute the new &lt;code&gt;tail&lt;/code&gt; to be at a its lexicographical minimum, which is a matter of sorting it from low to high.
&lt;/p&gt;
&lt;p&gt;Since &lt;code&gt;tail&lt;/code&gt; is in reverse order, finding the smallest item larger than &lt;code&gt;head[-1]&lt;/code&gt; is a matter of walking back from the end of the range to find the first such items; and once we&amp;#8217;ve swapped these items, &lt;code&gt;tail&lt;/code&gt; remains in reverse order, so a simple reversed will sort it.
&lt;/p&gt;
&lt;p&gt;As an example consider finding the next permutation of:
&lt;/p&gt;
&lt;pre style="font-size:250%;"&gt;
8342666411
&lt;/pre&gt;

&lt;p&gt;The longest monotonically decreasing tail is &lt;code&gt;666411&lt;/code&gt;, and the corresponding head is &lt;code&gt;8342&lt;/code&gt;.
&lt;/p&gt;
&lt;pre style="font-size:250%;"&gt;
8342 666411
&lt;/pre&gt;

&lt;p&gt;&lt;code&gt;666411&lt;/code&gt; is, by definition, reverse-ordered, and cannot be increased by permuting its elements. To find the next permutation, we must increase the head; a matter of finding the smallest tail element larger than the head&amp;#8217;s final &lt;code&gt;2&lt;/code&gt;.
&lt;/p&gt;
&lt;pre style="font-size:250%;"&gt;
834&lt;span style="color:#930"&gt;2&lt;/span&gt; 666411
&lt;/pre&gt;

&lt;p&gt;Walking back from the end of tail, the first element greater than &lt;code&gt;2&lt;/code&gt; is &lt;code&gt;4&lt;/code&gt;.
&lt;/p&gt;
&lt;pre style="font-size:250%;"&gt;
834&lt;span style="color:#930"&gt;2&lt;/span&gt;  666&lt;span style="color:#930"&gt;4&lt;/span&gt;11
&lt;/pre&gt;

&lt;p&gt;Swap the &lt;code&gt;2&lt;/code&gt; and the &lt;code&gt;4&lt;/code&gt;
&lt;/p&gt;
&lt;pre style="font-size:250%;"&gt;
834&lt;span style="color:#930"&gt;4&lt;/span&gt; 666&lt;span style="color:#930"&gt;2&lt;/span&gt;11
&lt;/pre&gt;

&lt;p&gt;Since head has increased, we now have a greater permutation. To reduce to the next permutation, we reverse tail, putting it into increasing order.
&lt;/p&gt;
&lt;pre style="font-size:250%;"&gt;
8344 112666
&lt;/pre&gt;

&lt;p&gt;Join the head and tail back together. The permutation one greater than &lt;code&gt;8342666411&lt;/code&gt; is &lt;code&gt;8344112666&lt;/code&gt;.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#toc8" name="tocbeautiful-c" id="tocbeautiful-c"&gt;Beautiful C++?&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;&lt;a href="http://wordaligned.org/articles/looping-forever-and-ever"&gt;&lt;img  src="http://wordaligned.org/images/mite.jpg" alt="for(;;) dust mite"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;C++ has its &lt;a href="http://yosefk.com/c++fqa/defective.html" title="If you are an expert in the intricacies of C++, please consider this knowledge a kind of martial art - something a real master never uses. Yossi Keinin"&gt;detractors&lt;/a&gt;, who characterise it as subtle, &lt;a href="http://twitter.com/dabeaz/status/5677453478" title="C++0x reminds me of blocks stacked by my toddler. Really wobbly and one block too many makes it topple. @dabeaz"&gt;complex&lt;/a&gt;, and &lt;a href="http://www2.research.att.com/~bs/bs_faq.html#really-say-that" title="C++ can blow your whole leg off. Bjarne Stroustrup"&gt;dangerous&lt;/a&gt;; but sometimes it excels. Look once more at the C++ implementation of this algorithm.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;template&amp;lt;typename Iter&amp;gt;
bool next_permutation(Iter first, Iter last)
{
    if (first == last)
        return false;
    Iter i = first;
    ++i;
    if (i == last)
        return false;
    i = last;
    --i;
    
    for(;;)
    {
        Iter ii = i;
        --i;
        if (*i &amp;lt; *ii)
        {
            Iter j = last;
            while (!(*i &amp;lt; *--j))
            {}
            std::iter_swap(i, j);
            std::reverse(ii, last);
            return true;
        }
        if (i == first)
        {
            std::reverse(first, last);
            return false;
        }
    }
}

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;span /&gt;With its special cases, boolean literals, multiple returns (4, count them!), disembodied and infinite loops, this code fails to exhibit conventional beauty. Yet &lt;em&gt;it is&lt;/em&gt; beautiful. All the next permutation algorithm needs are iterators which can advance forwards or backwards, step by step. And that&amp;#8217;s all this implementation uses.
&lt;/p&gt;
&lt;p&gt;I&amp;#8217;m as excited as anyone by the mathematical rigour of &lt;a href="http://www.informit.com/articles/article.aspx?p=1407357&amp;amp;seqNum=3" title="Great article by Andrei Alexandrescu, which questions a pure Haskell quicksort implementation"&gt;functional programming&lt;/a&gt;, but sometimes computer science is about algorithms with virtually no space overhead, algorithms which loop rather than recurse. Sometimes it&amp;#8217;s about shuffling, nudging and swapping &amp;#8212; operations which map directly to the machine&amp;#8217;s most primitive operations. In such cases, C++ gets it right.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/next-permutation#toc9" name="tocpermutations-in-python" id="tocpermutations-in-python"&gt;Permutations in Python&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;For the code jam, though, as mentioned earlier, having a super-fast program rarely matters. More often, it&amp;#8217;s about developing a fast enough program super-quickly.
&lt;/p&gt;
&lt;p&gt;I find Python a far quicker language for developing code than C++. (Indeed, sometimes when it&amp;#8217;s obvious from the outset that a final program will need implementing in C++, I put together a working prototype using Python, which I then translate.) Could we solve the next number problem using code from the standard Python library?
&lt;/p&gt;
&lt;p&gt;At a first glance, &lt;a href="http://docs.python.org/py3k/library/itertools.html#itertools.permutations"&gt;itertools.permutations&lt;/a&gt; looks promising.
&lt;/p&gt;
&lt;blockquote&gt;&lt;h3&gt;&lt;tt&gt;itertools.permutations(&lt;em&gt;iterable&lt;/em&gt;, &lt;em&gt;r=None&lt;/em&gt;)&lt;/tt&gt;&lt;/h3&gt;&lt;p&gt;Return successive &lt;em&gt;r&lt;/em&gt; length permutations of elements in the &lt;em&gt;iterable&lt;/em&gt;.&lt;/p&gt;&lt;p&gt;If &lt;em&gt;r&lt;/em&gt; is not specified or is &lt;tt&gt;None&lt;/tt&gt;, then &lt;em&gt;r&lt;/em&gt; defaults to the length
of the &lt;em&gt;iterable&lt;/em&gt; and all possible full-length permutations
are generated.&lt;/p&gt;&lt;p&gt;Permutations are emitted in lexicographic sort order.  So, if the input &lt;em&gt;iterable&lt;/em&gt; is sorted, the permutation tuples will be produced in sorted order.&lt;/p&gt;&lt;p&gt;Elements are treated as unique based on their position, not on their value.  So if the input elements are unique, there will be no repeat values in each permutation.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;However, this algorithm doesn&amp;#8217;t care about the values of the items in the iterable, and the lexicographic sort order applies to the indices of these items. So although the ordering of the generated items is well-defined:
&lt;/p&gt;
&lt;ol&gt;
 &lt;li&gt;
     we get repeats, and
 &lt;/li&gt;

 &lt;li&gt;
     it&amp;#8217;s not the ordering we want (in this case)
 &lt;/li&gt;
&lt;/ol&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; from itertools import permutations
&amp;gt;&amp;gt;&amp;gt; concat = ''.join
&amp;gt;&amp;gt;&amp;gt; list(map(concat, permutations('AAA')))
['AAA', 'AAA', 'AAA', 'AAA', 'AAA', 'AAA']
&amp;gt;&amp;gt;&amp;gt; list(map(concat, permutations('231')))
['231', '213', '321', '312', '123', '132']

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;It &lt;strong&gt;is&lt;/strong&gt; possible to code up &lt;code&gt;next_permutation&lt;/code&gt; using nothing more than the standard itertools, but it isn&amp;#8217;t advisable.
&lt;/p&gt;
&lt;div class="typocode"&gt;&lt;div class="codetitle"&gt;Snail permute&lt;/div&gt;

&lt;pre class="prettyprint"&gt;from itertools import permutations, groupby

def next_permutation(xs):
    """Calculate the next permutation of the sequence xs.
    
    Returns a pair (yn, xs'), where yn is a boolean and xs' is the 
    next permutation. If yn is True, xs' will be the lexicographic 
    next permutation of xs, otherwise xs' is the lexicographic 
    smallest permutation of xs.
    """
    xs = tuple(xs)
    if not xs:
        return False, xs
    else:
        ps = [p for p, gp in groupby(sorted(permutations(xs)))]
        np = len(ps)
        ix = ps.index(xs) + 1
        if ix == len(ps):
            return False, ps[0]
        else:
            return True, ps[ix]

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;As it happens, a solution based on this exhaustive search would score points in the code jam since it copes with the small input set. For the large input set its factorial complexity rules it out, and we&amp;#8217;d need to implement the next permutation algorithm &lt;a href="http://code.google.com/codejam/contest/dashboard?c=186264#s=a&amp;amp;a=1"&gt;from scratch&lt;/a&gt;.
&lt;/p&gt;
&lt;hr /&gt;

&lt;p&gt;&lt;a id="fn1" href="http://wordaligned.org/articles/next-permutation#fn1link"&gt;[1]&lt;/a&gt;: A more cunning &lt;a href="http://code.google.com/codejam/contest/dashboard?c=186264#s=a&amp;amp;a=1"&gt;solution&lt;/a&gt; avoids the special case by pushing the extra zero to head of the string before applying &lt;code&gt;next_permutation&lt;/code&gt;, then popping it if it hasn&amp;#8217;t been moved.
&lt;/p&gt;
&lt;p&gt;&lt;a id="fn2" href="http://wordaligned.org/articles/next-permutation#fn2link"&gt;[2]&lt;/a&gt;: I&amp;#8217;ve tweaked the layout and parameter names for use on this site.
&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=ESjRGU30bM8:oLZBTsoLGK0:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=ESjRGU30bM8:oLZBTsoLGK0:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=ESjRGU30bM8:oLZBTsoLGK0:F7zBnMyn0Lo"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=ESjRGU30bM8:oLZBTsoLGK0:F7zBnMyn0Lo" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=ESjRGU30bM8:oLZBTsoLGK0:gIN9vFwOqvQ"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=ESjRGU30bM8:oLZBTsoLGK0:gIN9vFwOqvQ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=ESjRGU30bM8:oLZBTsoLGK0:cGdyc7Q-1BI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?d=cGdyc7Q-1BI" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/ESjRGU30bM8" height="1" width="1"/&gt;</description>
<dc:date>2009-11-19</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/next-permutation</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/ESjRGU30bM8/next-permutation</link>
<category>Puzzles</category>
<category>C++</category>
<category>Algorithms</category>
<category>Python</category>
<category>Google</category>
<feedburner:origLink>http://wordaligned.org/articles/next-permutation</feedburner:origLink></item>

<item>
<title>Python on Ice</title>
<description>&lt;div class="toc"&gt;
&lt;h2&gt;Contents&lt;/h2&gt;
&lt;ul&gt;
 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/antipep#tocpython" name="toc0" id="toc0"&gt;Python?&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/antipep#toctwisted-81-on-python-25" name="toc1" id="toc1"&gt;Twisted 8.1 on Python 2.5&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/antipep#tocpython-3-on-word-aligned" name="toc2" id="toc2"&gt;Python 3 on Word Aligned&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/antipep#tocpython-3-absent-from-europython" name="toc3" id="toc3"&gt;Python 3 absent from Europython&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/antipep#tocpython-3-literature" name="toc4" id="toc4"&gt;Python 3 Literature&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/antipep#tocthe-cost-of-python-3" name="toc5" id="toc5"&gt;The Cost of Python 3&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/antipep#tocevolution-of-python" name="toc6" id="toc6"&gt;Evolution of Python&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/antipep#tocaccepting-python-3" name="toc7" id="toc7"&gt;Accepting Python 3&lt;/a&gt;
 &lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;&lt;blockquote&gt;&lt;p&gt;A moratorium on Python changes is probably a good thing&amp;#8212;the last edition of my book nearly made my head explode. &amp;#8212; &lt;a href="http://twitter.com/dabeaz/status/5055586588"&gt;@dabeaz&lt;/a&gt;
&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/antipep#toc0" name="tocpython" id="tocpython"&gt;Python?&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;&amp;#8220;Python?&amp;#8221;
&lt;/p&gt;
&lt;p&gt;&amp;#8220;Yes, Python. It&amp;#8217;s a &lt;a href="http://wordaligned.org/articles/pitching-python-in-three-syllables"&gt;high-level&lt;/a&gt; language, we used it for the prototype. We can use it for parts of the system where performance isn&amp;#8217;t critical. Connecting components together. The web server.&amp;#8221;
&lt;/p&gt;
&lt;p&gt;&amp;#8220;But what will we do when Python changes? It&amp;#8217;s a developing language, right? How can we maintain our system.&amp;#8221;
&lt;/p&gt;
&lt;p&gt;Not an issue, I explained. Python takes backwards compatibility very seriously. Besides, we choose which version of Python to deploy with, we choose when we migrate &amp;#8212; maybe never. Look, you can &lt;a href="http://python.org/download/releases"&gt;download the source&lt;/a&gt; for every version of Python ever released. All you need is a &lt;a href="http://www.python.org/dev/peps/pep-0007" title="A C89 compiler, in fact. (PEP 7, Style Guide for C Code)"&gt;C compiler&lt;/a&gt;. C is the porting layer, if you like, and C isn&amp;#8217;t going anywhere in a hurry.
&lt;/p&gt;
&lt;p&gt;In all honesty, I expected more maintenance issues with the C++ parts of our product, where the language may not have changed in a decade but &lt;a href="http://wordaligned.org/articles/code-rot"&gt;compilers are only just catching up&lt;/a&gt; with it; and in fact I didn&amp;#8217;t have to argue for long to persuade senior management, not on this issue at least. They&amp;#8217;d already seen how quickly I could get things up and running using Python. Even though the company had more experience with C, C++, Java, and even .Net, I convinced them Python had a role on the server-based system we were developing.
&lt;/p&gt;
&lt;p&gt;Nonetheless, I didn&amp;#8217;t think it the right time to mention Python 3. Why confuse things?
&lt;/p&gt;
&lt;span id="continue-reading"/&gt;


&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/antipep#toc1" name="toctwisted-81-on-python-25" id="toctwisted-81-on-python-25"&gt;Twisted 8.1 on Python 2.5&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;I won&amp;#8217;t go into detail about the product. Data flowed through it, redirected dynamically using tees and filters, and robots were attached to the resulting streams to monitor them. A web UI presented controls and a view of the system. We used C and C++ for managing the bulk of the flow. The robots we coded in both Python and C++. We connected and coordinated everything using &lt;a href="http://twistedmatrix.com" title="Twisted is an event-driven networking engine written in Python"&gt;Twisted&lt;/a&gt;, a Python networking engine. Our initial deployment used Python 2.5 and Twisted 8.1. We&amp;#8217;ve since upgraded to Python 2.6 and Twisted 8.2.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/antipep#toc2" name="tocpython-3-on-word-aligned" id="tocpython-3-on-word-aligned"&gt;Python 3 on Word Aligned&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;At work there was no question of using Python 3 even though it became available when we started development. Twisted hadn&amp;#8217;t been released against Python 3 (&lt;a href="http://stackoverflow.com/questions/172306/how-are-you-planning-on-handling-the-migration-to-python-3/214601#214601"&gt;it still hasn&amp;#8217;t&lt;/a&gt;) and even if it had been, we wouldn&amp;#8217;t have trusted it immediately. Here at &lt;a href="http://wordaligned.org/"&gt;Word Aligned&lt;/a&gt;, though, I switched to Python 3 pretty much as soon as it was officially released. Since the start of 2009, &lt;a href="http://wordaligned.org/articles/perl-6-python-3#tocword-aligned-and-python-30"&gt;any Python code published on this site has been written in Python 3&lt;/a&gt;.
&lt;/p&gt;
&lt;p&gt;Since then I&amp;#8217;ve come to question my decision. I want people to visit my site and I want them to stay long enough to read any code here. Python is perfect because it&amp;#8217;s readable and accessible. Anyone who&amp;#8217;s ever written a program, whatever the  language, can understand Python. But many times I&amp;#8217;ve felt the need to explain my Python 3 code, not to Java, C#, C++ and C users, not even to Perl and Ruby users, but to Python users!
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;Note that in Python 3 &amp;#8230; whereas in Python 2 &amp;#8230; available in Python 3.1 only &amp;#8230; you&amp;#8217;d need to write &amp;#8230; from __future__ import print_function
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;I wouldn&amp;#8217;t have felt the need to say any of this if I&amp;#8217;d stuck with Python 2.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/antipep#toc3" name="tocpython-3-absent-from-europython" id="tocpython-3-absent-from-europython"&gt;Python 3 absent from Europython&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;&lt;a href="http://www.flickr.com/photos/thomasguest/4036266347/" title="Europython 2009 bag code by Thomas Guest, on Flickr"&gt;&lt;img src="http://farm3.static.flickr.com/2700/4036266347_9862579d68.jpg" width="500" height="208" alt="Code on the Europython 2009 bag" /&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;At Europython 2009 I was struck by the absence of Python 3 from the agenda. None of &lt;a href="http://www.europython.eu/talks/timetable/" title="Europython 2009 timetable"&gt;the sessions&lt;/a&gt; covered Python 3, used Python 3, or even mentioned Python 3 (unless you count David Jones&amp;#8217; talk on &lt;a href="http://www.europython.eu/talks/talk_abstracts/index.html#talk74"&gt;Loving Old Versions of Python&lt;/a&gt;). The only Python 3 code I saw appeared on the conference bag; a lightly obfuscated script which printed out the conference destination. Note that &lt;code&gt;print&lt;/code&gt; is being used in a way which works with both 2.x and 3.x &amp;#8212; that is, with parentheses and taking a single parameter. Very few systems resolve &lt;code&gt;/usr/bin/env python&lt;/code&gt; as Python 3, though&lt;a id="fn1link" href="http://wordaligned.org/articles/antipep#fn1"&gt;&lt;sup&gt;[1]&lt;/sup&gt;&lt;/a&gt;, which is lucky since even this simple function raises an exception under Python 3 (and transforming the code using &lt;code&gt;&lt;a href="http://docs.python.org/library/2to3.html"&gt;2to3&lt;/a&gt;&lt;/code&gt; makes it worse).
&lt;/p&gt;
&lt;p&gt;This Python 3 silence was at last broken during the question and answer session which followed the final keynote on the final day of the conference. An audibly nervous member of the audience asks &lt;a href="http://www.python.org/psf"&gt;Python Software Foundation&lt;/a&gt; supremo &lt;a href="http://holdenweb.com"&gt;Steve Holden&lt;/a&gt; a question:
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;Audience member: A source of confusion is the Python 2 Python 3 thing. How are you going about getting people to move from Python 2 to Python 3?
&lt;/p&gt;
&lt;p&gt;Steve Holden: I&amp;#8217;m not trying to get people to move to Python 3. [Audience applauds].
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;Steve Holden went on to round out this answer, saying that 2.6 is the recommended production version of Python. Anyone who took Python 3.0 into production, he said, would have been &amp;#8220;kicked in the teeth by the fact that the IO subsystem performed execrably slowly, it was really dreadful&amp;#8221; &amp;#8212; a fact the 3.0 release notes failed to mention, but which has been &lt;a href="http://docs.python.org/3.1/whatsnew/3.1.html#optimizations"&gt;fixed in Python 3.1&lt;/a&gt;. For teaching purposes, or for greenfield development which doesn&amp;#8217;t need to reuse other people&amp;#8217;s code, by all means try Python 3, he said. Python 3 is the future of Python. There&amp;#8217;s a migration strategy in place.
&lt;/p&gt;
&lt;p&gt;And what about the overhead on the core Python development team, who now have two versions to maintain? Well, Steve Holden said, there are tools to automate patching and merging, but yes, there&amp;#8217;s an overhead.
&lt;/p&gt;
&lt;p&gt;(To hear the question and full response, there&amp;#8217;s &lt;a href="http://blip.tv/file/2351630" title="The Python Software Foundation and Us, Steve Holden, Europython 2009"&gt;a video at blip.tv&lt;/a&gt;. Fast forward to 52 minutes and 40 seconds.)
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/antipep#toc4" name="tocpython-3-literature" id="tocpython-3-literature"&gt;Python 3 Literature&lt;/a&gt;&lt;/h3&gt;
&lt;div class="amazon"&gt;&lt;a href="http://www.amazon.com/Python-Essential-Reference-Developers-Library/dp/0672329786"&gt;&lt;img src="http://wordaligned.org/images/books/python-essential-reference4.png" alt="Python Essential Reference, 4th edition"/&gt;&lt;/a&gt;&lt;/div&gt;
&lt;div class="amazon"&gt;&lt;a href="http://www.amazon.com/gp/product/1430224150?ie=UTF8&amp;amp;tag=diveintomark-20"&gt;&lt;img height="222px" src="http://ecx.images-amazon.com/images/I/51N9HK%2B7WGL._SL300_.jpg" alt="Dive into Python 3 cover"/&gt;&lt;/a&gt;&lt;/div&gt;

&lt;p&gt;How have book authors reacted to Python 3? Mark Pilgrim has dived in with  aplomb. His introductory book, &lt;a href="http://www.diveintopython3.org"&gt;Dive into Python 3&lt;/a&gt;, uses Python 3 as Python 3 was intended. For example, you won&amp;#8217;t find &lt;a href="http://docs.python.org/py3k/library/stdtypes.html#old-string-formatting-operations"&gt;% characters&lt;/a&gt; used in string formatting; {up to date &lt;a href="http://www.python.org/dev/peps/pep-3101/" title="PEP 3101, Advanced String Formatting"&gt;braces&lt;/a&gt; are used exclusively}. It&amp;#8217;s an engaging, painstakingly-written book, and (bonus!) the online version is an object lesson in how to craft HTML.
&lt;/p&gt;
&lt;p&gt;David Beazley attacks the problem in a different way, but then his subject is different. His comprehensive &lt;a href="http://www.amazon.com/Python-Essential-Reference-Developers-Library/dp/0672329786"&gt;Python Essential Reference&lt;/a&gt; aims to cover the core language and its standard library in its entirety: think of it as the reference any serious Python programmer &lt;a href="http://www.dabeaz.com/blog/2009/08/essential-misconceptions.html"&gt;would like to have within reach&lt;/a&gt;. David Beazley&amp;#8217;s approach is to concentrate on the common subset of Python 2 and 3, omitting features of 2 which aren&amp;#8217;t in 3 and avoiding features of 3 which haven&amp;#8217;t been backported to 2. His book succeeds but it does raise some awkward questions. Will Pythonistas find themselves maintaining parallel code-bases, and end up twisting their code until it fits into the intersection of two flavours of the language?
&lt;/p&gt;
&lt;img src="http://chart.apis.google.com/chart?cht=v&amp;amp;chco=ffdd66,33ccff&amp;amp;chs=450x330&amp;amp;chd=t:100,90,100,70,100,70,70&amp;amp;chdl=Python+2|Python+3&amp;amp;chf=bg,s,EFEFEF&amp;amp;chdlp=l&amp;amp;chtt=Safe+Programming+Zone|Python+2+%e2%88%a9+Python+3&amp;amp;chts=333333,24" alt="Safe Python Programming zone"/&gt;

&lt;p&gt;Or will they simply avoid Python 3?
&lt;/p&gt;
&lt;p&gt;David Beazley eventually covers new Python 3 features in an appendix, by which time the strain has started to show:
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;Finally, even though Python 3.0 is described as the latest and greatest, it suffers from numerous performance and behavioral problems [&amp;#8230;] in the opinion of this author, Python 3.0 is really only suitable for experimental use by seasoned Python veterans.
&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/antipep#toc5" name="tocthe-cost-of-python-3" id="tocthe-cost-of-python-3"&gt;The Cost of Python 3&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;David Beazley may have ended up feeling like a Python 3 beta tester, but, as discussed at the start of this article, most Python users have a free choice. We can live a little longer with Python 2 &lt;a href="http://wiki.python.org/moin/PythonWarts"&gt;warts&lt;/a&gt; in exchange for a proven platform and an excellent set of supporting libraries. We can try and write to a language subset. We can use Python 2 and import much of the future. Or we can dive into Python 3.
&lt;/p&gt;
&lt;p&gt;The people who must find the language fork tough are the Python suppliers. Our choice, as consumers, means work for them: we&amp;#8217;ve mentioned the core Python team, who must surely spend more time patching and testing; think of Python library writers (such as the wizards behind Twisted).
&lt;/p&gt;
&lt;p&gt;There&amp;#8217;s another important class of supplier: the people working on alternative Python implementations, the ones which work on &lt;a href="http://www.jython.org"&gt;Java&lt;/a&gt;, or &lt;a href="http://www.codeplex.com/IronPython"&gt;.Net&lt;/a&gt;, the ones which have no global interpreter lock, the ones which can run deeply recursive functions. Pythonistas are understandably excited about &lt;a href="http://code.google.com/p/unladen-swallow/"&gt;Unladen Swallow&lt;/a&gt;, a development branch of CPython 2.6. Just look at the project &lt;a href="http://code.google.com/p/unladen-swallow/wiki/ProjectPlan"&gt;goals&lt;/a&gt;!
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;We want to make Python faster, but we also want to make it easy for large, well-established applications to switch to Unladen Swallow.
&lt;/p&gt;
&lt;ol&gt;
 &lt;li&gt;
     Produce a version of Python at least 5x faster than CPython.
 &lt;/li&gt;

 &lt;li&gt;
     Python application performance should be stable.
 &lt;/li&gt;

 &lt;li&gt;
     Maintain source-level compatibility with CPython applications.
 &lt;/li&gt;

 &lt;li&gt;
     Maintain source-level compatibility with CPython extension modules.
 &lt;/li&gt;

 &lt;li&gt;
     We do not want to maintain a Python implementation forever; we view our work as a branch, not a fork. 
 &lt;/li&gt;
&lt;/ol&gt;
&lt;/blockquote&gt;&lt;p&gt;In summary, if Unladen Swallow touches down safely, it will become CPython, and anyone using CPython will benefit from a high-level language capable of performing at native speeds.
&lt;/p&gt;
&lt;p&gt;I&amp;#8217;d like to highlight the final goal, the one about maintaining, branching and forking. Unladen Swallow is a branch taken from CPython 2.6&lt;a id="fn2link" href="http://wordaligned.org/articles/antipep#fn2"&gt;&lt;sup&gt;[2]&lt;/sup&gt;&lt;/a&gt;. If it succeeds, much hard and unglamorous work will be needed to merge it to the latest CPython 2, and patch it across to CPython 3. Python implementers must pay a high price, in terms of increased workload, for the Python 2, Python 3 fork.
&lt;/p&gt;
&lt;p&gt;Could Python have evolved in a more linear way, by deprecating then removing features, while adding in new ones? I guess not, a mature language wouldn&amp;#8217;t dare break backwards compatibility. 
&lt;/p&gt;
&lt;p&gt;Would it?
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/antipep#toc6" name="tocevolution-of-python" id="tocevolution-of-python"&gt;Evolution of Python&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;My employers were right to characterise Python as a developing language. On the subject of incremental change, I wanted to highlight again the &lt;a href="http://wordaligned.org/articles/python-counters"&gt;evolution of the multiset in Python&lt;/a&gt;, from Python 1.4, released in 2001, through to Python 3.1, just 4 months old.
&lt;/p&gt;
&lt;div class="typocode"&gt;&lt;div class="codetitle"&gt;Evolution of the Multiset in Python&lt;/div&gt;

&lt;pre class="prettyprint"&gt;def multiset_14(xs):
    multiset = {}
    for x in xs:
        if multiset.has_key(x):
            multiset[x] = multiset[x] + 1
        else:
            multiset[x] = 1
    return multiset

def multiset_15(xs):
    multiset = {}        
    for x in xs:
        multiset[x] = multiset.get(x, 0) + 1
    return multiset

import collections

def multiset_25(xs):
    multiset = collections.defaultdict(int)
    for x in xs:
        multiset[x] += 1
    return multiset

def multiset_31(xs):
    return collections.Counter(xs)

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Lovely!
&lt;/p&gt;
&lt;p&gt;&lt;span style="font-size:8px"&gt;&lt;code&gt;multiset_14()&lt;/code&gt; won&amp;#8217;t work in Python 3, and &lt;code&gt;multiset_31()&lt;/code&gt; won&amp;#8217;t work in Python 2 (not yet, anyway).&lt;/span&gt;
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/antipep#toc7" name="tocaccepting-python-3" id="tocaccepting-python-3"&gt;Accepting Python 3&lt;/a&gt;&lt;/h3&gt;
&lt;blockquote&gt;&lt;p&gt;The main goal of the Python development community at this point should be to get widespread acceptance of Python 3000. &amp;#8212; Guido van Rossum, &lt;a href="http://mail.python.org/pipermail/python-ideas/2009-October/006305.html"&gt;2009-10-21&lt;/a&gt;
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;Unlike Steve Holden, Guido van Rossum &lt;strong&gt;is&lt;/strong&gt; trying to get people to move to Python 3, or at least to accept it. &lt;a href="http://mail.python.org/pipermail/python-ideas/2009-October/006305.html"&gt;Here&amp;#8217;s how&lt;/a&gt;:
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;I propose a moratorium on language changes. This would be a period of several years during which no changes to Python&amp;#8217;s grammar or language semantics will be accepted. The reason is that frequent changes to the language cause pain for implementers of alternate implementations (Jython, IronPython, PyPy, and others probably already in the wings) at little or no benefit to the average user (who won&amp;#8217;t see the changes for years to come and might not be in a position to upgrade to the latest version for years after).
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;Wow!
&lt;/p&gt;
&lt;p&gt;I&amp;#8217;m not close enough to Python development to know exactly what&amp;#8217;s involved here, but a scan of the &lt;a href="http://mail.python.org/pipermail/python-ideas/2009-October/thread.html#6305"&gt;email thread&lt;/a&gt; suggests this proposal has been widely accepted. I think it&amp;#8217;s clear from the rest of this article that I sympathise with the motivation behind it. Yet I can&amp;#8217;t help feeling uneasy about putting Python on ice. Yes, there have been changes to the language grammar over the past fifteen years. I wouldn&amp;#8217;t say they&amp;#8217;ve been frequent, and there aren&amp;#8217;t many I&amp;#8217;d want to do without, even if I only get to use them (in production) a year or two after they&amp;#8217;ve been released. Yes, these changes cause pain to implementers&lt;a id="fn3link" href="http://wordaligned.org/articles/antipep#fn3"&gt;&lt;sup&gt;[3]&lt;/sup&gt;&lt;/a&gt;, but that&amp;#8217;s not the whole story. Who said implementing a language would be easy? Perhaps much of the pain comes from implementing the changes twice, once for 2 and once for 3.
&lt;/p&gt;
&lt;p&gt;Soon there&amp;#8217;ll be a &lt;a href="http://www.python.org/dev/peps" title="Python Enhancement Proposals"&gt;PEP&lt;/a&gt; stating more formally what exactly a moratorium on language changes will mean. That is,
&lt;/p&gt;
&lt;p&gt;A &lt;strong&gt;P&lt;/strong&gt;ython &lt;strong&gt;E&lt;/strong&gt;nhancement &lt;strong&gt;P&lt;/strong&gt;roposal which &lt;strong&gt;P&lt;/strong&gt;roposes: Stop &lt;strong&gt;E&lt;/strong&gt;nhancing &lt;strong&gt;P&lt;/strong&gt;ython!
&lt;/p&gt;
&lt;hr /&gt;

&lt;p&gt;&lt;a id="fn1" href="http://wordaligned.org/articles/antipep#fn1link"&gt;[1]&lt;/a&gt; I wonder if anyone can guess why this code fails, just by looking at it? As a hint, highlight the rest of this paragraph. &lt;span style="color:white"&gt;Some Python 2 codecs are byte rather than text oriented, and Python 3 prohibits this kind of confusion.&lt;/span&gt;
&lt;/p&gt;
&lt;p&gt;&lt;a id="fn2" href="http://wordaligned.org/articles/antipep#fn2link"&gt;[2]&lt;/a&gt; More details of the Unladen Swallow branch approach. 
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;In order to achieve our combination of performance and compatibility goals, we opt to modify CPython, rather than start our own implementation from scratch. In particular, we opt to start working on CPython 2.6.1: Python 2.6 nestles nicely between 2.4/2.5 (which most interesting applications are using) and 3.x (which is the eventual future). Starting from a CPython release allows us to avoid reimplementing a wealth of built-in functions, objects and standard library modules, and allows us to reuse the existing, well-used CPython C extension API. Starting from a 2.x CPython release allows us to more easily migrate existing applications; if we were to start with 3.x, and ask large application maintainers to first port their application, we feel this would be a non-starter for our intended audience. 
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;&lt;a id="fn3" href="http://wordaligned.org/articles/antipep#fn3link"&gt;[3]&lt;/a&gt;: C++ compiler writers, gear up for C++0x!
&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=yHwk_vQgrg8:k0Dzsxc43Hs:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=yHwk_vQgrg8:k0Dzsxc43Hs:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=yHwk_vQgrg8:k0Dzsxc43Hs:F7zBnMyn0Lo"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=yHwk_vQgrg8:k0Dzsxc43Hs:F7zBnMyn0Lo" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=yHwk_vQgrg8:k0Dzsxc43Hs:gIN9vFwOqvQ"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=yHwk_vQgrg8:k0Dzsxc43Hs:gIN9vFwOqvQ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=yHwk_vQgrg8:k0Dzsxc43Hs:cGdyc7Q-1BI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?d=cGdyc7Q-1BI" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/yHwk_vQgrg8" height="1" width="1"/&gt;</description>
<dc:date>2009-10-28</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/antipep</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/yHwk_vQgrg8/antipep</link>
<category>Python</category>
<feedburner:origLink>http://wordaligned.org/articles/antipep</feedburner:origLink></item>

<item>
<title>Steady on Subversion</title>
<description>&lt;p&gt;&lt;a href="http://subversion.tigris.org"&gt;&lt;img src="http://subversion.tigris.org/images/subversion_logo_hor-468x64.png" width="468px" height="64px" alt="Subversion banner"/&gt;&lt;/a&gt;
&lt;/p&gt;

&lt;h3&gt;My secret shame&lt;/h3&gt;
&lt;p&gt;In a world of distributed version control systems I&amp;#8217;m ashamed to confess I still use &lt;a href="http://subversion.tigris.org"&gt;Subversion&lt;/a&gt;. We use it at work, exclusively. I use it at home, by default. Worse still, in a small way, I help promote and perpetuate this antiquated version control system: if you want to &lt;a href="http://www.google.com/search?q=mirror+subversion+repository"&gt;mirror a Subversion repository&lt;/a&gt; or set up a &lt;a href="http://www.google.com/search?q=subversion+pre-commit+hook"&gt;Subversion pre-commit hook&lt;/a&gt;, you may well find some faded notes I wrote on these subjects.
&lt;/p&gt;
&lt;p&gt;Whisper the words. &lt;span style="font-size:8px"&gt;I still like Subversion.&lt;/span&gt;
&lt;/p&gt;
&lt;p&gt;What do I like most? The command to revert a change. Merge it backwards.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;$ svn merge --change -666

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Branches and tags are one and the same. For someone who grew up with &lt;a href="http://www.nongnu.org/cvs/"&gt;CVS&lt;/a&gt;, that&amp;#8217;s quite a relief. Anyone can grasp the versioned file tree model. No one wants their version control system to surprise them. My boss, who doesn&amp;#8217;t get to program as much as he&amp;#8217;d like, has discovered a &lt;a href="http://versionsapp.com/"&gt;shiny Subversion client&lt;/a&gt; &amp;#8212; and he doesn&amp;#8217;t even use Windows. The sales team, who do use Windows, can use Subversion to collaborate on their office documents. &lt;a href="http://tortoisesvn.tigris.org/"&gt;TortoiseSVN&lt;/a&gt; interfaces to the Word diff tool, a nice touch. And software developers can surely find stable Subversion plug-ins for whatever tools they use.
&lt;/p&gt;
&lt;p&gt;The &lt;a href="http://svnbook.red-bean.com/"&gt;Subversion documentation&lt;/a&gt; is solid and has been for some while. I&amp;#8217;m surprised anyone ever arrives at my website &lt;a href="http://wordaligned.org/tag/subversion"&gt;seeking tips&lt;/a&gt;, but arrive they do, and in ever-increasing numbers.
&lt;/p&gt;
&lt;p&gt;Subversion does enough. The hard parts of my job are deciding what software to write, writing it, and working as a team. &lt;span /&gt;Version control should be frictionless, the easy bit. Which it is.
&lt;/p&gt;
&lt;span id="continue-reading"/&gt;

&lt;p&gt;Of course Subversion has weak points. It should be faster (whee, see how &lt;a href="http://git-scm.com"&gt;git&lt;/a&gt; flies!) and merging can be irksome (improving on CVS wasn&amp;#8217;t much of a target). But the biggest annoyance I&amp;#8217;ve had with Subversion is caused by its ubiquity and its continuing upgrade trajectory. Somehow I&amp;#8217;ve ended up accessing 1.4 and 1.5 format repositories on a machine which hosts 1.4, 1.5 and 1.6 clients in &lt;code&gt;/usr/local/bin&lt;/code&gt;, &lt;code&gt;/usr/bin&lt;/code&gt; and &lt;code&gt;/opt/local/bin&lt;/code&gt;, not necessarily in that order. Silly me, I&amp;#8217;m sorted now, I think, but I&amp;#8217;d happily see Subversion go into maintenance mode. &lt;span /&gt;For managing change, give me stable software.
&lt;/p&gt;

&lt;h3&gt;Do it the same, but better!&lt;/h3&gt;
&lt;p&gt;As mentioned at the start of this post, though, the world of version control has itself changed. Subversion represents evolution: by being a better CVS, it aimed to supplant its ancestor and become the VCS of choice for open source projects. CVS has indeed been supplanted, but true progress has come from the distributed version control revolution.
&lt;/p&gt;
&lt;p&gt;We&amp;#8217;ve been talking about a single, central source tree which develops in discrete steps. Everyone has a local working copy of the files in this tree, which they keep up to date, routinely merging changes back to base. Check out, check in. It &lt;strong&gt;is&lt;/strong&gt; an easy model to understand, but in practice there can be problems. What happens when you can&amp;#8217;t access the tree? Or when it gets pulled in different directions? Or when you lose track of who merged what where when? Now consider the distributed version control world, where the model extends to multiple trees. Everyone copies the entire repository as needed. Clone, merge.
&lt;/p&gt;
&lt;p&gt;In this distributed world a project needn&amp;#8217;t have a single, central repository&lt;a id="fn1link" href="http://wordaligned.org/articles/steady-on-subversion#fn1"&gt;&lt;sup&gt;[1]&lt;/sup&gt;&lt;/a&gt;. What&amp;#8217;s more, there is no single leading distributed version control system. As a result, open source projects are spoiled for choice.
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://git-scm.com/"&gt;&lt;img src="http://wordaligned.org/images/git-logo.png" alt="git logo"/&gt;&lt;/a&gt;&lt;a href="http://bazaar-vcs.org/"&gt;&lt;img src="http://bazaar-vcs.org/bzricons/bazaar-logo.png" alt="Bazaar logo"/&gt;&lt;/a&gt;&lt;a href="http://mercurial.selenic.com"&gt;&lt;img src="http://www.selenic.com/hg-logo/logo-droplets-100.png" alt="Mercurial logo"/&gt;&lt;/a&gt;&lt;a href="http://www.darcs.net"&gt;&lt;img src="http://www.darcs.net/logos/logo.png" alt="Darcs logo"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;The rise of the DVCS is a fascinating history, though one I&amp;#8217;ve yet to directly engage with &amp;#8212; unless you count the growing collection of DVCSes taking root on my hard disk (none of which shipped with my &lt;a href="http://www.apple.com/macosx/" title="Snow Leopard"&gt;operating system&lt;/a&gt;). I like the feel of &lt;a href="http://git-scm.com"&gt;git&lt;/a&gt;. Python will &lt;a href="http://python.org/dev/peps/pep-0385/" title="PEP 385. Migrating from svn to Mercurial"&gt;migrate to mercurial&lt;/a&gt;. For now, I&amp;#8217;m staying put.
&lt;/p&gt;

&lt;h3&gt;Definitive commentary&lt;/h3&gt;
&lt;p&gt;&lt;a href="http://www.red-bean.com/sussman/"&gt;Ben Collins-Sussman&lt;/a&gt; is one of the original designers and developers of Subversion, and co-author of &lt;a href="http://svnbook.red-bean.com/"&gt;Version Control with Subversion&lt;/a&gt;. His essays on the changing field of version control make fine reading. A couple of years ago he wrote:
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;Today, Subversion has now gone from &amp;#8220;cool subversive product&amp;#8221; to &amp;#8220;the default safe choice&amp;#8221; for both 80% and 20% audiences. The 80% companies who were once using crappy version control (or no version control at all) are now blogging to one another &amp;#8212; web developers giving &amp;#8220;hot tips&amp;#8221; to each other about using version control (and Subversion in particular) to manage their web sites at their small web-development shops. What was once new and hot to 20% people has finally trickled down to everyday-tool status among the 80%.
&lt;/p&gt;
&lt;p&gt;The great irony here &amp;#8230; is that Subversion was originally intended to subvert the open source world. It&amp;#8217;s done that to a reasonable degree, but it&amp;#8217;s proven far more subversive in the corporate world!
&lt;/p&gt;
&lt;p&gt;&amp;#8212; Ben Collins-Sussman, &lt;a href="http://blog.red-bean.com/sussman/?p=79" title="Version Control and the 80%"&gt;Version Control and &amp;#8220;the 80%&amp;#8221;&lt;/a&gt;, October 2007
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;In April last year he followed up with:
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;[&amp;#8230;] we think that [Subversion] will probably be the &amp;#8220;final&amp;#8221; centralized system that gets written in the open source world &amp;#8212; it represents the end-of-the-line for this model of code collaboration. 
&lt;/p&gt;
&lt;p&gt;It will continue to be used for many years, but specifically it will gain huge mindshare in the corporate world, while (eventually) losing mindshare to distributed systems in the open-source arena &amp;#8230; Subversion isn&amp;#8217;t anywhere near &amp;#8220;fading away&amp;#8221;. Quite the opposite: its adoption is still growing quadratically in the corporate world, with no sign of slowing down. This is happening independently of open source trailblazers losing interest in it. It may end up becoming a mainly &amp;#8220;corporate&amp;#8221; open source project (that is, all development funded by corporations that depend on it), but that&amp;#8217;s a fine way for a piece of mature software to settle down.
&lt;/p&gt;
&lt;p&gt;&amp;#8212; Ben Collins-Sussman, &lt;a href="http://blog.red-bean.com/sussman/?p=90"&gt;Subversion&amp;#8217;s Future&lt;/a&gt;, April 2008
&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;Long live Subversion&lt;/h3&gt;
&lt;p&gt;Ben Collins-Sussman backs up his essay with &lt;a href="http://subversion.tigris.org/images/svn-dav-securityspace-survey.png"&gt;a graph&lt;/a&gt; showing the increasing numbers of Apache Subversion servers discoverable on the internet. His claims square with my personal experience. I&amp;#8217;m a corporate Subversion user and I don&amp;#8217;t see my employer switching version control systems any time soon (it&amp;#8217;s my decision as much as anyone&amp;#8217;s). What&amp;#8217;s more, Subversion is used in most of the companies I know of, where it has supplanted both legacy and proprietary systems. As stated already, version control isn&amp;#8217;t the hard part of my job, but should I ever need to change jobs, Subversion won&amp;#8217;t stand in my way.
&lt;/p&gt;
&lt;hr /&gt;

&lt;p&gt;&lt;a id="fn1" href="http://wordaligned.org/articles/steady-on-subversion#fn1link"&gt;[1]&lt;/a&gt;: A project may well choose to nominate a single central repository as the &amp;#8220;master&amp;#8221; repository. The functionality offered by distributed version control systems is effectively a superset of that offered by centralised ones.
&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=MLYzlslh92Y:Tmcd2Y8waOE:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=MLYzlslh92Y:Tmcd2Y8waOE:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=MLYzlslh92Y:Tmcd2Y8waOE:F7zBnMyn0Lo"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=MLYzlslh92Y:Tmcd2Y8waOE:F7zBnMyn0Lo" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=MLYzlslh92Y:Tmcd2Y8waOE:gIN9vFwOqvQ"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=MLYzlslh92Y:Tmcd2Y8waOE:gIN9vFwOqvQ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=MLYzlslh92Y:Tmcd2Y8waOE:cGdyc7Q-1BI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?d=cGdyc7Q-1BI" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/MLYzlslh92Y" height="1" width="1"/&gt;</description>
<dc:date>2009-10-13</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/steady-on-subversion</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/MLYzlslh92Y/steady-on-subversion</link>
<category>Subversion</category>
<feedburner:origLink>http://wordaligned.org/articles/steady-on-subversion</feedburner:origLink></item>

<item>
<title>Favicon</title>
<description>&lt;p&gt;On the subject of &lt;a href="http://wordaligned.org/"&gt;this site&lt;/a&gt;, I wanted to mention the recent addition of a favicon &lt;img src="http://wordaligned.org/images/favicon.png" alt="Little chap favicon"/&gt;. Per pixel, it&amp;#8217;s cost me more effort than any other feature; but then it&amp;#8217;s accessed more than any other asset. It&amp;#8217;s meant to be a piece from a jigsaw puzzle. I got the idea when &lt;a href="http://wordaligned.org/articles/recursive-pictures"&gt;re-reading Life A User&amp;#8217;s Manual&lt;/a&gt;. I like &lt;a href="http://wordaligned.org/tag/puzzles"&gt;puzzles&lt;/a&gt; and piecing things together.
&lt;/p&gt;
&lt;div class="amazon"&gt;&lt;a href="http://www.amazon.com/gp/product/0002719991?ie=UTF8&amp;amp;tag=wordalig-20"&gt;&lt;img src="http://wordaligned.org/images/books/life-a-users-manual.jpg" alt="Life A User's Manual"/&gt;&lt;/a&gt;&lt;/div&gt;

&lt;p&gt;Perec&amp;#8217;s great masterpiece is packed with interwoven stories and trickery, but at its heart is the epic battle between the millionaire, Bartlebooth, and the puzzle-maker Gaspard Winckler. Bartlebooth begins his campaign by learning how to paint, which takes him 10 years. For the next 20 years he travels the world, painting a water colour picture of a different port every couple of weeks. He sends the paintings back home to Paris. On receipt, Winckler glues each picture to a board which he then cuts, making a series of jigsaw puzzles for Bartlebooth to solve on his return. Once Bartlebooth completes each puzzle, an ingenious process is used to glue its pieces together and re-join the cut fibres of the paper; then the picture itself is lifted from the board, returned to the port it depicts, and washed clean in the sea; and finally the paper is returned in something close to its original state to Bartlebooth.
&lt;/p&gt;
&lt;p&gt;Thus, after 50 years of work, there will be nothing to show.
&lt;/p&gt;
&lt;img src="http://wordaligned.org/images/jigsaw-fr.png" alt="French jigsaw pieces"/&gt;

&lt;p&gt;In the book&amp;#8217;s preamble Perec describes familiar die-cut jigsaws, classifying the best known pieces of such puzzles as &amp;#8220;little chaps&amp;#8221;, &amp;#8220;double crosses&amp;#8221; and &amp;#8220;crossbars&amp;#8221;. Such diversions are eschewed by the true puzzler:
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;The art of jigsaw puzzling begins with wooden puzzles cut by hand, whose maker undertakes to ask himself all the questions the player will have to solve, and, instead of allowing chance to cover his tracks, aims to replace it with cunning, trickery and subterfuge. All the elements occurring in the image to be reassembled &amp;#8212; this armchair covered in gold brocade, that three-pointed black hat with its rather ruined black plume, or that silver-braided bright yellow livery &amp;#8212; serve by design as points of departure for trails that lead to false information.
&lt;/p&gt;
&lt;/blockquote&gt;&lt;hr /&gt;

&lt;p&gt;My thanks to Tim Beard for scanning a couple of pages from his edition of &lt;a href="http://en.wikipedia.org/wiki/La_Vie_mode_d%27emploi" title="Life A User's Manual, Wikipedia"&gt;La Vie, Mode d&amp;#8217;Emploi&lt;/a&gt;. I wanted to know what the &amp;#8220;little chaps&amp;#8221; etc. were before they got translated into English. I realise my favicon &lt;img src="http://wordaligned.org/images/favicon.png" alt="Little chap favicon"/&gt; could be &lt;a href="http://typophile.com/node/60577" title="Astonishing exploded view of improved YouTube favicon"&gt;improved&lt;/a&gt; but I don&amp;#8217;t know how to go about it. Anyone?
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://typophile.com/node/60577" title="YouTube Favicon - improved"&gt;&lt;img src="http://wordaligned.org/images/youtube1.png" width="52px" height="26px" alt="YouTube Favicon - improved"/&gt;&lt;/a&gt;&lt;a href="http://typophile.com/node/60577" title="YouTube Favicon - improved"&gt;&lt;img src="http://wordaligned.org/images/youtube2.png" width="104px" height="52px" alt="YouTube Favicon - improved"/&gt;&lt;/a&gt;&lt;a href="http://typophile.com/node/60577" title="YouTube Favicon - improved"&gt;&lt;img src="http://wordaligned.org/images/youtube3.png" width="208px" height="104px" alt="YouTube Favicon - improved"/&gt;&lt;/a&gt;
&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=meZDMcEn2Cg:FUazHg5jTzc:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=meZDMcEn2Cg:FUazHg5jTzc:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=meZDMcEn2Cg:FUazHg5jTzc:F7zBnMyn0Lo"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=meZDMcEn2Cg:FUazHg5jTzc:F7zBnMyn0Lo" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=meZDMcEn2Cg:FUazHg5jTzc:gIN9vFwOqvQ"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=meZDMcEn2Cg:FUazHg5jTzc:gIN9vFwOqvQ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=meZDMcEn2Cg:FUazHg5jTzc:cGdyc7Q-1BI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?d=cGdyc7Q-1BI" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/meZDMcEn2Cg" height="1" width="1"/&gt;</description>
<dc:date>2009-09-16</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/favicon</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/meZDMcEn2Cg/favicon</link>
<category>Self</category>
<category>Perec</category>
<category>Puzzles</category>
<feedburner:origLink>http://wordaligned.org/articles/favicon</feedburner:origLink></item>

<item>
<title>Code Rot</title>
<description>&lt;div class="toc"&gt;
&lt;h2&gt;Contents&lt;/h2&gt;
&lt;ul&gt;
 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/code-rot#tocdvbcodec-fail" name="toc0" id="toc0"&gt;Dvbcodec Fail&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/code-rot#toccode-rot" name="toc1" id="toc1"&gt;Code Rot&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/code-rot#tocstandard-c" name="toc2" id="toc2"&gt;Standard C++&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/code-rot#tocsupport-rot" name="toc3" id="toc3"&gt;Support Rot&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/code-rot#tochow-did-this-ever-compile" name="toc4" id="toc4"&gt;How did this ever compile?&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/code-rot#tocboost-advances" name="toc5" id="toc5"&gt;Boost advances&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/code-rot#tocchanging-behaviour" name="toc6" id="toc6"&gt;Changing behaviour&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/code-rot#toccode-inaction" name="toc7" id="toc7"&gt;Code inaction&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/code-rot#tocrotten-artefacts" name="toc8" id="toc8"&gt;Rotten artefacts&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/code-rot#tocstopping-the-rot" name="toc9" id="toc9"&gt;Stopping the rot&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/code-rot#tocthanks" name="toc10" id="toc10"&gt;Thanks&lt;/a&gt;
 &lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;&lt;blockquote&gt;&lt;p&gt;Those of us who have to tiptoe around non-standard or ancient compilers will know that template template parameters are off limits. 
&lt;/p&gt;
&lt;p&gt;&amp;#8212; &lt;a href="http://www.oxyware.com/CheckedInt.pdf" title="CheckedInt: A policy-based range-checked integer, Hubert Matthews"&gt;Hubert Matthews (PDF)&lt;/a&gt;
&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/code-rot#toc0" name="tocdvbcodec-fail" id="tocdvbcodec-fail"&gt;Dvbcodec Fail&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Long ago, way back in 2004, I wrote an &lt;a href="http://wordaligned.org/docs/dvbcodec/index.html"&gt;article&lt;/a&gt; for &lt;a href="http://accu.org/index.php/journals/241" title="A Mini-project to decode a Mini-Language, Thomas Guest"&gt;Overload&lt;/a&gt; describing how to use the &lt;a href="http://www.boost.org/doc/libs/1_39_0/libs/spirit/index.html" title="Boost Spirit C++ parser framework"&gt;Boost Spirit&lt;/a&gt; parser framework to generate C++ code which could convert structured binary data to text. I went on to republish this article on my own website, where I also included a source distribution.
&lt;/p&gt;
&lt;p&gt;Much has changed since then. The C++ language may not have, but compiler and platform support for it has improved considerably. Boost survives &amp;#8212; indeed, many of its libraries will feed into the next version of C++. Overload thrives, adapting to an age when printed magazines about programming are all but extinct. My old website proved less durable: I&amp;#8217;ve changed domain name and shuffled things around more than once. But you can still find the article online if you look hard enough, and recently someone did indeed find it. He, let&amp;#8217;s call him Rick, downloaded the source code archive, &lt;a href="http://wordaligned.org/docs/dvbcodec/dvbcodec-1.0.zip" title="Rotten dvbcodec source distribution"&gt;dvbcodec-1.0.zip&lt;/a&gt;, extracted it, scanned the README, typed:
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;$ make

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&amp;#8230; and discovered the code didn&amp;#8217;t even build.
&lt;/p&gt;
&lt;p&gt;At this point many of us would assume (correctly) the code had not been maintained. We&amp;#8217;d delete it and write off the few minutes it took to evaluate it. Rick decided instead to contact me and let me know my code was broken. He even offered a fix for one problem.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/code-rot#toc1" name="toccode-rot" id="toccode-rot"&gt;Code Rot&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Sad to say, I wasn&amp;#8217;t entirely surprised. I no longer use this code. Unused code stops working. It decays.
&lt;/p&gt;
&lt;p&gt;I&amp;#8217;m not talking about a compiled executable, which the compiler has tied to a particular platform, and which therefore progressively degrades as the platform advances. (I&amp;#8217;ve heard stories about device drivers for which the source code has long been lost, and which require ever more elaborate emulation shims to keep them alive.) I&amp;#8217;m talking about source code. And the decay isn&amp;#8217;t usually literal, though I suppose you might have a source listing on a mouldy printout, or an unreadable floppy disk.
&lt;/p&gt;
&lt;p&gt;No, the code itself is usually a pristine copy of the original. Publishers often attach checksums to source distributions so readers can verify their download is correct. I hadn&amp;#8217;t taken this precaution with my &lt;code&gt;dvbcodec-1.0.zip&lt;/code&gt; but I&amp;#8217;m certain the version Rick downloaded was exactly the same as the one I created 5 years ago. Yet in that time it had stopped working. Why?
&lt;/p&gt;
&lt;span id="continue-reading"/&gt;


&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/code-rot#toc2" name="tocstandard-c" id="tocstandard-c"&gt;Standard C++&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;As already mentioned, this was C++ code. C++ is backed by an ISO standard, ratified in 1998, with corrigenda published in 2003. You might expect C++ code to improve with age, compiling and running more quickly, less likely to run out of resources.
&lt;/p&gt;
&lt;p&gt;Not so. My favourite counter-example comes from a nice paper &lt;a href="http://www.oxyware.com/CheckedInt.pdf" title="CheckedInt: A policy-based range-checked integer, Hubert Matthews"&gt;&amp;#8220;CheckedInt: A policy-based range-checked integer&amp;#8221; (PDF)&lt;/a&gt; published by Hubert Matthews in 2004 which discusses how to use C++ templates to implement a range-checked integer. The paper includes a source code listing together with some notes to help readers forced to &amp;#8220;tiptoe around non-standard or ancient compilers&amp;#8221; (think: MSVC6). Yet when I experimented with this code in 2005 I found myself tripped up by a strict and up-to-date compiler.
&lt;/p&gt;
&lt;pre&gt;
$ g++ -Wall -c checked_int.cpp
checked_int.cpp: In constructor `CheckedInt&amp;lt;low, high, ValueChecker&amp;gt;::CheckedInt(int)':
checked_int.cpp:45: error: there are no arguments to `RangeCheck' that
depend on a template parameter, so a declaration of `RangeCheck' must
be available
checked_int.cpp:45: error: (if you use `-fpermissive', G++ will accept
your code, but allowing the use of an undeclared name is deprecated)
&lt;/pre&gt;

&lt;p&gt;I emailed Hubert Matthews using the address included at the top of his paper. He swiftly and kindly put me straight on how to fix the problem.
&lt;/p&gt;
&lt;p&gt;What&amp;#8217;s interesting here is that this code is pure C++, just over a page of it. It has no dependencies on third party libraries. Hubert Matthews is a C++ expert and he acknowledges the help of two more experts, &lt;a href="http://erdani.org" title="Author of Modern C++ and coauthor of C++ Coding Standards"&gt;Andrei Alexandrescu&lt;/a&gt; and &lt;a href="http://curbralan.com" title="Programming guru"&gt;Kevlin Henney&lt;/a&gt;, in his paper. Yet the code fails to build using both ancient and modern compilers. In its published form it has the briefest of shelf-lives.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/code-rot#toc3" name="tocsupport-rot" id="tocsupport-rot"&gt;Support Rot&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Code alone is of limited use. What really matters for its ongoing health is that someone cares about it &amp;#8212; someone exercises, maintains and supports it. Hubert Matthews included an email address in his paper and I was able to contact him using that address.
&lt;/p&gt;
&lt;p&gt;How well would my code shape up on this front? Putting myself in Rick&amp;#8217;s position, I unzipped the source distribution I&amp;#8217;d archived 5 years ago. I was pleased to find a README which, at the very top, provides a URL for updates, &lt;a href="http://homepage.ntlworld.com/thomas.guest"&gt;http://homepage.ntlworld.com/thomas.guest&lt;/a&gt;. I was less pleased to find this URL gave me a &lt;strong&gt;404 Not Found&lt;/strong&gt; error. Similarly, when I tried emailling the project maintainer mentioned in the README, I got a &lt;strong&gt;550 Invalid recipient&lt;/strong&gt; error: the attempted delivery to &lt;a href="mailto:thomas.guest@ntlworld.com"&gt;thomas.guest@ntlworld.com&lt;/a&gt; had failed permanently.
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://homepage.ntlworld.com/thomas.guest"&gt;&lt;img src="http://wordaligned.org/images/ntlworld-404.png" alt="NTL World 404" width="520px" height="400px"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://www.w3.org/Provider/Style/URI" title="Tim Berners-Lee's classic advice on creating stable links"&gt;Cool URIs don&amp;#8217;t change&lt;/a&gt; but my old NTL homepage was anything but cool; it came for free with a dial-up connection I&amp;#8217;ve happily since abandoned. Looking back, maybe I should have found a more stable location for my code. If I&amp;#8217;d set up (e.g.) a Sourceforge project then my &lt;code&gt;dvbcodec&lt;/code&gt; project might still be alive and supported, possibly even by a new maintainer.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/code-rot#toc4" name="tochow-did-this-ever-compile" id="tochow-did-this-ever-compile"&gt;How did this ever compile?&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Wise hindsights wouldn&amp;#8217;t resurrect my code. If I wanted to continue I&amp;#8217;d have to go it alone. Here&amp;#8217;s what the README had to say about platform requirements.
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;   REQUIREMENTS and PLATFORMS
&lt;/p&gt;
&lt;p&gt;   To build the dvbcodec you will need Version 1.31.0 of Boost, or later.
&lt;/p&gt;
&lt;p&gt;   You will also need a good C++ compiler. The dvbcodec has been built and
      tested on the Windows operating system using: GCC 3.3.1, MSVC 7.1
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;A &amp;#8220;good C++ compiler&amp;#8221;, eh? As we&amp;#8217;ve already seen, GCC 3.3.1 may be good but my platform has GCC 4.0.1 installed, which is better. If my records can be believed, this &lt;code&gt;upperCase()&lt;/code&gt; function compiled cleanly using both GCC 3.3.1 and MSVC 7.1.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;std::string
upperCase(std::string const &amp;amp; lower)
{
    std::string upper = lower;
    
    for (std::string&amp;lt;char&amp;gt;::iterator cc = upper.begin();
         cc != upper.end(); ++cc)
    {
        * cc = std::toupper(* cc);
    }
    
    return upper;
}

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Huh? &lt;code&gt;Std::string&lt;/code&gt; is a typedef for &lt;code&gt;std::basic_string&amp;lt;char&amp;gt;&lt;/code&gt; and there&amp;#8217;s no such thing as a &lt;code&gt;std::basic_string&amp;lt;char&amp;gt;&amp;lt;char&amp;gt;::iterator&lt;/code&gt;, which is what GCC 4.0.1 says:
&lt;/p&gt;
&lt;pre&gt;
stringutils.cpp:58: error: 'std::string' is not a template
&lt;/pre&gt;

&lt;p&gt;The simple fix is to write &lt;code&gt;std::string::iterator&lt;/code&gt; instead of &lt;code&gt;std::string&amp;lt;char&amp;gt;::iterator&lt;/code&gt;. A better fix, suggested by Rick, is to use &lt;code&gt;std::transform()&lt;/code&gt;. I wonder why I missed this first time round?
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;std::string
upperCase(std::string const &amp;amp; lower)
{
    std::string upper = lower;
    std::transform(upper.begin(), upper.end(), upper.begin(), ::toupper);
    return upper;
}

&lt;/pre&gt;

&lt;/div&gt;


&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/code-rot#toc5" name="tocboost-advances" id="tocboost-advances"&gt;Boost advances&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;GCC has become stricter about what it accepts even though the formal specification of what it should do (the C++ standard) has stayed put. The Boost C++ libraries have more freedom to evolve, and the next round of build problems I encountered relate to Boost.Spirit&amp;#8217;s evolution. Whilst it would be possible to require dvbcodec users to build against Boost 1.31 (which can still be downloaded from the &lt;a href="http://www.boost.org"&gt;Boost website&lt;/a&gt;) it wouldn&amp;#8217;t be reasonable. So I updated my machine (using Macports) to make sure I had an up to date version of Boost, 1.38 at the time of writing.
&lt;/p&gt;
&lt;pre&gt;
$ sudo port upgrade boost
&lt;/pre&gt;

&lt;p&gt;Boost&amp;#8217;s various dependencies triggered an upgrade of boost-jam, gperf, libiconv, ncursesw, ncurses, gettext, zlib, bzip2, and this single command took over an hour to complete.
&lt;/p&gt;
&lt;p&gt;I discovered that Boost.Spirit, the C++ parser framework on which &lt;code&gt;dvbcodec&lt;/code&gt; is based, has gone through an overhaul. According to the change log the flavour of Spirit used by &lt;code&gt;dvbcodec&lt;/code&gt; is now known respectfully as Spirit Classic. A clever use of namespaces and include path forwarding meant my &amp;#8220;classic&amp;#8221; client code would at least compile, at the expense of some deprecation warnings.
&lt;/p&gt;
&lt;pre&gt;
Computing dependencies for decodeout.cpp...
Compiling decodeout.cpp...
In file included from codectypedefs.hpp:11,
                 from decodecontext.hpp:10,
                 from decodeout.cpp:8:
/opt/local/include/boost/spirit/tree/ast.hpp:18:4: warning: #warning "This header is deprecated. Please use: boost/spirit/include/classic_ast.hpp"
In file included from codectypedefs.hpp:12,
                 from decodecontext.hpp:10,
                 from decodeout.cpp:8:
&lt;/pre&gt;

&lt;p&gt;To suppress these warnings I included the preferred header. I then had to change namespace directives from &lt;code&gt;boost::spirit&lt;/code&gt; to &lt;code&gt;boost::spirit::classic&lt;/code&gt;. I fleetingly considered porting my code to Spirit V2, but decided against it: for even after this first round of changes, I still had a build problem.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/code-rot#toc6" name="tocchanging-behaviour" id="tocchanging-behaviour"&gt;Changing behaviour&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Actually, this was a second level build problem. The &lt;code&gt;dvbcodec&lt;/code&gt; build has multiple phases:
&lt;/p&gt;
&lt;ol&gt;
 &lt;li&gt;
     it builds a program to generate code. This generator can parse binary format syntax descriptions and emit C++ code which will convert data formatted according to these descriptions
 &lt;/li&gt;

 &lt;li&gt;
     it runs this generator with the available syntax descriptions as inputs
 &lt;/li&gt;

 &lt;li&gt;
     it compiles the emitted C++ code into a final &lt;code&gt;dvbcodec&lt;/code&gt; executable
 &lt;/li&gt;
&lt;/ol&gt;
&lt;img src="http://wordaligned.org/images/dvbcodec-build.png" alt="Dvbcodec build process"/&gt;

&lt;p&gt;I ran into a problem during the second phase of this process. The dvbcodec generator no longer parsed all of the supplied syntax descriptions. Specifically, I was seeing this conditional test raise an exception when trying to parse section format syntax descriptions.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;    if (!parse(section_format,
               section_grammar,
               space_p).full)
    {
        throw SectionFormatParseException(section_format);
    }

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here, &lt;code&gt;parse&lt;/code&gt; is &lt;code&gt;boost::spirit::classic::parse&lt;/code&gt;, which parses something &amp;#8212; the section format syntax description, passed as a string in this case &amp;#8212; according to the supplied grammar. The third parameter, &lt;code&gt;boost::spirit::classic::space_p&lt;/code&gt;, is a skip parser which tells &lt;code&gt;parse&lt;/code&gt; to skip whitespace between tokens. &lt;code&gt;Parse&lt;/code&gt; returns a &lt;code&gt;parse_info&lt;/code&gt; struct whose &lt;code&gt;full&lt;/code&gt; field is a boolean which will be set to &lt;code&gt;true&lt;/code&gt; if the input section format has been fully consumed.
&lt;/p&gt;
&lt;p&gt;I soon figured out that the parse call was failing to fully consume binary syntax descriptions with trailing spaces, such as the the one shown below.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;" program_association_section() {"
"    table_id                   8"
"    section_syntax_indicator   1"
"    '0'                        1"
....
"    CRC_32                    32"
" }                              "

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;If I stripped the trailing whitespace after the closing brace before calling &lt;code&gt;parse()&lt;/code&gt; all would be fine. I wasn&amp;#8217;t fine about this fix though. The Spirit documentation is very good but it had been a while since I&amp;#8217;d read it and, as already mentioned, my code used the &amp;#8220;classic&amp;#8221; version of Spirit, in danger of becoming the &amp;#8220;legacy&amp;#8221; then &amp;#8220;deprecated&amp;#8221; and eventually the &amp;#8220;dead&amp;#8221; version. Re-reading the documentation it wasn&amp;#8217;t clear to me exactly what the correct behaviour of &lt;code&gt;parse()&lt;/code&gt; should be in this case. Should it fully consume trailing space? Had my program ever worked?
&lt;/p&gt;
&lt;p&gt;I went back in time, downloading and building against Boost 1.31, and satisfied myself that my code used to work, though maybe it worked due to a bug in the old version of Spirit. Stripping trailing spaces before parsing allowed my code to work with Spirit past and present, so I curtailed my investigation and made the fix.
&lt;/p&gt;
&lt;p&gt;(Interestingly, Boost 1.31 found a way to warn me I was using a compiler it didn&amp;#8217;t know about.
&lt;/p&gt;
&lt;pre&gt;
boost_1_31_0/boost/config/compiler/gcc.hpp:92:7: warning: 
#warning "Unknown compiler version - please run the configure tests and report the results"
&lt;/pre&gt;

&lt;p&gt;I ignored this warning.)
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/code-rot#toc7" name="toccode-inaction" id="toccode-inaction"&gt;Code inaction&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Apologies for the lengthy explanation in the previous section. The point is, few software projects stand alone, and changes in any dependencies, &lt;strong&gt;including bug fixes&lt;/strong&gt;, can have knock on effects. In this instance, I consider myself lucky; &lt;code&gt;dvbcodec&lt;/code&gt;&amp;#8217;s unusual three phase build enabled me to catch a runtime error before generating the final product. Of course, to actually catch that error, I needed to at least try building my code.
&lt;/p&gt;
&lt;p&gt;More simply: if you don&amp;#8217;t use your code, it rots.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/code-rot#toc8" name="tocrotten-artefacts" id="tocrotten-artefacts"&gt;Rotten artefacts&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;It wasn&amp;#8217;t just the code which had gone off. My source distribution included documentation &amp;#8212; the plain text version of the article I&amp;#8217;d written for Overload &amp;#8212; and the Makefile had a build target to generate an HTML version of this documentation. This target depended on &lt;a href="http://www.boost.org/doc/tools/quickbook/index.html" title="Quickbook, a Boost documentation tool"&gt;Quickbook&lt;/a&gt;, another Boost tool. Quickbook generates Docbook XML from plain text source, and Docbook is a good starting point for HTML, PDF and other standard output formats.
&lt;/p&gt;
&lt;p&gt;This is quite a sophisticated toolchain. It&amp;#8217;s also one I no longer use. Most of what I write goes straight to the web and I don&amp;#8217;t need such a fiddly process just to produce HTML. So I decided to freshen up dead links, leave the original documentation as a record, and simply cut the documentation target from the Makefile.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/code-rot#toc9" name="tocstopping-the-rot" id="tocstopping-the-rot"&gt;Stopping the rot&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;As we&amp;#8217;ve seen, software, like other soft organic things, breaks down over time. How can we stop the rot?
&lt;/p&gt;
&lt;p&gt;Freezing software to a particular executable built against a fixed set of dependencies to run on a single platform is one way &amp;#8212; and maybe some of us still have an aging Windows 95 machine, kept alive purely to run some such frozen program.
&lt;/p&gt;
&lt;p&gt;A better solution is to actively tend the software and ensure it stays in shape. Exercise it regularly on a build server. Record test results. Fix faults as and when they appear. Review the architecture. Upgrade the platform and dependencies. Prune unused features, splice in new ones. This is the path taken by the Boost project, though certainly the growth far outpaces any pruning (the Boost 1.39 download is 5 times bigger than its 1.31 ancestor). Boost takes forwards and backwards compatibility seriously, hence the ongoing support for Spirit classic and the compiler version certification headers. Maintaining compatibility can be at odds with simplicity.
&lt;/p&gt;
&lt;p&gt;There is another way too. Although the &lt;code&gt;dvbcodec&lt;/code&gt; project has collapsed into disrepair the idea behind it certainly hasn&amp;#8217;t. I&amp;#8217;ve taken this same idea &amp;#8212; of parsing formal syntax descriptions to generate code which handles binary formatted data &amp;#8212; and enhanced it to work more flexibly and with a wider range of inputs. Whenever I come across a new binary data structure, I paste its syntax into a text file, regenerate the code, and I can work with this structure. Unfortunately I can&amp;#8217;t show you any code (it&amp;#8217;s proprietary) but I hope I&amp;#8217;ve shown you the idea. Effectively, &lt;span /&gt;the old C++ code has been left to rot but the idea within it remains green, recoded in Python. Maybe I should find a way to humanely destroy the C++ and all links to it, but for now I&amp;#8217;ll let it degrade, an illustration of its time.
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;Is it possible that software is not like anything else, that it is meant to be discarded: that the whole point is to see it as a soap bubble? &amp;#8212; &lt;a href="http://www.cs.yale.edu/quotes.html"&gt;Alan J. Perlis&lt;/a&gt;
&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/code-rot#toc10" name="tocthanks" id="tocthanks"&gt;Thanks&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;I would like to thank to Rick Engelbrecht for reporting and helping to fix the bugs discussed in this article.
&lt;/p&gt;
&lt;p&gt;This article first appeared in Overload 92, and I would like to thank the team at Overload for their expert help.
&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=_PKUCHV6Xms:jTOwiYo_cpY:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=_PKUCHV6Xms:jTOwiYo_cpY:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=_PKUCHV6Xms:jTOwiYo_cpY:F7zBnMyn0Lo"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=_PKUCHV6Xms:jTOwiYo_cpY:F7zBnMyn0Lo" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=_PKUCHV6Xms:jTOwiYo_cpY:gIN9vFwOqvQ"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=_PKUCHV6Xms:jTOwiYo_cpY:gIN9vFwOqvQ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=_PKUCHV6Xms:jTOwiYo_cpY:cGdyc7Q-1BI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?d=cGdyc7Q-1BI" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/_PKUCHV6Xms" height="1" width="1"/&gt;</description>
<dc:date>2009-09-03</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/code-rot</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/_PKUCHV6Xms/code-rot</link>
<category>C++</category>
<category>Build</category>
<feedburner:origLink>http://wordaligned.org/articles/code-rot</feedburner:origLink></item>

<item>
<title>A useful octal escape sequence</title>
<description>&lt;p&gt;&lt;a href="http://wordaligned.org/articles/integer-literal-values"&gt;Previously&lt;/a&gt; I grumbled about &lt;a href="http://wordaligned.org/articles/octal-literals"&gt;octal integer literals&lt;/a&gt;, suggesting:
&lt;/p&gt;
&lt;ol&gt;
 &lt;li&gt;
     they aren&amp;#8217;t much use, not when you&amp;#8217;ve got hexademical and binary literals.
 &lt;/li&gt;

 &lt;li&gt;
     they&amp;#8217;re risky. As Doug Napoleone noted in a &lt;a href="http://wordaligned.org/articles/integer-literal-values#comment-14394465"&gt;comment&lt;/a&gt;, &lt;code&gt;011&lt;/code&gt; is all too easily read as eleven rather than nine.
 &lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Python 3 attempts to patch the readability issue: octal nine must be written as &lt;code&gt;0o11&lt;/code&gt; or &lt;code&gt;0O11&lt;/code&gt;. Choose your fonts with care, &lt;code&gt;O&lt;/code&gt; and &lt;code&gt;0&lt;/code&gt; look similar!
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; 0o11, 0O11
(9, 9)

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Octal numbers can also appear in escape sequences embedded in &lt;a href="http://docs.python.org/py3k/reference/lexical_analysis.html#string-and-bytes-literals"&gt;string and bytes literals&lt;/a&gt;. The syntax hasn&amp;#8217;t changed for Python 3 and the rules are as in C: the escape sequence &lt;code&gt;\OOO&lt;/code&gt; embeds a character/byte with octal value &lt;code&gt;OOO&lt;/code&gt;. Up to three octal digits are allowed in an octal escape sequence.
&lt;/p&gt;
&lt;p&gt;&lt;img src="http://wordaligned.org/images/buttons/octopus-small.jpg" alt="Smart mollusc"/&gt;
   &lt;img src="http://wordaligned.org/images/buttons/octopus-small.jpg" alt="Smart mollusc"/&gt;
   &lt;img src="http://wordaligned.org/images/buttons/octopus-small.jpg" alt="Smart mollusc"/&gt;
&lt;/p&gt;
&lt;p&gt;Like  octal integers, these octal escape sequences may appear to be of limited use &amp;#8212; a syntactic oddity rarely seen in the wild &amp;#8212; but in fact there&amp;#8217;s one particular use case which is so common we don&amp;#8217;t even notice it.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;char const terminator = '\0';

&lt;/pre&gt;

&lt;/div&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=xT78YAcBekQ:2ioA9BJQrLE:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=xT78YAcBekQ:2ioA9BJQrLE:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=xT78YAcBekQ:2ioA9BJQrLE:F7zBnMyn0Lo"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=xT78YAcBekQ:2ioA9BJQrLE:F7zBnMyn0Lo" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=xT78YAcBekQ:2ioA9BJQrLE:gIN9vFwOqvQ"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=xT78YAcBekQ:2ioA9BJQrLE:gIN9vFwOqvQ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=xT78YAcBekQ:2ioA9BJQrLE:cGdyc7Q-1BI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?d=cGdyc7Q-1BI" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/xT78YAcBekQ" height="1" width="1"/&gt;</description>
<dc:date>2009-08-09</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/a-useful-octal-escape-sequence</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/xT78YAcBekQ/a-useful-octal-escape-sequence</link>
<category>Python</category>
<category>C</category>
<feedburner:origLink>http://wordaligned.org/articles/a-useful-octal-escape-sequence</feedburner:origLink></item>

<item>
<title>Converting integer literals in C++ and Python</title>
<description>&lt;p&gt;An integral literal in a C program can be decimal, hexadecimal or octal.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;int percent = 110;
unsigned flags = 0x80;
unsigned agent = 007;

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This snippet would be equivalent to (e.g.):
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;int percent = 0156;
unsigned flags = 128;
unsigned agent = 0x7;

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;So programmers can choose the best of these options when including numbers in their code.
&lt;/p&gt;
&lt;p&gt;Python adopted this same C syntax, but has recently gone on to extend and modify it. Some Python 2.6 numbers:
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;Python 2.6
&amp;gt;&amp;gt;&amp;gt; 0x80, 110, 007, 0O7, 0o7, 0b10000000
(128, 110, 7, 7, 7, 128)

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;I&amp;#8217;m pleased to see support for binary literals, which are useful for (e.g.) bitmasks. I&amp;#8217;ve never really seen the point of &lt;a href="http://wordaligned.org/articles/octal-literals"&gt;octals&lt;/a&gt;; nonetheless, they&amp;#8217;ve been enhanced for Python 3. Python 2.6 backports the new improved &lt;a href="http://www.python.org/dev/peps/pep-3127"&gt;octal literal syntax&lt;/a&gt; whilst retaining support for classic C-style octals. Python 3 drops C-style octals.
&lt;/p&gt;
&lt;span id="continue-reading"/&gt;

&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;Python 3.1
&amp;gt;&amp;gt;&amp;gt; 007
  File "&amp;lt;stdin&amp;gt;", line 1
    007
      ^
SyntaxError: invalid token
&amp;gt;&amp;gt;&amp;gt; 0O7
7

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Now consider the compiler/interpreter writer&amp;#8217;s problem. Clearly it must be possible to take a string representing an integer literal and work out what number it represents. At a first glance, the &lt;a href="http://docs.python.org/library/functions.html#int"&gt;int()&lt;/a&gt; builtin isn&amp;#8217;t quite smart enough to do the job without us supplying an explicit base for the conversion:
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; int('0xff')
Traceback (most recent call last):
  File "&amp;lt;stdin&amp;gt;", line 1, in &amp;lt;module&amp;gt;
ValueError: invalid literal for int() with base 10: '0xff'
&amp;gt;&amp;gt;&amp;gt; int('0xff', 16)
255

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;We might consider reading any prefix from the literal and dispatching the string to an appropriate handler. Something like this:
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;def integer_literal_value(s):
    if s.startswith('0x'):
        return int(s, 16)
    if s.startswith('0b'):
        return int(s, 2)
    ...

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Yuck! Surely there&amp;#8217;s an easier way to do something this fundamental? Well, there&amp;#8217;s always &lt;a href="http://docs.python.org/library/functions.html#eval"&gt;eval()&lt;/a&gt;, which turns the interpreter on itself.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; def integer_literal_value(s): return eval(s)
... 
&amp;gt;&amp;gt;&amp;gt; v = integer_literal_value
&amp;gt;&amp;gt;&amp;gt; v('0x80'), v('0o7'), v('0b1010101'), v('42')
(128, 7, 85, 42)

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;We should have looked more carefully at the &lt;a href="http://docs.python.org/library/functions.html#int"&gt;int()&lt;/a&gt; documentation:
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;&lt;strong&gt;int([x[, radix]])&lt;/strong&gt; &amp;#8230; The &lt;em&gt;radix&lt;/em&gt; parameter gives the base for the conversion (which is 10 by default) and may be any integer in the range [2, 36], or zero. If &lt;em&gt;radix&lt;/em&gt; is zero, the proper &lt;em&gt;radix&lt;/em&gt; is determined based on the contents of string; the interpretation is the same as for integer literals. 
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;Perfect!
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; from functools import partial
&amp;gt;&amp;gt;&amp;gt; integer_literal_value = partial(int, base=0)
&amp;gt;&amp;gt;&amp;gt; v = integer_literal_value
&amp;gt;&amp;gt;&amp;gt; v('0x80'), v('0o7'), v('0b1010101'), v('42')
(128, 7, 85, 42)

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;(Notice, by the way, that &lt;em&gt;radix&lt;/em&gt; is used in the online documentation but the actual argument name is &lt;em&gt;base&lt;/em&gt;. I&amp;#8217;ll confess that before I wrote this note I hadn&amp;#8217;t spotted this use of zero as a special value for string&amp;rarr;integer conversions even though it&amp;#8217;s been available since Python 2.1)
&lt;/p&gt;
&lt;p&gt;C++ also offers a way to convert integer literals into the numbers they represent, but it&amp;#8217;s not very well known. As is usual for format conversions, we use streams &amp;#8212; stringstreams typically, but here I show an example using standard input and output. The trick is to disable any numeric formatting of the input stream.
&lt;/p&gt;
&lt;div class="typocode"&gt;&lt;div class="codetitle"&gt;integer_literal_value.cpp&lt;/div&gt;

&lt;pre class="prettyprint"&gt;#include &amp;lt;iostream&amp;gt;

int main()
{
    int x;
    std::cin.unsetf(std::ios::basefield);
    while (std::cin &amp;gt;&amp;gt; x)
    {
        std::cout &amp;lt;&amp;lt; x &amp;lt;&amp;lt; '\n';
    }
    return std::cin.eof() ? 0 : 1;
}

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;It &lt;a href="http://www.cplusplus.com/reference/iostream/ios_base/fmtflags"&gt;works by magic&lt;/a&gt;.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;$ g++ integer_literal_value.cpp -o integer_literal_value
$ echo 007 0x80 110 | ./integer_literal_value
7
128
110

&lt;/pre&gt;

&lt;/div&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=ggImbbVC4Oc:TAJVcluUmBo:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=ggImbbVC4Oc:TAJVcluUmBo:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=ggImbbVC4Oc:TAJVcluUmBo:F7zBnMyn0Lo"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=ggImbbVC4Oc:TAJVcluUmBo:F7zBnMyn0Lo" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=ggImbbVC4Oc:TAJVcluUmBo:gIN9vFwOqvQ"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=ggImbbVC4Oc:TAJVcluUmBo:gIN9vFwOqvQ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=ggImbbVC4Oc:TAJVcluUmBo:cGdyc7Q-1BI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?d=cGdyc7Q-1BI" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/ggImbbVC4Oc" height="1" width="1"/&gt;</description>
<dc:date>2009-08-06</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/integer-literal-values</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/ggImbbVC4Oc/integer-literal-values</link>
<category>Python</category>
<category>C++</category>
<feedburner:origLink>http://wordaligned.org/articles/integer-literal-values</feedburner:origLink></item>

<item>
<title>Inner, Outer, Shake it all abouter</title>
<description>&lt;p&gt;C++ programmers enjoy three levels of access control: &lt;a href="http://www.parashift.com/c++-faq-lite/basics-of-inheritance.html#faq-19.5"&gt;private, protected and public&lt;/a&gt;. Some programmers use protected instead of private just in case someone might want to derive from their class some day. Others keep everything as private as possible, hiding nested classes in anonymous namespaces; these inner classes never seem to work quite the way you&amp;#8217;d want, but if you get tangled up a &lt;a href="http://www.parashift.com/c++-faq-lite/friends.html"&gt;friend&lt;/a&gt; can cut through the knots!
&lt;/p&gt;
&lt;p&gt;Python is less sophisticated. Prefix class members with a double underscore and &lt;a href="http://docs.python.org/tutorial/classes.html#private-variables"&gt;their names are disguised&lt;/a&gt; to the world outside. Prefix module members with a single underscore to indicate they won&amp;#8217;t be exported from that module. Many Python programmers use single underscore prefix in classes too (no mangling but better looking).
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://accu.org"&gt;&lt;img style="float:right" src="http://accu.org/content/images/buttonl_120x60.gif" width="120px" alt="ACCU Button" height="60px"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;Once you&amp;#8217;ve used Python for a while you may well question the benefits of the C++ model. Recently a C++ question came up on the &lt;a href="http://accu.org/index.php/mailinglists"&gt;accu-general&lt;/a&gt; mailing list. It involved nested classes, &lt;code&gt;operator&amp;lt;&amp;lt;()&lt;/code&gt; and code which refused to compile. You&amp;#8217;ll have to trawl through the list archives if you want the exact question: I didn&amp;#8217;t give it much attention since it seemed an example of the kind of struggle with the language which causes me to throw in the towel. I would like to quote from one of the answers though.
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;Surely your issue is that f() is a friend of Inner only. f() is &lt;b&gt;not&lt;/b&gt; a friend of Outer. Inner is private to Outer. Therefore in the global scope, outside Outer, f() cannot access Inner via Outer::Inner, as that is private.
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;Wow, some &lt;a href="http://yosefk.com/c++fqa/class.html#fqa-7.7" title="#define private public from the C++ FQA"&gt;brain twister&lt;/a&gt;!
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://www.kleinbottle.com"&gt;&lt;img src="http://www.kleinbottle.com/images/giantKleinbotandCliff2.jpg" alt="Giant Klein bottle and Cliff Stoll" width="240px" height="461px"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;Time to get back to basics.
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;Encapsulation is about allocating responsibility and easing utility rather than protecting data, which is a side effect. &amp;#8212; &lt;a href="http://twitter.com/KevlinHenney/status/2684963420"&gt;@KevlinHenney&lt;/a&gt;
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p style="text-align:center"&gt;&amp;sect;&lt;/p&gt;

&lt;p&gt;My thanks to &lt;a href="http://libjmmcg.sourceforge.net"&gt;Jason McGuiness&lt;/a&gt; for allowing me to quote from his expert answer to a tricky C++ question. The photo shows Cliff Stoll holding the world&amp;#8217;s largest glass klein bottle, which was produced by the company he owns, operates and mismanages, &lt;a href="http://www.kleinbottle.com"&gt;Acme Klein Bottles&lt;/a&gt;. Klein bottles get a mention here because they don&amp;#8217;t have an outside or an inside, they just have a side. You can solve every computing problem with an extra dimension, except the problem of too many dimensions.
&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=HiGdlx2EX5I:kwIXKDuljHo:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=HiGdlx2EX5I:kwIXKDuljHo:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=HiGdlx2EX5I:kwIXKDuljHo:F7zBnMyn0Lo"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=HiGdlx2EX5I:kwIXKDuljHo:F7zBnMyn0Lo" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=HiGdlx2EX5I:kwIXKDuljHo:gIN9vFwOqvQ"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=HiGdlx2EX5I:kwIXKDuljHo:gIN9vFwOqvQ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=HiGdlx2EX5I:kwIXKDuljHo:cGdyc7Q-1BI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?d=cGdyc7Q-1BI" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/HiGdlx2EX5I" height="1" width="1"/&gt;</description>
<dc:date>2009-07-29</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/inner-outer-shake-it-all-abouter</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/HiGdlx2EX5I/inner-outer-shake-it-all-abouter</link>
<category>C++</category>
<category>Python</category>
<category>ACCU</category>
<feedburner:origLink>http://wordaligned.org/articles/inner-outer-shake-it-all-abouter</feedburner:origLink></item>

<item>
<title>Blackmail made easy using Python counters</title>
<description>&lt;div class="toc"&gt;
&lt;h2&gt;Contents&lt;/h2&gt;
&lt;ul&gt;
 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/python-counters#tocthe-obsessive-blackmailer" name="toc0" id="toc0"&gt;The Obsessive Blackmailer&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/python-counters#tocmodeling-the-problem" name="toc1" id="toc1"&gt;Modeling the Problem&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/python-counters#tocthe-evolution-of-multisets-in-python" name="toc2" id="toc2"&gt;The evolution of multisets in Python&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/python-counters#tocwait-theres-more" name="toc3" id="toc3"&gt;Wait, there&amp;#8217;s more!&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/python-counters#tocback-to-blackmail" name="toc4" id="toc4"&gt;Back to Blackmail&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/python-counters#tocgeneric-code" name="toc5" id="toc5"&gt;Generic Code&lt;/a&gt;
 &lt;/li&gt;

 &lt;li&gt;&lt;a href="http://wordaligned.org/articles/python-counters#tocend-of-message" name="toc6" id="toc6"&gt;End of Message&lt;/a&gt;
 &lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/python-counters#toc0" name="tocthe-obsessive-blackmailer" id="tocthe-obsessive-blackmailer"&gt;The Obsessive Blackmailer&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;An obsessive blackmailer writes anonymous messages by by cut-and-pasting letters from newspapers. Being obsessive, the blackmailer only writes messages which can be composed entirely from a single newspaper.
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://www.flickr.com/photos/thomasguest/3754867981/" title="word aligned by Thomas Guest, on Flickr"&gt;&lt;img src="http://farm3.static.flickr.com/2567/3754867981_a752d15f74_o.png" width="480" height="309" alt="word aligned" /&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;Devise an algorithm which determines whether a given message can be written using a given newspaper.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/python-counters#toc1" name="tocmodeling-the-problem" id="tocmodeling-the-problem"&gt;Modeling the Problem&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;This is a nice little problem but I&amp;#8217;m about to spoil it since I&amp;#8217;m using it here as a study in Python&amp;#8217;s evolution. So if you&amp;#8217;d like to try it yourself, look away now.
&lt;/p&gt;
&lt;span id="continue-reading"/&gt;

&lt;p&gt;We can represent both inputs to the algorithm as sequences of characters: a message string, length M, and a newspaper string, length N. We &lt;em&gt;could&lt;/em&gt; process the message string one character at a time, at each step scanning through the newspaper and noting the first occurrence of that character we haven&amp;#8217;t used before; but this is inefficient since we potentially read the whole paper M times.
&lt;/p&gt;
&lt;p&gt;It&amp;#8217;s better to think of this problem in terms of multisets, sometimes known as bags. A multiset is a set which can have repeated elements. Our blackmailer can proceed if the multiset of letters used in the message is contained entirely within the multiset of letters used in the newspaper.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/python-counters#toc2" name="tocthe-evolution-of-multisets-in-python" id="tocthe-evolution-of-multisets-in-python"&gt;The evolution of multisets in Python&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;A dictionary provides a compact and efficient way to represent a multiset in Python: each dictionary key represents an item in the multiset, and the value associated with that key is the number of times the key appears in the multiset. Python dictionaries are implemented as hashed arrays, meaning that member insertion and access take constant time, on average.
&lt;/p&gt;
&lt;p&gt;It&amp;#8217;s not hard to create such a multiset from a sequence but it&amp;#8217;s interesting to see how advances in the Python language have simplified the code. 
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://python.org/doc/1.4/lib/node13.html#SECTION00316000000000000000"&gt;&lt;img src="http://python.org/doc/1.4/lib/img7.gif" height="181px" width="469px"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;The complete documentation for &lt;a href="http://python.org/doc/1.4/"&gt;Python 1.4&lt;/a&gt;, released in 1996, is still available on the Python website. In version 1.4 you could write:
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;def multiset_14(xs):
    multiset = {}
    for x in xs:
        if multiset.has_key(x):
            multiset[x] = multiset[x] + 1
        else:
            multiset[x] = 1
    return multiset

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This code works unchanged in the current Python release, 2.6 (though note &lt;code&gt;dict.has_key()&lt;/code&gt; doesn&amp;#8217;t exist in Python 3.*). Alternatively, you might catch the &lt;code&gt;KeyError&lt;/code&gt; raised when trying to access the dict with an invalid key:
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;def multiset_14(xs):
    multiset = {}        
    for x in xs:
        try:
            multiset[x] = multiset[x] + 1
        except KeyError:
            multiset[x] = 1
    return multiset

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Python 1.5 introduces an exception-free dictionary access method, &lt;code&gt;dict.get()&lt;/code&gt;, which returns a user supplied default (defaulting to &lt;code&gt;None&lt;/code&gt;) for missing keys.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;def multiset_15(xs):
    multiset = {}        
    for x in xs:
        multiset[x] = multiset.get(x, 0) + 1
    return multiset

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;It&amp;#8217;s certainly shorter, a little cleaner maybe, but perhaps it takes more effort for readers to see what exactly is going on.
&lt;/p&gt;
&lt;p&gt;At Python 2.2, &lt;code&gt;x in multiset&lt;/code&gt; improves on the equivalent &lt;code&gt;multiset.has_key(x)&lt;/code&gt; and we can use augmented arithmetic operators (&lt;code&gt;+=, -=, *=, /=, %=, **=, &amp;lt;&amp;lt;=, &amp;gt;&amp;gt;=, &amp;amp;=, =, |=&lt;/code&gt;), allowing:
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;def multiset_22(xs):
    multiset = {}
    for x in xs:
        if x in multiset:
            multiset[x] += 1
        else:
            multiset[x] = 1
    return multiset

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;I think I prefer the &lt;code&gt;dict.get()&lt;/code&gt; version, though.
&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;collections&lt;/code&gt; module makes its first appearance in Python 2.4 offering a &lt;code&gt;deque&lt;/code&gt; and a promise of more high performance container types to come. Python 2.5 makes good on this promise, adding &lt;code&gt;defaultdict&lt;/code&gt; to the module. A &lt;code&gt;defaultdict&lt;/code&gt; is a specialised dictionary which calls a client supplied factory function for missing keys. Setting this factory function to &lt;code&gt;int&lt;/code&gt; turns the &lt;code&gt;defaultdict&lt;/code&gt; into a multiset. No need for &lt;code&gt;dict.get()&lt;/code&gt; any more.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;from collections import defaultdict

def multiset_25(xs):
    multiset = defaultdict(int)
    for x in xs:
        multiset[x] += 1
    return multiset

&lt;/pre&gt;

&lt;/div&gt;


&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/python-counters#toc3" name="tocwait-theres-more" id="tocwait-theres-more"&gt;Wait, there&amp;#8217;s more!&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;The final improvement is available in Python 3.1 right now (or in Python 2.7, coming soon), courtesy once again of the collections module. &lt;a href="http://docs.python.org/dev/library/collections.html#collections.Counter"&gt;Collections.Counter&lt;/a&gt; is exactly what we&amp;#8217;ve been waiting for.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;from collections import Counter

def multiset_31(xs):
    return Counter(xs)

&lt;/pre&gt;

&lt;/div&gt;


&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/python-counters#toc4" name="tocback-to-blackmail" id="tocback-to-blackmail"&gt;Back to Blackmail&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;So our blackmailer should first generate a multiset representation of the letters in the message. Then it&amp;#8217;s a matter of iterating through the newspaper and reducing the multiset each time a letter matches up. We keep a tally of the number of letters we still need to match, and stop when this tally is zero or when we get to the end of the newspaper. Here&amp;#8217;s a sketch of an implementation.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;def blackmailable(message, newspaper):
    """Return True if newspaper can be used to write the blackmail 
    message, False otherwise.
    """
    m = len(message)
    if m == 0:
        return True
    counts = multiset(message)
    for ch in newspaper:
        if counts[ch] &amp;gt; 0:
            counts[ch] -= 1
            m -= 1
            if m == 0:
                return True
    return False

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This code assumes the multiset is represented as a &lt;code&gt;Counter&lt;/code&gt; or a &lt;code&gt;defaultdict&lt;/code&gt;, since it depends on &lt;code&gt;counts[ch]&lt;/code&gt; returning 0 for any character not in the message. If we&amp;#8217;d used a plain dict, we&amp;#8217;d need to employ &lt;code&gt;dict.get(ch, 0)&lt;/code&gt;.
&lt;/p&gt;
&lt;p&gt;I&amp;#8217;m not entirely happy with the code shown. It&amp;#8217;s what I first came up with. Here&amp;#8217;s an alternative, which I also find a bit clunky. I&amp;#8217;d welcome any improvements. It&amp;#8217;s also worth noting that the algorithm locates the matching characters in the newspaper, so we might want to cache some indices for later use.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;def blackmailable(message, newspaper):
    """Return True if newspaper can be used to write the blackmail 
    message, False otherwise.
    """
    counts = multiset(message)
    m = len(message)
    n = len(newspaper)
    i = 0
    while m != 0 and i != n:
        ch = newspaper[i]
        if counts[ch] &amp;gt; 0:
            counts[ch] -= 1
            m -= 1
        i += 1
    return m == 0

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;We can avoid the ugly code by persuading the obssessive blackmailer to generate and maintain multiset representations of the entire newspaper library. Then &lt;code&gt;blackmailable()&lt;/code&gt; can be implemented as multiset containment, something which the &lt;code&gt;Counter&lt;/code&gt; class handles nicely using the subtraction operator. Note here that multiset subtraction never results in any negative counts, even though a &lt;code&gt;Counter&lt;/code&gt; instance could itself have negative counts.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; from collections import Counter
&amp;gt;&amp;gt;&amp;gt; missing_letters = Counter(message) - Counter(newspaper)
&amp;gt;&amp;gt;&amp;gt; blackmailable = len(missing_letters) == 0

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Alternatively:
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; blackmailable = not missing_letters

&lt;/pre&gt;

&lt;/div&gt;


&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/python-counters#toc5" name="tocgeneric-code" id="tocgeneric-code"&gt;Generic Code&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;Suppose the blackmailer prefers to compose a message from words, rather than letters? (For an example, see the threat to stay away from Grimpen Moor delivered to Sir Henry Baskerville discussed later in this article.) The code works as is &amp;#8212; just pass in message and newspaper as word sequences, rather than character sequences. Anything we can hash can be counted.
&lt;/p&gt;

&lt;h3&gt;&lt;a href="http://wordaligned.org/articles/python-counters#toc6" name="tocend-of-message" id="tocend-of-message"&gt;End of Message&lt;/a&gt;&lt;/h3&gt;
&lt;p&gt;In the age of the interweb anonymous cowardice is far easier and blackmailers don&amp;#8217;t need to resort to manual cut and paste techniques unless they&amp;#8217;re after a retro threatening effect.
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://en.wikipedia.org/wiki/Never_Mind_the_Bollocks,_Here's_the_Sex_Pistols"&gt;&lt;img src="http://wordaligned.org/images/never-mind-the-bollocks.jpg" alt="Never Mind the Bollocks"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;img style="float:right" alt="Sherlock Holmes" src="http://wordaligned.org/images/sherlock-holmes.png"/&gt;

&lt;p&gt;What&amp;#8217;s more, a detective can figure out plenty from these messages: so when Sir Henry Baskerville receives a threatening letter during his stay at the Northumberland Hotel, he shows it promptly to Sherlock Holmes:
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;Across the middle of it a single sentence had been formed by the expedient of pasting printed words upon it. It ran: &amp;#8220;As you value your life or your reason keep away from the moor.&amp;#8221; The word &amp;#8220;moor&amp;#8221; only was printed in ink.
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;In a virtuso display of deductive reasoning, Holmes shows the author of the message was in a hurry, afraid of being interrupted, and working in a hotel room using nail-scissors. (He also deduces something else, which he does not reveal at the time.) Identifying the source of the words to be yesterday&amp;#8217;s Times leader is elementary.
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;The detection of types is one of the most elementary branches of knowledge to the special expert in crime, though I confess that once when I was very young I confused the Leeds Mercury with the Western Morning News. But a Times leader is entirely distinctive, and these words could have been taken from nothing else.
&lt;/p&gt;
&lt;p&gt;&amp;#8212; Sherlock Holmes, &lt;a href="http://www.gutenberg.org/dirs/etext02/bskrv11a.txt"&gt;The Hound of the Baskervilles&lt;/a&gt;
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;Can anyone identify the newspaper I used to create the image at the start of this article?
&lt;/p&gt;
&lt;p&gt;&lt;hr /&gt;
   My thanks to jay for a &lt;a href="http://wordaligned.org/articles/python-counters#comment-13418772"&gt;correction&lt;/a&gt; to the original version of this article.
&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=MzW5vqkuZVg:a1kSfo23rWM:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=MzW5vqkuZVg:a1kSfo23rWM:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=MzW5vqkuZVg:a1kSfo23rWM:F7zBnMyn0Lo"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=MzW5vqkuZVg:a1kSfo23rWM:F7zBnMyn0Lo" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=MzW5vqkuZVg:a1kSfo23rWM:gIN9vFwOqvQ"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=MzW5vqkuZVg:a1kSfo23rWM:gIN9vFwOqvQ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=MzW5vqkuZVg:a1kSfo23rWM:cGdyc7Q-1BI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?d=cGdyc7Q-1BI" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/MzW5vqkuZVg" height="1" width="1"/&gt;</description>
<dc:date>2009-07-27</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/python-counters</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/MzW5vqkuZVg/python-counters</link>
<category>Python</category>
<category>Puzzles</category>
<category>Characters</category>
<feedburner:origLink>http://wordaligned.org/articles/python-counters</feedburner:origLink></item>

<item>
<title>Could OCR conquer the calligraphylion?</title>
<description>&lt;p&gt;&lt;a href="http://tesseract-ocr.googlecode.com/files/TesseractOSCON.pdf" title="Tesseract presentation (PDF)"&gt;&lt;img src="http://wordaligned.org/images/ocr-outline.png" width="425px" height="135px" alt="OCR outline"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;Optical character recognition (&lt;a href="http://en.wikipedia.org/wiki/Optical_character_recognition" title="Wikipedia on OCR"&gt;OCR&lt;/a&gt;) algorithms typically process an image in stages:
&lt;/p&gt;
&lt;ul&gt;
 &lt;li&gt;
     convert the image to monochrome
 &lt;/li&gt;

 &lt;li&gt;
     identify blocks of text
 &lt;/li&gt;

 &lt;li&gt;
     find lines of text within those blocks
 &lt;/li&gt;

 &lt;li&gt;
     separate out words, then characters
 &lt;/li&gt;

 &lt;li&gt;
     extract character outlines
 &lt;/li&gt;

 &lt;li&gt;
     match outlines to archetypes
 &lt;/li&gt;

 &lt;li&gt;
     match candidate words to dictionary
 &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;a href="http://tesseract-ocr.googlecode.com/files/TesseractOSCON.pdf" title="Tesseract presentation (PDF)"&gt;&lt;img src="http://wordaligned.org/images/ocr-matching.png" width="425px" height="135px" alt="OCR matching"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;span id="continue-reading"/&gt;

&lt;p&gt;These algorithms can be tuned and trained, to particular fonts and dictionaries for example, and the later stages can feedback into earlier ones; but the strategy basically tackles the picture character by character. This form of OCR is a mature and successful technology. It works very effectively with, for example, a page from a Western newspaper; but as with all things language-related, varying cultural conventions can lead to complications. The fundamental assumption that the atoms of a text image are characters may no longer be true. &lt;a href="http://code.google.com/p/tesseract-ocr"&gt;Tesseract&lt;/a&gt;, the leading open source OCR engine, &lt;a href="http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract"&gt;comes clean&lt;/a&gt;
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;Tesseract is unlikely to be able to handle connected scripts like Arabic. It will take some specialized algorithms to handle this case, and right now it doesn&amp;#8217;t have them.
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;Mohammad S. M. Khorsheed&amp;#8217;s &lt;a href="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-495.html"&gt;Phd dissertation&lt;/a&gt; describes such algorithms, explaining the dimensions of the challenge in more detail:
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;Arabic scripts are inherently cursive: writing isolated characters in &amp;#8216;block letters&amp;#8217; is an unacceptable and unused writing style. The letters are context sensitive. Certain character combinations form new ligature shapes which are often font dependent. Some ligatures involve vertical stacking of characters. Since not all characters connect, word boundary location becomes an interesting problem, as spacing may not only separate words but also certain characters within a word.
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;Here&amp;#8217;s an illustration of Arabic characters taking on different forms, depending on position.
&lt;/p&gt;
&lt;img src="http://wordaligned.org/images/arabic-context-sensitive-characters.png" alt="Context sensitive Arabic characters"/&gt;

&lt;p&gt;There are some commercially available specialised Arabic OCR packages but I haven&amp;#8217;t been able to try them out. They don&amp;#8217;t provide information about the algorithms they use. 
&lt;/p&gt;
&lt;p&gt;Could OCR software ever conquer the calligraphylion?
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://www.tate.org.uk/britain/exhibitions/eastwest/rooms/room25.htm"&gt;&lt;img width="512px" height="408px" src="http://www.tate.org.uk/britain/exhibitions/eastwest/images/calligraphylion.jpg" alt="Calligraphylion"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;This magical beast demonstrates OCR in reverse: an image which has been converted by hand into text.
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;This image of a lion originates from Lahore, Pakistan and is part of a rich tradition of zoomorphic calligraphy. This practice, developed in the sixteenth century, employs the flexibility and beauty of Arabic script to delineate living forms such as tigers, parrots, ostriches and cockerels. This is done without disobeying religious injunctions that prohibit their direct depiction.
&lt;/p&gt;
&lt;p&gt;&amp;#8212; &lt;a href="http://www.tate.org.uk/britain/exhibitions/eastwest/rooms/room25.htm"&gt;Tate Gallery, East West exhibition, 2006-2007&lt;/a&gt;
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;(More &lt;a href="http://images.google.com/images?q=zoomorphic+calligraphy"&gt;zoomorphic calligraphy&lt;/a&gt;.)
&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=aTUA54xjZSg:yxatoYQJQVg:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=aTUA54xjZSg:yxatoYQJQVg:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=aTUA54xjZSg:yxatoYQJQVg:F7zBnMyn0Lo"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=aTUA54xjZSg:yxatoYQJQVg:F7zBnMyn0Lo" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=aTUA54xjZSg:yxatoYQJQVg:gIN9vFwOqvQ"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=aTUA54xjZSg:yxatoYQJQVg:gIN9vFwOqvQ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=aTUA54xjZSg:yxatoYQJQVg:cGdyc7Q-1BI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?d=cGdyc7Q-1BI" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/aTUA54xjZSg" height="1" width="1"/&gt;</description>
<dc:date>2009-07-14</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/calligraphylion</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/aTUA54xjZSg/calligraphylion</link>
<category>Characters</category>
<category>OCR</category>
<category>Tesseract</category>
<feedburner:origLink>http://wordaligned.org/articles/calligraphylion</feedburner:origLink></item>

<item>
<title>Undogfooding</title>
<description>&lt;p&gt;David Jones perfectly &lt;a href="http://drj11.wordpress.com/2009/07/04/tony-hoare-man-of-science/"&gt;captures the look and feel&lt;/a&gt; of Sir Tony Hoare&amp;#8217;s presentation at &lt;a href="http://www.europython.eu"&gt;Europython 2009&lt;/a&gt;
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;Tony Hoare is clearly old skool. His slides had the calm and aged patina of the &lt;a href="http://en.wikipedia.org/wiki/Overhead_projector"&gt;OHP&lt;/a&gt; era, and I thought they were all the better for that. If you have a message, then that message can be conveyed without all the flash and shine that PowerPoint tempts you with (although, being a Microsoft man, of course his slides &lt;em&gt;were&lt;/em&gt; in PowerPoint). 
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;&lt;a href="http://docs.python.org/library/turtle.html#module-turtle"&gt;&lt;img src="http://wordaligned.org/images/my-1st-turtle.png" alt="My first turtle sketch"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;Note the parenthetical comment: Tony Hoare works for Microsoft and he uses Microsoft software, an activity developers refer to as &lt;a href="http://catb.org/jargon/html/D/dogfood.html"&gt;&amp;#8220;eating your own dogfood&amp;#8221;&lt;/a&gt;. Also eating dogfood at Europython, &lt;a href="http://www.europython.eu/talks/speakers/index.html#lingl_gregor"&gt;Gregor Lingl&lt;/a&gt; employed his very own &lt;a href="http://docs.python.org/library/turtle.html#module-turtle"&gt;turtle&lt;/a&gt; to guide the audience through a nifty presentation belying the reptile&amp;#8217;s slow-and-steady reputation. I&amp;#8217;ve always enjoyed sketching code using the Python interpreter, and sketching pictures with a turtle feels very pythonic.
&lt;/p&gt;
&lt;div class="typocode"&gt;&lt;div class="codetitle"&gt;My first turtle sketch&lt;/div&gt;

&lt;pre class="prettyprint"&gt;Python 3.1
&amp;gt;&amp;gt;&amp;gt; from turtle import *
&amp;gt;&amp;gt;&amp;gt; shape('turtle')
&amp;gt;&amp;gt;&amp;gt; circle(100)
&amp;gt;&amp;gt;&amp;gt; fillcolour('red')
Traceback (most recent call last):
  File "&amp;lt;stdin&amp;gt;", line 1, in &amp;lt;module&amp;gt;
NameError: name 'fillcolour' is not defined
&amp;gt;&amp;gt;&amp;gt; fillcolor('red')
&amp;gt;&amp;gt;&amp;gt; begin_fill(); circle(100); end_fill()
&amp;gt;&amp;gt;&amp;gt; def grey(pc): c = pc/100; fillcolor(c, c, c)
... 
&amp;gt;&amp;gt;&amp;gt; clear()
&amp;gt;&amp;gt;&amp;gt; def doit(r): begin_fill(); grey(r); circle(r); end_fill()
... 
&amp;gt;&amp;gt;&amp;gt; for r in range(100): doit(r)
... 
^C^C
Traceback (most recent call last):
...
&amp;gt;&amp;gt;&amp;gt; clear()
&amp;gt;&amp;gt;&amp;gt; for r in range(100, 0, -5): doit(r)
...
&amp;gt;&amp;gt;&amp;gt; fillcolor('white')
&amp;gt;&amp;gt;&amp;gt; forward(150)

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Gregor Lingl&amp;#8217;s session at Europython wasn&amp;#8217;t the only one delivered using software developed by the presenter. What would be the opposite of eating your own dogfood, I wondered? Abusing someone else&amp;#8217;s software, perhaps.
&lt;/p&gt;
&lt;span id="continue-reading"/&gt;

&lt;blockquote&gt;&lt;p&gt;After one too many bad presentations at a meeting in January 2000, I decided to see if I could do something about it. &amp;#8212; &lt;a href="http://norvig.com/Gettysburg/making.html" title="The Making of the Gettysburg PowerPoint Presentation"&gt;Peter Norvig&lt;/a&gt;
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;I&amp;#8217;m fairly sure Peter Norvig had no part in PowerPoint&amp;#8217;s development. In a &lt;a href="http://norvig.com/Gettysburg"&gt;hilarious satire&lt;/a&gt; he skewers the popular presentation tool with its own Autocontent Wizard.
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://norvig.com/Gettysburg/index.htm"&gt;&lt;img src="http://norvig.com/Gettysburg/img001.gif" width="500px" height="375px" alt="Gettysburg PPT, slide 1"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;Back on the subject of overhead projectors, I saw one put to good use by another eminent old-skooler, Donald Knuth, at the recent &lt;a href="http://www.dcs.warwick.ac.uk/bshm/meetings/Fiction.html"&gt;Mathematics and Fiction&lt;/a&gt; workshop in Oxford. Asked by passport control at London Heathrow if he&amp;#8217;d be visiting Oxford on a business trip or a pleasure trip, Knuth had answered: no, he&amp;#8217;d come on a ego trip.
&lt;/p&gt;
&lt;p&gt;Knuth was in Oxford to talk to his fans about about &lt;a href="http://www-cs-faculty.stanford.edu/~knuth/sn.html"&gt;Surreal Numbers&lt;/a&gt;, a       book he wrote in just a couple of weeks in 1973 while taking a break from working on &lt;a href="http://en.wikipedia.org/wiki/The_Art_of_Computer_Programming"&gt;The Art of Computer Programming&lt;/a&gt; in a hotel in Oslo, and which may well be the only example of a mathematical &lt;a href="http://en.wikipedia.org/wiki/Surreal_numbers"&gt;theory&lt;/a&gt; first published in fictional form.
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://www-cs-faculty.stanford.edu/~knuth/sn.html"&gt;&lt;img src="http://wordaligned.org/images/books/surreal-numbers.jpg" width="240px" height="240px" alt="Surreal Numbers book cover" style="float:right;"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;For the workshop Knuth had gone back to his files and found his working notes, the original hotel brochure, photos, hand-written reviews by P&amp;oacute;lya and others, though he couldn&amp;#8217;t locate the paper napkin on which &lt;a href="http://en.wikipedia.org/wiki/John_Horton_Conway"&gt;John Horton Conway&lt;/a&gt; had sketched the axioms underpinning the number system a few months before&lt;a id="fn1link" href="http://wordaligned.org/articles/undogfooding#fn1"&gt;&lt;sup&gt;[1]&lt;/sup&gt;&lt;/a&gt; &amp;#8212; Knuth had copied this source material onto acetates which he then sorted and shuffled on the OHP according to the direction his reminiscences and questions from the audience took him.
&lt;/p&gt;
&lt;p&gt;Unlike Tony Hoare, Donald Knuth wasn&amp;#8217;t making a formal presentation, but I was struck by some advantages offered by the OHP. For one, the process of getting slides displayed seemed foolproof &amp;#8212; at Europython we had the usual catalogue of computer/screen interface glitches (unwanted dialog boxes popping up, batteries dying, screensavers kicking in, missing Mac display dongles, delays while rebooting Ubuntu, alignment issues). The OHP also facilitated a dynamic presentation style: Knuth accessed his slides at random, composing screens from more than one slide on the fly, and pointing to areas of interest directly using a finger. He didn&amp;#8217;t modify his slides during the session by writing on them, but that would also have been possible.
&lt;/p&gt;
&lt;p&gt;In a supreme example of dogfood consumption, a couple of years after publishing Surreal Numbers Knuth took a rather longer break from TAOCP to work on &lt;a href="http://en.wikipedia.org/wiki/TeX"&gt;TeX&lt;/a&gt;, a typesetting system whose results match the beauty of his writing.
&lt;/p&gt;
&lt;p style="text-align:center"&gt;&amp;sect;&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;&amp;#8230; You get surreal numbers by playing games. I used to feel guilty in Cambridge that I spent all day playing games, while I was supposed to be doing mathematics. Then, when I discovered surreal numbers, I realized that playing games IS mathematics. &amp;#8212; &lt;a href="http://www.gap-system.org/~history/Quotations/Conway.html"&gt;John Horton Conway&lt;/a&gt;
&lt;/p&gt;
&lt;/blockquote&gt;&lt;hr /&gt;

&lt;p&gt;&lt;a id="fn1" href="http://wordaligned.org/articles/undogfooding#fn1link"&gt;[1]&lt;/a&gt; In fact, he&amp;#8217;d lost the napkin before his stay in the hotel in Oslo, which explains the difference between the axioms used in Surreal Numbers and the ones originally suggested by John Horton Conway.
&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=2mS3r9LQKAk:wAezk7n8hKU:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=2mS3r9LQKAk:wAezk7n8hKU:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=2mS3r9LQKAk:wAezk7n8hKU:F7zBnMyn0Lo"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=2mS3r9LQKAk:wAezk7n8hKU:F7zBnMyn0Lo" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=2mS3r9LQKAk:wAezk7n8hKU:gIN9vFwOqvQ"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=2mS3r9LQKAk:wAezk7n8hKU:gIN9vFwOqvQ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=2mS3r9LQKAk:wAezk7n8hKU:cGdyc7Q-1BI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?d=cGdyc7Q-1BI" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/2mS3r9LQKAk" height="1" width="1"/&gt;</description>
<dc:date>2009-07-08</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/undogfooding</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/2mS3r9LQKAk/undogfooding</link>
<category>Python</category>
<category>Europython</category>
<category>Knuth</category>
<category>Drawing</category>
<feedburner:origLink>http://wordaligned.org/articles/undogfooding</feedburner:origLink></item>

<item>
<title>Tony Hoare’s vision, car crashes, and Alan Turing</title>
<description>&lt;p&gt;&lt;a href="http://www.europython.eu"&gt;&lt;img src="http://www.europython.eu/images/europython_logo.png" alt="Europython Logo"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;Yesterday afternoon, at &lt;a href="http://www.europython.eu"&gt;Europython 2009&lt;/a&gt;, David Jones addressed the subject &lt;a href="http://www.europython.eu/talks/talk_abstracts/index.html#talk21"&gt;&amp;#8220;What sucks about Python?&amp;#8221;&lt;/a&gt; Despite this provocative title, David Jones had lots of good things to say about Python, and the two topics which really roused the audience (the global interpreter lock and the over-crowded Python packaging space) had more to do with the Python-the-platform than Python-the-language. He also failed to mention the thing I miss most when working with Python: via Peter Norvig&amp;#8217;s &lt;a href="http://norvig.com/python-iaq.html"&gt;Python IAQ&lt;/a&gt;, quoting Bjarne Stroustrup:
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;&amp;#8220;If I were to design a language from scratch, I would follow the Algol68 path and make every statement and declaration an expression that yields a value.&amp;#8221;
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;Of course what &lt;strong&gt;really&lt;/strong&gt; matters to an engineer is Python-the-platform rather than Python-the-language. Python famously comes with batteries included, and, stretching the metaphor, it also excels at integrating with the other batteries used in modern software applications. The packaging confusion is a side-effect of Python-the-platform&amp;#8217;s success. More on over-stretched batteries later &amp;#8230;
&lt;/p&gt;
&lt;p&gt;The day had started with &lt;a href="http://www.mindviewinc.com"&gt;Bruce Eckel&lt;/a&gt;&amp;#8217;s keynote on Software Archeology. Bruce Eckel is a relaxed and engaging speaker but I found his presentation rather flimsy. Its substance could (and did!) fit on to a couple of David Jones&amp;#8217; slides and the remainder dwelt a little too long on Bruce Eckel, his Java and C++ books, blogs, and &lt;a href="http://www.mindviewinc.com"&gt;www.mindviewinc.com&lt;/a&gt;.
&lt;/p&gt;
&lt;p&gt;During the rest of the day, I had a choice of 4 or 5 different presentations at any one time, generally on the subject of Python modules or frameworks. The common theme I took away is that &lt;strong&gt;people turn to Python to get things done&lt;/strong&gt;, and they&amp;#8217;re reluctant to turn back. As the afternoon drew on everyone regathered in the Adrian Boult Hall to listen to a brilliant series of lightning talks which developed on this same theme. We also heard a wonderful &lt;a href="http://europython09.blip.tv/file/2351409"&gt;short story&lt;/a&gt; about the best way to wreck cars and software systems.
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://www.europython.eu/gallery/europython-2009"&gt;&lt;img alt="Sir Tony Hoare" src="http://www.europython.eu/gallery/media/gallery-title/thumbs/3kxu_jpg_310x260_crop_q85.jpg"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;span id="continue-reading"/&gt;

&lt;p&gt;On the subject of car crashes, it seemed a shame to stem the flow of these talks and keep &lt;a href="http://research.microsoft.com/en-us/people/thoare/"&gt;Sir Tony Hoare&lt;/a&gt; waiting in the wings in order to pull Guido van Rossum on stage, especially since Guido wasn&amp;#8217;t even at the conference. Nonetheless, the the benevolent dictator for life dutifully skyped in, and his face beamed up on the 12m&lt;sup&gt;2&lt;/sup&gt; screen. We could see and hear him. He couldn&amp;#8217;t hear us. The connection fizzed and chirped until all protocol broke down. Eventually the skype laptop popped up a message warning its batteries were running low, then promptly and mercifully shut down, ending the experiment.
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://en.wikipedia.org/wiki/C._A._R._Hoare"&gt;&lt;img src="http://upload.wikimedia.org/wikipedia/commons/thumb/7/70/CAR_Hoare.jpg/225px-CAR_Hoare.jpg" alt="Sir Tony Hoare"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;On walked Sir Tony Hoare, the programmer, software engineer, computer scientist and academic who now works for Microsoft research. His thoughtful keynote analysed the differences between science and engineering, and considered their interplay in the field of software development. His presentation challenged my way of thinking about software specification and program correctness, and the message on his closing slide was both shaming and inspirational:
&lt;/p&gt;
&lt;blockquote&gt;&lt;h3&gt;One Day&lt;/h3&gt;
&lt;ul&gt;&lt;li&gt;Software will be the most reliable component of every product which contains it&lt;/li&gt;
&lt;li&gt;Software engineering will be the most dependable of all engineering professions&lt;/li&gt;
&lt;li&gt;Because of the successful interplay of research
&lt;ul&gt;&lt;li&gt;into the science of programming&lt;/li&gt;
&lt;li&gt;and the engineering of software&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&amp;#8212; &lt;a href="http://research.microsoft.com/en-us/people/thoare/"&gt;Tony Hoare&lt;/a&gt;, The Science of Computing and the Engineering of Software, 2009
&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="http://www.flickr.com/photos/thomasguest/3681644236/" title="Enigma machine by Thomas Guest, on Flickr"&gt;&lt;img style="float:right;" src="http://farm4.static.flickr.com/3614/3681644236_2b40e5b726_m.jpg" width="240" height="199" alt="Enigma machine" /&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;Sir Tony Hoare is a Turing Award winner. I think he said he&amp;#8217;d been working with computers for 50 years &amp;#8212; it may well have been longer. Certainly he remembered when ALGOL was the hot new language, poised to supplant FORTRAN. The conference provided an opportunity to go further back, to Turing himself, and to the first ever programmable digital computer. In another memorable keynote session after lunch &lt;a href="http://www.europython.eu/talks/speakers/index.html#greenish_simon"&gt;Simon Greenish&lt;/a&gt; and &lt;a href="http://www.europython.eu/talks/speakers/index.html#black_sue"&gt;Dr Sue Black&lt;/a&gt; spoke passionately about &lt;a href="http://www.bletchleypark.org.uk/"&gt;Bletchley Park&lt;/a&gt;, the expansive Milton Keynes mansion which was converted into a code-breaking factory during the second World War and which is now a museum struggling to make ends meet. At the end of the presentation they unveiled a real Enigma machine to a huge round of applause.
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://en.wikipedia.org/wiki/Colossus_computer"&gt;Colossus&lt;/a&gt;, the world&amp;#8217;s first ever programmable digital computer, &lt;a href="http://www.codesandciphers.org.uk/lorenz/rebuild.htm"&gt;has been painstakingly rebuilt&lt;/a&gt; at Bletchley Park. Although it won&amp;#8217;t be running Python, it will resume its orginal task of cracking the &lt;a href="http://en.wikipedia.org/wiki/Lorenz_cipher"&gt;Lorenz cipher&lt;/a&gt;.
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://www.bletchleypark.org.uk"&gt;&lt;img src="http://www.bletchleypark.org.uk/doc/image.rhtm/Head%20detail%20web.jpg" alt="Statue of Turing at Bletchley Park"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;There can be no such return for &lt;a href="http://www.turing.org.uk/turing/"&gt;Alan Turing&lt;/a&gt;, who would have been 97 last week, but who tragically took his own life 55 years ago. He was a brilliant mathematician and scientist, famously eccentric, yet his papers on machine intelligence and computability remain highly accessible, forming the way we now think about computers. Over 11 thousand people worked at Bletchley Park during the war, and Alan Turing helped direct the decryption effort, a huge task which succeeded in shortening the war by two years and saving an estimated 22 million lives. In 2007 a statue of Alan Turing was erected at Bletchley Park. Dr Sue Black showed us a photo of his coffee mug which remains where he kept it, chained to a radiator.
&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=im4IZXq_KSQ:Xh1txzr9nn4:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=im4IZXq_KSQ:Xh1txzr9nn4:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=im4IZXq_KSQ:Xh1txzr9nn4:F7zBnMyn0Lo"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=im4IZXq_KSQ:Xh1txzr9nn4:F7zBnMyn0Lo" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=im4IZXq_KSQ:Xh1txzr9nn4:gIN9vFwOqvQ"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=im4IZXq_KSQ:Xh1txzr9nn4:gIN9vFwOqvQ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=im4IZXq_KSQ:Xh1txzr9nn4:cGdyc7Q-1BI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?d=cGdyc7Q-1BI" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/im4IZXq_KSQ" height="1" width="1"/&gt;</description>
<dc:date>2009-07-02</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/europython-2009</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/im4IZXq_KSQ/europython-2009</link>
<category>Python</category>
<category>Europython</category>
<feedburner:origLink>http://wordaligned.org/articles/europython-2009</feedburner:origLink></item>

<item>
<title>Partitioning with Python</title>
<description>&lt;h3&gt;Sums and Splits&lt;/h3&gt;
&lt;p&gt;On the subject of &lt;a href="http://wordaligned.org/articles/oulipo-eodermdrome"&gt;hunting for eodermdromes&lt;/a&gt;, here are a couple of semi-related partitioning problems.
&lt;/p&gt;
&lt;ol&gt;
 &lt;li&gt;
     for a positive integer, N, find the positive integer sequences which sum to N
 &lt;/li&gt;

 &lt;li&gt;
     for a sequence, S, find the distinct partitions of that sequence
 &lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;As an example of the first, the 16 distinct integer sequences which sum to 5 are:
&lt;/p&gt;
&lt;pre&gt;
5
4 + 1
3 + 1 + 1
3 + 2
2 + 1 + 2
2 + 1 + 1 + 1
2 + 2 + 1
2 + 3
1 + 1 + 3
1 + 1 + 2 + 1
1 + 1 + 1 + 1 + 1
1 + 1 + 1 + 2
1 + 2 + 2
1 + 2 + 1 + 1
1 + 3 + 1
1 + 4
&lt;/pre&gt;

&lt;p&gt;and of the second, the 8 distinct ways of partitioning the sequence ABCD are:
&lt;/p&gt;
&lt;pre&gt;
ABCD
A BCD
AB CD
ABC D
A B CD
A BC D
AB C D
A B C D
&lt;/pre&gt;

&lt;p&gt;Note that I&amp;#8217;ve counted 2 + 1 + 2, 2 + 2 + 1, and 1 + 2 + 2 as distinct sums totalling 5. That happens to be the formulation of the problem which interested me.
&lt;/p&gt;
&lt;span id="continue-reading"/&gt;


&lt;h3&gt;Context&lt;/h3&gt;
&lt;p&gt;&lt;a href="http://wordaligned.org/articles/oulipo-eodermdrome"&gt;&lt;img src="http://wordaligned.org/images/eodermdrome.png" width="200px" height="200px" style="float:right" alt="eodermdrome"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;Before discussing a solution to these problems, some context. Recall that an &lt;a href="http://wordaligned.org/articles/oulipo-eodermdrome"&gt;eodermdrome&lt;/a&gt; is a sequence which forms an Eulerian circuit through the fully connected graph whose vertices are the set of its elements. Put more simply: when you trace through the letters you get the figure shown, with no edge covered twice. Examples include:
&lt;/p&gt;
&lt;p style="margin:0;font-size:300%"&gt;&lt;span style="color:#888"&gt;E&lt;/span&gt;&lt;span style="color:#930"&gt;O&lt;/span&gt;&lt;span style="color:#036"&gt;D&lt;/span&gt;&lt;span style="color:#888"&gt;E&lt;/span&gt;&lt;span style="color:#555"&gt;R&lt;/span&gt;&lt;span style="color:#e50"&gt;M&lt;/span&gt;&lt;span style="color:#036"&gt;D&lt;/span&gt;&lt;span style="color:#555"&gt;R&lt;/span&gt;&lt;span style="color:#930"&gt;O&lt;/span&gt;&lt;span style="color:#e50"&gt;M&lt;/span&gt;&lt;span style="color:#888"&gt;E&lt;/span&gt;&lt;/p&gt;
&lt;p style="margin:0;font-size:300%"&gt;&lt;span style="color:#e50"&gt;E&lt;/span&gt;&lt;span style="color:#555"&gt;N&lt;/span&gt;&lt;span style="color:#036"&gt;H&lt;/span&gt;&lt;span style="color:#930"&gt;A&lt;/span&gt;&lt;span style="color:#555"&gt;N&lt;/span&gt;&lt;span style="color:#888"&gt;C&lt;/span&gt;&lt;span style="color:#e50"&gt;E&lt;/span&gt; &lt;span style="color:#930"&gt;A&lt;/span&gt;&lt;span style="color:#888"&gt;C&lt;/span&gt;&lt;span style="color:#036"&gt;H&lt;/span&gt;&lt;span style="color:#e50"&gt;E&lt;/span&gt;&lt;/p&gt;
&lt;p style="margin:0;font-size:300%"&gt;&lt;span style="color:#036"&gt;T&lt;/span&gt;&lt;span style="color:#930"&gt;A&lt;/span&gt;&lt;span style="color:#888"&gt;X&lt;/span&gt; &lt;span style="color:#e50"&gt;D&lt;/span&gt;&lt;span style="color:#555"&gt;E&lt;/span&gt;&lt;span style="color:#930"&gt;A&lt;/span&gt;&lt;span style="color:#e50"&gt;D&lt;/span&gt; &lt;span style="color:#036"&gt;T&lt;/span&gt;&lt;span style="color:#555"&gt;E&lt;/span&gt;&lt;span style="color:#888"&gt;X&lt;/span&gt;&lt;span style="color:#036"&gt;T&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;Eodermdromes turn out to be surprisingly rare. Writing a computer program to find them is a nice exercise in searching and text processing. Clearly, we should start with a collection of words. Then we can generate combinations of words from this collection and filter out the eodermdromes.
&lt;/p&gt;
&lt;pre&gt;
(filter eodermdrome? (combinations words))
&lt;/pre&gt;

&lt;p&gt;A large set of words (note: &amp;#8220;set&amp;#8221; not &amp;#8220;collection&amp;#8221;, we don&amp;#8217;t need duplicates) gives the best chance of success. I started with a file containing more than 35 thousand distinct words. This gives over a billion possible word pairs, and when we consider word triples and quartets the numbers get silly even for a modern computer.
&lt;/p&gt;
&lt;p&gt;As is so often the case in computing, we have a tension between opposing concerns. We&amp;#8217;d like code which separates the task of generating candidates and the task of testing these candidates for eodermdromicity, but in order to run this code in a timely manner we need some of the eodermdrome testing to leak into the candidate generation. For example, we could preprocess the word set removing words which contain double Ls (all, ball, call, ill, Bill, kill &amp;#8230;) since these can never appear in an eodermdrome. And we could similarly remove words which end ETE (delete, Pete, effete). As I hope you can see, it&amp;#8217;s easy to end up with finickity code and co-dependent functions.
&lt;/p&gt;
&lt;p&gt;I chose a simple but effective strategy to reduce the search space to something manageable, based on word length. First, then, I loaded my word set into a Python dict collecting lists of words keyed by their length.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; from collections import defaultdict
&amp;gt;&amp;gt;&amp;gt; words = defaultdict(list)
&amp;gt;&amp;gt;&amp;gt; for word in open('word-set.txt').read().split():
...     words[len(word)].append(word)

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Given this dict, picking out single word eodermdromes is easy.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; list(filter(is_eodermdrome, words[11]))
['eodermdrome']

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;How about eodermdromes composed of a 6 letter word followed by a 5 letter word? We can form the &lt;a href="http://docs.python.org/library/itertools.html#itertools.product"&gt;cartesian product&lt;/a&gt; of the lists of 6 and 5 letter words and filter out the ones we want.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; from itertools import product
&amp;gt;&amp;gt;&amp;gt; eod_6_5 = filter(is_eodermdrome, product(words[6], words[5]))
&amp;gt;&amp;gt;&amp;gt; next(eod_6_5)
('earned', 'andre')
&amp;gt;&amp;gt;&amp;gt; next(eod_6_5)
('yearly', 'relay')

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;How about &lt;em&gt;all eodermdromes&lt;/em&gt; of length 11?
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; from itertools import chain.from_iterable as seq
&amp;gt;&amp;gt;&amp;gt; word_lens = sum_to_n(11)
&amp;gt;&amp;gt;&amp;gt; candidates = seq(product(*[words[i] for i in s]) for s in word_lens)
&amp;gt;&amp;gt;&amp;gt; eods = filter(is_eodermdrome, candidates)

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Note here that I&amp;#8217;m using Python 3.0, and that &lt;a href="http://docs.python.org/py3k/library/functions.html#filter"&gt;filter&lt;/a&gt; is therefore a lazy function. The interactive session shown above hasn&amp;#8217;t actually started taking anything from these lazily-evaluated streams.
&lt;/p&gt;
&lt;p&gt;I certainly don&amp;#8217;t claim this is the quickest way to search for eodermdromes. In fact, this little program took several hours to complete. But a back-of-an-envelope calculation showed it &lt;em&gt;would&lt;/em&gt; complete in a few hours, and that was good enough.
&lt;/p&gt;
&lt;p&gt;Note also that we haven&amp;#8217;t shown an implementation of &lt;code&gt;sum_to_n()&lt;/code&gt; yet, which takes us back to the problems posed at the start of this article.
&lt;/p&gt;

&lt;h3&gt;Sum to N&lt;/h3&gt;
&lt;p&gt;Finding the positive integer series which sum to a positive integer N is a job for &lt;a href="http://docs.python.org/library/itertools.html#itertools.combinations"&gt;itertools.combinations&lt;/a&gt;.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;from itertools import combinations, chain

def sum_to_n(n):
    'Generate the series of +ve integer lists which sum to a +ve integer, n.'
    from operator import sub
    b, mid, e = [0], list(range(1, n)), [n]
    splits = (d for i in range(n) for d in combinations(mid, i)) 
    return (list(map(sub, chain(s, e), chain(b, s))) for s in splits)

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The idea here is straightforward: there&amp;#8217;s a 1-to-1 correspondence between the sums we want and ordered combinations drawn from the series 1, 2, &amp;#8230; n-1. For example, if n is 11 one such combination would be:
&lt;/p&gt;
&lt;pre&gt;
(1, 5, 7, 10)
&lt;/pre&gt;

&lt;p&gt;we can extend this by pushing 0 in front and n at the end
&lt;/p&gt;
&lt;pre&gt;
(0, 1, 5, 7, 10, 11)
&lt;/pre&gt;

&lt;p&gt;This extended tuple can now be seen as partial sums of a series which sums to 11. Taking differences gives the series
&lt;/p&gt;
&lt;pre&gt;
(1-0, 5-1, 7-5, 10-7, 11-10)
&lt;/pre&gt;

&lt;p&gt;which is
&lt;/p&gt;
&lt;pre&gt;
(1, 4, 2, 3, 1)
&lt;/pre&gt;

&lt;p&gt;which does indeed sum to 11
&lt;/p&gt;
&lt;pre&gt;
1 + 4 + 2 + 3 + 1 = 11
&lt;/pre&gt;

&lt;p&gt;The Python code shown uses a clever idea to implement this staggered differencing, an idea I &lt;a href="http://newsimg.bbc.co.uk/media/images/45909000/jpg/_45909582_badartists.jpg" title="Bristol's famous artist and thief"&gt;cleverly stole&lt;/a&gt; from one of &lt;a href="http://code.activestate.com/recipes/users/178123/"&gt;Raymond Hettinger&amp;#8217;s brilliant Python recipes&lt;/a&gt;.
&lt;/p&gt;

&lt;h3&gt;Partitioning a Sequence&lt;/h3&gt;
&lt;div class="typocode"&gt;&lt;div class="codetitle"&gt;Python Cookbook: Recipe 576795&lt;/div&gt;

&lt;pre class="prettyprint"&gt;from itertools import chain, combinations

def partition(iterable, chain=chain, map=map):
    s = iterable if hasattr(iterable, '__getslice__') else tuple(iterable)
    n = len(s)
    first, middle, last = [0], range(1, n), [n]
    getslice = s.__getslice__
    return [map(getslice, chain(first, div), chain(div, last))
            for i in range(n) for div in combinations(middle, i)]

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This &lt;a href="http://code.activestate.com/recipes/576795"&gt;recipe&lt;/a&gt; shows sum-to-n and partitioning to be very similar problems. In fact, we could easily implement &lt;code&gt;sum_to_n()&lt;/code&gt; on top of  &lt;code&gt;partition()&lt;/code&gt;:
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;def sum_to_n(n):
    return ([len(t) for t in p] for p in partition(range(n)))

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The posted recipe needs a minor overhaul to get it working with Python 3.0, &lt;a href="http://docs.python.org/3.0/whatsnew/3.0.html#operators-and-special-methods"&gt;which does away&lt;/a&gt; with &lt;code&gt;__getslice__&lt;/code&gt;: getting a slice is simply what &lt;code&gt;__getitem__&lt;/code&gt; does when given a slice object. The 2to3 tool fails to convert the recipe, which must be recast as something like:
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;from itertools import chain, combinations

def sliceable(xs):
    '''Return a sliceable version of the iterable xs.'''
    try:
        xs[:0]
        return xs
    except TypeError:
        return tuple(xs)

def partition(iterable):
    s = sliceable(iterable)
    n = len(s)
    b, mid, e = [0], list(range(1, n)), [n]
    getslice = s.__getitem__
    splits = (d for i in range(n) for d in combinations(mid, i))
    return [[s[sl] for sl in map(slice, chain(b, d), chain(d, e))]
            for d in splits]

&lt;/pre&gt;

&lt;/div&gt;


&lt;h3&gt;Sum to N, again&lt;/h3&gt;
&lt;p&gt;Here&amp;#8217;s a variant implementation of &lt;code&gt;sum_to_n()&lt;/code&gt;. The idea here is to fill N slots with a pattern of 0&amp;#8217;s and 1&amp;#8217;s. We then reduce this pattern to the lengths of runs of repeated elements, giving a series which sums to N. &lt;code&gt;Itertools.product('01', repeat=n)&lt;/code&gt; generates all possible binary patterns of length N, which turns out to be twice as many as we want since (e.g.) 00001111100 and 11110000011 represent the same sum, 4 + 5 + 2; hence the n-1 &lt;code&gt;repeat&lt;/code&gt; count and the call to &lt;code&gt;chain&lt;/code&gt; in the code below&lt;a id="fn1link" href="http://wordaligned.org/articles/partitioning-with-python#fn1"&gt;&lt;sup&gt;[1]&lt;/sup&gt;&lt;/a&gt;.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;from itertools import groupby, chain, product

def ilen(it):
    return sum(1 for _ in it)

def sum_to_n(n):
    return ([ilen(gp) for _, gp in groupby(chain('1', O1))]
            for O1 in product('01', repeat=n-1))

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Fun, but the version using &lt;a href="http://docs.python.org/library/itertools.html#itertools.combinations"&gt;combinations&lt;/a&gt; is better!
&lt;/p&gt;
&lt;hr /&gt;

&lt;p&gt;&lt;a id="fn1" href="http://wordaligned.org/articles/partitioning-with-python#fn1link"&gt;[1]&lt;/a&gt; My first thought was to use &lt;code&gt;itertools.islice&lt;/code&gt; to limit the stream to the first 2&lt;sup&gt;n-1&lt;/sup&gt; values, but I discovered &lt;code&gt;islice&lt;/code&gt; has a surprising &lt;a href="http://bugs.python.org/issue6305" title="I've reported this as a bug"&gt;limitation&lt;/a&gt;.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; from itertools import islice, count
&amp;gt;&amp;gt;&amp;gt; islice(count(), (1&amp;lt;&amp;lt;31) - 1)
&amp;lt;itertools.islice object at 0x63a0c0&amp;gt;
&amp;gt;&amp;gt;&amp;gt; islice(count(), (1&amp;lt;&amp;lt;31))
Traceback (most recent call last):
  File "&amp;lt;stdin&amp;gt;", line 1, in &amp;lt;module&amp;gt;
ValueError: Stop argument for islice() must be a non-negative integer or None.

&lt;/pre&gt;

&lt;/div&gt;

&lt;hr /&gt;

&lt;p style="text-align:center"&gt;&amp;sect;&lt;/p&gt;
&lt;p style="text-align:center"&gt;so reuse ours&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=GKGO7wxv_fE:fHPwDL-4OTA:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=GKGO7wxv_fE:fHPwDL-4OTA:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=GKGO7wxv_fE:fHPwDL-4OTA:F7zBnMyn0Lo"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=GKGO7wxv_fE:fHPwDL-4OTA:F7zBnMyn0Lo" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=GKGO7wxv_fE:fHPwDL-4OTA:gIN9vFwOqvQ"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=GKGO7wxv_fE:fHPwDL-4OTA:gIN9vFwOqvQ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=GKGO7wxv_fE:fHPwDL-4OTA:cGdyc7Q-1BI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?d=cGdyc7Q-1BI" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/GKGO7wxv_fE" height="1" width="1"/&gt;</description>
<dc:date>2009-06-17</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/partitioning-with-python</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/GKGO7wxv_fE/partitioning-with-python</link>
<category>Python</category>
<category>Puzzles</category>
<category>Algorithms</category>
<feedburner:origLink>http://wordaligned.org/articles/partitioning-with-python</feedburner:origLink></item>

<item>
<title>Oulipo and the Eodermdrome challenge</title>
<description>&lt;p style="margin:0;font-size:500%"&gt;&lt;span style="color:#888"&gt;S&lt;/span&gt;&lt;span style="color:#930"&gt;H&lt;/span&gt;&lt;span style="color:#036"&gt;O&lt;/span&gt;&lt;span style="color:#555"&gt;E&lt;/span&gt;&lt;span style="color:#888"&gt;S&lt;/span&gt; &lt;span style="color:#036"&gt;O&lt;/span&gt;&lt;span style="color:#e50"&gt;N&lt;/span&gt; &lt;span style="color:#930"&gt;H&lt;/span&gt;&lt;span style="color:#555"&gt;E&lt;/span&gt;&lt;span style="color:#e50"&gt;N&lt;/span&gt;&lt;span style="color:#888"&gt;S&lt;/span&gt;&lt;/p&gt;
&lt;p style="margin:0;font-size:500%"&gt;&lt;span style="color:#036"&gt;S&lt;/span&gt;&lt;span style="color:#888"&gt;A&lt;/span&gt;&lt;span style="color:#930"&gt;M&lt;/span&gt;&lt;span style="color:#036"&gt;S&lt;/span&gt;&lt;span style="color:#555"&gt;O&lt;/span&gt;&lt;span style="color:#e50"&gt;N&lt;/span&gt; &lt;span style="color:#930"&gt;M&lt;/span&gt;&lt;span style="color:#555"&gt;O&lt;/span&gt;&lt;span style="color:#888"&gt;A&lt;/span&gt;&lt;span style="color:#e50"&gt;N&lt;/span&gt;&lt;span style="color:#036"&gt;S&lt;/span&gt;&lt;/p&gt;
&lt;p style="margin:0;font-size:500%"&gt;&lt;span style="color:#555"&gt;D&lt;/span&gt;&lt;span style="color:#930"&gt;R&lt;/span&gt;&lt;span style="color:#888"&gt;A&lt;/span&gt;&lt;span style="color:#036"&gt;B&lt;/span&gt; &lt;span style="color:#930"&gt;R&lt;/span&gt;&lt;span style="color:#e50"&gt;E&lt;/span&gt;&lt;span style="color:#555"&gt;D&lt;/span&gt; &lt;span style="color:#036"&gt;B&lt;/span&gt;&lt;span style="color:#e50"&gt;E&lt;/span&gt;&lt;span style="color:#888"&gt;A&lt;/span&gt;&lt;span style="color:#555"&gt;D&lt;/span&gt;&lt;/p&gt;


&lt;h3&gt;Oulipo&lt;/h3&gt;
&lt;p&gt;At the &lt;a href="http://www.dcs.warwick.ac.uk/bshm/meetings/Fiction.html"&gt;Mathematics and Fiction&lt;/a&gt; workshop held last weekend in Oxford I particularly enjoyed &lt;a href="http://web.princeton.edu/sites/fit/faculty/bellos.html"&gt;David Bellos&lt;/a&gt;&amp;#8217; wonderful talk about Oulipo, the world&amp;#8217;s longest running literary movement. &lt;a href="http://www.oulipo.net"&gt;The Oulipo&lt;/a&gt; is a group of writers interested in exploring the application of mathematical structures, patterns and algorithms to writing.
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://www.flickr.com/photos/thomasguest/3597995774/" title="Queneau sonnets by Thomas Guest, on Flickr"&gt;&lt;img src="http://farm4.static.flickr.com/3342/3597995774_857cdd8566_o.jpg" width="450" height="325" alt="Queneau sonnets" /&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;As an example, poet and novelist &lt;a href="http://en.wikipedia.org/wiki/Raymond_Queneau"&gt;Raymond Queneau&lt;/a&gt; unleashed the exponential power of combinatorics to write a &lt;a href="http://en.wikipedia.org/wiki/Sonnet"&gt;small book&lt;/a&gt; of sonnets which he hadn&amp;#8217;t finished reading himself!
&lt;/p&gt;
&lt;span id="continue-reading"/&gt;


&lt;h3&gt;Constraints&lt;/h3&gt;
&lt;div class="amazon"&gt;&lt;a href="http://www.amazon.com/gp/product/0099477548?tag=wordalig-20"&gt;&lt;img src="http://wordaligned.org/images/books/damascus.jpg" alt="Damascus cover"/&gt;&lt;/a&gt;&lt;/div&gt;

&lt;p&gt;The &lt;a href="http://en.wikipedia.org/wiki/Sonnet"&gt;sonnet&lt;/a&gt; is a highly constrained literary form: 14 lines, 10 syllables per line, and a well-defined rhyme pattern. More generally, the Oulipo discovered  such mathematical constraints can generate interesting results. Constraints can also provide inspiration &amp;#8212; tying things down helps give them shape. Consider two questions:
&lt;/p&gt;
&lt;ol&gt;
 &lt;li&gt;
     What are you doing?
 &lt;/li&gt;

 &lt;li&gt;
     What are you doing? (Limit your answer to &lt;a href="http://twitter.com/thomasguest"&gt;140 characters&lt;/a&gt;.)
 &lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The first sounds plain nosey; but the second has spawned a whole new form of publishing.
&lt;/p&gt;
&lt;p&gt;The &lt;a href="http://www.fox.com/24" title="24. Never seen it, but I get the idea!"&gt;day-in-a-life&lt;/a&gt; format is another &lt;a href="http://en.wikipedia.org/wiki/Bloomsday"&gt;famous&lt;/a&gt; literary constraint. Oulipo-inspired writer &lt;a href="http://richardbeard.info"&gt;Richard Beard&lt;/a&gt; explains how he notched this constraint up a level, creating a novel in which the action is formally and tightly bound to a single day.
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;In &amp;#8220;Damascus,&amp;#8221; I only use nouns that appeared in The Times of Nov. 1 1993. How does this work? In one paragraph some children are racing to the sea and one of them wants to say &amp;#8212; &amp;#8220;Last to touch the water&amp;#8217;s a donkey.&amp;#8221; But there&amp;#8217;s no &amp;#8220;donkey&amp;#8221; in the paper, so they end up saying, &amp;#8220;Last to touch the water&amp;#8217;s a walrus.&amp;#8221; So you end up with some interesting and novel linguistic formulations. &amp;#8212; &lt;a href="http://richardbeard.info/html/the_japan_times_.html"&gt;Richard Beard&lt;/a&gt;
&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h3&gt;The Eodermdrome challenge&lt;/h3&gt;
&lt;p&gt;&lt;a href="http://www.oulipo.net/contraintes/docs/eodermdrome"&gt;&lt;img src="http://wordaligned.org/images/eodermdrome.png" width="200px" height="200px" style="float:right" alt="eodermdrome"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;p&gt;The simplest Oulipian structure David Bellos presented was the &lt;a href="http://www.oulipo.net/contraintes/docs/eodermdrome"&gt;eodermdrome&lt;/a&gt;. The word &amp;#8220;EODERMDROME&amp;#8221; is itself an eodermdrome &lt;a id="fn1link" href="http://wordaligned.org/articles/oulipo-eodermdrome#fn1"&gt;&lt;sup&gt;[1]&lt;/sup&gt;&lt;/a&gt;: if you place the letters E, O, D, R, M at the vertices of a pentagon, as shown, when you trace the sequence E&amp;rarr;O&amp;rarr;D&amp;rarr;E&amp;rarr;R&amp;rarr;M&amp;rarr;D&amp;rarr;R&amp;rarr;O&amp;rarr;M&amp;rarr;E you end up where you started, covering each line in the resulting figure exactly once. Mathematically speaking, the sequence EODERMDROME forms an Eulerian circuit within the fully connected graph whose vertices are the set of its constituent characters. Eodermdromes make naturally pleasing sequences, perhaps suitable for domain names or memorable phone numbers.
&lt;/p&gt;
&lt;p&gt;In his talk David Bellos offered three more eodermdromes. The second is credited to Jacques Roubaud. You&amp;#8217;ll notice that the elements in the third are words rather than characters: the pattern works at any scale, and a reader needn&amp;#8217;t be aware of it to appreciate its beauty.
&lt;/p&gt;
&lt;ol&gt;
 &lt;li&gt;
     tears at rest
 &lt;/li&gt;

 &lt;li&gt;
     &amp;eacute;toile, ortie
 &lt;/li&gt;

 &lt;li&gt;
     figs, lizards, snakes, heat, light, figs, snakes, light, lizards, heat, figs
 &lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Eodermdromes turn out to be surprisingly thin on the ground. I include three of my own discoveries &lt;a id="fn2link" href="http://wordaligned.org/articles/oulipo-eodermdrome#fn2"&gt;&lt;sup&gt;[2]&lt;/sup&gt;&lt;/a&gt; at the start of this article. Can you can find any better ones?
&lt;/p&gt;
&lt;p&gt;&lt;a href="http://www.withhugsandkisses.co.uk"&gt;&lt;img src="http://wordaligned.org/images/shoes-on-hens.jpg" alt="SHOES ON HENS"/&gt;&lt;/a&gt;
&lt;/p&gt;
&lt;hr /&gt;

&lt;p&gt;&lt;a id="fn1" href="http://wordaligned.org/articles/oulipo-eodermdrome#fn1link"&gt;[1]&lt;/a&gt; The word for such words is &amp;#8220;autological&amp;#8221;, as opposed to &amp;#8220;heterological&amp;#8221;. But is &lt;a href="http://en.wikipedia.org/wiki/Grelling-Nelson_paradox" title="Yes but no but"&gt;&amp;#8220;heterological&amp;#8221; itself heterological&lt;/a&gt;?
&lt;/p&gt;
&lt;p&gt;&lt;a id="fn2" href="http://wordaligned.org/articles/oulipo-eodermdrome#fn2link"&gt;[2]&lt;/a&gt; OK, so a computer did the hard work. It&amp;#8217;s a nice programming exercise.
&lt;/p&gt;
&lt;p style="text-align:center"&gt;&amp;sect;&lt;/p&gt;
&lt;p style="text-align:center"&gt;end code once&lt;/p&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=W0RdEutmkh0:068jlObf6Fg:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=W0RdEutmkh0:068jlObf6Fg:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=W0RdEutmkh0:068jlObf6Fg:F7zBnMyn0Lo"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=W0RdEutmkh0:068jlObf6Fg:F7zBnMyn0Lo" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=W0RdEutmkh0:068jlObf6Fg:gIN9vFwOqvQ"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=W0RdEutmkh0:068jlObf6Fg:gIN9vFwOqvQ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=W0RdEutmkh0:068jlObf6Fg:cGdyc7Q-1BI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?d=cGdyc7Q-1BI" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/W0RdEutmkh0" height="1" width="1"/&gt;</description>
<dc:date>2009-06-05</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/oulipo-eodermdrome</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/W0RdEutmkh0/oulipo-eodermdrome</link>
<category>Puzzles</category>
<feedburner:origLink>http://wordaligned.org/articles/oulipo-eodermdrome</feedburner:origLink></item>

<item>
<title>Run-length encoding in Python</title>
<description>&lt;p&gt;Recently I discussed &lt;a href="http://wordaligned.org/articles/deflate-runlength-encoding-but-better" title="DEFLATE: run-length encoding, but better"&gt;run-length encoding and DEFLATE&lt;/a&gt; compression. I never actually showed a Python implementation of a run-length encoder, so here&amp;#8217;s one now.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;import itertools as its

def ilen(it):
    '''Return the length of an iterable.
    
    &amp;gt;&amp;gt;&amp;gt; ilen(range(7))
    7
    '''
    return sum(1 for _ in it)

def runlength_enc(xs):
    '''Return a run-length encoded version of the stream, xs.
    
    The resulting stream consists of (count, x) pairs.
    
    &amp;gt;&amp;gt;&amp;gt; ys = runlength_enc('AAABBCCC')
    &amp;gt;&amp;gt;&amp;gt; next(ys)
    (3, 'A')
    &amp;gt;&amp;gt;&amp;gt; list(ys)
    [(2, 'B'), (3, 'C')]
    '''
    return ((ilen(gp), x) for x, gp in its.groupby(xs))

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;The decoder is equally simple. &lt;code&gt;Itertools.repeat&lt;/code&gt; expands a &lt;code&gt;(count, value)&lt;/code&gt; pair into an iterable which will generate &lt;code&gt;count&lt;/code&gt; elements. &lt;code&gt;Itertools.chain&lt;/code&gt; flattens these iterables into a single stream.
&lt;/p&gt;
&lt;span id="continue-reading"/&gt;

&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;def runlength_dec(xs):
    '''Expand a run-length encoded stream.
    
    Each element of xs is a pair, (count, x).
    
    &amp;gt;&amp;gt;&amp;gt; ys = runlength_dec(((3, 'A'), (2, 'B')))
    &amp;gt;&amp;gt;&amp;gt; next(ys)
    'A'
    &amp;gt;&amp;gt;&amp;gt; ''.join(ys)
    'AABB'
    '''
    return its.chain.from_iterable(its.repeat(x, n) for n, x in xs)

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;If you haven&amp;#8217;t seen &lt;code&gt;&lt;a href="http://docs.python.org/library/itertools.html#itertools.itertools.chain.from_iterable"&gt;itertools.chain.from_iterable()&lt;/a&gt;&lt;/code&gt; yet, it was introduced at Python 3.0/2.6. The important feature here is that it lazily works its way through a single iterable argument. If instead we&amp;#8217;d written:
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;def runlength_dec(xs):
    ....
    return its.chain(*(its.repeat(x, n) for n, x in xs))

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;then our run-length decoder would need to consume all of &lt;code&gt;xs&lt;/code&gt; before yielding results (which is why we must interrupt the interpreter&amp;#8217;s execution below).
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; xs = its.cycle((3, 'A'), (2, 'B'))
&amp;gt;&amp;gt;&amp;gt; runlength_dec(xs)
  C-c C-cTraceback (most recent call last):
  File "&amp;lt;stdin&amp;gt;", line 1, in &amp;lt;module&amp;gt;
  File "&amp;lt;string&amp;gt;", line 25, in runlength_dec
  File "&amp;lt;string&amp;gt;", line 25, in &amp;lt;genexpr&amp;gt;
KeyboardInterrupt

&lt;/pre&gt;

&lt;/div&gt;


&lt;h3&gt;Named tuples for clarity&lt;/h3&gt;
&lt;p&gt;Streams of pairs (as shown above) are perfectly Pythonic. If we run-length encode a stream of numbers, clients will just have to read the manual and remember that &lt;code&gt;item[0]&lt;/code&gt; is a repeat count and &lt;code&gt;item[1]&lt;/code&gt; is a value.
&lt;/p&gt;
&lt;p&gt;If this seems fragile, a new-ish member of the &lt;a href="http://docs.python.org/dev/library/collections.html"&gt;collections module&lt;/a&gt; can give the pair more structure.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;&amp;gt;&amp;gt;&amp;gt; from collections import namedtuple
&amp;gt;&amp;gt;&amp;gt; Run = namedtuple('Run', 'count value') 
&amp;gt;&amp;gt;&amp;gt; run1 = Run(count=10, value=2)
&amp;gt;&amp;gt;&amp;gt; run2 = Run(value=2, count=10)
&amp;gt;&amp;gt;&amp;gt; run1
Run(count=10, value=2)
&amp;gt;&amp;gt;&amp;gt; run2
Run(count=10, value=2)
&amp;gt;&amp;gt;&amp;gt; run1.count
10
&amp;gt;&amp;gt;&amp;gt; run1[0]
10

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here&amp;#8217;s how we&amp;#8217;d change &lt;code&gt;runlength_enc()&lt;/code&gt; to use the new type.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;def runlength_enc(xs):
    '''Return a run-length encoded version of the stream, xs.
    
    &amp;gt;&amp;gt;&amp;gt; ys = runlength_enc('AAABBCCC')
    &amp;gt;&amp;gt;&amp;gt; next(ys)
    Run(count=3, value='A')
    &amp;gt;&amp;gt;&amp;gt; list(ys)
    [Run(count=2, value='B'), Run(count=3, value='C')]
    '''
    return (Run(ilen(gp), x) for x, gp in its.groupby(xs))

&lt;/pre&gt;

&lt;/div&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=I2i0l49FVYQ:LnhAEMGdj4E:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=I2i0l49FVYQ:LnhAEMGdj4E:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=I2i0l49FVYQ:LnhAEMGdj4E:F7zBnMyn0Lo"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=I2i0l49FVYQ:LnhAEMGdj4E:F7zBnMyn0Lo" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=I2i0l49FVYQ:LnhAEMGdj4E:gIN9vFwOqvQ"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=I2i0l49FVYQ:LnhAEMGdj4E:gIN9vFwOqvQ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=I2i0l49FVYQ:LnhAEMGdj4E:cGdyc7Q-1BI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?d=cGdyc7Q-1BI" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/I2i0l49FVYQ" height="1" width="1"/&gt;</description>
<dc:date>2009-06-01</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/runlength-encoding-in-python</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/I2i0l49FVYQ/runlength-encoding-in-python</link>
<category>Python</category>
<category>Streams</category>
<feedburner:origLink>http://wordaligned.org/articles/runlength-encoding-in-python</feedburner:origLink></item>

<item>
<title>DEFLATE: run-length encoding, but better</title>
<description>&lt;h3&gt;Run-length encoding&lt;/h3&gt;
&lt;p&gt;&lt;a href="http://en.wikipedia.org/wiki/Run-length_encoding"&gt;Run-length encoding&lt;/a&gt; is a simple compression scheme in which runs of equal values are represented by the value and a repeat count. For example, a supermarket cashier might process this line of shopping
&lt;/p&gt;
&lt;img src="http://wordaligned.org/images/fruit-line.png" alt="Fruit salad"/&gt;

&lt;p&gt;as
&lt;/p&gt;
&lt;ul&gt;
 &lt;li&gt;
     4 bananas
 &lt;/li&gt;

 &lt;li&gt;
     3 apples
 &lt;/li&gt;

 &lt;li&gt;
     2 bananas
 &lt;/li&gt;

 &lt;li&gt;
     1 pineapple
 &lt;/li&gt;

 &lt;li&gt;
     3 apples
 &lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Unix packs in its very own run length encoder, &lt;code&gt;uniq -c&lt;/code&gt;. It works just fine &amp;#8212; so long as the values you want to encode are newline separated byte strings, that is.
&lt;/p&gt;
&lt;p&gt;Let&amp;#8217;s use a sequence of coin tosses as an example stream. &lt;code&gt;$RANDOM&lt;/code&gt; generates random numbers. We use the least significant bit of these numbers as an index into an array containing the values &lt;code&gt;heads&lt;/code&gt;, &lt;code&gt;tails&lt;/code&gt;.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;$ HT=(heads tails)
$ toss() { echo ${HT[$RANDOM&amp;amp;1]}; }
$ toss; toss; toss
heads
tails
tails
$ tosses() { while [ 1 ]; do toss; done; }
$ tosses | head
tails
tails
tails
heads
tails
heads
heads
heads
tails
tails

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;img src="http://wordaligned.org/images/tails.jpg" alt="tails"/&gt;
   &lt;img src="http://wordaligned.org/images/tails.jpg" alt="tails"/&gt;
   &lt;img src="http://wordaligned.org/images/tails.jpg" alt="tails"/&gt;
   &lt;img src="http://wordaligned.org/images/heads.jpg" alt="heads"/&gt;
   &lt;img src="http://wordaligned.org/images/tails.jpg" alt="tails"/&gt;
   &lt;img src="http://wordaligned.org/images/heads.jpg" alt="heads"/&gt;
   &lt;img src="http://wordaligned.org/images/heads.jpg" alt="heads"/&gt;
   &lt;img src="http://wordaligned.org/images/heads.jpg" alt="heads"/&gt;
   &lt;img src="http://wordaligned.org/images/tails.jpg" alt="tails"/&gt;
   &lt;img src="http://wordaligned.org/images/tails.jpg" alt="tails"/&gt;
&lt;/p&gt;
&lt;span id="continue-reading"/&gt;

&lt;p&gt;Passing a fresh sample from this same stream through our run-length encoder we get:
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;$ tosses | uniq -c | head
   2 heads
   1 tails
   1 heads
   1 tails
   1 heads
   6 tails
   3 heads
   1 tails
   4 heads
   1 tails

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;An &lt;code&gt;awk&lt;/code&gt; script can be used as a run-length decoder. (There must be a neater way, using &lt;code&gt;sed&lt;/code&gt; maybe?)
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;$ runlendec() { awk '{ while ($1--) print $2 }'; }
$ tosses | head | tee orig.log | uniq -c | runlendec | tee encdec.log
heads
tails
heads
tails
heads
heads
tails
tails
heads
heads
$ diff orig.log encdec.log

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Here, we toss a coin 10 times teeing the original sequence to a file. The next two links in the pipeline compress and decompress the sequence, teeing the results to another file. Finally, as a sanity check, we confirm the round trip results are the same.
&lt;/p&gt;

&lt;h3&gt;Run-length encoding in Python&lt;/h3&gt;
&lt;p&gt;This Unix run-length codec is fun, but of limited practical use. One good feature, though, is the way it operates on streams of data (including infinite streams), leaving clients free to decide how best to slice and buffer these streams.
&lt;/p&gt;
&lt;p&gt;Python has a fine library of high-level &lt;a href="http://docs.python.org/library/itertools.html"&gt;stream transformation tools&lt;/a&gt; from which we can build a generic and flexible run-length codec in just a few lines. Since I want to progress from run-length coding to something more advanced, I&amp;#8217;ll leave discussing how to implement this codec for now, but if you&amp;#8217;d like to write your own version, here&amp;#8217;s a description suitable for &lt;a href="http://docs.python.org/library/doctest#simple-usage-checking-examples-in-a-text-file"&gt;doctesting&lt;/a&gt;.
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;Import the run-length codec functions and compress a short string.
&amp;gt;&amp;gt;&amp;gt; from runlength import compress, decompress
&amp;gt;&amp;gt;&amp;gt; comp = compress('AABBBACC')

The returned compressor is a stream (an iterable).
&amp;gt;&amp;gt;&amp;gt; next(comp)
(2, 'A')

Pull the rest of the stream into memory.
&amp;gt;&amp;gt;&amp;gt; rest = list(comp)
&amp;gt;&amp;gt;&amp;gt; rest
[(3, 'B'), (1, 'A'), (2, 'C')]

Simple decompress example.
&amp;gt;&amp;gt;&amp;gt; concat = ''.join
&amp;gt;&amp;gt;&amp;gt; concat(decompress(rest))
'BBBACC'

Compress, decompress also work with infinite streams, like the 
a2b3 stream, which repeatedly cycles two pairs. 
&amp;gt;&amp;gt;&amp;gt; from itertools import cycle, islice
&amp;gt;&amp;gt;&amp;gt; a2b3 = cycle([(2, 'a'), (3, 'b')])
&amp;gt;&amp;gt;&amp;gt; dec = decompress(a2b3)

Pull 8 values from the decompressed stream.
&amp;gt;&amp;gt;&amp;gt; concat(islice(dec, 8))
'aabbbaab'

Now compress the decompressed stream, and explore a few items.
&amp;gt;&amp;gt;&amp;gt; comp = compress(dec)
&amp;gt;&amp;gt;&amp;gt; next(comp)
(2, 'b')
&amp;gt;&amp;gt;&amp;gt; list(islice(comp, 2))
[(2, 'a'), (3, 'b')]

&lt;/pre&gt;

&lt;/div&gt;


&lt;h3&gt;DEFLATE&lt;/h3&gt;
&lt;img style="border: 2px solid #ccc;" src="http://wordaligned.org/images/chessboard-monochrome.png" alt="Chessboard"/&gt;

&lt;p&gt;The Wikipedia page on &lt;a href="http://en.wikipedia.org/wiki/Run-length_encoding"&gt;run-length encoding&lt;/a&gt; identifies monochrome images as good candidates for run-length compression. The white and black pixels typically group into long runs. Indeed, any simple image using a limited palette should reduce well using this compression scheme.
&lt;/p&gt;
&lt;p&gt;The chessboard above is 256&amp;times;256 pixels, each square being 32&amp;times;32 pixels. We &lt;em&gt;could&lt;/em&gt; run-length encode this 64K pixel image as 256&amp;times;8 = 2K runs of 32 pixels, a decent saving. (Actually, we should do slightly better, noting that there are runs of length 64 at the chessboard rank boundaries, but  you get the idea.)
&lt;/p&gt;
&lt;pre&gt;
(32,W)(32,B)(32,W)(32,B)(32,W)(32,B)(32,W)(32,B),
(32,W)(32,B)(32,W)(32,B)(32,W)(32,B)(32,W)(32,B),
....
(32,B)(32,W)(32,B)(32,W)(32,B)(32,W)(32,B)(32,W)
&lt;/pre&gt;

&lt;p&gt;Like a paletted image, a block of text &amp;#8212; the web page you&amp;#8217;re reading now, for example &amp;#8212; employs a limited alphabet. Although the characters in this text don&amp;#8217;t usually group into long runs there&amp;#8217;s plenty of repetition, especially in the raw HTML: all the occurrences of &lt;code&gt;&amp;lt;div&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;span&amp;gt;&lt;/code&gt; and &lt;code&gt;class&lt;/code&gt; used for CSS styling, for example. The &lt;a href="http://en.wikipedia.org/wiki/DEFLATE"&gt;DEFLATE&lt;/a&gt; compression algorithm uses a clever twist on run-length encoding to remove this redundancy:
&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;The compressed data consists of a series of elements of two types: literal bytes (of strings that have not been detected as duplicated within the previous 32K input bytes), and pointers to duplicated strings, where a pointer is represented as a pair &amp;lt;length, backward distance&amp;gt;. (&lt;a href="http://tools.ietf.org/html/rfc1951"&gt;RFC-1951&lt;/a&gt;)
&lt;/p&gt;
&lt;/blockquote&gt;&lt;p&gt;(In addition, a multiple-level dynamic Huffman encoding scheme reduces the space needed for the strings, distances and lengths themselves.)
&lt;/p&gt;
&lt;p&gt;There&amp;#8217;s more to these pointer elements than first appears: the length can exceed the backward distance. Thus the sequence:
&lt;/p&gt;
&lt;pre&gt;
heads
heads
heads
heads
heads
&lt;/pre&gt;

&lt;p&gt;can be deflated as the literal type &lt;code&gt;heads\n&lt;/code&gt; followed by the pointer type &lt;code&gt;&amp;lt;24, 6&amp;gt;&lt;/code&gt;. 
&lt;/p&gt;
&lt;p&gt;If you&amp;#8217;ve spotted the potential for recursion, good! The inflating stream can reference itself, which can reference itself, which can &amp;#8230; &lt;a href="http://steike.com/code/useless/zip-file-quine/" title="Best ever Quine!"&gt;Confusing?&lt;/a&gt;
&lt;/p&gt;

&lt;h3&gt;Zipping pixels&lt;/h3&gt;
&lt;p&gt;&lt;a href="http://www.libpng.org/pub/png/" title="Check out the graphics on the PNG home page!"&gt;PNG&lt;/a&gt; images use DEFLATE compression (as implemented by &lt;a href="http://www.zlib.net"&gt;zlib&lt;/a&gt;) to save on pixel storage space. Here&amp;#8217;s a binary view of the raw data in the chessboard graphic shown above, all &lt;strong&gt;137 bytes&lt;/strong&gt; of it. The 64K pixels themselves compress into a 88 byte IDAT chunk, of which the final 8 bytes are a checksum and (I think?) some padding. Maybe the image could be &lt;a href="http://drj11.wordpress.com/2009/02/20/i-crush-optipng/"&gt;squeezed harder&lt;/a&gt;, but I&amp;#8217;m impressed!
&lt;/p&gt;
&lt;pre&gt;
8950 4e47 0d0a 1a0a 0000 000d 4948 4452  .&lt;b&gt;PNG&lt;/b&gt;........&lt;b&gt;IHDR&lt;/b&gt;
0000 0100 0000 0100 0100 0000 0074 0995  .............t..
cb00 0000 5049 4441 5468 81ed ceb1 0d00  ....P&lt;b&gt;IDAT&lt;/b&gt;h......
200c 0341 f65f 1a58 803a 2f74 6e52 e424   ..A._.X.:/tnR.$
7bed 9b75 f3ba cf07 0000 df83 ca0e 0000  {..u............
7a60 ba1f 0080 2ea8 ec00 00a0 07a6 fb01  z`..............
00e8 82ca 0e00 007a 60ba 1f00 802e a8ec  .......z`.......
0000 2007 0e8a 69f0 e2b9 9471 c700 0000  .. ...i....q....
0049 454e 44ae 4260 82                   .&lt;b&gt;IEND&lt;/b&gt;.B`.
&lt;/pre&gt;

&lt;p&gt;Here&amp;#8217;s a trace of how zlib inflates the compressed pixels in this &lt;a href="http://www.libpng.org/pub/png/spec/1.2/PNG-Chunks.html"&gt;IDAT chunk&lt;/a&gt;. (Source code available via anonymous SVN at &lt;a href="http://wordaligned.org/svn/etc/zlib_trace"&gt;http://wordaligned.org/svn/etc/zlib_trace&lt;/a&gt;.)
&lt;/p&gt;
&lt;div class="typocode"&gt;

&lt;pre class="prettyprint"&gt;inflate: allocated
inflate: reset
inflate:   zlib header ok
inflate:     dynamic codes block (last)
inflate:       table sizes ok
inflate:       code lengths ok
inflate:       codes ok
inflate:         literal 0x00
inflate:         literal 0xff
inflate:         length 3
inflate:         distance 1
inflate:         literal 0x00
inflate:         length 3
inflate:         distance 1
inflate:         length 24
inflate:         distance 8
inflate:         length 25
inflate:         distance 25
inflate:         length 258
inflate:         distance 33
....

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;I&amp;#8217;ve attempted to show the first few stages of the genesis of the uncompressed stream in the picture below. The way the stream recursively inflates itself is quite beautiful.
&lt;/p&gt;
&lt;img style="border: 2px solid #ccc;" src="http://wordaligned.org/images/inflate.png" alt="Inflating pixels"/&gt;

&lt;ol&gt;
 &lt;li&gt;
     put 00
 &lt;/li&gt;

 &lt;li&gt;
     put ff
 &lt;/li&gt;

 &lt;li&gt;
     go back 1 (to ff), put 3
 &lt;/li&gt;

 &lt;li&gt;
     put 00
 &lt;/li&gt;

 &lt;li&gt;
     go back 1 (to 00), put 3
 &lt;/li&gt;

 &lt;li&gt;
     go back 8 (to 00 00 00 00 ff ff ff ff)
 &lt;/li&gt;

 &lt;li&gt;
     put 24
 &lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Two elements later, and the repeat length has grown to 258. In fact, the entire chessboard is generated from just 3 literal and 43 pointer elements.
&lt;/p&gt;
&lt;p&gt;(Not all graphics have such a regular pattern, of course, so we can&amp;#8217;t always achieve such dramatic compression.)
&lt;/p&gt;

&lt;h3&gt;Deflated HTML&lt;/h3&gt;
&lt;p&gt;Web servers can and do save on band-width by transferring &lt;a href="http://www.gzip.org/"&gt;gzip&lt;/a&gt; compressed HTML to gzip capable clients. (Gzip is a simple wrapper around DEFLATE.) Any PNG images transferred will also have their pixels DEFLATE compressed.
&lt;/p&gt;
&lt;pre&gt;
$ curl http://wordaligned.org --head --compress
HTTP/1.1 200 OK
Date: Sun, 17 May 2009 17:41:53 GMT
Server: lighttpd | Word Aligned
Content-Type: text/html; charset=UTF-8
....
Vary: Accept-Encoding
&lt;b&gt;Content-Encoding: gzip&lt;/b&gt;
Content-Length: 20
&lt;/pre&gt;

&lt;p&gt;The Word Aligned &lt;a href="http://wordaligned.org/"&gt;front page&lt;/a&gt; contains about 75Kb of HTML, which gzips to just 16Kb &amp;#8212; a decent saving. Relevant lines from the &lt;a href="http://redmine.lighttpd.net/projects/lighttpd/wiki/Docs:ModCompress"&gt;lighttpd configuration file&lt;/a&gt; read:
&lt;/p&gt;
&lt;div class="typocode"&gt;&lt;div class="codetitle"&gt;lighttpd mod_compress&lt;/div&gt;

&lt;pre class="prettyprint"&gt;server.modules = (
    ....
    "mod_compress"
)
compress.cache-dir = basedir + "lighttpd/cache/compress/"
compress.filetype  = ("text/plain", "text/html", "text/css")

&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;I uphold Gzip (built on zlib, which implements DEFLATE) as a hero of the web. As we&amp;#8217;ve seen, it implements a powerful and elegant algorithm, but perhaps the best thing about it is that it&amp;#8217;s free to use, a freedom worth fighting for. Check out this battle report from the &lt;a href="http://www.gzip.org/#faq"&gt;FAQ&lt;/a&gt;.
&lt;/p&gt;
&lt;blockquote&gt;
&lt;h3&gt;What about patents?&lt;/h3&gt;
&lt;p&gt;&lt;em&gt;gzip&lt;/em&gt; was developed as a replacement for compress because of the UNISYS and IBM &lt;a href="http://www.faqs.org/faqs/compression-faq/part1/section-6.html"&gt;patents&lt;/a&gt; covering the &lt;a href="http://www.faqs.org/faqs/compression-faq/part2/section-1.html"&gt;LZW&lt;/a&gt; algorithm used by compress.
&lt;/p&gt;
&lt;p&gt;I have probably spent more time studying data compression patents than actually implementing data compression algorithms. I maintain a list of several hundred patents on lossless data compression algorithms, and I made sure that &lt;em&gt;gzip&lt;/em&gt; isn&amp;#8217;t covered by any of them. In particular, the &lt;code&gt;--fast&lt;/code&gt; option of gzip is not as fast it could, precisely to avoid a patented technique.  &amp;#8212; Jean-Loup Gailly, &lt;a href="http://www.gzip.org/#faq11"&gt;Gzip FAQ&lt;/a&gt;
&lt;/p&gt;
&lt;/blockquote&gt;&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=L6im3K_j-Bc:ZL12pmH0Ov4:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=L6im3K_j-Bc:ZL12pmH0Ov4:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=L6im3K_j-Bc:ZL12pmH0Ov4:F7zBnMyn0Lo"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=L6im3K_j-Bc:ZL12pmH0Ov4:F7zBnMyn0Lo" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=L6im3K_j-Bc:ZL12pmH0Ov4:gIN9vFwOqvQ"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?i=L6im3K_j-Bc:ZL12pmH0Ov4:gIN9vFwOqvQ" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.wordaligned.org/~ff/wordaligned?a=L6im3K_j-Bc:ZL12pmH0Ov4:cGdyc7Q-1BI"&gt;&lt;img src="http://feeds.feedburner.com/~ff/wordaligned?d=cGdyc7Q-1BI" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/wordaligned/~4/L6im3K_j-Bc" height="1" width="1"/&gt;</description>
<dc:date>2009-05-21</dc:date>
<guid isPermaLink="false">http://wordaligned.org/articles/deflate-runlength-encoding-but-better</guid>
<author>tag@wordaligned.org (Thomas Guest)</author>
<link>http://feeds.wordaligned.org/~r/wordaligned/~3/L6im3K_j-Bc/deflate-runlength-encoding-but-better</link>
<category>Zlib</category>
<category>Streams</category>
<category>Graphics</category>
<category>Shell</category>
<category>Python</category>
<feedburner:origLink>http://wordaligned.org/articles/deflate-runlength-encoding-but-better</feedburner:origLink></item>

</channel>
</rss>
